BIS501
BIS501
Chapter 1
• Programmers have some tools that allow them to build new tools
Computer
Programmer
Hardware + Software
From a software creator’s point of view, we build the software. The end
users (stakeholders/actors) are our masters - who we want to please -
often they pay us money when they are pleased. But the data,
information, and networks are our problem to solve on their behalf.
The hardware and software are our friends and allies in this quest.
What is Code? Software? A Program?
https://www.youtube.com/watch?v=XiBYM6g8Tck
Programs for Humans...
while music is playing:
Left hand out and up
Right hand out and up
Flip Left hand
Flip Right hand
Left hand to right shoulder
Right hand to left shoulder
Left hand to back of head
Right ham to back of head
Left hand to right hit
Right hand to left hit
Left hand on left bottom
Right hand on right bottom
Wiggle
Wiggle
Jump
https://www.youtube.com/watch?v=XiBYM6g8Tck
Programs for Humans...
while music is playing:
Left hand out and up
Right hand out and up
Flip Left hand
Flip Right hand
Left hand to right shoulder
Right hand to left shoulder
Left hand to back of head
Right ham to back of head
Left hand to right hit
Right hand to left hit
Left hand on left bottom
Right hand on right bottom
Wiggle
Wiggle
Jump
https://www.youtube.com/watch?v=XiBYM6g8Tck
Programs for Humans...
while music is playing:
Left hand out and up
Right hand out and up
Flip Left hand
Flip Right hand
Left hand to right shoulder
Right hand to left shoulder
Left hand to back of head
Right hand to back of head
Left hand to right hip
Right hand to left hip
Left hand on left bottom
Right hand on right bottom
Wiggle
Wiggle
Jump
https://www.youtube.com/watch?v=XiBYM6g8Tck
text = input('Enter text:')
counts = dict()
for line in text:
words = line.split()
for word in words:
counts[word] = counts.get(word,0) + 1
bigcount = None
bigword = None
for word,count in counts.items():
if bigcount is None or count > bigcount:
bigword = word
bigcount = count
print(bigword, bigcount)
Hardware Architecture
http://upload.wikimedia.org/wikipedia/commons/3/3d/RaspberryPi.jpg
Generic
Software What
Next? Computer
Input Central
and Output Processing
Devices Unit
Secondary
Memory
Main
Memory
Definitions
• Central Processing Unit: Runs the Program - The CPU is What
always wondering “what to do next”. Not the brains Next?
exactly - very dumb but very very fast
• Main Memory: Fast small temporary storage - lost on reboot - aka RAM
• Secondary Memory: Slower large permanent storage - lasts until deleted - disk
drive / memory stick
Generic
Software What
Next? Computer
Input Central
and Output Processing
Devices Unit
Secondary
if x< 3: print Memory
Main
Memory
Generic
Software What
Next? Computer
Input Central
and Output Processing
Devices Unit
01001001 Secondary
00111001 Memory
Main
Memory
Machine
Language
Totally Hot CPU
What
Next?
http://www.youtube.com/watch?v=y39D4529FM4
Hard Disk in Action
http://www.youtube.com/watch?v=9eMWG3fwiEU
Python as a Language
Python is the language of the Python
Interpreter and those who can converse with
it. An individual who can speak Python is
known as a Pythonista. It is a very uncommon
skill, and may be hereditary. Nearly all known
Pythonistas use software initially developed
by Guido van Rossum.
Early Learner: Syntax Errors
• We need to learn the Python language so we can communicate our instructions to
Python. In the beginning we will make lots of mistakes and speak gibberish like
small children.
• When you make a mistake, the computer does not think you are “cute”. It says
“syntax error” - given that it knows the language and you are just learning it. It
seems like Python is cruel and unfeeling.
• You must remember that you are intelligent and can learn. The computer is
simple and very fast, but cannot learn. So it is easier for you to learn Python than
for the computer to learn English...
Talking to Python
csev$ python3
Python 3.5.1 (v3.5.1:37a07cee5969, Dec 5 2015, 21:12:44)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwinType
"help", "copyright", "credits" or "license" for more information.
>>>
What
next?
csev$ python3
Python 3.5.1 (v3.5.1:37a07cee5969, Dec 5 2015, 21:12:44)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwinType
"help", "copyright", "credits" or "license" for more information.
>>> x = 1
>>> print(x)
1
>>> x = x + 1 This is a good test to make sure that you have
>>> print(x) Python correctly installed. Note that quit() also
2 works to end the interactive session.
>>> exit()
What Do We Say?
Elements of Python
• Vocabulary / Words - Variables and Reserved words (Chapter 2)
counts = dict()
A short “story”
for line in text: about how to count
words = line.split()
for word in words:
characters in
counts[word] = counts.get(word,0) + 1 Python
bigcount = None
bigword = None
for word,count in counts.items():
if bigcount is None or count > bigcount:
bigword = word
bigcount = count
print(bigword, bigcount)
Reserved Words
You cannot use reserved words as variable names / identifiers
x = 2 Assignment statement
x = x + 2 Assignment with expression
print(x) Print statement
• Most programs are much longer, so we type them into a file and tell
Python to run the commands in the file.
• Script
print('Smaller') Program:
No Output:
x = 5
Yes if x < 10: Smaller
x > 20 ? print('Smaller') Finis
if x > 20:
print('Bigger') print('Bigger')
No
print('Finis')
print('Finis')
n=5 Repeated Steps
No Yes Output:
n>0? Program:
5
print(n) n = 5 4
while n > 0 :
print(n)
3
n = n -1 n = n – 1 2
print('Blastoff!') 1
Blastoff!
Loops (repeated steps) have iteration variables that
print('Blastoff')
change each time through a loop.
text = input('Enter text:') Sequential
bigcount = None
bigword = None
for word,count in counts.items():
if bigcount is None or count > bigcount:
bigword = word
bigcount = count
print(bigword, bigcount)
text = input('Enter text:')
A short Python “Story”
about how to count
counts = dict() characters
for line in text:
words = line.split()
A word used to read
for word in words:
counts[word] = counts.get(word,0) + 1 data from a user
x = 12.2 x 12.2
y = 14
y 14
Variables
• A variable is a named place in the memory where a programmer can store
data and later retrieve the data using the variable “name”
• Case Sensitive
http://en.wikipedia.org/wiki/Mnemonic
x1q3z9ocd = 35.0
x1q3z9afd = 12.50
x1q3p9afd = x1q3z9ocd * x1q3z9afd
print(x1q3p9afd)
hours = 35.0
What are these bits rate = 12.50
of code doing? pay = hours * rate
print(pay)
Sentences or Lines
x = 2 Assignment statement
x = x + 2 Assignment with expression
print(x) Print statement
x = 3.9 * x * ( 1 - x )
A variable is a memory location x 0.6
used to store a value (0.6)
0.6 0.6
x = 3.9 * x * ( 1 - x )
0.4
0.4
The right side is an expression. Once the
expression is evaluated, the result is
placed in (assigned to) the variable on the
0.936
left side (i.e., x).
Expressions…
Numeric Expressions
Operator Operation
• Because of the lack of mathematical
symbols on computer keyboards - we + Addition
use “computer-speak” to express the - Subtraction
classic math operations
* Multiplication
• Asterisk is multiplication / Division
3
Order of Evaluation
• When we string operators together - Python must know which one
to do first
x = 1 + 2 * 3 - 4 / 5 ** 6
Operator Precedence Rules
Highest precedence rule to lowest precedence rule:
• Left to right
1 + 2 ** 3 / 4 * 5
>>> x = 1 + 2 ** 3 / 4 * 5
>>> print(x)
11.0 1 + 8 / 4 * 5
>>>
1 + 2 * 5
Parenthesis
Power
Multiplication 1 + 10
Addition
Left to Right 11
Operator Precedence Parenthesis
Power
• Remember the rules top to bottom Multiplication
Addition
• When writing code - use parentheses Left to Right
• Why comment?
• Operator precedence
Exercise
Enter Hours: 35
Enter Rate: 2.75
Pay: 96.25
Acknowledgements / Contributions
No print('Smaller') Program:
Output:
x = 5
Yes if x < 10: Smaller
x > 20 ? print('Smaller') Finis
if x > 20:
No print('Bigger') print('Bigger')
print('Finis')
print('Finis')
Comparison Operators
• Boolean expressions ask a Python Meaning
question and produce a Yes or No < Less than
result which we use to control
program flow <= Less than or Equal to
== Equal to
• Boolean expressions using >= Greater than or Equal to
comparison operators evaluate to
> Greater than
True / False or Yes / No
!= Not equal
• Comparison operators look at
variables but do not change the Remember: “=” is used for assignment.
variables
http://en.wikipedia.org/wiki/George_Boole
Comparison Operators
x = 5
if x == 5 :
print('Equals 5') Equals 5
if x > 4 :
print('Greater than 4')
Greater than 4
if x >= 5 : Greater than or Equals 5
print('Greater than or Equals 5')
if x < 6 : print('Less than 6') Less than 6
if x <= 5 :
print('Less than or Equals 5') Less than or Equals 5
if x != 6 :
print('Not equal 6') Not equal 6
One-Way Decisions
x = 5 Yes
print('Before 5') Before 5 x == 5 ?
if x == 5 :
print('Is 5') Is 5 print('Is 5’)
No
print('Is Still 5')
Is Still 5
print('Third 5')
print('Afterwards 5')
Third 5 print('Still 5')
print('Before 6') Afterwards 5
if x == 6 : Before 6 print('Third 5')
print('Is 6')
print('Is Still 6')
print('Third 6')
print('Afterwards 6') Afterwards 6
Indentation
• Increase indent indent after an if statement or for statement (after : )
• Maintain indent to indicate the scope of the block (which lines are affected
by the if/for)
• Most text editors can turn tabs into spaces - make sure to enable this
feature
• Python cares a *lot* about how far a line is indented. If you mix tabs and
spaces, you may get “indentation errors” even if everything looks fine
This will save you
much unnecessary
pain.
increase / maintain after if or for
decrease to indicate end of block
x = 5
if x > 2 :
print('Bigger than 2')
print('Still bigger')
print('Done with 2')
for i in range(5) :
print(i)
if i > 2 :
print('Bigger than 2')
print('Done with i', i)
print('All Done')
Think About begin/end Blocks
x = 5
if x > 2 :
print('Bigger than 2')
print('Still bigger')
print('Done with 2')
for i in range(5) :
print(i)
if i > 2 :
print('Bigger than 2')
print('Done with i', i)
print('All Done')
Nested x>1
yes
x = 42
if x > 1 : yes
print('More than one') x < 100
if x < 100 :
no
print('Less than 100') print('Less than 100')
print('All done')
print('All Done')
Two-way Decisions
x=4
• Sometimes we want to
do one thing if a logical no yes
x>2
expression is true and
something else if the
expression is false print('Not bigger') print('Bigger')
if x > 2 :
print('Bigger') print('Not bigger') print('Bigger')
else :
print('Smaller')
print('All done')
print('All Done')
Visualize Blocks x=4
no yes
x = 4 x>2
if x > 2 :
print('Bigger') print('Not bigger') print('Bigger')
else :
print('Smaller')
print('All done')
print('All Done')
More Conditional Structures…
Multi-way
yes
x<2 print('small')
no
if x < 2 :
yes
print('small')
elif x < 10 :
x < 10 print('Medium')
print('Medium') no
else :
print('LARGE') print('LARGE')
print('All done')
print('All Done')
x=0
Multi-way
yes
x<2 print('small')
x = 0
no
if x < 2 :
yes
print('small')
elif x < 10 :
x < 10 print('Medium')
print('Medium') no
else :
print('LARGE') print('LARGE')
print('All done')
print('All Done')
x=5
Multi-way
yes
x<2 print('small')
x = 5
no
if x < 2 :
yes
print('small')
elif x < 10 :
x < 10 print('Medium')
print('Medium') no
else :
print('LARGE') print('LARGE')
print('All done')
print('All Done')
x = 20
Multi-way
yes
x<2 print('small')
x = 20
no
if x < 2 :
yes
print('small')
elif x < 10 :
x < 10 print('Medium')
print('Medium') no
else :
print('LARGE') print('LARGE')
print('All done')
print('All Done')
Multi-way if x < 2 :
print('Small')
elif x < 10 :
# No Else print('Medium')
x = 5 elif x < 20 :
if x < 2 : print('Big')
print('Small') elif x < 40 :
elif x < 10 : print('Large')
print('Medium') elif x < 100:
print('Huge')
print('All done') else :
print('Ginormous')
Multi-way Puzzles
Which will never print
regardless of the value for x?
if x < 2 :
print('Below 2')
if x < 2 : elif x < 20 :
print('Below 2') print('Below 20')
elif x >= 2 : elif x < 10 :
print('Two or more') print('Below 10')
else : else :
print('Something else') print('Something else')
The try / except Structure
Output Main
Devices Memory
Generic
Software
Computer
Input
Central
Devices
Processing
Unit
Secondary
Memory
Output Main
Devices Memory
astr = 'Hello Bob' When the first conversion fails - it
try: just drops into the except: clause
istr = int(astr) and the program continues.
except:
istr = -1
$ python tryexcept.py
print('First', istr) First -1
Second 123
astr = '123'
try:
istr = int(astr)
except:
istr = -1 When the second conversion
succeeds - it just skips the except:
print('Second', istr) clause and the program continues.
astr = 'Bob'
try / except
print('Hello')
astr = 'Bob'
try:
print('Hello') istr = int(astr)
istr = int(astr)
print('There')
except: print('There')
istr = -1
istr = -1
print('Done', istr)
Enter Hours: 45
Enter Rate: 10
Pay: 475.0
475 = 40 * 10 + 5 * 15
Exercise
Enter Hours: 20
Enter Rate: nine
Error, please enter numeric input
• It is like a loop test that can happen anywhere in the body of the
loop
while True: > hello there
line = input('> ') hello there
if line == 'done' : > finished
break finished
print(line) > done
print('Done!') Done!
Breaking Out of a Loop
• The break statement ends the current loop and jumps to the
statement immediately following the loop
• It is like a loop test that can happen anywhere in the body of the
loop
while True: > hello there
line = input('> ') hello there
if line == 'done' : > finished
break finished
print(line) > done
print('Done!')
Done!
No Yes
while True: True ?
line = input('> ')
if line == 'done' :
....
break
print(line)
print('Done!')
break
...
http://en.wikipedia.org/wiki/Transporter_(Star_Trek)
print('Done')
Finishing an Iteration with
continue
The continue statement ends the current iteration and jumps to the
top of the loop and starts the next iteration
while True:
> hello there
line = input('> ')
if line[0] == '#' : hello there
continue > # don't print this
if line == 'done' : > print this!
break print this!
print(line) > done
print('Done!') Done!
Finishing an Iteration with
continue
The continue statement ends the current iteration and jumps to the
top of the loop and starts the next iteration
while True:
> hello there
line = input('> ')
if line[0] == '#' : hello there
continue > # don't print this
if line == 'done' : > print this!
break print this!
print(line) > done
print('Done!') Done!
No
True ? Yes
while True:
line = raw_input('> ') ....
if line[0] == '#' :
continue
if line == 'done' : continue
break
print(line)
...
print('Done!')
print('Done')
Indefinite Loops
• The loops we have seen so far are pretty easy to examine to see
if they will terminate or if they will be “infinite loops”
• We can write a loop to run the loop once for each of the items in
a set using the Python for construct
i=2
3
What is the Largest Number?
41
What is the Largest Number?
12
What is the Largest Number?
9
What is the Largest Number?
74
What is the Largest Number?
15
What is the Largest Number?
What is the Largest Number?
3 41 12 9 74 15
What is the Largest Number?
largest_so_far -1
What is the Largest Number?
largest_so_far 3
What is the Largest Number?
41
largest_so_far 41
What is the Largest Number?
12
largest_so_far 41
What is the Largest Number?
largest_so_far 41
What is the Largest Number?
74
largest_so_far 74
What is the Largest Number?
15
74
What is the Largest Number?
3 41 12 9 74 15
74
Finding the Largest Value
$ python largest.py
largest_so_far = -1
Before -1
print('Before', largest_so_far)
for the_num in [9, 41, 12, 3, 74, 15] : 9 9
if the_num > largest_so_far : 41 41
largest_so_far = the_num 41 12
print(largest_so_far, the_num) 41 3
74 74
print('After', largest_so_far) 74 15
After 74
We make a variable that contains the largest value we have seen so far. If the current
number we are looking at is larger, it is the new largest value we have seen so far.
More Loop Patterns…
Counting in a Loop
$ python countloop.py
zork = 0 Before 0
print('Before', zork) 19
for thing in [9, 41, 12, 3, 74, 15] :
2 41
zork = zork + 1
print(zork, thing) 3 12
print('After', zork) 43
5 74
6 15
After 6
If we just want to search and know if a value was found, we use a variable that
starts at False and is set to True as soon as we find what we are looking for.
How to Find the Smallest Value
$ python largest.py
largest_so_far = -1
Before -1
print('Before', largest_so_far)
for the_num in [9, 41, 12, 3, 74, 15] : 9 9
if the_num > largest_so_far : 41 41
largest_so_far = the_num 41 12
print(largest_so_far, the_num) 41 3
74 74
print('After', largest_so_far) 74 15
After 74
How would we change this to make it find the smallest value in the list?
Finding the Smallest Value
smallest_so_far = -1
print('Before', smallest_so_far)
for the_num in [9, 41, 12, 3, 74, 15] :
if the_num < smallest_so_far :
smallest_so_far = the_num
print(smallest_so_far, the_num)
print('After', smallest_so_far)
We switched the variable name to smallest_so_far and switched the > to <
Finding the Smallest Value
$ python smallbad.py
smallest_so_far = -1
Before -1
print('Before', smallest_so_far)
for the_num in [9, 41, 12, 3, 74, 15] : -1 9
if the_num < smallest_so_far : -1 41
smallest_so_far = the_num -1 12
print(smallest_so_far, the_num) -1 3
-1 74
print('After', smallest_so_far) -1 15
After -1
We switched the variable name to smallest_so_far and switched the > to <
Finding the Smallest Value
smallest = None $ python smallest.py
print('Before') Before
for value in [9, 41, 12, 3, 74, 15] : 99
if smallest is None :
9 41
smallest = value
elif value < smallest : 9 12
smallest = value 33
print(smallest, value) 3 74
print('After', smallest) 3 15
After 3
We still have a variable that is the smallest so far. The first time through the loop
smallest is None, so we take the first value to be the smallest.
The is and is not Operators
• Python has an is operator
smallest = None that can be used in logical
print('Before') expressions
for value in [3, 41, 12, 9, 74, 15] :
if smallest is None :
smallest = value
• Implies “is the same as”
elif value < smallest :
smallest = value • Similar to, but stronger than
print(smallest, value) ==
print('After', smallest)
• is not also is a logical
operator
Summary
• While loops (indefinite) • For loops (definite)
• Infinite loops • Iteration variables
• Using break • Loop idioms
• Using continue • Largest or smallest
• None constants and variables
Acknowledgements / Contributions
These slides are Copyright 2010- Charles R. Severance ...
(www.dr-chuck.com) of the University of Michigan School of
Information and open.umich.edu and made available under a
Creative Commons Attribution 4.0 License. Please maintain this
last slide in all copies of the document to comply with the
attribution requirements of the license. If you make a change,
feel free to add your name and organization to the list of
contributors on this page as you republish the materials.
Result
>>> big = max('Hello world')
>>> print(big)
w
>>> tiny = min('Hello world')
>>> print(tiny)
>>>
Max Function
A function is some
>>> big = max('Hello world') stored code that we
>>> print(big)
use. A function takes
w
some input and
produces an output.
• This defines the function but does not execute the body of the
function
def print_lyrics():
print("I'm a lumberjack, and I'm okay.")
print('I sleep all night and I work all day.')
print("I'm a lumberjack, and I'm okay.")
print_lyrics(): print('I sleep all night and I work all day.')
x = 5
print('Hello')
def print_lyrics():
print("I'm a lumberjack, and I'm okay.") Hello
print('I sleep all night and I work all day.')
Yo
print('Yo') 7
x = x + 2
print(x)
Definitions and Uses
• Once we have defined a function, we can call (or invoke) it
as many times as we like
def print_lyrics():
print("I'm a lumberjack, and I'm okay.")
print('I sleep all night and I work all day.')
print('Yo')
print_lyrics()
x = x + 2
Hello
print(x) Yo
I'm a lumberjack, and I'm okay.
I sleep all night and I work all day.
7
Arguments
• An argument is a value we pass into the function as its input
when we call the function
def greet():
return "Hello" Hello Glenn
Hello Sally
print(greet(), "Glenn")
print(greet(), "Sally")
Return Value
>>> def greet(lang):
... if lang == 'es':
• A “fruitful” function is one ... return 'Hola'
... elif lang == 'fr':
that produces a result (or ... return 'Bonjour'
return value) ... else:
... return 'Hello'
...
• The return statement ends >>> print(greet('en'),'Glenn')
the function execution and Hello Glenn
>>> print(greet('es'),'Sally')
“sends back” the result of Hola Sally
the function >>> print(greet('fr'),'Michael')
Bonjour Michael
>>>
Arguments, Parameters, and
Results
>>> big = max('Hello world') Parameter
>>> print(big)
w
def max(inp):
blah
blah
'Hello world' for x in inp: 'w'
blah
blah
Argument return 'w'
Result
Multiple Parameters / Arguments
• We can define more than one
parameter in the function def addtwo(a, b):
definition added = a + b
return added
• We simply add more arguments
x = addtwo(3, 5)
when we call the function print(x)
Enter Hours: 45
Enter Rate: 10
Pay: 475.0
475 = 40 * 10 + 5 * 15
Acknowledgements / Contributions
These slides are Copyright 2010- Charles R. Severance ...
(www.dr-chuck.com) of the University of Michigan School of
Information and open.umich.edu and made available under a
Creative Commons Attribution 4.0 License. Please maintain this
last slide in all copies of the document to comply with the
attribution requirements of the license. If you make a change,
feel free to add your name and organization to the list of
contributors on this page as you republish the materials.
Secondary
if x < 3: print Memory
Details: http://source.sakaiproject.org/viewsvn/?view=rev&rev=39772
http://www.py4e.com/code/mbox-short.txt
• Before we can read the contents of the file, we must tell Python
Opening a File
which file we are going to work with and what we will be doing
with the file
filename is a string
Details: http://source.sakaiproject.org/viewsvn/?view=rev&rev=39772
File Processing
A text file has newlines at the end of each line
• Remember - a sequence is an
ordered set
Counting Lines in a File fhand = open('mbox.txt')
• Open a file read-only count = 0
for line in fhand:
• Use a for loop to read each line count = count + 1
print('Line Count:', count)
• Count the lines and print out
the number of lines
$ python open.py
Line Count: 132045
Reading
We can read the whole
the *Whole* File
>>> fhand = open('mbox-short.txt')
>>> inp = fhand.read()
file (newlines and all) >>> print(len(inp))
94626
into a single string
>>> print(inp[:20])
From stephen.marquar
Searching Through a File fhand = open('mbox-short.txt')
We can put an if statement in
for line in fhand:
our for loop to only print lines if line.startswith('From:') :
that meet some criteria print(line)
OOPS!
From: [email protected]
What are all these blank
lines doing here? From: [email protected]
From: [email protected]
From: [email protected]
...
OOPS!
What are all these blank From: [email protected]\n
lines doing here? \n
From: [email protected]\n
Each line from the file has \n
a newline at the end From: [email protected]\n
\n
The print statement adds a From: [email protected]\n
\n
newline to each line
...
Searching
We Through a File (fixed)
can strip the whitespace
fhand = open('mbox-short.txt')
for line in fhand:
from the right-hand side of line = line.rstrip()
if line.startswith('From:') :
the string using rstrip() from print(line)
the string library
From: [email protected]
The newline is considered
From: [email protected]
“white space” and is From: [email protected]
stripped From: [email protected]
....
Skipping with continue
fhand = open('mbox-short.txt')
We can conveniently for line in fhand:
skip a line by using the line = line.rstrip()
if not line.startswith('From:') :
continue statement continue
print(line)
Using in to Select Lines
We can look for a string
anywhere in a line as our
fhand = open('mbox-short.txt')
for line in fhand:
line = line.rstrip()
if not '@uct.ac.za' in line :
selection criteria continue
print(line)
Names quit()
count = 0
for line in fhand:
if line.startswith('Subject:') :
count = count + 1
print('There were', count, 'subject lines in', fname)
• Data Structure
- A particular way of organizing data in a computer
https://en.wikipedia.org/wiki/Algorithm
https://en.wikipedia.org/wiki/Data_structure
What is Not a “Collection”?
Most of our variables have one value in them - when we put a new
value in the variable, the old value is overwritten
$ python
>>> x = 2
>>> x = 4
>>> print(x)
4
A List is a Kind of
Collection
• A collection allows us to put many values in a single “variable”
Just like strings, we can get at any single element in a list using an
index specified in square brackets
http://docs.python.org/tutorial/datastructures.html
Building a List from Scratch
>>> stuff = list()
• We can create an empty list >>> stuff.append('book')
and then add elements using >>> stuff.append(99)
the append method >>> print(stuff)
['book', 99]
• The list stays in order and >>> stuff.append('cookie')
>>> print(stuff)
new elements are added at
['book', 99, 'cookie']
the end of the list
Is Something in a List?
• Python provides two operators >>> some = [1, 9, 21, 10, 16]
that let you check if an item is >>> 9 in some
True
in a list
>>> 15 in some
False
• These are logical operators >>> 20 not in some
that return True or False True
>>>
• They do not modify the list
Lists are in Order
• A list can hold many
items and keeps
those items in the
order until we do >>> friends = [ 'Joseph', 'Glenn', 'Sally' ]
>>> friends.sort()
something to change >>> print(friends)
the order ['Glenn', 'Joseph', 'Sally']
>>> print(friends[1])
• A list can be sorted Joseph
(i.e., change its order) >>>
Split breaks a string into parts and produces a list of strings. We think of these
as words. We can access a particular word or loop through all the words.
>>> line = 'A lot of spaces'
>>> etc = line.split()
>>> print(etc)
['A', 'lot', 'of', 'spaces'] ● When you do not specify a
>>>
>>> line = 'first;second;third' delimiter, multiple spaces are
>>> thing = line.split()
>>> print(thing) treated like one delimiter
['first;second;third']
>>> print(len(thing))
1 ● You can specify what delimiter
>>> thing = line.split(';')
>>> print(thing) character to use in the splitting
['first', 'second', 'third']
>>> print(len(thing))
3
>>>
From [email protected] Sat Jan 5 09:14:16 2008
words = line.split()
email = words[1]
print pieces[1]
The Double Split Pattern
words = line.split()
email = words[1] [email protected]
print pieces[1]
The Double Split Pattern
words = line.split()
email = words[1] [email protected]
pieces = email.split('@') ['stephen.marquard', 'uct.ac.za']
print pieces[1]
The Double Split Pattern
words = line.split()
email = words[1] [email protected]
pieces = email.split('@') ['stephen.marquard', 'uct.ac.za']
print(pieces[1]) 'uct.ac.za'
List Summary
• Concept of a collection • Slicing lists
>>> t = tuple()
>>> dir(t)
['count', 'index']
Tuples are More Efficient
• Since Python does not have to build tuple structures to be
modifiable, they are simpler and more efficient in terms of
memory use and performance than lists
• So in our program when we are making “temporary variables”
we prefer tuples over lists
Tuples and Assignment
• We can also put a tuple on the left-hand side of an assignment
statement
• We can even omit the parentheses
lst = []
for key, val in counts.items():
newtup = (val, key)
lst.append(newtup)
$ python
>>> x = 2
>>> x = 4
>>> print(x)
4
A Story of Two Collections..
• List
• Dictionary
perfume
money
candy
http://en.wikipedia.org/wiki/Associative_array
Dictionaries
• Dictionaries are Python’s most powerful data collection
counts = dict()
names = ['csev', 'cwen', 'csev', 'zqian', 'cwen']
for name in names :
if name not in counts: {'csev': 2, 'zqian': 1, 'cwen': 2}
counts[name] = 1
else :
counts[name] = counts[name] + 1
print(counts)
The get Method for Dictionaries
The pattern of checking to see if a if name in counts:
key is already in a dictionary and x = counts[name]
assuming a default value if the key else :
is not there is so common that there x = 0
is a method called get() that does
this for us
x = counts.get(name, 0)
counts = dict()
names = ['csev', 'cwen', 'csev', 'zqian', 'cwen']
for name in names :
counts[name] = counts.get(name, 0) + 1
print(counts)
counts = dict()
names = ['csev', 'cwen', 'csev', 'zqian', 'cwen']
for name in names :
counts[name] = counts.get(name, 0) + 1
print(counts)
http://www.youtube.com/watch?v=EHJ9uYx5L58
Counting Words in Text
Writing programs (or programming) is a very creative and rewarding activity. You can write
programs for many reasons ranging from making your living to solving a difficult data analysis
problem to having fun to helping someone else solve a problem. This book assumes that
everyone needs to know how to program and that once you know how to program, you will figure
out what you want to do with your newfound skills.
We are surrounded in our daily lives with computers ranging from laptops to cell phones. We
can think of these computers as our “personal assistants” who can take care of many things on
our behalf. The hardware in our current-day computers is essentially built to continuously ask us
the question, “What would you like me to do next?”
Our computers are fast and have vast amounts of memory and could be very helpful to us if we
only knew the language to speak to explain to the computer what we would like it to do next. If
we knew this language we could tell the computer to do tasks on our behalf that were repetitive.
Interestingly, the kinds of things computers can do best are often the kinds of things that we
humans find boring and mind-numbing.
Counting Pattern
counts = dict()
print('Enter a line of text:') The general pattern to count the
line = input('')
words in a line of text is to split
words = line.split() the line into words, then loop
through the words and use a
print('Words:', words) dictionary to track the count of
each word independently.
print('Counting...')
for word in words:
counts[word] = counts.get(word,0) + 1
print('Counts', counts)
python wordcount.py
Enter a line of text:
the clown ran after the car and the car ran into the tent
and the tent fell down on the clown and the car
http://www.flickr.com/photos/71502646@N00/2526007974/
python wordcount.py
counts = dict() Enter a line of text:
line = input('Enter a line of text:') the clown ran after the car and the car ran
words = line.split()
into the tent and the tent fell down on the
print('Words:', words) clown and the car
print('Counting...’)
Words: ['the', 'clown', 'ran', 'after', 'the', 'car',
for word in words: 'and', 'the', 'car', 'ran', 'into', 'the', 'tent', 'and',
counts[word] = counts.get(word,0) + 1 'the', 'tent', 'fell', 'down', 'on', 'the', 'clown',
print('Counts', counts)
'and', 'the', 'car']
Counting...
Object
String
Objects get
created and
used Output
Input
Code/Data
Code/Data
Code/Data
Code/Data
Objects are
bits of code
and data Output
Input
Code/Data
Code/Data
Code/Data
Code/Data
Objects hide detail
- they allow us to
ignore the detail of
the “rest of the Output
program”.
Input
Code/Data
Code/Data
Code/Data
Code/Data
Objects hide detail -
they allow the “rest
of the program” to
ignore the detail Output
about “us”.
Definitions
• Class - a template
• Method or Message - A defined capability of a class
• Field or attribute- A bit of data in a class
• Object or Instance - A particular instance of a class
Terminology: Class
Defines the abstract characteristics of a thing (object), including the
thing's characteristics (its attributes, fields or properties) and the
thing's behaviors (the things it can do, or methods, operations or
features). One might say that a class is a blueprint or factory that
describes the nature of something. For example, the class Dog would
consist of traits shared by all dogs, such as breed and fur color
(characteristics), and the ability to bark and sit (behaviors).
http://en.wikipedia.org/wiki/Object-oriented_programming
Terminology: Instance
One can have an instance of a class or a particular object.
The instance is the actual object created at runtime. In
programmer jargon, the Lassie object is an instance of the
Dog class. The set of values of the attributes of a particular
object is called its state. The object consists of state and the
behavior that's defined in the object's class.
Object and Instance are often used interchangeably.
http://en.wikipedia.org/wiki/Object-oriented_programming
Terminology: Method
An object's abilities. In language, methods are verbs. Lassie, being a
Dog, has the ability to bark. So bark() is one of Lassie's methods. She
may have other methods as well, for example sit() or eat() or walk() or
save_timmy(). Within the program, using a method usually affects
only one particular object; all Dogs can bark, but you need only one
particular dog to do the barking
http://en.wikipedia.org/wiki/Object-oriented_programming
Some Python Objects
>>> dir(x)
>>> x = 'abc' [ … 'capitalize', 'casefold', 'center', 'count',
>>> type(x) 'encode', 'endswith', 'expandtabs', 'find',
<class 'str'> 'format', … 'lower', 'lstrip', 'maketrans',
>>> type(2.5) 'partition', 'replace', 'rfind', 'rindex', 'rjust',
<class 'float'> 'rpartition', 'rsplit', 'rstrip', 'split',
>>> type(2) 'splitlines', 'startswith', 'strip', 'swapcase',
<class 'int'> 'title', 'translate', 'upper', 'zfill']
>>> y = list() >>> dir(y)
>>> type(y) [… 'append', 'clear', 'copy', 'count', 'extend',
<class 'list'> 'index', 'insert', 'pop', 'remove', 'reverse',
>>> z = dict() 'sort']
>>> type(z) >>> dir(z)
<class 'dict'> […, 'clear', 'copy', 'fromkeys', 'get', 'items',
'keys', 'pop', 'popitem', 'setdefault', 'update',
'values']
A Sample Class
This is the template
class is a reserved
class PartyAnimal: for making
word
x=0 PartyAnimal objects
def party(self) :
self.x = self.x + 1
print("So far",self.x)
an = PartyAnimal()
an.party()
an.party()
an.party()
class PartyAnimal:
$ python party1.py
x=0
def party(self) :
self.x = self.x + 1
print("So far",self.x)
an
x 0
an = PartyAnimal()
party()
an.party()
an.party()
an.party()
class PartyAnimal: $ python party1.py
x=0 So far 1
So far 2
def party(self) : So far 3
self.x = self.x + 1
print("So far",self.x)
an
self x
an = PartyAnimal()
party()
an.party()
an.party()
an.party() PartyAnimal.party(an)
Playing with dir() and type()
A Nerdy Way to Find Capabilities
>>> y = list()
• The dir() command lists >>> type(y)
capabilities <class 'list'>
>>> dir(y)
• Ignore the ones with underscores ['__add__', '__class__',
'__contains__', '__delattr__',
- these are used by Python itself
'__delitem__', '__delslice__',
'__doc__', … '__setitem__',
• The rest are real operations that '__setslice__', '__str__',
the object can perform 'append', 'clear', 'copy',
'count', 'extend', 'index',
• It is like type() - it tells us 'insert', 'pop', 'remove',
'reverse', 'sort']
something *about* a variable >>>
class PartyAnimal:
x = 0 We can use dir() to find
the “capabilities” of our
def party(self) : newly created class.
self.x = self.x + 1
print("So far",self.x)
an = PartyAnimal()
$ python party3.py
print("Type", type(an)) Type <class '__main__.PartyAnimal'>
print("Dir ", dir(an)) Dir ['__class__', ... 'party', 'x']
Try dir() with a String
>>> x = 'Hello there'
>>> dir(x)
['__add__', '__class__', '__contains__', '__delattr__',
'__doc__', '__eq__', '__ge__', '__getattribute__',
'__getitem__', '__getnewargs__', '__getslice__', '__gt__',
'__hash__', '__init__', '__le__', '__len__', '__lt__',
'__repr__', '__rmod__', '__rmul__', '__setattr__', '__str__',
'capitalize', 'center', 'count', 'decode', 'encode', 'endswith',
'expandtabs', 'find', 'index', 'isalnum', 'isalpha', 'isdigit',
'islower', 'isspace', 'istitle', 'isupper', 'join', 'ljust',
'lower', 'lstrip', 'partition', 'replace', 'rfind', 'rindex',
'rjust', 'rpartition', 'rsplit', 'rstrip', 'split',
'splitlines', 'startswith', 'strip', 'swapcase', 'title',
'translate', 'upper', 'zfill']
Object Lifecycle
http://en.wikipedia.org/wiki/Constructor_(computer_science)
Object Lifecycle
• Objects are created, used, and discarded
• We have special blocks of code (methods) that get called
- At the moment of creation (constructor)
- At the moment of destruction (destructor)
• Constructors are used a lot
• Destructors are seldom used
Constructor
The primary purpose of the constructor is to set up some
instance variables to have the proper initial values when
the object is created
class PartyAnimal:
x = 0
$ python party4.py
def __init__(self):
I am constructed
print('I am constructed')
So far 1
def party(self) : So far 2
self.x = self.x + 1 I am destructed 2
print('So far',self.x) an contains 42
def __del__(self):
print('I am destructed', self.x)
an = PartyAnimal()
The constructor and destructor are
an.party() optional. The constructor is
an.party() typically used to set up variables.
an = 42
The destructor is seldom used.
print('an contains',an)
Constructor
In object oriented programming, a constructor in a class
is a special block of statements called when an object is
created
http://en.wikipedia.org/wiki/Constructor_(computer_science)
Many Instances
• We can create lots of objects - the class is the template
for the object
s = PartyAnimal("Sally")
j = PartyAnimal("Jim")
s.party()
j.party()
s.party() party5.py
class PartyAnimal:
x = 0
name = ""
def __init__(self, z):
self.name = z
print(self.name,"constructed")
def party(self) :
self.x = self.x + 1
print(self.name,"party count",self.x)
s = PartyAnimal("Sally")
j = PartyAnimal("Jim")
s.party()
j.party()
s.party()
class PartyAnimal:
x = 0
name = "" s
def __init__(self, z): x: 0
self.name = z
print(self.name,"constructed")
name:
def party(self) :
self.x = self.x + 1
print(self.name,"party count",self.x)
s = PartyAnimal("Sally")
j = PartyAnimal("Jim")
s.party()
j.party()
s.party()
class PartyAnimal:
x = 0
name = "" s
def __init__(self, z): x: 0
self.name = z
print(self.name,"constructed")
name: Sally
def party(self) :
self.x = self.x + 1
print(self.name,"party count",self.x)
s = PartyAnimal("Sally") j
j = PartyAnimal("Jim") x: 0
We have two
s.party()
j.party()
independent name: Jim
s.party() instances
class PartyAnimal:
x = 0
name = "" Sally constructed
def __init__(self, z): Jim constructed
Sally party count 1
self.name = z
Jim party count 1
print(self.name,"constructed") Sally party count 2
def party(self) :
self.x = self.x + 1
print(self.name,"party count",self.x)
s = PartyAnimal("Sally")
j = PartyAnimal("Jim")
s.party()
j.party()
s.party()
Inheritance
http://www.ibiblio.org/g2swap/byteofpython/read/inheritance.html
Inheritance
• When we make a new class - we can reuse an existing
class and inherit all the capabilities of an existing class
and then add our own little bit to make our new class
• Another form of store and reuse
• Write once - reuse many times
• The new class (child) has all the capabilities of the old
class (parent) - and then some more
Terminology: Inheritance
http://en.wikipedia.org/wiki/Object-oriented_programming
class PartyAnimal:
x = 0 s = PartyAnimal("Sally")
name = "" s.party()
def __init__(self, nam):
self.name = nam j = FootballFan("Jim")
print(self.name,"constructed") j.party()
j.touchdown()
def party(self) :
self.x = self.x + 1
print(self.name,"party count",self.x)
FootballFan is a class which
class FootballFan(PartyAnimal):
extends PartyAnimal. It has all
points = 0
def touchdown(self): the capabilities of PartyAnimal
self.points = self.points + 7 and more.
self.party()
print(self.name,"points",self.points)
class PartyAnimal:
x = 0 s = PartyAnimal("Sally")
name = "" s.party()
def __init__(self, nam):
self.name = nam j = FootballFan("Jim")
print(self.name,"constructed") j.party()
j.touchdown()
def party(self) :
self.x = self.x + 1
print(self.name,"party count",self.x) s
x:
class FootballFan(PartyAnimal):
points = 0
def touchdown(self):
name: Sally
self.points = self.points + 7
self.party()
print(self.name,"points",self.points)
class PartyAnimal:
x = 0 s = PartyAnimal("Sally")
name = "" s.party()
def __init__(self, nam):
self.name = nam j = FootballFan("Jim")
print(self.name,"constructed") j.party()
j.touchdown()
def party(self) :
self.x = self.x + 1
print(self.name,"party count",self.x) j
x:
class FootballFan(PartyAnimal):
points = 0
def touchdown(self): name: Jim
self.points = self.points + 7
self.party() points:
print(self.name,"points",self.points)
Definitions
• Class - a template
• Attribute – A variable within a class
• Method - A function within a class
• Object - A particular instance of a class
• Constructor – Code that runs when an object is created
• Inheritance - The ability to extend a class to make a new class.
Summary
• Object Oriented programming is a very structured
approach to code reuse
• Photo from the television program Lassie. Lassie watches as Jeff (Tommy Rettig) works on his bike is Public
Domain
https://en.wikipedia.org/wiki/Lassie#/media/File:Lassie_and_Tommy_Rettig_1956.JPG
Introducing
DataFrames
DATA M A NIPULATION W ITH PA NDA S
pandasisbuiltonNumPyandMatplotlib
pandasispopular
https://pypistats.org/packages/pandas
Rectangulardata
Name Breed Color Height (cm) Weight (kg) Date of Birth
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7 entries, 0 to 6
Data columns (total 6 columns):
name 7 non-null object
breed 7 non-null object
color 7 non-null object
height_cm 7 non-null int64
weight_kg 7 non-null int64
date_of_birth 7 non-null object
dtypes: int64(2), object(4)
ExploringaDataFrame:.shape
dogs.shape
(7, 6)
ExploringaDataFrame:.describe()
dogs.describe()
height_cm weight_kg
count 7.000000 7.000000
mean 49.714286 27.428571
std 17.960274 22.292429
min 18.000000 2.000000
25% 44.500000 19.500000
50% 49.000000 23.000000
75% 57.500000 27.000000
max 77.000000 74.000000
ComponentsofaDataFrame:.values
dogs.values
dogs.index
https://peps.python.org/pep-0020/
Sortingand
subsetting
DATA M A NIPULATION W ITH PA NDA S
Richie Cotton
Curriculum Architect at DataCamp
Sorting
dogs.sort_values("weight_kg")
0 Bella
1 Charlie
2 Lucy
3 Cooper
4 Max
5 Stella
6 Bernie
Name: name, dtype: object
Subsettingmultiplecolumns
dogs[["breed", "height_cm"]] cols_to_subset = ["breed", "height_cm"]
dogs[cols_to_subset]
breed height_cm
0 Labrador 56 breed height_cm
1 Poodle 43 0 Labrador 56
2 Chow Chow 46 1 Poodle 43
3 Schnauzer 49 2 Chow Chow 46
4 Labrador 59 3 Schnauzer 49
5 Chihuahua 18 4 Labrador 59
6 St. Bernard 77 5 Chihuahua 18
6 St. Bernard 77
Subsettingrows
dogs["height_cm"] > 50
0 True
1 False
2 False
3 False
4 True
5 False
6 True
Name: height_cm, dtype: bool
Subsettingrows
dogs[dogs["height_cm"] > 50]
Richie Cotton
Curriculum Architect at DataCamp
Addinganewcolumn
dogs["height_m"] = dogs["height_cm"] / 100
print(dogs)
Maggie Matsui
Content Developer at DataCamp
Summarizingnumericaldata
dogs["height_cm"].mean() .median() , .mode()
.min() , .max()
.sum()
.quantile()
Summarizingdates
Oldest dog:
dogs["date_of_birth"].min()
'2011-12-11'
Youngest dog:
dogs["date_of_birth"].max()
'2018-02-27'
The.agg()method
def pct30(column):
return column.quantile(0.3)
dogs["weight_kg"].agg(pct30)
22.599999999999998
Summariesonmultiplecolumns
dogs[["weight_kg", "height_cm"]].agg(pct30)
weight_kg 22.6
height_cm 45.4
dtype: float64
Multiplesummaries
def pct40(column):
return column.quantile(0.4)
dogs["weight_kg"].agg([pct30,
pct40])
pct30 22.6
pct40 24.0
Name: weight_kg, dtype:
float64
Cumulativesum
dogs["height_cm"] dogs["height_cm"].cumsum()
0 56 0 56
1 43 1 99
2 46 2 145
3 49 3 194
4 59 4 253
5 18 5 271
6 77 6 348
Name: height_cm, dtype: Name: height_cm, dtype:
int64 int64
Cumulativestatistics
.cummax()
.cummin()
.cumprod()
Walmart
sales.head()
Maggie Matsui
Content Developer at DataCamp
Avoidingdoublecounting
Vetvisits
print(vet_visits)
Labrador 2 Labrador 2
Schnauzer 1 Chow Chow 2
St. Bernard 1 Schnauzer 1
Chow Chow 2 St. Bernard 1
Poodle 1 Poodle 1
Chihuahua 1 Chihuahua 1
Name: breed, dtype: int64 Name: breed, dtype: int64
Proportions
unique_dogs["breed"].value_counts(normalize=True)
Labrador 0.250
Chow Chow 0.250
Schnauzer 0.125
St. Bernard 0.125
Poodle 0.125
Chihuahua 0.125
Name: breed, dtype: float64
Groupedsummary
statistics
DATA M A NIPULATION W ITH PA NDA S
Maggie Matsui
Content Developer at DataCamp
Summariesbygroup
dogs[dogs["color"] == "Black"]["weight_kg"].mean()
dogs[dogs["color"] == "Brown"]["weight_kg"].mean()
dogs[dogs["color"] == "White"]["weight_kg"].mean()
dogs[dogs["color"] == "Gray"]["weight_kg"].mean()
dogs[dogs["color"] == "Tan"]["weight_kg"].mean()
26.0
24.0
74.0
17.0
2.0
Groupedsummaries
dogs.groupby("color")["weight_kg"].mean()
color
Black 26.5
Brown 24.0
Gray 17.0
Tan 2.0
White 74.0
Name: weight_kg, dtype:
float64
Multiplegroupedsummaries
dogs.groupby("color")["weight_kg"].agg([min, max,
sum])
color breed
Black Chow Chow 25
Labrador 29
Poodle 24
Brown Chow Chow 24
Labrador 24
Gray Schnauzer 17
Tan Chihuahua 2
White St. Bernard 74
Name: weight_kg, dtype: int64
Manygroups,manysummaries
dogs.groupby(["color", "breed"])[["weight_kg",
"height_cm"]].mean()
weight_kg height_cm
color breed
Black Labrador 29 59
Poodle 24 43
Brown Chow Chow 24 46
Labrador 24 56
Gray Schnauzer 17 49
Tan Chihuahua 2 18
White St. Bernard 74 77
Pivottables
DATA M A NIPULATION W ITH PA NDA S
Maggie Matsui
Content Developer at DataCamp
Groupbytopivottable
dogs.groupby("color")["weight_kg"].mean() dogs.pivot_table(values="weight_kg",
index="color")
color
Black 26 weight_kg
Brown 24 color
Gray 17 Black 26.5
Tan 2 Brown 24.0
White 74 Gray 17.0
Name: weight_kg, dtype: int64 Tan 2.0
White 74.0
Differentstatistics
import numpy as np
dogs.pivot_table(values="weight_kg", index="color",
aggfunc=np.median)
weight_kg
color
Black 26.5
Brown 24.0
Gray 17.0
Tan 2.0
White 74.0
Multiplestatistics
dogs.pivot_table(values="weight_kg", index="color", aggfunc=[np.mean,
np.median])
mean median
weight_kg weight_kg
color
Black 26.5 26.5
Brown 24.0 24.0
Gray 17.0 17.0
Tan 2.0 2.0
White 74.0 74.0
Pivotontwovariables
dogs.groupby(["color", "breed"])["weight_kg"].mean()
dogs.pivot_table(values="weight_kg", index="color",
columns="breed")
breed Chihuahua Chow Chow Labrador Poodle Schnauzer St. Bernard All
color
Black 0 0 29 24 0 0 26.500000
Brown 0 24 24 0 0 0 24.000000
Gray 0 0 0 0 17 0 17.000000
Tan 2 0 0 0 0 0 2.000000
White 0 0 0 0 0 74 74.000000
All 2 24 26 24 17 74 27.714286
Explicitindexes
DATA M A NIPULATION W ITH PA NDA S
Richie Cotton
Curriculum Architect at DataCamp
Thedogdataset,revisited
print(dogs)
dogs.index
print(dogs_ind)
dogs_ind.loc[["Bella", "Stella"]]
Richie Cotton
Curriculum Architect at DataCamp
Slicinglists
breeds = ["Labrador", "Poodle", breeds[2:5]
"Chow Chow", "Schnauzer",
"Labrador", "Chihuahua",
['Chow Chow', 'Schnauzer',
"St. Bernard"]
'Labrador']
breeds[:3]
['Labrador',
'Poodle',
'Chow Chow', ['Labrador', 'Poodle', 'Chow Chow']
'Schnauzer',
'Labrador', breeds[:]
'Chihuahua',
'St. Bernard']
['Labrador','Poodle','Chow Chow','Schnauzer',
'Labrador','Chihuahua','St. Bernard']
Sorttheindexbeforeyouslice
dogs_srt = dogs.set_index(["breed", "color"]).sort_index()
print(dogs_srt)
Richie Cotton
Curriculum Architect at DataCamp
A biggerdogdataset
print(dog_pack)
color
Black 43.973563
Brown 48.717917
Gray 48.107667
Tan 44.934738
White 44.465208
dtype: float64
Calculatingsummarystatsacrosscolumns
dogs_height_by_breed_vs_color.mean(axis="columns")
breed
Beagle 36.362667
Boxer 59.358667
Chihuahua 19.561250
Chow Chow 52.413333
Dachshund 20.236667
Labrador 55.875000
Poodle 51.637750
St. Bernard 66.654300
dtype: float64
Intro. to Data Visualization
Simple Graphs in Python
using
plt.plot(x, y1)
plt.plot(x, y2)
plt.xlabel("x")
plt.ylabel("y") Incrementally
plt.ylim(-2000, 2000) modify the figure.
plt.axhline(0) # horizontal line
plt.axvline(0) # vertical line
x = [1, 2, 3, 4]
y = [1, 4, 9, 16]
plt.plot(x, y)
no return value?
# x axis values
x = [1,2,3]
# corresponding y axis values
y = [2,4,1]
# line 2 points
x2 = [1,2,3]
y2 = [4,1,3]
# plotting the line 2 points
plt.plot(x2, y2, label = "line 2")
# heights of bars
height = [10, 24, 36, 40, 5]
# labels for bars
names = ['one','two','three','four','five']
# frequencies
ages=[2,5,70,40,30,45,50,45,43,40,44,60,7,13,57,18,90,77,32,21,20,40]
# plotting a histogram
plt.hist(ages, bins, range, color='green',histtype='bar',rwidth=0.8)
# x-axis label
plt.xlabel('age')
# frequency label
plt.ylabel('No. of people')
# plot title
plt.title('My histogram')
x_values = [0,1,2,3,4,5]
y_values = [0,1,4,9,16,25]
# x-axis values
x = [1,2,3,4,5,6,7,8,9,10]
# y-axis values
y = [2,4,5,7,6,8,9,11,12,12]
# x-axis label
plt.xlabel('x - axis')
# frequency label
plt.ylabel('y - axis')
# plot title
plt.title('My scatter plot!')
# showing legend
plt.legend()
# defining labels
activities = ['eat', 'sleep', 'work', 'play']
# plotting legend
plt.legend()
Maggie Matsui
Content Developer at DataCamp
Histograms
import matplotlib.pyplot as plt
dog_pack["height_cm"].hist()
plt.show()
Histograms
dog_pack["height_cm"].hist(bins=20) dog_pack["height_cm"].hist(bins=5)
plt.show() plt.show()
Barplots
avg_weight_by_breed = dog_pack.groupby("breed")["weight_kg"].mean()
print(avg_weight_by_breed)
breed
Beagle 10.636364
Boxer 30.620000
Chihuahua 1.491667
Chow Chow 22.535714
Dachshund 9.975000
Labrador 31.850000
Poodle 20.400000
St. Bernard 71.576923
Name: weight_kg, dtype: float64
Barplots
avg_weight_by_breed.plot(kind="bar") avg_weight_by_breed.plot(kind="bar",
title="Mean Weight by Dog Breed")
plt.show()
plt.show()
Lineplots
sully.head() sully.plot(x="date",
y="weight_kg",
kind="line")
date weight_kg
plt.show()
0 2019-01-31 36.1
1 2019-02-28 35.3
2 2019-03-31 32.0
3 2019-04-30 32.9
4 2019-05-31 32.0
Rotatingaxislabels
sully.plot(x="date", y="weight_kg", kind="line",
rot=45) plt.show()
Scatterplots
dog_pack.plot(x="height_cm", y="weight_kg", kind="scatter")
plt.show()
Layeringplots
dog_pack[dog_pack["sex"]=="F"]["height_cm"].hist()
dog_pack[dog_pack["sex"]=="M"]["height_cm"].hist()
plt.show()
Addalegend
dog_pack[dog_pack["sex"]=="F"]["height_cm"].hist()
dog_pack[dog_pack["sex"]=="M"]["height_cm"].hist()
plt.legend(["F", "M"])
plt.show()
Transparency
dog_pack[dog_pack["sex"]=="F"]["height_cm"].hist(alpha=0.7)
dog_pack[dog_pack["sex"]=="M"]["height_cm"].hist(alpha=0.7)
plt.legend(["F", "M"])
plt.show()
Missingvalues
DATA M A NIPULATION W ITH PA NDA S
Maggie Matsui
Content Developer at DataCamp
What'samissingvalue?
Name Breed Color Height (cm) Weight (kg) Date of Birth
name False
breed False
color False
height_cm False
weight_kg True
date_of_birth False
dtype: bool
Countingmissingvalues
dogs.isna().sum()
name 0
breed 0
color 0
height_cm 0
weight_kg 2
date_of_birth 0
dtype: int64
Plottingmissingvalues
import matplotlib.pyplot as plt
dogs.isna().sum().plot(kind="bar")
plt.show()
Removingmissingvalues
dogs.dropna()
Maggie Matsui
Content Developer at DataCamp
Dictionaries
my_dict = { my_dict = {
"key1": value1, "title": "Charlotte's Web",
"key2": value2, "author": "E.B. White",
"key3": value3 "published": 1952
} }
my_dict["key1"] my_dict["title"]
list_of_dicts = [
]
Listofdictionaries-byrow
name breed height (cm) weight (kg) date of birth
new_dogs = pd.DataFrame(list_of_dicts)
print(new_dogs)
new_dogs = pd.DataFrame(dict_of_lists)
Dictionaryoflists-bycolumn
name breed height (cm) weight (kg) date of birth
print(new_dogs)
Maggie Matsui
Content Developer at DataCamp
What'saCSV file?
CSV =comma-separated values
Most database and spreadsheet programs can use them or create them
ExampleCSV file
new_dogs.csv
name,breed,height_cm,weight_kg,d_o_b
Ginger,Dachshund,22,10,2019-03-14
Scout,Dalmatian,59,25,2019-05-09
CSVtoDataFrame
import pandas as pd
new_dogs = pd.read_csv("new_dogs.csv")
print(new_dogs)
print(new_dogs)
new_dogs_with_bmi.csv
name,breed,height_cm,weight_kg,d_o_b,bmi
Ginger,Dachshund,22,10,2019-03-14,206.611570
Scout,Dalmatian,59,25,2019-05-09,71.818443
https://www.monkeyuser.com/2019/bug-fixing-ways/
Lecture Overview
• Debugging
• Exception Handling
• Testing
Disclaimer: Much of the material and slides for this lecture were borrowed from
—R. Anderson, M. Ernst and B. Howe in University of Washington CSE 140 3
Lecture Overview
• Debugging
• Exception Handling
• Testing
https://www.reddit.com/r/ProgrammerHumor/comments/1r0cw7/the_5_stages_of_debugging/ 4
The Problem “Computers are good at following
instructions, but not at reading
your mind.” - Donald Knuth
There is a bug!
1. Create a hypothesis
2. Design an experiment to test that hypothesis
– Ensure that it yields insight
3. Understand the result of your experiment
– If you don’t understand, then possibly suspend
your main line of work to understand that
The Scientific Method
Tips:
• Be systematic
– Never do anything if you don't have a reason
– Don’t just flail
• Random guessing is likely to dig you into a deeper hole
• IndexError
– Raised when a sequence subscript is out of range.
• KeyError
– Raised when a mapping (dictionary) key is not found in the set of
existing keys.
• KeyboardInterrupt
– Raised when the user hits the interrupt key (normally Control-C or
Delete).
Common Error Types
• NameError
– Raised when a local or global name is not found.
• SyntaxError
– Raised when the parser encounters a syntax error.
• IndentationError
– Base class for syntax errors related to incorrect indentation.
• TypeError
– Raised when an operation or function is applied to an object of
inappropriate type.
Divide and Conquer
Three approaches:
1. Test one function at a time
Divide and Conquer in the Program Code
Three approaches:
2. Add assertions or print statements
– The defect is executed before the failing assertion
(and maybe after a succeeding assertion)
Divide and Conquer in the Program Code
Three approaches:
3. Split complex expressions into simpler ones
Example: Failure in
result = set({graph.neighbors(user)})
Change it to
nbors = graph.neighbors(user)
nbors_set = {nbors}
result = set(nbors_set)
The error occurs on the “nbors_set = {nbors}" line
Divide and Conquer in Test Cases
These “innocent” and unnoticed changes happen more than you would think!
• You add a comment, and the indentation changes.
• You add a print statement, and a function is evaluated twice.
• You move a file, and the wrong one is being read
• You are on a different computer, and the library is a different version
Once You are on Solid Ground You can
Set Out Again
• Once you have something that works and something that
doesn’t work, it is only a matter of time
• Variation: Perhaps your code works with one input, but fails
with another. Incrementally change the good input into the
bad input to expose the problem.
Simple Debugging Tools
print
– shows what is happening whether there is a problem or
not
– does not stop execution
assert
– Raises an exception if some condition is not met
– Does nothing if everything works
– Example: assert len(rj.edges()) == 16
– Use this liberally! Not just for debugging!
Lecture Overview
• Debugging
• Exception Handling
• Testing
What is an Exception?
• An exception is an abnormal condition (and thus
rare) that arises in a code sequence at runtime.
• For instance:
– Dividing a number by zero
– Accessing an element that is out of bounds of an array
– Attempting to open a file which does not exist
What is an Exception?
• When an exceptional condition arises, an object
representing that exception is created and thrown in
the code that caused the error
except IOError:
print("I can't open the file!")
except ZeroDivisionError:
print("You can't divide by zero!")
try:
f = open(arg, 'r')
except IOError:
print('cannot open', arg)
else:
print(arg, 'has', len(f.readlines()), 'lines')
finally Statement
try:
You do your operations here
except:
Execute this block.
finally:
This block will definitely be executed.
try:
file = open('out.txt', 'w')
do something…
finally:
file.close()
os.path.remove('out.txt')
Nested try Blocks
• When an exception occurs inside a try block;
– If the try block does not have a matching except, then the outer
try statement’s except clauses are inspected for a match
– If a matching except is found, that except block is executed
– If no matching except exists, execution flow continues to find a
matching except by inspecting the outer try statements
– If a matching except cannot be found at all, the exception will be
caught by Python’s exception handler.
59
raise Statement
def getRatios(vect1, vect2):
ratios = []
for index in range(len(vect1)):
try:
ratios.append(vect1[index]/vect2[index])
except ZeroDivisionError:
ratios.append(float('nan')) # nan = Not a Number
except:
raise ValueError(’getRatios called with bad arguments’)
return ratios
try:
print(getRatios([1.0, 2.0, 7.0, 6.0], [1.0,2.0,0.0,3.0]))
print(getRatios([], []))
print(getRatios([1.0, 2.0], [3.0]))
except ValueError as msg: [1.0, 1.0, nan, 2.0]
print(msg) []
getRatios called with bad arguments60
raise Statement
• Avoid raising a generic Exception! To catch it, you'll have
to catch all other more specific exceptions that subclass it..
def demo_bad_catch():
try:
raise ValueError('a hidden bug, do not catch this')
raise Exception('This is the exception you expect to handle')
except Exception as error:
print('caught this error: ' + repr(error))
>>> demo_bad_catch()
caught this error: ValueError('a hidden bug, do not catch this',)
raise Statement
• and more specific catches won't catch the general exception:..
def demo_no_catch():
try:
raise Exception('general exceptions not caught by specific handling')
except ValueError as e:
print('we will not catch e')
>>> demo_no_catch()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 3, in demo_no_catch
Exception: general exceptions not caught by specific handling
Custom Exceptions
• Users can define their own exception by creating a
new class in Python.
class ValueTooLargeError(Exception):
"""Raised when the input value is too large"""
pass
Custom Exceptions
number = 10 # you need to guess this number
while True:
try:
i_num = int(input("Enter a number: "))
if i_num < number:
raise ValueTooSmallError
elif i_num > number:
raise ValueTooLargeError
break
except ValueTooSmallError:
print("This value is too small, try again!")
except ValueTooLargeError:
print("This value is too large, try again!")
c = a + b
if c > 100
print("Tested”)
print("Passed”)
# Tests
assert mean([1, 2, 3, 4, 5]) == 3
assert mean([1, 2, 3]) == 2
def mean(numbers):
"""Returns the average of the argument list.
The argument must be a non-empty number list."""
return sum(numbers)//len(numbers)
mean([1, 2, "hello"])
mean("hello")
mean([])
Test suite
• Want to find a collection of inputs that has high
likelihood of revealing bugs, yet is efficient
– Partition space of inputs into subsets that provide
equivalent information about correctness
• Partition divides a set into group of subsets such that each
element of set is in exactly one subset
• Construct test suite that contains one input from
each element of partition
• Run test suite
Example of partition
def bigger(x,y):
""" Assumes x and y are ints returns 1
if x is less than y else returns 0 """
if x<-1:
return –x
else:
return x
Hugo Bowne-Anderson
Data Scientist at DataCamp
URL
Uniform /Universal Resource Locator
Ingredients :
Protocol identi fi er - htp:
Resource name - datacamp . com
http://en.wikipedia.org/wiki/Regular_expression
Really smart “Find” or “Search”
Understanding Regular Expressions
• Very powerful and quite cryptic
• Fun once you understand them
• Regular expressions are a language unto themselves
• A language of “marker characters” - programming with
characters
• It is kind of an “old school” language - compact
http://xkcd.com/208/
Regular Expression Quick Guide
^ Matches the beginning of a line
$ Matches the end of the line
. Matches any character
\s Matches whitespace
\S Matches any non-whitespace character
* Repeats a character zero or more times
*? Repeats a character zero or more times (non-greedy)
+ Repeats a character one or more times
+? Repeats a character one or more times (non-greedy)
[aeiou] Matches a single character in the listed set
[^XYZ] Matches a single character not in the listed set
[a-z0-9] The set of characters can include a range
( Indicates where string extraction is to start
) Indicates where string extraction is to end
https://www.py4e.com/lectures3/Pythonlearn-11-Regex-Handout.txt
The Regular Expression Module
• Before you can use regular expressions in your program, you
must import the library using “import re”
import re
hand = open('mbox-short.txt')
for line in hand: hand = open('mbox-short.txt')
line = line.rstrip() for line in hand:
if line.find('From:') >= 0: line = line.rstrip()
print(line) if re.search('From:', line) :
print(line)
Using re.search() Like startswith()
import re
hand = open('mbox-short.txt')
for line in hand: hand = open('mbox-short.txt')
line = line.rstrip() for line in hand:
if line.startswith('From:') : line = line.rstrip()
print(line) if re.search('^From:', line) :
print(line)
Many
Match the start of times
X-Sieve: CMU Sieve 2.3 the line
X-DSPAM-Result: Innocent
X-Plane is behind schedule: two weeks
X-: Very short
^X.*:
Match any character
Fine-Tuning Your Match
Depending on how “clean” your data is and the purpose of your
application, you may want to narrow your match down a bit
One or more
X-Sieve: CMU Sieve 2.3 Match the start of
times
X-DSPAM-Result: Innocent the line
X-: Very Short
X-Plane is behind schedule: two weeks ^X-\S+:
Match any non-whitespace character
Matching and Extracting Data
• re.search() returns a True/False depending on whether the string
matches the regular expression
>>> import re
>>> x = 'My 2 favorite numbers are 19 and 42'
>>> y = re.findall('[0-9]+',x)
>>> print(y)
['2', '19', '42']
>>> y = re.findall('[AEIOU]+',x)
>>> print(y)
[]
Warning: Greedy Matching
The repeat characters (* and +) push outward in both directions
(greedy) to match the largest possible string
One or more
characters
>>> import re
>>> x = 'From: Using the : character'
>>> y = re.findall('^F.+:', x)
>>> print(y)
^F.+:
['From: Using the :']
>>> y = re.findall('\S+@\S+',x)
\S+@\S+
>>> print(y)
['[email protected]’]
At least one
non-whitespace
character
Fine-Tuning String Extraction
Parentheses are not part of the match - but they tell where to start
and stop what string to extract
>>> y = re.findall('\S+@\S+',x)
>>> print(y) ^From (\S+@\S+)
['[email protected]']
>>> y = re.findall('^From (\S+@\S+)',x)
>>> print(y)
['[email protected]']
String Parsing Examples…
21 31
['uct.ac.za']
'@([^ ]*)'
['uct.ac.za']
'@([^ ]*)'
['uct.ac.za']
'@([^ ]*)'
['uct.ac.za']
'^From .*@([^ ]*)'
Starting at the beginning of the line, look for the string 'From '
Even Cooler Regex Version
From [email protected] Sat Jan 5 09:14:16 2008
import re
lin = 'From [email protected] Sat Jan 5 09:14:16 2008'
y = re.findall('^From .*@([^ ]*)',lin)
print(y)
['uct.ac.za']
'^From .*@([^ ]*)'
['uct.ac.za']
'^From .*@([^ ]*)'
Start extracting
Even Cooler Regex Version
From [email protected] Sat Jan 5 09:14:16 2008
import re
lin = 'From [email protected] Sat Jan 5 09:14:16 2008'
y = re.findall('^From .*@([^ ]*)',lin)
print(y)
['uct.ac.za']
'^From .*@([^ ]+)'
['uct.ac.za']
'^From .*@([^ ]+)'
Stop extracting
Escape Character
If you want a special regular expression character to just behave
normally (most of the time) you prefix it with '\'
fhand = urllib.request.urlopen('http://data.pr4e.org/romeo.txt')
for line in fhand:
print(line.decode().strip())
urllib1.py
import urllib.request, urllib.parse, urllib.error
fhand = urllib.request.urlopen('http://data.pr4e.org/romeo.txt')
for line in fhand:
print(line.decode().strip())
urllib1.py
Reading Web Pages
import urllib.request, urllib.parse, urllib.error
fhand = urllib.request.urlopen('http://www.dr-chuck.com/page1.htm')
for line in fhand:
print(line.decode().strip())
fhand = urllib.request.urlopen('http://www.dr-chuck.com/page1.htm')
for line in fhand:
print(line.decode().strip())
De-Serialize
{
Python "name" : "Chuck", Java
"phone" : "303-4456"
Dictionary }
HashMap
Serialize
JSON
XML
Marking up data to send across the network...
http://en.wikipedia.org/wiki/XML
XML “Elements” (or Nodes)
<people>
<person>
<name>Chuck</name>
<phone>303 4456</phone>
• Simple Element </person>
• Complex Element
<person>
<name>Noah</name>
<phone>622 7421</phone>
</person>
</people>
eXtensible Markup Language
• Primary purpose is to help information systems share structured
data
http://en.wikipedia.org/wiki/XML
XML Basics
• Start Tag <person>
<name>Chuck</name>
• End Tag
<phone type="intl">
• Text Content +1 734 303 4456
• Attribute
</phone>
<email hide="yes" />
• Self Closing Tag </person>
White Space
<person> Line ends do not matter.
<name>Chuck</name>
White space is generally
<phone type="intl">
+1 734 303 4456
discarded on text elements.
</phone> We indent only to be
<email hide="yes" /> readable.
</person>
<person>
<name>Chuck</name>
<phone type="intl">+1 734 303 4456</phone>
<email hide="yes" />
</person>
XML Terminology
• Tags indicate the beginning and ending of elements
• Attributes - Keyword/value pairs on the opening tag of XML
• Serialize / De-Serialize - Convert data in one program into a
common format that can be stored and/or transmitted between
systems in a programming language-independent manner
http://en.wikipedia.org/wiki/Serialization
XML as a Tree
a
<a>
<b>X</b>
<c>
b c
<d>Y</d>
<e>Z</e>
</c> X d e
</a>
Elements Text Y Z
XML Text and Attributes
a
<a>
<b w="5">X</b>
<c> w
b text
c
<d>Y</d> attrib node
<e>Z</e>
</c> 5 X d e
</a>
Elements Text Y Z
XML as Paths a
<a>
<b>X</b>
b c
<c> /a/b X
<d>Y</d> /a/c/d Y
<e>Z</e> /a/c/e Z X d e
</c>
</a>
Y Z
Elements Text
XML Schema
Describing a “contract” as to what is acceptable XML
http://en.wikipedia.org/wiki/Xml_schema
http://en.wikibooks.org/wiki/XML_Schema
XML Schema
• Description of the legal format of an XML document
XML Schema
Validator
Contract
XML Document XML Validation
<person>
<lastname>Severance</lastname>
<age>17</age>
<dateborn>2001-04-17</dateborn>
</person>
- http://en.wikipedia.org/wiki/Document_Type_Definition
- http://en.wikipedia.org/wiki/SGML
- http://en.wikipedia.org/wiki/XML_Schema_(W3C)
http://en.wikipedia.org/wiki/Xml_schema
xml1.py
import xml.etree.ElementTree as ET
data = '''<person>
<name>Chuck</name>
<phone type="intl">
+1 734 303 4456
</phone>
<email hide="yes"/>
</person>'''
tree = ET.fromstring(data)
print('Name:',tree.find('name').text)
print('Attr:',tree.find('email').get('hide'))
import xml.etree.ElementTree as ET xml2.py
input = '''<stuff>
<users>
<user x="2">
<id>001</id>
<name>Chuck</name>
</user>
<user x="7">
<id>009</id>
<name>Brent</name>
</user>
</users>
</stuff>'''
stuff = ET.fromstring(input)
lst = stuff.findall('users/user')
print('User count:', len(lst))
for item in lst:
print('Name', item.find('name').text)
print('Id', item.find('id').text)
print('Attribute', item.get("x"))
JavaScript Object Notation
import json json1.py
data = '''{
"name" : "Chuck",
"phone" : {
"type" : "intl",
"number" : "+1 734 303 4456" JSON represents data
}, as nested “lists” and
"email" : { “dictionaries”
"hide" : "yes"
}
}'''
info = json.loads(data)
print('Name:',info["name"])
print('Hide:',info["email"]["hide"])
import json json2.py
input = '''[
{ "id" : "001",
"x" : "2",
"name" : "Chuck"
} ,
{ "id" : "009", JSON represents data
"x" : "7",
"name" : "Chuck" as nested “lists” and
} “dictionaries”
]'''
info = json.loads(input)
print('User count:', len(info))
for item in info:
print('Name', item['name'])
print('Id', item['id'])
print('Attribute', item['x'])
Service Oriented Approach
http://en.wikipedia.org/wiki/Service-oriented_architecture
Service Oriented Approach
• Most non-trivial web applications use services Application
http://en.wikipedia.org/wiki/API
import urllib.request, urllib.parse, urllib.error
import json
while True:
address = input('Enter location: ')
if len(address) < 1: break
print('Retrieving', url)
uh = urllib.request.urlopen(url)
data = uh.read().decode()
print('Retrieved', len(data), 'characters')
try:
js = json.loads(data)
except:
js = None
print(js[0]['lat'])
print(js[0]['lon'])
print(js[0]['display_name'])
geojson.py
Acknowledgements / Contributions
Thes slide are Copyright 2010- Charles R. Severance (www.dr-
...
chuck.com) of the University of Michigan School of Information
and open.umich.edu and made available under a Creative
Commons Attribution 4.0 License. Please maintain this last slide
in all copies of the document to comply with the attribution
requirements of the license. If you make a change, feel free to
add your name and organization to the list of contributors on this
page as you republish the materials.