0% found this document useful (0 votes)
44 views

Py Toolbox 4 Iterators

This document discusses iterators in Python. It begins by explaining how to iterate over lists, strings, ranges, and dictionaries using for loops. It then defines the difference between iterables and iterators, with iterables being objects that can return an iterator and iterators keeping state and producing the next value. Examples are given of using next() on an iterator and unpacking iterators. The document also demonstrates using enumerate() and zip() to iterate over multiple iterables at once. Finally, it shows how to iterate over large files in chunks using iterators to avoid loading the entire file into memory at once.

Uploaded by

dieko
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views

Py Toolbox 4 Iterators

This document discusses iterators in Python. It begins by explaining how to iterate over lists, strings, ranges, and dictionaries using for loops. It then defines the difference between iterables and iterators, with iterables being objects that can return an iterator and iterators keeping state and producing the next value. Examples are given of using next() on an iterator and unpacking iterators. The document also demonstrates using enumerate() and zip() to iterate over multiple iterables at once. Finally, it shows how to iterate over large files in chunks using iterators to avoid loading the entire file into memory at once.

Uploaded by

dieko
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Introduction to

iterators
P Y T H O N D ATA S C I E N C E TO O L B O X ( PA R T 2 )

Hugo Bowne-Anderson
Data Scientist at DataCamp
Iterating with a for loop
We can iterate over a list using a for loop

employees = ['Nick', 'Lore', 'Hugo']

for employee in employees:


print(employee)

Nick
Lore
Hugo

PYTHON DATA SCIENCE TOOLBOX (PART 2)


Iterating with a for loop
We can iterate over a string using a for loop

for letter in 'DataCamp':


print(letter)

D
a
t
a
C
a
m
p

PYTHON DATA SCIENCE TOOLBOX (PART 2)


Iterating with a for loop
We can iterate over a range object using a for loop

for i in range(4):
print(i)

0
1
2
3

PYTHON DATA SCIENCE TOOLBOX (PART 2)


Iterators vs. iterables
an iterable is an object that can return an iterator,
while an iterator is an object that keeps state and produ-
ces the next value when you call next()

Iterable
Examples: lists, strings, dictionaries, le connections

An object with an associated iter() method

Applying iter() to an iterable creates an iterator

Iterator
Produces next value with next()

PYTHON DATA SCIENCE TOOLBOX (PART 2)


Iterating over iterables: next()
word = 'Da'
it = iter(word)
next(it)

'D'

next(it)

'a'

next(it)

StopIteration Traceback (most recent call last)


<ipython-input-11-2cdb14c0d4d6> in <module>()
-> 1 next(it)
StopIteration:

PYTHON DATA SCIENCE TOOLBOX (PART 2)


Iterating at once with *
word = 'Data'
it = iter(word)

print(*it)

D a t a

print(*it)

No more values to go through!

PYTHON DATA SCIENCE TOOLBOX (PART 2)


Iterating over dictionaries
pythonistas = {'hugo': 'bowne-anderson', 'francis': 'castro'}

for key, value in pythonistas.items():


print(key, value)

francis castro
hugo bowne-anderson

PYTHON DATA SCIENCE TOOLBOX (PART 2)


Iterating over le connections
file = open('file.txt')
it = iter(file)

print(next(it))

This is the first line.

print(next(it))

This is the second line.

PYTHON DATA SCIENCE TOOLBOX (PART 2)


Let's practice!
P Y T H O N D ATA S C I E N C E TO O L B O X ( PA R T 2 )
Playing with
iterators
P Y T H O N D ATA S C I E N C E TO O L B O X ( PA R T 2 )

Hugo Bowne-Anderson
Data Scientist at DataCamp
Using enumerate()
avengers = ['hawkeye', 'iron man', 'thor', 'quicksilver']
e = enumerate(avengers)

print(type(e))

<class 'enumerate'>

e_list = list(e)

print(e_list)

[(0, 'hawkeye'), (1, 'iron man'), (2, 'thor'), (3, 'quicksilver')]

PYTHON DATA SCIENCE TOOLBOX (PART 2)


enumerate() and unpack
avengers = ['hawkeye', 'iron man', 'thor', 'quicksilver']
for index, value in enumerate(avengers):
print(index, value)

0 hawkeye
1 iron man
2 thor
3 quicksilver

for index, value in enumerate(avengers, start=10):


print(index, value)

10 hawkeye
11 iron man
12 thor
13 quicksilver

PYTHON DATA SCIENCE TOOLBOX (PART 2)


Using zip()
avengers = ['hawkeye', 'iron man', 'thor', 'quicksilver']
names = ['barton', 'stark', 'odinson', 'maximoff']

z = zip(avengers, names)

print(type(z))

<class 'zip'>

z_list = list(z)

print(z_list)

[('hawkeye', 'barton'), ('iron man', 'stark'),


('thor', 'odinson'), ('quicksilver', 'maximoff')]

PYTHON DATA SCIENCE TOOLBOX (PART 2)


zip() and unpack
avengers = ['hawkeye', 'iron man', 'thor', 'quicksilver']
names = ['barton', 'stark', 'odinson', 'maximoff']

for z1, z2 in zip(avengers, names):


print(z1, z2)

hawkeye barton
iron man stark
thor odinson
quicksilver maximoff

PYTHON DATA SCIENCE TOOLBOX (PART 2)


Print zip with *
avengers = ['hawkeye', 'iron man', 'thor', 'quicksilver']
names = ['barton', 'stark', 'odinson', 'maximoff']
z = zip(avengers, names)
print(*z)

('hawkeye', 'barton') ('iron man', 'stark')


('thor', 'odinson') ('quicksilver', 'maximoff')

PYTHON DATA SCIENCE TOOLBOX (PART 2)


Let's practice!
P Y T H O N D ATA S C I E N C E TO O L B O X ( PA R T 2 )
Using iterators to
load large les into
memory
P Y T H O N D ATA S C I E N C E TO O L B O X ( PA R T 2 )

Hugo Bowne-Anderson
Data Scientist at DataCamp
Loading data in chunks
There can be too much data to hold in memory

Solution: load data in chunks!

Pandas function: read_csv()


Specify the chunk: chunk_size

PYTHON DATA SCIENCE TOOLBOX (PART 2)


Iterating over data
import pandas as pd
result = []

for chunk in pd.read_csv('data.csv', chunksize=1000):

result.append(sum(chunk['x']))

total = sum(result)

print(total)

4252532

PYTHON DATA SCIENCE TOOLBOX (PART 2)


Iterating over data
import pandas as pd
total = 0

for chunk in pd.read_csv('data.csv', chunksize=1000):


total += sum(chunk['x'])

print(total)

4252532

PYTHON DATA SCIENCE TOOLBOX (PART 2)


Let's practice!
P Y T H O N D ATA S C I E N C E TO O L B O X ( PA R T 2 )
Congratulations!
P Y T H O N D ATA S C I E N C E TO O L B O X ( PA R T 2 )

Hugo Bowne-Anderson
Data Scientist at DataCamp
What’s next?
List comprehensions and generators

List comprehensions:
Create lists from other lists, DataFrame columns, etc.

Single line of code

More ef cient than using a for loop

PYTHON DATA SCIENCE TOOLBOX (PART 2)


Let's practice!
P Y T H O N D ATA S C I E N C E TO O L B O X ( PA R T 2 )

You might also like