0% found this document useful (0 votes)
9 views

04 DataContainer

Uploaded by

林恩玉
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

04 DataContainer

Uploaded by

林恩玉
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

Python Programming

Data Containers
Prof. Chang-Chieh Cheng
Information Technology Service Center
National Yang Ming Chiao Tung University
Lists
• A list is an ordered data sequence
• Random access
• Each element can be accessed by an index
• Duplicate elements are allowed
L = [3, 3, 2, 2, 2, 4, 1]
print(L[0])
L[0] = L[1] + L[2]
print(L[0])
print(L)

2
Some useful methods of lists
• list() L = list()
• create an empty list. print(L) # []

• list.append(x) L = [5, 6, 7]
print(L) # [5, 6, 7]
• add an item x to the end of the list
L.append(99)
print(L) # [5, 6, 7, 99]

• list.insert(i, x) L = ['A', 'B', 'C']


• Insert an item x at a given position i. L.insert(1, 'X')
print(L) # ??
• Then, i will be the index of x.

• list.reverse() L = [5, 6, 7]
• Reverse the elements of the list in place. L.reverse()
print(L) # [7, 6, 5]

3
Some useful methods of lists
• list.clear() L = [5, 6, 7]
• Remove all items. print(L) # [5, 6, 7]
L.clear()
print(L) # []
• list.remove(x)
• Remove the first item from the list whose value is x. It is an error if there is no such item.

L = [5, 6, 7, 6]
print(L) # [5, 6, 7, 6]
L.remove(6)
print(L) # [5, 7, 6]
L.remove(6)
print(L) # [5, 7]
L.remove(6) # An exception of ValueError thrown.

4
del
• Another way to remove an item from a list
L= [-1, 1, 66.25, 333, 333, 1234.5]
del(L[0])
print(L) # [1, 66.25, 333, 333, 1234.5]
del(L[2:4])
print(L) # [1, 66.25, 1234.5]
del(L[:])
print(L) # []

• Let's try it
• A list L contains a set of integers
• Find the range of the longest repeated numbers
• For example, L= [1, 1, 6, 3, 3, 3, 4, 4, 3, 3]
• Then, the range is [3:6]
• Remove the longest repeated numbers
• In the above example, L will be [1, 1, 6, 4, 4, 3, 3]

5
Some useful methods of lists
• list.sort(key=None, reverse=False)
• Sorting, where key specifies the comparison method. Just let it be None in most cases.
L = ['cat', 'mouse', 'pig', 'dog', 'bird']
L.sort()
print(L) # ['bird', 'cat', 'dog', 'mouse', 'pig']

L = ['cat', 'mouse', 'pig', 'dog', 'bird']


L.sort(key = len)
print(L) # ['cat', 'pig', 'dog', 'bird', 'mouse']

L = ['cat', 'mouse', 'pig', 'dog', 'bird']


L.sort(key = len, reverse = True)
print(L) # ['mouse', 'bird', 'cat', 'pig', 'dog']

• Let's try it
• Sort a list that contains a set of integers by the descending order of the number of digits
• If any two numbers have the same digit number, their order in the original list must be kept.
• For example,
• If L = [123, 4, 567, 9801, 1234, 0, 2341]
• The result is [9801, 1234, 2341, 123, 567, 4, 0] 6
Multidimensional arrays
• 3x4 array
L = [[1, 2, 3 ,4], [5, 6, 7, 8], [9, 10, 11, 12]]
print(L)
print(len(L)) # 3
print(len(L[0])) # 4
print(L[0][0], L[0][1], L[1][0], L[2][3])
Print(L[2])

• 2 x 3 x 2 array
L = [[[1, 2], [3 ,4], [5, 6]],
[[7, 8], [9, 10], [11, 12]]]
print(L)
print(len(L)) # 2
print(len(L[0])) # 3
print(len(L[0][0])) # 2
print(L[0][0][0], L[0][1][0], L[1][2][1])
print(L[1])
print(L[0][2])

7
Multidimensional arrays
• 1 x 2 x 3 x 2 array
L = [[[[1, 2], [3 ,4], [5, 6]],
[[7, 8], [9, 10], [11, 12]]]]
print(L)
print(len(L)) # 1
print(len(L[0])) # 2
print(len(L[0][0])) # 3
print(len(L[0][0][0])) # 2
print(L[0][0][0][0], L[0][1][2][1])

8
Shallow copy & deep copy
• Shallow copy
L1 = [[1, 2], [3, 4], 5, 6]
L2 = list(L1)

L2[2] += 1
print(L1, L2)

L2[0][0] += 10
print(L1, L2)

9
Shallow copy & deep copy
• Deep copy
import copy
L1 = [[1, 2], [3, 4], 5, 6]
L2 = copy.deepcopy(L1)

L2[2] += 1
print(L1, L2)

L2[0][0] += 10
print(L1, L2)

10
List comprehension
• Using a for-statement to generate a list
L1 = [ x for x in range(5) ]
print(L1) # [0, 1, 2, 3, 4]

L2 = [ x * 2 for x in range(5) ]
print(L2) # [0, 2, 4, 6, 8]

L3 = [ x * x for x in range(5) if x % 2 == 0]
print(L3) # [0, 4, 16]

11
List comprehension
• Multiple layers of for-statement and 2D list

L4 = [y for x in range(3) for y in range(4)]


print(L4)
# [0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3]

L5 = [[y for x in range(3)] for y in range(4)]


print(L5)
# [[0, 0, 0], [1, 1, 1], [2, 2, 2], [3, 3, 3]]

L6 = [[y * 3 + x for x in range(3)] for y in range(4)]


print(L6)
# [[0, 1, 2], [3, 4, 5], [6, 7, 8], [9, 10, 11]]

12
Exercise 1
• Using list comprehension to generate a m * n list as follows
• m = 6, n = 3:
[[0, 1, 2],
[5, 4, 3],
[6, 7, 8],
[11, 10, 9],
[12, 13, 14],
[17, 16, 15]]

• m = 8, n = 4:
[[0, 1, 2, 3],
[7, 6, 5, 4],
[8, 9, 10, 11],
[15, 14, 13, 12],
[16, 17, 18, 19],
[23, 22, 21, 20],
[24, 25, 26, 27],
[31, 30, 29, 28]]

13
Tuples
• A tuple is also a data container to store a set of data
objects
• Like list, using an index to access an item of a tuple
t = 12345, 54321, 'hello!' # without any parenthesis
print(t[0]) # 12345
print(t) # (12345, 54321, 'hello!')

u = t, (1, 2, 3, 4, 5)
# A tuple also can be an item of another tuple
print(u) # ((12345, 54321, 'hello!'), (1, 2, 3, 4, 5))
# Notice the parentheses

• Each item is immutable (read only).

t = 12345, 54321, 'hello!'


t[1] = 0 # Error!

14
Tuples
• Tuple appending
t = 1, 2, 3
t += 4 # Error!
t += (4) # Error! Why?

t += (4,) # OK!

15
Tuples
• Let's try it
• The following code can draw a figure from two lists that are X axis's data and Y axis's
data.
import matplotlib.pyplot as plt
X = [0, 1, 2, 3]
Y = [10, 5, 8, 9]
plt.plot(X, Y)
plt.scatter(X, Y)
plt.show()

• Let's see what kind figure will be drawn if you changed some number in X and Y.

16
Tuples
• Let's try it
• Given a set of tuple of two elements
T = (0, 10), (1, 5), (2, 8), (3, 9)

• Write a program to transform T as two lists, X and Y, such that we can use the
program in previous page to draw a figure for T.

17
Sets
• Unordered collection with no duplicate elements.
A = {7, 9, 1, 1, 9, 2, 1, 2}
print(A) # {9, 2, 1, 7}
# The order is undefined
# because set is unordered
print(A[0]) # Error!

• Item appending and removing


A = {4, 6, 1, 2, 2, 1, 3}
A.add(5)
print(len(A)) # 6
print(A) # {1, 2, 3, 4, 5, 6}
A.remove(2)
print(len(A)) # 5
print(A) # {1, 3, 4, 5, 6}
A.clear()
print(len(A)) # 0
print(A) # set()

18
Sets
• Element accessing
• By for statement
A = {7, 9, 1, 1, 9, 2, 1, 2}
for x in A:
print(x)

19
Sets
• Creating an empty set
A = set()
print(len(A)) # 0

A.add(7)
A.add(1)
A.add(1)
A.add(7)
A.add(9)
print(len(A)) # 3
print(A) # {1, 9, 7}

20
Sets
• Let's try it
• Use the following to input a series of positive numbers and store them into a list L.

S = set()
while True:
x = int(input('Input a positive number: '))
if x < 0:
break
else:
S.add(x)
print('The size of S is ', len(S))
print(S)
L = list(S);
print(L)

21
Sets
• Let's try it
• Given a list L.
• Find the median of L
• Using a set S to store all elements in L
• Find the median of S
• For example,
• if L = [4, 6, 1, 2, 2, 1, 3 ], its median is 2
• Then, S maybe is {3, 2, 1, 4, 6}, the median of S is 3

• if L = [7, 8, 8, 6, 5, 2, 2, 3, 3 ], its median is 5


• Then, S maybe is {5, 3, 2, 7 ,6 ,8}, the median of S is 6

22
Sets
• Because a set is unordered, we cannot sort a set
• But sorted function can return a list of sorted data from a set

A = {7, 9, 1, 1, 9, 2, 1, 2}
L = sorted(A)
print(A) # {9, 2, 1, 7}
print(L) # [1, 2, 7, 9]

23
Sets
• Union two sets
A = {4, 3, 1, 2}
B = {3, 6, 5, 4}
C = A.union(B)
print(A) # {1, 2, 3, 4}
print(B) # {3, 4, 5, 6}
print(C) # {1, 2, 3, 4, 5, 6}

• Intersection of two sets


A = {4, 3, 1, 2}
B = {3, 6, 5, 4}
C = A.intersection(B)
print(A) # {1, 2, 3, 4}
print(B) # {3, 4, 5, 6}
print(C) # {3, 4}

24
Sets
• Check whether two sets are disjointed
A = {4, 3, 1, 2}
B = {3, 6, 5, 4}
C = {5, 6}
print(A.isdisjoint(B)) # False
print(A.isdisjoint(C)) # True

• Check set B is a subset of set A


A = {4, 3, 1, 2}
B = {3, 6, 5, 4}
C = {5, 6}
print(C.issubset(A)) # False
print(C.issubset(B)) # True
print(B.issubset(C)) # False

25
Sets
• Check set B is superset of set A
A = {4, 3, 1, 2}
B = {3, 6, 5, 4}
C = {5, 6}
print(C.issuperset(A)) # False
print(C.issuperset(B)) # False
print(B.issuperset(C)) # True

• A-B
A = {4, 3, 1, 2}
B = {3, 6, 5, 4}
C = A.difference(B)
D = B.difference(A)
print(C) # {1, 2}
print(D) # {5, 6}
C = A.symmetric_difference(B)
D = B.symmetric_difference(A)
print(C) # {1, 2, 5, 6}
print(D) # {1, 2, 5, 6}

26
enumerate
• Creating a sequence of tuples for a data container and each tuple contains
(index, and data).

L = ['ABC', 'DEF', 'GHI']


E = enumerate(L)
for x in E:
print(x)

S = {'ABC', 'DEF', 'GHI'}


E = enumerate(S)
for x in E:
print(x)

27
Dictionaries
• A dictionary is similar to a list, but each element is indexed by a key rather
than an integer
Scores = {'James':82, 'Mary':98, 'Yamamoto':93}
print(Scores['Mary']) # 98
Scores['Yamamoto'] += 7
print(Scores) # {'James':82, 'Mary':98, 'Yamamoto':100}
print(len(Scores)) # 3

• Like a set, each element is unique in a dictionary.


• Notice that all elements are unsorted

Scores = {'Yamamoto':93, 'James':82, 'Mary':68, 'Mary':98, 'James':80}


print(Scores) # {'Yamamoto': 93, 'James': 80, 'Mary': 98}

28
Dictionaries
• Element updating and appending
Scores = {'James':82, 'Mary':98, 'Yamamoto':93}
print(Scores) # {'James':80, 'Mary':98, 'Yamamoto':93}
Scores.update({'Yamamoto':84})
print(Scores['Yamamoto'])
Scores.update({'Hideo':77})
print(Scores) # {'James':80, 'Mary':98, 'Yamamoto':93, 'Hideo':77}

• Element removing
Scores = {'James':82, 'Mary':98, 'Yamamoto':93}
print(Scores) # {'James':80, 'Mary':98, 'Yamamoto':93}
Scores.pop('James')
print(Scores) # {'Mary':98, 'Yamamoto':93}

29
Dictionaries
• Create a new dictionary
Scores = dict()

Scores.update({'Yamamoto':84})
Scores.update({'Hideo':77})
print(Scores) # {'Yamamoto':93, 'Hideo':77}

Scores = {}

Scores.update({'Yamamoto':84})
Scores.update({'Hideo':77})
print(Scores) # {'Yamamoto':93, 'Hideo':77}

30
Dictionaries
• DO NOT use a floating point number to be a key
D = {}
i = 0.0
while i <= 1.0:
print(i)
i += 0.1
D[i] = i * 100

print(D[0.8]) # Key error!

x = 0.7
• Why? y = 0.1
• Storing a floating number may generate an error z = x + y
print(x, y, z)
u = 0.9
v = -0.1
w = u + v
print(u, v, w)
if z == w:
print("!")
31
Dictionaries
• for loop and dictionaries
Scores = {'James':82, 'Mary':98, 'Yamamoto':93}
for key in Scores:
print(key ) # list all keys

for key in Scores:


print(key , "=", Scores[key]) # list keys and values

32
Dictionaries
• Sort by key
Scores = {'James':82, 'Mary':98, 'Yamamoto':93}

L1 = sorted(Scores)
print(L1)
# ['James', 'Mary', 'Yamamoto']

L2 = sorted(Scores.items())
print(L2)
# [('James', 82), ('Mary', 98), ('Yamamoto', 93)]

33
Dictionaries
• Sort by value
from operator import itemgetter

Scores = {'James':82, 'Mary':98, 'Yamamoto':93}


L = sorted(Scores.items(), key = itemgetter(1))
print(L)
# [('James', 82), ('Yamamoto', 93), ('Mary', 98)]

• itemgetter is a function generator

34
Dictionaries
• Let's try it
• Modify all examples of dictionary such that each student can store three scores

35
Dictionaries
• A dictionary with multiple keys
• Example:
student1 = {'Name':'James', 'ID':'01008', 'Score':90}
student2 = {'Name':'Mary', 'ID':'01003', 'Score':98}
student3 = {'Name':'Yamamoto', 'ID':'01005', 'Score':93}
print(student1)
print(student2)
print(student3)

L = list()
L.append({'Name':'James', 'ID':'01008', 'Score':90})
L.append({'Name':'Mary', 'ID':'01003', 'Score':98})
L.append({'Name':'Yamamoto', 'ID':'01005', 'Score':93})
for student in L:
print(student)

36
Dictionaries
• Data selection from a list of dictionaries
• Example:
L = list()
L.append({'Name':'James', 'ID':'01008', 'Score':90})
L.append({'Name':'Ruby', 'ID':'01024', 'Score':89})
L.append({'Name':'Mary', 'ID':'01003', 'Score':98})
L.append({'Name':'Yamamoto', 'ID':'01005', 'Score':93})
L.append({'Name':'Judy', 'ID':'01021', 'Score':73})

L2 = [x for x in L if x['Score'] < 90 ]

for student in L2:


print(student)

L3 = [x for x in L if x['Name'][-1] == 'y' and x['Score'] >= 80 ]

for student in L3:


print(student)

37
Dictionaries
• Data selection from a list of dictionaries
• Example:
from operator import itemgetter
L = list()
L.append({'Name':'James', 'ID':'01008', 'Score':90})
L.append({'Name':'Ruby', 'ID':'01024', 'Score':89})
L.append({'Name':'Mary', 'ID':'01003', 'Score':98})
L.append({'Name':'Yamamoto', 'ID':'01005', 'Score':93})
L.append({'Name':'Judy', 'ID':'01021', 'Score':73})

L4 = [{'ID':x['ID'], 'Score':x['Score']} for x in L ]


L4.sort(key = itemgetter('ID'))
for student in L4:
print(student)

38
Strings
• All string operations
• https://docs.python.org/3/library/stdtypes.html#string-methods
• Substring finding
s = 'Hello! My firends!'
if 'My' in s:
print('OK')

if 'my' in s:
print('OK')

39
Strings
• str.index(substring, start=0, end=len(string))
• Get the lowest index of substring in str

s = 'Hello! My firends!'
print(s.index('My'))
print(s.index('my')) # Error!
print(s.index('!'))
print(s.index('!', s.index('!') + 1))

40
Strings
• str.split(sep=None, maxsplit=-1)
• Return a list of the words in the string, using sep as the delimiter string. If maxsplit is given,
at most maxsplit splits are done.

s = 'aaa*bbb*ccc eee fff*ggg '


L1 = s.split(sep = '*')
L2 = s.split(sep = ' ')
L3 = s.split(sep = '*', maxsplit = 2)
print(L1) # ['aaa', 'bbb', 'ccc eee fff', 'ggg ']
print(L2) # ['aaa*bbb*ccc', 'eee', 'fff*ggg', '']
print(L3) # ['aaa', 'bbb', 'ccc eee fff*ggg ']

• Notice zero-length substrings

s = 'aaa*bbb**ccc***eee'
L = s.split(sep = '*')
print(L) # ['aaa', 'bbb', '', 'ccc', '', '', 'eee']

41
Strings
• re.split
• Splitting a string with multiple separators

import re
s = 'aaa*bbb*ccc eee fff*ggg '
L4 = re.split('ccc|\*| ', s)
print(L4)

42
Exercise
• Input two strings, s1 and s2.
• Using a set to store all words in s1.
• Count how many number of words in both s1 and s2.
• For example, s1 is 'How do you do' and s2 is 'What do you
think'
• The result is 2, which are 'do' and 'you'.

43
Strings
• str.replace(old, new[, count])
• Return a copy of the string with all occurrences of substring old replaced by new. If the optional
argument count is given, only the first count occurrences are replaced.

s = 'AAA ABC ABC ccc abc ABC ABC ddd'


s1 = s.replace('ABC', '*')
s2 = s.replace('ABC', '*', 3)
print(s1) # AAA * * ccc abc * * ddd
print(s2) # AAA * * ccc abc * ABC ddd

44
Strings
• str.count(sub[, start[, end]])
• Return the number of non-overlapping occurrences of substring sub in the range
[start, end].
s = 'aaa bbb aaa aaa bbb ccc'
print(s.count('aaa')) # 3
print(s.count('bbb')) # 2
print(s.count('ccc')) # 1
print(s.count('ddd')) # 0

45
collections::Counter
• collections is a standard library of Python that includes a lot of many useful data
containers
• collections/Counter
• Counter is an unordered collection where elements are stored as dictionary keys and their counts are
stored as dictionary values.
from collections import Counter
s = 'cccbbbaaabbbaaabbb'
cnt = Counter(s)
print('----------')
print(cnt) # Counter({'b': 9, 'a': 6, 'c': 3})
print('----------')
print(cnt.keys()) # dict_keys(['c', 'b', 'a'])
print('----------')
print(cnt.items()) # dict_items([('c', 3), ('b', 9), ('a', 6)])

46
collections::Counter
• Accessing elements of Counter

from collections import Counter


s = 'cccbbbaaabbbaaabbb'
cnt = Counter(s)
for item in cnt:
print(item, '\t', cnt[item])

"""
c 3
b 9 without sorting!
a 6
"""

47
Exercise 2
• Design a program to count the number of occurrences of each word in a text
• Assuming that each word is separated by a space character.
• Only count non-zero-length words
• Just ignore zero-length words
• Print the results by the number of occurrences in ascending order
• for example, given a text,
" XYZ abc XYZ abc xyz abc ABC xyz ",
the results will be
ABC: 1
XYZ: 2
xyz: 2
abc: 3

Another example,
" xyz abc xyz abc XYZ abc ABC XYZ ",
the results will be
ABC: 1
xyz: 2
XYZ: 2
abc: 3 48

You might also like