CS1010S Lecture 08 - Implementing Data Structures
CS1010S Lecture 08 - Implementing Data Structures
Lecture 8
Implementing Data
Structures
12 Oct 2022
Python you should know
Python Statements :
• def
• return
• lambda
• if, elif, else
• for, while, break, continue
• import
Assume we have:
def make_game_state(n, m):
...
where n and m are the number of coins in each pile.
What Else Do We Need?
def size_of_pile(game_state, p):
...
where p is the number of the pile
def total_size(game_state):
return size_of_pile(game_state, 1) + size_of_pile(game_state, 2)
Announcing winner/loser:
def announce_winner(player):
if player == "human":
print("You lose. Better luck next time.")
else:
print("You win. Congratulations.")
Getting Human Player’s Move
def human_move(game_state):
p = input("Which pile will you remove from?")
n = input("How many coins do you want to remove?")
return remove_coins_from_pile(game_state, int(n), int(p))
Artificial Intelligence
def computer_move(game_state):
pile = 1 if size_of_pile(game_state, 1) > 0 else 2
print("Computer removes 1 coin from pile "+ str(pile))
return remove_coins_from_pile(game_state, 1, pile)
def human_move(game_state):
p = input("Which pile will you remove from?")
n = input("How many coins do you want to remove?")
if int(p) == 0:
return handle_undo(game_state)
else:
push(game_stack, game_state)
return remove_coins_from_pile(game_state, int(n), int(p))
Changes to Nim
def handle_undo(game_state):
if is_empty(game_stack):
print("No more previous moves!")
return human_move(game_state)
old_state = pop(game_stack)
display_game_state(old_state)
return human_move(old_state)
Data Structures: Design Principles
When designing a data structure, need to spell out:
- Specification
Nim: Game state
- Implementation
Specification (contract) piles, coins in each pile
- What does it do? size, remove-coin
- Allows others to use it.
Implementation Multiple representations
- How is it realized? possible
- Users do not need to know this.
- Choice of implementation.
Specification
• Conceptual description of data structure.
- Overview of data structure.
- State assumptions, contracts, guarantees.
- Give examples.
Specification
• Operations:
- Constructors
- Selectors (Accessors)
- Predicates
- Printers
Example: Lists
• Specs:
- A list is a collection of objects, in a given order.
• e.g. [], [3, 4, 1]
Example: Lists
Specs:
• Constructors: list(), [ ]
• Selectors: [ ]
• Predicates: type, in
• Printer: print
Multiset: Specs
A multiset (or bag or mset)
• is a modified set that allows duplicate elements
• count of each element is called the multiplicity
• arrangement of elements does not matter
Example:
- {a, b, b, a}, {b, a, b, a} are the same
- both elements a and b have multiplicity of 2
Multisets: Specs
Constructors:
make_empty_mset, adjoin_mset,
union_mset, intersection_mset
Selectors:
Predicates:
multiplicity_of, is_empty_mset
Printers:
print_set
Multisets: Contract
For any multiset 𝑆, and any object 𝑥
Predicates:
def is_empty_mset(s):
return not s
MSets: Implementation #1
Predicates:
def multiplicitiy_of(x, s): # linear search
count = 0
for ele in s:
if ele == x:
count += 1
return count
Time complexity:
𝑂(𝑛 ), 𝑛 is size of set
MSets: Implementation #1
Constructors:
def adjoin_set(x, s):
s.append(x)
Time complexity: 𝑂 1
MSets: Implementation #1
Constructors:
def intersection_of(s1, s2): # complete matching
result = []
for ele in s1: # O(n)
if ele not in result: # O(n)
n = min(multiplicity_of(ele, s1),
multiplicity_of(ele, s2))
result.extend(n * [ele])
return result
Predicates:
def is_empty_set(s):
return not s #as before
MSets: Implementation #2
def adjoin_mset(x, s):
# binary search
low, high = 0, len(mset) – 1
...
# found at mid, or not found
s.insert(mid, x)
Set 1: {1 3 4 8}
Set 2: {1 4 4 6 8 9}
Result: {1}
→ 3 < 4, 3 not in intersection, forward set1 cursor only
Intersection
Set 1: {1 3 4 4 8}
Set 2: {1 4 4 4 6 8 9}
Result: {1 4}
→ so 4 in intersection, forward both set1 & set2 cursor
Set 1: {1 3 4 4 8}
Set 2: {1 4 4 4 6 8 9}
Result: {1 4 4}
→ so 4 in intersection, forward both set1 & set2 cursor
Intersection
Set 1: {1 3 4 4 8}
Set 2: {1 4 4 4 6 8 9}
Result: {1 4 4}
→ 8 > 4, 4 not in intersection, forward set2 cursor
Set 1: {1 3 4 8}
Set 2: {1 4 4 4 6 8 9}
Result: {1 4 4}
→ 8 > 6, 6 not in intersection, forward set2 cursor
Intersection
Set 1: {1 3 4 4 8}
Set 2: {1 4 4 4 6 8 9}
Result: {1 4 4 8}
→ so 4 in intersection, forward both set1 & set2 cursor
Set 1: {1 3 4 4 8}
Set 2: {1 4 4 4 6 8 9}
Result: {1 4 4 8}
→ set1 empty, return result
MSets: Implementation #2
def intersection_of(s1, s2): # “merge” algorithm
result = []
i, j = 0, 0
while i < len(s1) and j < len(s2):
if s1[i] == s2[j]:
result.append(s1[i])
i += 1
j += 1
elif s1[i] < s2[j]:
i += 1 Time complexity:
else:
j += 1
𝑂(𝑛), faster than previous!
return result
Comparing Implementations
Time Complexity Unordered List Ordered List
adjoin_mset append insert
𝑂(1) 𝑂(𝑛)
multiplicity_of linear search binary search
𝑂(𝑛) 𝑂(log 𝑛)
intersection_of complete match merge algorithm
𝑂(𝑛2 ) 𝑂(𝑛)
MSets: Implementation #3
• Representation: binary tree
- Empty set represented by empty tree.
- Objects are sorted.
MSets: Implementation #3
• Each node stores 1 object.
• Left subtree contains objects smaller than this.
• Right subtree contains objects greater than this.
MSets: Implementation #3
7 3 5
3 9 1 7 3 9
1 5 11 5 9 1 7 11
11
def entry(tree):
return tree[0]
def left_branch(tree):
return tree[1]
def right_branch(tree):
return tree[2]
MSets: Implementation #3
• Each node in the tree contains
- The element
- The count
def make_mset():
'''returns a new, empty set'''
return []
MSets: Implementation #3
def adjoin_mset(x, s): # binary search
if is_empty_set(s):
s.extend(make_tree([x, 1], [], []))
elif x == entry(s)[0]:
entry(s)[1] += 1 # O(1) update
elif x < entry(s)[0]:
adjoin_mset(x, left_branch(s))
else:
adjoin_mset(x, right_branch(s))
Time complexity: 𝑂 log 𝑛
MSets: Implementation #3
def multiplicity_of(x, s): # binary search
if is_empty_set(s):
return 0
elif x == entry(s)[0]:
return entry(s)[1]
elif x < entry(s)[1]:
return multiplicity_of(x, left_branch(s))
else:
return multiplicity_of(x, right_branch(s))
11
Question of the Day
• How do we convert an unbalanced binary tree into a
balanced tree?
• Write a function balance_tree that will take a
binary tree and return a balanced tree (or as
balanced as you can make it)
MSets: Implementation #3
def intersection_of(s1, s2):
# traversing a BST left-mid-right will give the
# elements in order
Time complexity:
𝑂 𝑖 + 𝑗 where 𝑖 and 𝑗 are number of unique items in the sets
Comparing Implementations
Time Complexity Unordered List Ordered List Binary Search Tree
adjoin_mset append insert binary search
𝑂(1) 𝑂(𝑛) 𝑂(log 𝑛)
multiplicity_of linear search binary search binary search
𝑂(𝑛) 𝑂(log 𝑛) 𝑂(log 𝑛)
intersection_of complete match merge merge
𝑂(𝑛2 ) 𝑂(𝑛 + 𝑚) 𝑂(𝑖 + 𝑗)
Python Dictionary
• Often convenient to have a data structure that allow
retrieval by keyword, i.e. put + get
• Table of key-value pairs. Commonly called Associative
arrays
• Python dictionaries use the curly braces {}
Dictionaries
English Dictionary Python Dictionary
Word Its meaning
Key Value
'wind' 0
'desc' 'cloudy'
Word Its meaning 'temp' [25.5, 29.0]
'rainfall' {2:15, 15:7, 18:22}
Python Dictionary
{key1:value1, key2:value2, ...}
def make_mset():
'''returns a new, empty set'''
return {}
MSets: Implementation #4
def adjoin_mset(x, s): def intersection_of(s1, s2):
if x in s: d = {}
s[x] += 1 for x in s1:
else: if x in s2:
s[x] = 1 d[x] = min(s1[x], s2[x])
return d
def multiplicity_of(x, s):
return s.get(x, 0) Time complexity: 𝑂 𝑖
Time complexity: 𝑂 1
Comparing Implementations
Time Complexity Unordered List Ordered List Binary Search Dictionary
Tree
adjoin_mset append insert binary search dict access
𝑂(1) 𝑂(𝑛) 𝑂(log 𝑛) 𝑂(1)
multiplicity_of linear search binary search binary search dict access
𝑂(𝑛) 𝑂(log 𝑛) 𝑂(log 𝑛) 𝑂(1)
intersection_of full match merge merge linear search
𝑂(𝑛2 ) 𝑂(𝑛 + 𝑚) 𝑂(𝑖 + 𝑗) 𝑂 𝑖
Multiple representations
• You have seen that for compound data, multiple
representations are possible:
- e.g. multisets as:
1. Unordered list,
2. Ordered list
3. Binary search tree
4. Dictionary
Multiple representations
• Each representation has its pros/cons:
- Typically, some operations are more efficient, some are
less efficient.
- “Best” representation may depend on how the object is
used.
Typically in large software
projects, multiple
representations co-exist.
Why?
Many possible reasons
• Because large projects have long lifetime, and project
requirements change over time.
• Because no single representation is suitable for every
purpose.
• Because programmers work independently and
develop their own representations for the same
thing.
Multiple representations
Therefore, you must learn to manage different co-
existing representations.
- What are the issues?
- What strategies are available?
- What are the pros/cons?
Summary
• Lots of wishful thinking (top-down)
• Design Principles
- Specification
- Implementation
• Abstraction Barriers allow for multiple
implementations
• Choice of implementation affects performance!
If you have a lot of time on your hands….
• Play nim (dumb version)
• Re-write nim to allow for arbitrary number of piles of
coins
• Write a smarter version of computer-move