19hashing
19hashing
Sungmin Cha
New York University
11.13.2024
Outline
• Notice
• Hashing
– Hash Table
– Hash Function
– Collision Resolution
2
Outline
• Notice
• Hashing
– Hash Table
– Hash Function
– Collision Resolution
3
Notice
• Final exam
– Date: Dec 16th from 14:00 to 15:30
– Location: 60 Fifth Ave 110
– The exam scope covers all topics
▪ Questions in the final exam may include content related to topics
learned before the midterm (e.g., Linked List, Time Complexity,
Sorting, etc.)
▪ However, questions directly asking about contents learned before
the midterm will not be included
▪ Topics majorly covered in the final exam
: Tree (BST, and AVL Tree), Hash Table, Graph
4
Notice
• Changed Schedule
6
Outline
• Notice
• Hashing
– Hash Table
– Hash Function
– Collision Resolution
7
AVL Tree
• Repairing a tree
– So far, we have learned about cases of imbalance in subtree t
▪ And how to resolve it using rotations
– Sometimes, resolving the imbalance of a subtree t can lead to a
new imbalance in the parent tree of t
8
AVL Tree
• Example: repairing a tree
– 1) Balanced tree
6
0
3 7
0 8
2 4 7 8
0 0 0 7
1 2 3 5 6 7 8 9
7 3 5 0 5 3 4 2
2 2 3 3 4 5 6 6 7 7 8 8 9 9
9 6 5
2 5 3 7 5 5 4 7 2 7 2 0
2 3 4 5 5 6 6 6 7 8 8 8 8 9 9 9
7 1 1 4 7 3 6 8 5 4 3 5 9 1 4 7
5 6 8 8 9 9 9
8 9 0 8 3 6 8
9
9
9
AVL Tree
• Example: repairing a tree
– 2) Call delete(9)
6
0
3 7
0 8
2 4 7 8
0 0 0 7
1 2 3 5 6 7 8 9
7 3 5 0 5 3 4 2
2 2 3 3 4 5 6 6 7 7 8 8 9 9
9 6 5
2 5 3 7 5 5 4 7 2 7 2 0
2 3 4 5 5 6 6 6 7 8 8 8 8 9 9 9
7 1 1 4 7 3 6 8 5 4 3 5 9 1 4 7
5 6 8 8 9 9 9
8 9 0 8 3 6 8
9
9
10
AVL Tree
• Example: repairing a tree
– 2) Imbalance in the subtree of the node with 20
6
0
3 7
Left rotation 0 8
2 4 7 8
0 0 0 7
1 2 3 5 6 7 8 9
7 3 5 0 5 3 4 2
2 2 3 3 4 5 6 6 7 7 8 8 9 9
2 5 3 7 5 5 4 7 2 7 2 6 0 5
2
2 3 4 5 5 6 6 6 7 8 8 8 8 9 9 9
7 1 1 4 7 3 6 8 5 4 3 5 9 1 4 7
5 6 8 8 9 9 9
8 9 0 8 3 6 8
9
9
11
AVL Tree
• Example: repairing a tree
– 3) Apply left rotation to the subtree
6
0
3 7
0 8
2 4 7 8
3 0 0 7
2 2 3 5 6 7 8 9
0 5 5 0 5 3 4 2
1 2 2 3 3 4 5 6 6 7 7 8 8 9 9
7 2 7 3 7 5 5 4 7 2 7 2 6 0 5
3 4 5 5 6 6 6 7 8 8 8 8 9 9 9
1 1 4 7 3 6 8 5 4 3 5 9 1 4 7
Balanced! 5 6 8 8 9 9 9
8 9 0 8 3 6 8
9
9
12
AVL Tree
• Example: repairing a tree
– 4) Determine whether there is an imbalance in the upper subtree
6
Left rotation 0
3 7
0 8
2 4 7 8
3 0 0 7
2 2 3 5 6 7 8 9
0 5 5 0 5 3 4 2
1 2 2 3 3 4 5 6 6 7 7 8 8 9 9
7 2 7 3 7 5 5 4 7 2 7 2 6 0 5
3 4 5 5 6 6 6 7 8 8 8 8 9 9 9
1 1 4 7 3 6 8 5 4 3 5 9 1 4 7
2 5 6 8 8 9 9 9
8 9 0 8 3 6 8
9
9
13
AVL Tree
• The impact of recursive implementation of
insert(),delete()
– After inserting or deleting a specific node, check for imbalance in
the subtree where the node is located
▪ And perform repairing if needed
– As the recursive function calls return, sequentially check for
imbalance in the parent node’s subtree and proceed with
repairing
– Finally, check up to the root node of the entire tree and return
the root node
14
Outline
• Notice
• Hashing
– Hash Table
– Hash Function
– Collision Resolution
15
Towards the Most Efficient Data Structure
– Linked List
• AVL Tree
18
Towards the Most Efficient Data Structure
Average case
At least one
: O(log n) For all cases
operation is
Worst case : O(log n)
O(n)
: O(n)
20
Towards the Most Efficient Data Structure
21
Hashing
• Hashing
– A data structure where the position of a key is determined by
the key’s value
– In other words, the goal is to find the position for storing a
key based on its value without comparing it with the stored
keys
▪ Also, it aims to do this calculation just once (O(1))
22
Hashing
– Hash table
: A table capable of storing m keys. Each slot has a hash value
ranging from 0 to m-1
– Hash function
: receive an arbitrary key and return one of the hash values 23
Hashing
Hash Table
Hash v. key
0
1
Key (x)
2
3
4
24
Hashing
• Hashing example
– 1) Insert a key value of 1
Hash Table
Hash v. key
0
h(1) = 1
1 1
Key (1)
2
3
4
25
Hashing
• Hashing example
– 2) Insert a key value of 15
Hash Table
Hash v. key
0 15
h(15) = 0
Key 1 1
(15) 2
3
4
26
Hashing
• Hashing example
– 3) Insert a key value of 24
Hash Table
Hash v. key
0 15
h(24) = 4
Key 1 1
(24) 2
3
4 24
27
Hashing
• Hashing example
– 4) Search for a key value of 15
Hash Table
Hash v. key
0 15
h(15) = 0
Key 1 1
(15) 2
3
4 24
28
Hashing
• Hashing example
– 5) Delete a key value of 1
Hash Table
Hash v. key
0 15
h(1) = 1
1 1
Key (1)
2
3
4 24
▪ Time complexity: O(1) Since the Hash Function calculates the location for
searching, storing, or deleting keys in one step,
achieving a time complexity of O(1) is possible.
29
Hashing
Hash Table
Hash v. key
0 15
h(46) = 1
Key 1 46
(46) 2
3
4 24
30
Hashing
Hash Table
Hash v. key
0 15
h(91) = 1
Key 1 46
(91) 2
3
4 24
▪ Since the slot of hash value 1 already has 46, the key 91 cannot be
inserted into the slot pointed to the hash value
: Hash collision
31
Hashing
▪ When the hash function maps a large number of keys into a smaller
range of hash values, collisions are more likely due to multiple keys
being assigned to the same hash value.
32
Hashing
▪ A hash function that doesn't distribute keys evenly across the hash table
can lead to clusters of keys in certain slots
▪ Therefore, the goal of hash function design is to evenly distribute
input keys across the entire hash table
34
Hashing
• ADT of Hashing
37
Hash Function
Hash Table
Hash v. key
0 15
h(46) = 1
Key 1 46
(46) 2
3
4 24
38
Hash Function
• Division method
– Pros
▪ The most basic hash function
▪ Simple yet allows for fast computation
– Cons
▪ Excess space an be left relatively unused, leading to memory inefficiency
39
Hash Function
40
Hash Function
• Multiplication method
– Example
▪ m = 65,536 and A = 0.6180339887
Hash Table
▪ Key = 1,025,390 is given
Hash v. key
Hash Function 0
Key 1
(1,025,390) …
57,125 1,025,390
h(1,025,390) = 57,125 …
65,536
xA = 1,025,390*0.6180339887 = 633,725.871673093
0.871673093 * 65,536 = 57,125
41
Hash Function
• Multiplication method
– Pros
▪ Distribute hash values evenly
▪ Simple Implementation
▪ Efficient performance
: when the hash table size is appropriately chosen, operations can often bo
done in constant time on average O(1)
– Cons
▪ Sensitive to constant value (A)
: if it is poorly chosen, it can lead to increased collisions
▪ Limited flexibility
: it may not be as flexible or adaptive to changing data distributions or
hash table sizes
42
Concluding Remarks
• Hashing
– The motivation of hashing
– Hash table and function
– Hash collision
43
Thank you!
E-mail: [email protected]
44