0% found this document useful (0 votes)

4 views

19hashing

This lecture covers hashing, including hash tables, hash functions, and collision resolution. It emphasizes the efficiency of hashing for data storage and retrieval, achieving O(1) time complexity for operations. Additionally, it discusses the importance of understanding AVL trees and their balancing mechanisms in the context of data structures.

Uploaded by

thirtythr33spam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

19hashing

Uploaded by

thirtythr33spam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 44

Data Structures

<Lecture 19: Hashing>

Sungmin Cha
New York University

11.13.2024
Outline
• Notice

• Review the previous lecture

• Hashing
– Hash Table
– Hash Function
– Collision Resolution

2
Outline
• Notice

• Review the previous lecture

• Hashing
– Hash Table
– Hash Function
– Collision Resolution

3
Notice
• Final exam
– Date: Dec 16th from 14:00 to 15:30
– Location: 60 Fifth Ave 110
– The exam scope covers all topics
▪ Questions in the final exam may include content related to topics
learned before the midterm (e.g., Linked List, Time Complexity,
Sorting, etc.)
▪ However, questions directly asking about contents learned before
the midterm will not be included
▪ Topics majorly covered in the final exam
: Tree (BST, and AVL Tree), Hash Table, Graph

4
Notice
• Changed Schedule

– Lecture 25 will be a pre-recorded video lecture

– HW4 is about Hash and Graph
▪ Implementation of Hash
▪ Solving problems using Graph algorithms
o 3-4 problems? 5
Notice
• Question about HW3 BST and grading midterm exam
– I am checking them now , and I will post an announcement about
them to Ed discussion this week.

6
Outline
• Notice

• Review the previous lecture

• Hashing
– Hash Table
– Hash Function
– Collision Resolution

7
AVL Tree
• Repairing a tree
– So far, we have learned about cases of imbalance in subtree t
▪ And how to resolve it using rotations
– Sometimes, resolving the imbalance of a subtree t can lead to a
new imbalance in the parent tree of t

8
AVL Tree
• Example: repairing a tree
– 1) Balanced tree
6
0

3 7
0 8

2 4 7 8
0 0 0 7

1 2 3 5 6 7 8 9
7 3 5 0 5 3 4 2

2 2 3 3 4 5 6 6 7 7 8 8 9 9
9 6 5
2 5 3 7 5 5 4 7 2 7 2 0

2 3 4 5 5 6 6 6 7 8 8 8 8 9 9 9
7 1 1 4 7 3 6 8 5 4 3 5 9 1 4 7
5 6 8 8 9 9 9
8 9 0 8 3 6 8
9
9

9
AVL Tree
• Example: repairing a tree
– 2) Call delete(9)
6
0

3 7
0 8

2 4 7 8
0 0 0 7

1 2 3 5 6 7 8 9
7 3 5 0 5 3 4 2

2 2 3 3 4 5 6 6 7 7 8 8 9 9
9 6 5
2 5 3 7 5 5 4 7 2 7 2 0

2 3 4 5 5 6 6 6 7 8 8 8 8 9 9 9
7 1 1 4 7 3 6 8 5 4 3 5 9 1 4 7
5 6 8 8 9 9 9
8 9 0 8 3 6 8
9
9

10
AVL Tree
• Example: repairing a tree
– 2) Imbalance in the subtree of the node with 20
6
0

3 7
Left rotation 0 8

2 4 7 8
0 0 0 7

1 2 3 5 6 7 8 9
7 3 5 0 5 3 4 2

2 2 3 3 4 5 6 6 7 7 8 8 9 9
2 5 3 7 5 5 4 7 2 7 2 6 0 5
2
2 3 4 5 5 6 6 6 7 8 8 8 8 9 9 9
7 1 1 4 7 3 6 8 5 4 3 5 9 1 4 7
5 6 8 8 9 9 9
8 9 0 8 3 6 8
9
9

11
AVL Tree
• Example: repairing a tree
– 3) Apply left rotation to the subtree
6
0

3 7
0 8

2 4 7 8
3 0 0 7

2 2 3 5 6 7 8 9
0 5 5 0 5 3 4 2

1 2 2 3 3 4 5 6 6 7 7 8 8 9 9
7 2 7 3 7 5 5 4 7 2 7 2 6 0 5

3 4 5 5 6 6 6 7 8 8 8 8 9 9 9
1 1 4 7 3 6 8 5 4 3 5 9 1 4 7
Balanced! 5 6 8 8 9 9 9
8 9 0 8 3 6 8
9
9

12
AVL Tree
• Example: repairing a tree
– 4) Determine whether there is an imbalance in the upper subtree
6
Left rotation 0

3 7
0 8

2 4 7 8
3 0 0 7

2 2 3 5 6 7 8 9
0 5 5 0 5 3 4 2

1 2 2 3 3 4 5 6 6 7 7 8 8 9 9
7 2 7 3 7 5 5 4 7 2 7 2 6 0 5

3 4 5 5 6 6 6 7 8 8 8 8 9 9 9
1 1 4 7 3 6 8 5 4 3 5 9 1 4 7
2 5 6 8 8 9 9 9
8 9 0 8 3 6 8
9
9

13
AVL Tree
• The impact of recursive implementation of
insert(),delete()
– After inserting or deleting a specific node, check for imbalance in
the subtree where the node is located
▪ And perform repairing if needed
– As the recursive function calls return, sequentially check for
imbalance in the parent node’s subtree and proceed with
repairing
– Finally, check up to the root node of the entire tree and return
the root node

14
Outline
• Notice

• Review the previous lecture

• Hashing
– Hash Table
– Hash Function
– Collision Resolution

15
Towards the Most Efficient Data Structure

• Array and Linked List

– Array List

– Linked List

– Time complexity of search, insert and delete

▪ In any case, at least one operation is O(n) 16
Towards the Most Efficient Data Structure

• Binary Search Tree

– Time complexity of search, insert and delete

▪ In the average and best case, the time complexity is O(log n)
while in the worst case, it is O(n)
17
Towards the Most Efficient Data Structure

• AVL Tree

– Time complexity of search, insert and delete

▪ In all cases, the time complexity is O(log n)

18
Towards the Most Efficient Data Structure

• Improvement of time complexity

– For insert, delete and search operations

Linked and Array List BST AVL Tree

Average case
At least one
: O(log n) For all cases
operation is
Worst case : O(log n)
O(n)
: O(n)

Is it possible to consider faster data structures?

Can operations be performed in constant average

time (O(1)) regardless of the amount of stored data?
19
Towards the Most Efficient Data Structure

• Think different - the core limitation of BST

– Why can’t BST avoid an average time complexity of O(log n)?

▪ It is based on the key comparison between the given key and

the keys in the tree
▪ Because it compares keys to find their positions, it cannot have a time
complexity better than O(log n)

20
Towards the Most Efficient Data Structure

• Think different – Hashing

– Can we determine the position directly based on the key?

Position key Item

0
Key An algorithm Position = 3 1
(12)
Item or function 2
(a) 3 12 a
4

– This is the key idea of Hashing!

21
Hashing

• Hashing
– A data structure where the position of a key is determined by
the key’s value
– In other words, the goal is to find the position for storing a
key based on its value without comparing it with the stored
keys
▪ Also, it aims to do this calculation just once (O(1))

22
Hashing

• Components for Hashing

– Hash table and function
Hash Table
Hash v. key
0
Key Hash Hash value = 3 1
(12) Function 2
3 12
4

– Hash table
: A table capable of storing m keys. Each slot has a hash value
ranging from 0 to m-1
– Hash function
: receive an arbitrary key and return one of the hash values 23
Hashing

Hash Table
Hash v. key
0
1
Key (x)
2
3
4

24
Hashing

• Hashing example
– 1) Insert a key value of 1

Hash Table
Hash v. key
0
h(1) = 1
1 1
Key (1)
2
3
4

▪ Time complexity: O(1)

25
Hashing

• Hashing example
– 2) Insert a key value of 15

Hash Table
Hash v. key
0 15
h(15) = 0
Key 1 1
(15) 2
3
4

▪ Time complexity: O(1)

26
Hashing

• Hashing example
– 3) Insert a key value of 24

Hash Table
Hash v. key
0 15
h(24) = 4
Key 1 1
(24) 2
3
4 24

▪ Time complexity: O(1)

27
Hashing

• Hashing example
– 4) Search for a key value of 15

Hash Table
Hash v. key
0 15
h(15) = 0
Key 1 1
(15) 2
3
4 24

▪ Time complexity: O(1)

28
Hashing

• Hashing example
– 5) Delete a key value of 1

Hash Table
Hash v. key
0 15
h(1) = 1
1 1
Key (1)
2
3
4 24

▪ Time complexity: O(1) Since the Hash Function calculates the location for
searching, storing, or deleting keys in one step,
achieving a time complexity of O(1) is possible.
29
Hashing

• Hashing example – hash collision

– 6) Insert a key value of 46

Hash Table
Hash v. key
0 15
h(46) = 1
Key 1 46
(46) 2
3
4 24

30
Hashing

• Hashing example - hash collision

– 7) Insert a key value of 91

Hash Table
Hash v. key
0 15
h(91) = 1
Key 1 46
(91) 2
3
4 24

▪ Since the slot of hash value 1 already has 46, the key 91 cannot be
inserted into the slot pointed to the hash value
: Hash collision
31
Hashing

• Three factors that can cause a hash collision

– 1) Limited hash value range
Hash Table
Hash v. key
0 15
h(91) = 1
Key 1 46
(91) 2
3
4 24

▪ When the hash function maps a large number of keys into a smaller
range of hash values, collisions are more likely due to multiple keys
being assigned to the same hash value.

32
Hashing

• Three factors that can cause a hash collision

– 2) High load factor
Hash Table
Hash v. key
0 15
Key h(28) = 3 1 46
(28) 2
3 28
4 24
Load factor = 4/5
▪ If the number of keys stored in the hash table approaches or exceeds
the number of available slots (known as the load factor), collisions
become more frequent
o load factor: num of saved keys / size of table
33
Hashing

• Three factors that can cause a hash collision

– 3) Poor hash function design
Hash Table
Hash v. key
0 15
h(91) = 1
Key 1 46
(91) 2
Available
slots 3
4 24

▪ A hash function that doesn't distribute keys evenly across the hash table
can lead to clusters of keys in certain slots
▪ Therefore, the goal of hash function design is to evenly distribute
input keys across the entire hash table
34
Hashing

• Three factors that can cause a hash collision

– 1) Limited hash value range
Solution: collision resolution
– 2) High load factor
– 3) Poor hash function design - Solution: consider a more
efficient hash function

– Hash collision can disrupt achieving time complexity of O(1)

▪ Therefore, preventing this becomes a fundamental concept in hashing

• Key concepts in hashing

– 1) Hash function: division and multiplication method
– 2) Collision resolution
▪ Chaining
▪ Open Addressing
35
Hashing

• ADT of Hashing

– Table[] can be an array list or a linked list

– The search(), insert(), and delete() operations in hashing are
almost identical to those in arrays or linked lists.
▪ However, we need to consider hash function and collision resolution 36
Hash Function

• The goal of designing a hash function

– The input keys should be evenly distributed across the entire
hash table for storage

• Representative hash functions

– Division method
– Multiplication method

37
Hash Function

Hash Table
Hash v. key
0 15
h(46) = 1
Key 1 46
(46) 2
3
4 24

38
Hash Function

• Division method
– Pros
▪ The most basic hash function
▪ Simple yet allows for fast computation

– Cons
▪ Excess space an be left relatively unused, leading to memory inefficiency

39
Hash Function

40
Hash Function

• Multiplication method
– Example
▪ m = 65,536 and A = 0.6180339887
Hash Table
▪ Key = 1,025,390 is given
Hash v. key
Hash Function 0
Key 1
(1,025,390) …
57,125 1,025,390
h(1,025,390) = 57,125 …
65,536
xA = 1,025,390*0.6180339887 = 633,725.871673093
0.871673093 * 65,536 = 57,125

41
Hash Function

• Multiplication method
– Pros
▪ Distribute hash values evenly
▪ Simple Implementation
▪ Efficient performance
: when the hash table size is appropriately chosen, operations can often bo
done in constant time on average O(1)

– Cons
▪ Sensitive to constant value (A)
: if it is poorly chosen, it can lead to increased collisions
▪ Limited flexibility
: it may not be as flexible or adaptive to changing data distributions or
hash table sizes
42
Concluding Remarks

• Hashing
– The motivation of hashing
– Hash table and function
– Hash collision

43
Thank you!

E-mail: [email protected]

XAL External Interface For Alert Management
No ratings yet
XAL External Interface For Alert Management
76 pages
Unit5 Lect5 Hashing
No ratings yet
Unit5 Lect5 Hashing
20 pages
Hash Table Data Structure
No ratings yet
Hash Table Data Structure
34 pages
Hashing
No ratings yet
Hashing
44 pages
DSA Chapter 08 (Searching)
No ratings yet
DSA Chapter 08 (Searching)
65 pages
CSD203 Hashing
No ratings yet
CSD203 Hashing
32 pages
Lec12-Hash-Tables-09092024-090609pm (1)
No ratings yet
Lec12-Hash-Tables-09092024-090609pm (1)
48 pages
ADI Hashing
No ratings yet
ADI Hashing
47 pages
Hashing in Data Structure
No ratings yet
Hashing in Data Structure
43 pages
Hashing and Indexing
No ratings yet
Hashing and Indexing
28 pages
Lect Hashing
No ratings yet
Lect Hashing
36 pages
Search and Sort Algorithm
No ratings yet
Search and Sort Algorithm
37 pages
TCP2101 Algorithm Design & Analysis: - Hash Tables
No ratings yet
TCP2101 Algorithm Design & Analysis: - Hash Tables
58 pages
06 - Hashing
No ratings yet
06 - Hashing
75 pages
Lecture 9 - 2024-Searching and Hashing Algorithms
No ratings yet
Lecture 9 - 2024-Searching and Hashing Algorithms
38 pages
CS2040 Summary
No ratings yet
CS2040 Summary
16 pages
Dsa 4
No ratings yet
Dsa 4
55 pages
Week 9_Hash Functions and Collision
No ratings yet
Week 9_Hash Functions and Collision
73 pages
Course7 Hashing
No ratings yet
Course7 Hashing
19 pages
AST20105 Data Structure and Algorithms: Chapter 9 - Hash Table
No ratings yet
AST20105 Data Structure and Algorithms: Chapter 9 - Hash Table
39 pages
AR23 REC DS Unit-IV v2
No ratings yet
AR23 REC DS Unit-IV v2
26 pages
Hashing ClassNotes
No ratings yet
Hashing ClassNotes
8 pages
Unit Nine
No ratings yet
Unit Nine
31 pages
Dsa Insem Model Answer 2024 (1)
No ratings yet
Dsa Insem Model Answer 2024 (1)
16 pages
11 Hashtable-1
No ratings yet
11 Hashtable-1
48 pages
CSE 326: Data Structures Hash Tables: Autumn 2007
No ratings yet
CSE 326: Data Structures Hash Tables: Autumn 2007
29 pages
Data Structures
No ratings yet
Data Structures
6 pages
Lecture 12
No ratings yet
Lecture 12
19 pages
02 Hash Tables
No ratings yet
02 Hash Tables
21 pages
Hash Table
No ratings yet
Hash Table
68 pages
Chapter10_HashTables
No ratings yet
Chapter10_HashTables
49 pages
6 Dec. 24 Unit 5 DSA
No ratings yet
6 Dec. 24 Unit 5 DSA
56 pages
Hashing Updated
No ratings yet
Hashing Updated
26 pages
05 Hashing
No ratings yet
05 Hashing
47 pages
Maps
No ratings yet
Maps
36 pages
CSC508 Hashing
No ratings yet
CSC508 Hashing
35 pages
ADSA Unit-3
No ratings yet
ADSA Unit-3
7 pages
Hashing
No ratings yet
Hashing
14 pages
Hashing
No ratings yet
Hashing
37 pages
Struktur Data: By: Sri Rezeki Candra Nursari
No ratings yet
Struktur Data: By: Sri Rezeki Candra Nursari
34 pages
Hashing - Datastructures and Algorithms
No ratings yet
Hashing - Datastructures and Algorithms
32 pages
CH8 Hashing
No ratings yet
CH8 Hashing
110 pages
Hashing
No ratings yet
Hashing
66 pages
L5 HashTables
No ratings yet
L5 HashTables
22 pages
DS - Unit 5 - Notes
No ratings yet
DS - Unit 5 - Notes
8 pages
Hashing: Amar Jukuntla
No ratings yet
Hashing: Amar Jukuntla
22 pages
Dsa Lecture 13 Hash Tables
No ratings yet
Dsa Lecture 13 Hash Tables
15 pages
DSA MK Lect2 PDF
No ratings yet
DSA MK Lect2 PDF
92 pages
Hashing PPT For Student
No ratings yet
Hashing PPT For Student
53 pages
Maps and Hashing - Final
No ratings yet
Maps and Hashing - Final
51 pages
Hashing
No ratings yet
Hashing
29 pages
Lecture 8 Hashing
No ratings yet
Lecture 8 Hashing
47 pages
Chapter 8 - Searching
No ratings yet
Chapter 8 - Searching
44 pages
Dsaa 4
No ratings yet
Dsaa 4
3 pages
Hashing
No ratings yet
Hashing
27 pages
06 Hashing
No ratings yet
06 Hashing
6 pages
C++ Classes and Data Structures: Hash Tables
No ratings yet
C++ Classes and Data Structures: Hash Tables
238 pages
Unit III-Hashing
100% (1)
Unit III-Hashing
135 pages
Sudoku New: Workouts to sharpen your mind
From Everand
Sudoku New: Workouts to sharpen your mind
Sahil Gupta
No ratings yet
But Is It the Bad Kind?: A Story About Uninvited Guests
From Everand
But Is It the Bad Kind?: A Story About Uninvited Guests
Rachel Orgel
No ratings yet
16bst
No ratings yet
16bst
65 pages
11sorting
No ratings yet
11sorting
66 pages
10queue
No ratings yet
10queue
50 pages
1intro
No ratings yet
1intro
26 pages
Unit IV - AVL Tree at CSJMU - 6 Slides Handouts
No ratings yet
Unit IV - AVL Tree at CSJMU - 6 Slides Handouts
3 pages
Coding Blocks Trees Sheet - Kartik Bhaiya Trees Sheet
No ratings yet
Coding Blocks Trees Sheet - Kartik Bhaiya Trees Sheet
3 pages
CSD203 Syllabus
No ratings yet
CSD203 Syllabus
11 pages
279 - DBMS Complete1
No ratings yet
279 - DBMS Complete1
121 pages
Proposed-Data Structure Syllabus
No ratings yet
Proposed-Data Structure Syllabus
3 pages
Topic Wise Technical Questions Compiled With Explainations
No ratings yet
Topic Wise Technical Questions Compiled With Explainations
134 pages
COMP 171 Data Structures and Algorithms Spring 2005
No ratings yet
COMP 171 Data Structures and Algorithms Spring 2005
12 pages
Slides On Data Structures Tree and Graph
No ratings yet
Slides On Data Structures Tree and Graph
94 pages
Ip Cse 205
No ratings yet
Ip Cse 205
7 pages
Dsu Viva
No ratings yet
Dsu Viva
23 pages
Activity 1
No ratings yet
Activity 1
5 pages
DS MCQ
No ratings yet
DS MCQ
8 pages
Unit 4 Learning
No ratings yet
Unit 4 Learning
100 pages
Decision Tree Classifier Project
100% (1)
Decision Tree Classifier Project
20 pages
Black Tree
No ratings yet
Black Tree
9 pages
DSA by Shradha Didi & Aman Bhaiya
No ratings yet
DSA by Shradha Didi & Aman Bhaiya
7 pages
9--Tree-Datastructure-19112024-112618am
No ratings yet
9--Tree-Datastructure-19112024-112618am
42 pages
III Sem BCA QBank
No ratings yet
III Sem BCA QBank
25 pages
Data Structures
No ratings yet
Data Structures
20 pages
AVL Trees
No ratings yet
AVL Trees
75 pages
BCS304
No ratings yet
BCS304
3 pages
8 DataStorageIndexingStructures Updated
No ratings yet
8 DataStorageIndexingStructures Updated
57 pages
Quiz Questions For Chapter 1
No ratings yet
Quiz Questions For Chapter 1
19 pages
Data Mining For Customer Segmentation
No ratings yet
Data Mining For Customer Segmentation
13 pages
Practical 3 (Trees)
No ratings yet
Practical 3 (Trees)
8 pages
Levels of Testing, Integration Testing
No ratings yet
Levels of Testing, Integration Testing
16 pages
WINSEM2020-21 CSE4020 ETH VL2020210504996 Reference Material I 26-Apr-2021 Clustering
No ratings yet
WINSEM2020-21 CSE4020 ETH VL2020210504996 Reference Material I 26-Apr-2021 Clustering
43 pages
Final Exam Sheet
No ratings yet
Final Exam Sheet
2 pages
How Data Structure Differs/varies From Data Type
No ratings yet
How Data Structure Differs/varies From Data Type
142 pages

19hashing

Uploaded by

19hashing

Uploaded by

Data Structures

<Lecture 19: Hashing>

• Review the previous lecture

• Review the previous lecture

– Lecture 25 will be a pre-recorded video lecture

• Review the previous lecture

• Review the previous lecture

• Array and Linked List

– Time complexity of search, insert and delete

• Binary Search Tree

– Time complexity of search, insert and delete

– Time complexity of search, insert and delete

• Improvement of time complexity

Linked and Array List BST AVL Tree

Is it possible to consider faster data structures?

Can operations be performed in constant average

• Think different - the core limitation of BST

▪ It is based on the key comparison between the given key and

• Think different – Hashing

Position key Item

– This is the key idea of Hashing!

• Components for Hashing

▪ Time complexity: O(1)

▪ Time complexity: O(1)

▪ Time complexity: O(1)

▪ Time complexity: O(1)

• Hashing example – hash collision

• Hashing example - hash collision

• Three factors that can cause a hash collision

• Three factors that can cause a hash collision

• Three factors that can cause a hash collision

• Three factors that can cause a hash collision

– Hash collision can disrupt achieving time complexity of O(1)

• Key concepts in hashing

– Table[] can be an array list or a linked list

• The goal of designing a hash function

• Representative hash functions

You might also like