100% found this document useful (1 vote)
75 views

Hash Tables

Hash tables store data in an array format where each data value has its own unique index value. A hash function is used to map data to these indices by taking the input, applying a mathematical algorithm, and outputting an index value. Collisions occur when two different inputs map to the same index. Common techniques to handle collisions include separate chaining and open addressing. Hash tables allow O(1) time complexity for operations by directly accessing data locations. They are widely used in applications that require fast lookups like databases, caches, and associative arrays.

Uploaded by

Waleed Ahmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
75 views

Hash Tables

Hash tables store data in an array format where each data value has its own unique index value. A hash function is used to map data to these indices by taking the input, applying a mathematical algorithm, and outputting an index value. Collisions occur when two different inputs map to the same index. Common techniques to handle collisions include separate chaining and open addressing. Hash tables allow O(1) time complexity for operations by directly accessing data locations. They are widely used in applications that require fast lookups like databases, caches, and associative arrays.

Uploaded by

Waleed Ahmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 30

HASH TABLES

A N I M P L E M E N TAT I O N O F H A S H I N G A L G O R I T H M
Division Remainder
Method

Integer Key Mid-Square Method

Contents HASH FUNCTION


Multiplication
Method

Sum of ASCII
String/Char Key
Values
In this concept we will deeply explain this
data structure, the algorithm behind it,
different collision techniques, and some No-Collision
WORKING OF
use case examples including different HASH TABLE HASHING
Separate Chaining
Linear Probing
(Open Hashing)
scenarios thus explaining hashing ALGORITHM
technique for each of them. Collision
Open Addressing
Quadratic Probing
(Closed Hashing)
Insert

Double Hashing

Operations Find

Delete
What is
Hash Table?

• Hash Table is a data structure which stores data in an associative manner.


• In a hash table, data is stored in an array format, where each data value has its own unique index value.
• The Hash table data structure stores elements in key-value pairs where
• Key- unique integer that is used for indexing the values
• Value - data that are associated with keys.
• A hash table uses a hash function to compute an index, also called a hash value, into an array of buckets
or slots, from which the desired value can be found.
Why
Hash Table?
Figure 1. Time Complexity Difference
• Hashing gives a more secure and adjustable method of
retrieving data compared to any other data structure. It
is quicker than searching for lists and arrays as most
of these operate in O(n) run time complexity.
• Hashing allows O(1) queries/inserts/deletes to the
table.
• The idea of a hash table is to provide a direct access to
its items. So that is why the it calculates the "hash
code" of the key and uses it to store the item, instead
of the key itself.
Figure 2. Array vs Hash Table
Hash tables
can perform
nearly all
methods
(except list)

Uses of Hashing and very fast in


O(1) time.

Hash Tables:
• They are widely used in many kinds of computer
software, particularly for associative arrays, database
indexing, caches and sets.
• Compilers use Hash Tables to keep track of declared
variables.
• Used for checking circularity of array.
• In most of the programming languages, there are
built-in data types or data structures in the standard
library that are based on hash tables e.g., dictionary in
Python, or HashMap in Java
Hash Function/
Hashing Algorithm:
• Hashing algorithms are functions that generate a fixed What is Collision?
A hash collision occurs when a hash algorithm produces
result (the hash value) from a given input. the same hash value for two different input values
• Hash function is used to map data of arbitrary size to
fixed-size values.
• The values returned by a hash function are called hash
values, hash codes, digests, or simply hashes.
• The values are used to index a fixed-size table called a
hash table.
Hash Function/
Hashing Algorithm:
• A Hash function can be categorized as:
• Good Hash Function: (Minimum or No Collisions) Figure 1. Example of a Good Hash Function

Some of the properties of Good Hash Function are:

• Very fast to compute (nearly constant)


• One way; can not be reversed
• Output does not reveal information on input
• It should minimize duplication of output values (Hard to find
Collisions)
Figure 2. Example of a bad Hash Function
• Bad Hash Function:
It has high possibility of duplication of output values. (Producing same Indices resulting in
Collisions)
Hash Table & its
Operations:
Hash Function/
Hashing Algorithm:
Hash Algorithm vary depending upon the Input Key, the size of key, the size of table and the type of
keys. There are many hash algorithms for other purposes such as Data Encryption etc. But here we
will discuss some of Hashing Algorithms used in hash tables depending upon the input key:
 String/Character Key:
Different logics can be used and implemented in a hash function to output an index for input string or
character key based on the mathematical operations on the ASCII values of the characters.
 Integer Key:
When the key of input value is numeric, the hash function can be implemented using various
mathematical operational logics to calculate a viable and unique index to place the value at in the array.
 Mid-Square Method:

This method is not


Integer key good for large values
of k and there is also

Hash Functions: a good chance of


collision in this.

 Division/Mod Method:  Multiplication Method:

This method is generally


suitable when the table
size is the power of two,
then the whole process of
computing the index by
the key using
multiplication hashing is
very fast.

This method is relatively fast but has high chances of


collision.
 Multiplication Method:
Integer key
Hash Function
Examples:
 Division/Mod Method:
 Another Hash Function: Primes are used
because you
have the best
chance of
obtaining a
String key unique value
when
Hash Functions: multiplying
values by the
prime number
chosen and
adding them all
 ASCII Value SUM:  Hash Function for Char Key: up

It returns same
hash value for
the same
character either
upper or lower
case
Collisions This is an example of a bad hash function
Example: resulting in collisions.
Let's give it an input of a student whose
number we want to store against it.
(Student Name -> KEY, Marks -> Value)
 String key Hash Function:
Let name be MAHAD it produces sum
475 whose modulus can be taken to
produce and index.
Let another name be AHMAD it will also
produce its sum 475 and value of both
keys will be placed at same index which
will cause collision.
Implementing a simple Hash
Table:
(Considering no collisions)

Hash Table Operations

Hash Function

Hash Table Class


Implementing a simple Hash
Table:
(Considering no collisions)

Main Method for Dry Run


Strategies to
handle Hash
Collisions:
In case no collision is possible (either our algorithm is ideal, or
our keys are too different ), we can simply use an algorithm to
calculate index and place the values in an array.
But if there is a possibility of Collision, we use the following
techniques to tackle:

 Open Hashing or Separate Chaining:


 Closed Hashing or Open Addressing:
 Linear Probing:
 Quadratic Probing:
 Double Hashing:
Separate
Chaining:
• Separate chaining is a collision resolution strategy where
collisions are resolved by storing all colliding keys in the
same slot (using linked list or some other data structure).
• Each slot stores a pointer to another data structure (usually a
linked list or an AVL tree.
Separate
Chaining:
Separate
Chaining:

Running Time
BEST: O(1)
insert WORST
O(n)
:
BEST: O(1)
find WORST
O(n)
:
BEST: O(1)
delete WORST (if insertions
O(n) are always at
: the end of the
linked list)
Performance of
Separate
Chaining:
n = Number of keys to be inserted in hash table
c = Number of slots in hash table

Load factor λ = n/c

Average Cases of Run Time Complexity:

Expected time to search = O(1 + λ)


Expected time to delete = O(1 + λ)

Time to insert = O(1)


Time complexity of search insert and delete is O(1) if λ is 1
Implementing
a Hash Table:
(Collision handling by
Separate Chaining)
Implementing
a Hash Table:
(Collision handling by
Separate Chaining)
Implementing
a Hash Table:
(Collision handling by
Separate Chaining)
Open addressing is a
(Open Addressing collision resolution strategy
where collisions are resolved
Methods) by storing
(1) Linear the colliding key in a
different location when the
Probing: natural choice is full.

Simple Hash Function:


H(k) = k % length
Linear Probing Hash Function:
H’(k,i) = [ H(k) + i ] % length

Example:
length = 10
keys/values are
43, 135, 72, 23, 99, 19, 82
Advantage: No Extra Space
Disadvantage: Search Time O(n), Deletion Difficult,
Clustering (primary)
Open addressing is a
(Open Addressing collision resolution strategy
where collisions are resolved
Methods) by storing
(2) Quadratic the colliding key in a
different location when the
Probing: natural choice is full.

Simple Hash Function:


H(k) = k % length
Quadratic Probing Hash Function:
H’(k,i) = [ H(k) + i^2 ] % length

Example:
length = 10
keys/values are
42, 16, 91, 33, 18, 27, 36, 62
Advantage: No Extra Space, Clustering Resolved
Disadvantage: Search Time O(n), No Guarantee of
finding free slot.
Example
Question:
(Open Addressing
Methods)
(3) Double
Hashing:
• Double hashing is a collision resolving
technique in Open Addressed Hash tables.
Simple Hash Function:
• Double hashing uses the idea of applying a
H1(k) = k mod length
second hash function to key when a
Default Hash function when there is
collision occurs
no collision.
Double Hashing Function:
H’(k,i) = [ H1(k) + ( i x H2(k) ] % length
This is only used when collision
occurs
Where H2(k) = prime – ( k % prime)
prime can be any prime number integer or any
other number relatively prime with length &
prime<length
(Open Addressing Advantage: No Extra Space,
No Clustering
Methods) Disadvantage: Search Time O(n),
Double Hashing
Example:
(Open Addressing
Methods)
Double Hashing
Example:
Thank You!
ANY QUESTIONS ?

You might also like