0% found this document useful (0 votes)
12 views26 pages

Ds 5 Update

The document provides an overview of hashing, a technique for uniquely identifying objects using hash functions and hash tables. It discusses the implementation of hashing, the role of hash functions, basic operations of hash tables, and various collision handling methods such as separate chaining and open addressing. Additionally, it covers applications of hash tables and provides code examples for insertion and search operations.

Uploaded by

s.dhanapal13
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views26 pages

Ds 5 Update

The document provides an overview of hashing, a technique for uniquely identifying objects using hash functions and hash tables. It discusses the implementation of hashing, the role of hash functions, basic operations of hash tables, and various collision handling methods such as separate chaining and open addressing. Additionally, it covers applications of hash tables and provides code examples for insertion and search operations.

Uploaded by

s.dhanapal13
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 26

VIDHYA SAGAR WOMENS COLLEGE, CHENGALPATTU

DEPT OF COMPUTER SCIENCE


ONLINE CLASS
DATA
STRUCTURE
Staff Incharge: D.Preethi M.Sc.,M.Phil.,NET
Asst.Professor
Dept of Computer Science
HASHING
 Hashing is a technique that is used to
uniquely identify a specific object from a
group of similar objects.
 Example :
 In universities, each student is assigned a unique
roll number that can be used to retrieve
information about them.
 In libraries, each book is assigned a unique
number that can be used to determine
information about the book, such as its exact
position in the library or the users it has been
issued to etc
In both these examples the students and books
were hashed to a unique number.
 In hashing, large keys are converted into
small keys by using hash functions.
 The values are then stored in a data

structure called hash table.


 The idea of hashing is to distribute entries

(key/value pairs) uniformly across an array.


Each element is assigned a key (converted
key). By using that key you can access the
element in O(1) time.
IMPLEMENTATION:
 Hashing is implemented in two steps:
 An element is converted into an integer by using
a hash function. This element can be used as an
index to store the original element, which falls
into the hash table.
 The element is stored in the hash table where it
can be quickly retrieved using hashed key.
hash = hashfunc(key)
index = hash % array_size
 In this method, the hash is independent of the
array size and it is then reduced to an index (a
number between 0 and array_size − 1) by using
the modulo operator (%).
HASH FUNCTION
 A hash function is any function that can be used to
map a data set of an arbitrary size to a data set of a
fixed size, which falls into the hash table.
 The values returned by a hash function are called hash

values, hash codes, hash sums, or simply hashes.


To achieve a good hashing mechanism, It is important to
have a good hash function with the following basic
requirements:
 Easy to compute: It should be easy to compute and must
not become an algorithm in itself.
 Uniform distribution: It should provide a uniform distribution
across the hash table and should not result in clustering.
 Less collisions: Collisions occur when pairs of elements are
mapped to the same hash value. These should be avoided.
{(1,20)(2,70)(42,80)(4,25)(12,44)(14,32)(17,11)
(13,78)(37,98)}
Sr.No. Key Hash Array Index

1 1 1 % 20 = 1 1

2 2 2 % 20 = 2 2

3 42 42 % 20 = 2 2

4 4 4 % 20 = 4 4

5 12 12 % 20 = 12 12

6 14 14 % 20 = 14 14

7 17 17 % 20 = 17 17

8 13 13 % 20 = 13 13

9 37 37 % 20 = 17 17
HASH TABLE
 A hash table is a data structure that is used
to store keys/value pairs.
 It uses a hash function to compute an index

into an array in which an element will be


inserted or searched.
 By using a good hash function, hashing can

work well.
 Let us consider string S. You are required to

count the frequency of all the characters in


this string.
string S = “ababcd”
 The time complexity of this approach
is O(26*N) where N is the size of the string
and there are 26 possible characters.
void countFre(string S)
{
for(char c = ‘a’;c <= ‘z’;++c)
{
int frequency = 0;
for(int i = 0;i < S.length();++i)
if(S[i] == c)
frequency++;
cout << c << ‘ ‘ << frequency << endl;
}
}

Output
a2
b2
c1
d1
e0
f0

z0
BASIC OPERATIONS

 Following are the basic primary operations of


a hash table.
 Search − Searches an element in a hash

table.
 Insert − inserts an element in a hash table.

 delete − Deletes an element from a hash

table.
APPLICATIONS

 Associative arrays: Hash tables are commonly used to


implement many types of in-memory tables. They are
used to implement associative arrays (arrays whose
indices are arbitrary strings or other complicated
objects).
 Database indexing: Hash tables may also be used as
disk-based data structures and database indices (such as
in dbm).
 Caches: Hash tables can be used to implement caches
i.e. auxiliary data tables that are used to speed up the
access to data, which is primarily stored in slower media.
 Object representation: Several dynamic languages, such
as Perl, Python, JavaScript, and Ruby use hash tables to
implement objects.
 Hash Functions are used in various algorithms to make
their computing faster
HASH TABLE COLLISION
HANDLING
 Two basic methods;
 separate chaining and open address.
 open address
 LinearProbing:
 Quadratic Probing
 Double Hashing:
SEPARATE CHAINING
 Separate chaining is one of the most
commonly used collision resolution
techniques. It is usually implemented using
linked lists.
 In separate chaining, each element of the

hash table is a linked list.


 To store an element in the hash table you

must insert it into a specific linked list. If


there is any collision (i.e. two different
elements have same hash value) then store
both the elements in the same linked list.
IMPLEMENTATION OF HASH TABLES
WITH SEPARATE CHAINING
Assumption
 Hash function will return an integer from 0 to
19.
vector <string> hashTable[20];
int hashTableSize=20;
Insert
void insert(string s)
{
// Compute the index using Hash Function
int index = hashFunc(s);
// Insert the element in the linked list at the
particular index
hashTable[index].push_back(s);
}
Search
void search(string s)
{
//Compute the index by using the hash function
int index = hashFunc(s);
//Search the linked list at that specific index
for(int i = 0;i < hashTable[index].size();i++)
{
if(hashTable[index][i] == s)
{
cout << s << " is found!" << endl;
return;
}
}
cout << s << " is not found!" << endl;
}
LINEAR PROBING (OPEN ADDRESSING OR CLOSED
HASHING)
 In open addressing, instead of in linked lists, all entry records are
stored in the array itself.
 When a new entry has to be inserted, the hash index of the hashed

value is computed and then the array is examined (starting with the
hashed index).
 If the slot at the hashed index is unoccupied, then the entry record is

inserted in slot at the hashed index else it proceeds in some probe


sequence until it finds an unoccupied slot.
 When searching for an entry, the array is scanned in the same

sequence until either the target element is found or an unused slot is


found.
 This indicates that there is no such key in the table.

 The name "open addressing" refers to the fact that the location or

address of the item is not determined by its hash value.


 Linear probing is when the interval between successive probes is fixed

(usually to 1).
Let’s assume that the hashed index for a particular entry is index. The
probing sequence for linear probing will be:
index = index % hashTableSize
index = (index + 1) % hashTableSize
index = (index + 2) % hashTableSize
IMPLEMENTATION OF HASH TABLE WITH
LINEAR PROBING

Assumption
 There are no more than 20 elements in the data set.
 Hash function will return an integer from 0 to 19.
 Data set must have unique elements.
string hashTable[21];
int hashTableSize = 21;
Insert
void insert(string s)
{
//Compute the index using the hash function
int index = hashFunc(s);
//Search for an unused slot and if the index will exceed
the hashTableSize then roll back
while(hashTable[index] != "")
index = (index + 1) % hashTableSize;
hashTable[index] = s;
}
Search
void search(string s)
{
//Compute the index using the hash function
int index = hashFunc(s);
//Search for an unused slot and if the index will exceed
the hashTableSize then roll back
while(hashTable[index] != s and hashTable[index] !=
"")
index = (index + 1) % hashTableSize;
//Check if the element is present in the hash table
if(hashTable[index] == s)
cout << s << " is found!" << endl;
else
cout << s << " is not found!" << endl;
}
QUADRATIC PROBING
 Quadratic probing is similar to linear probing and the only
difference is the interval between successive probes or
entry slots.
 when the slot at a hashed index for an entry record is

already occupied, you must start traversing until you find an


unoccupied slot.
 The interval between slots is computed by adding the

successive value of an arbitrary polynomial in the original


hashed index.
Let us assume that the hashed index for an entry is index and
at index there is an occupied slot.
The probe sequence will be as follows:
 index = index % hashTableSize

index = (index + 12) % hashTableSize


index = (index + 22) % hashTableSize
index = (index + 32) % hashTableSize
and so on…
IMPLEMENTATION OF HASH TABLE WITH
QUADRATIC PROBING
Assumption
 There are no more than 20 elements in the data set.
 Hash function will return an integer from 0 to 19.
 Data set must have unique elements.
string hashTable[21];
int hashTableSize = 21;
Insert
void insert(string s)
{
//Compute the index using the hash function
int index = hashFunc(s);
//Search for an unused slot and if the index will exceed the hashTableSize roll
back
int h = 1;
while(hashTable[index] != "")
{
index = (index + h*h) % hashTableSize;
h++;
}
hashTable[index] = s;
}
Search
void search(string s)
{
//Compute the index using the Hash Function
int index = hashFunc(s);
//Search for an unused slot and if the index will exceed the
hashTableSize roll back
int h = 1;
while(hashTable[index] != s and hashTable[index] != "")
{
index = (index + h*h) % hashTableSize;
h++;
}
//Is the element present in the hash table
if(hashTable[index] == s)
cout << s << " is found!" << endl;
else
cout << s << " is not found!" << endl;
}
DOUBLE HASHING

 Double hashing is similar to linear probing and the


only difference is the interval between successive
probes.
 The interval between probes is computed by using

two hash functions.


 Let us say that the hashed index for an entry record

is an index that is computed by one hashing function


and the slot at that index is already occupied.
 You must start traversing in a specific probing

sequence to look for an unoccupied slot.


 The probing sequence will be:

index = (index + 1 * indexH) % hashTableSize;


index = (index + 2 * indexH) % hashTableSize;
and so on…
 Here, indexH is the hash value that is computed by

another hash function.


IMPLEMENTATION OF HASH TABLE WITH
DOUBLE HASHING
Assumption
 There are no more than 20 elements in the data set.

 Hash functions will return an integer from 0 to 19.

 Data set must have unique elements.

string hashTable[21];
int hashTableSize = 21;
Insert
void insert(string s)
{
//Compute the index using the hash function1
int index = hashFunc1(s);
int indexH = hashFunc2(s);
//Search for an unused slot and if the index exceeds the
hashTableSize roll back
while(hashTable[index] != "")
index = (index + indexH) % hashTableSize;
hashTable[index] = s;
}
Search
void search(string s)
{
//Compute the index using the hash function
int index = hashFunc1(s);
int indexH = hashFunc2(s);
//Search for an unused slot and if the index exceeds the
hashTableSize roll back
while(hashTable[index] != s and hashTable[index] != "")
index = (index + indexH) % hashTableSize;
//Is the element present in the hash table
if(hashTable[index] == s)
cout << s << " is found!" << endl;
else
cout << s << " is not found!" << endl;
}

You might also like