0% found this document useful (0 votes)
35 views23 pages

DS Module 5 Hashing

Data Structure BCS304 Module 5 PPT Hashing

Uploaded by

ashwiniiseait
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views23 pages

DS Module 5 Hashing

Data Structure BCS304 Module 5 PPT Hashing

Uploaded by

ashwiniiseait
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 23

DS Module 5

Hashing
Priority queues
Hashing
• Hashing is a technique that is used to uniquely identify a
specific object from a group of similar objects.
Some examples of how hashing is used in our lives include:
• In universities, each student is assigned a unique roll number
that can be used to retrieve information about them.
• In libraries, each book is assigned a unique number that can
be used to determine information about the book, such as its
exact position in the library or the users it has been issued to
etc.
• In both these examples the students and books were hashed
to a unique number.
Hashing
• Hashing is a technique or process of mapping keys, and values into
the hash table by using a hash function.
• It is done for faster access to elements.

• Hashing technique use a special function called the Hash function


which is used to map a given value with a particular key for faster
access of elements.
• In a hash table, data is stored in an array format, where each data
value has its own unique index value .
• A Hash Function is a function that converts a given numeric or
alphanumeric key to a small practical integer value.
• The hash function will compute the same index for all
the strings and the strings will be stored in the hash
table in the list format.
• As the index of all the strings is the same, you can
create a list on that index and insert all the strings in
that list.
Types of Hash functions

1. Division Method.
2. Mid Square Method.
3. Folding Method.
4. Digit Analysis
5. Converting keys to integers
Division Method
• This is the most simple and easiest method to generate a hash
value. The hash function divides the value k by N and then
uses the remainder obtained.
• Formula: h(K) = k mod N
• Here, k is the key value, and N is the size of the hash table.
• It is best suited that N is a prime number as that can make
sure the keys are more uniformly distributed. The hash
function is dependent upon the remainder of a division.
• k = 12345, N = 95
h(12345) = 12345 mod 95 = 90
• k = 1276 , N = 11
h(1276) = 1276 mod 11 = 0
Mid Square Method
The mid-square method is a very good hashing method. It
involves two steps to compute the hash value-
1. Square the value of the key k i.e. k2
2. Extract the middle r digits as the hash value.

Formula: h(K) = h(k x k)


• Here,k is the key value. The value of r can be decided based on
the size of the table.
Example: Suppose the hash table has 100 memory locations. So
r = 2 because two digits are required to map the key to the
memory location and k = 60,
k x k = 60 x 60= 3600
h(60) = 60
The hash value obtained is 60
Folding Method
This method involves two steps:
1. Divide the key-value k into a number of parts i.e. k1, k2, k3,
….,kn, where each part has the same number of digits except
for the last part that can have lesser digits than the other parts.
2. Add the individual parts. The hash value is obtained by ignoring
the last carry if any.

Formula: k = k1, k2, k3, k4, ….., kn


h(K)=s = k1+ k2 + k3 + k4 +….+ kn
Example:
• k = 12345
• k1 = 12, k2 = 34, k3 = 5
• h(K) = s = k1 + k2 + k3 = 12 + 34 + 5 = 51
Digit Analysis
• The digit analysis, is used with static files.
• A static file is one in which all the identifiers are known in
advance.
• Using this method, we first transform the identifiers into
numbers using some radix, r.
• We then examine the digits of each identifier, deleting those
digits that have the most skewed distributions.
• We continue deleting digits until the number of remaining
digits is small enough to give an address in the range of the
hash table.
• The digits used to calculate the hash address must be the
same for all identifiers and must not have abnormally high
peaks or valleys (the standard deviation must be small).
Converting keys to integers
• To use hash function it is necessary to convert keys into non
negative integers.( need not be unique)
• Each character maps to integer in the range 0 to 255.

unsigned int stringToInt(char *key)


{
int number=0;
while(*key)
number+= *key++;
return number;
}
Overflow handling
Two ways to handle overflows
1. Open Addressing
2. Chaining

4 Open Addressing methods are


3. Linear probing or linear open addressing
4. Quadratic probing
5. Rehashing
6. Random probing
Example: Additive transformation
Identifier Additive transformation x Hash
for 102 + 111+114 327 2
do 100+111 211 3
while 119 + 104 + 105 + 108+ 101 537 4
if 105 + 102 207 12
else 101 + 108 + 115 + 101 425 9
function 102 + 117 + 110 + 99 + 116 + 105 + 111 + 110 870 12
• Hash table with linear probing (13 buckets, one slot per bucket)

[0] function
[1]
[2] for
[3] do
[4] while
[5]
[6]
[7]
[8]
[9] else
[10]
[11]
[12] if
Linear probing
element *search(int k)
{
int hBucket, cBucket;
hBucket = h(k);
for(cBucket =hBucket; ht[cBucket] && ht[cBucket]->key!=k; )
{
cBucket= (cBucket+1) % b;
if(cBucket== hBucket)
return NULL;
}
if( ht[cBucket]->key==k)
return ht[cBucket];
return NULL;
}
Chaining

• The hash function will compute the same index


for all the strings and the strings will be stored
in the hash table in the linked list format.
• As the index of all the strings is the same, you
can create a list on that index and insert all the
strings in that list.
Chain Seaarch
element *search(int k)
{
nodePointer current;
int hBucket = h(k);
for(current = ht[hBucket]; current; current= current->link )
if(current -> data.key == k)
return &current->data;
return NULL;
}
Example of hash chain

[0]  acos atoi atol

[1]  NULL

[2]  char ceil cos ctime

[3]  define

[4]  exp

[5]  float floor

[6]  NULL


[26] NULL

Single ended and Double ended priority
queues
• A priority queue is a collection elements with each
element has an associated priority.

Operations supported by Single ended priority queues


are
SP1: Return an element with minimum priority
SP2: Insert an element with an arbitrary priority
SP3: Delete an element with minimum priority
Operations supported by Double ended priority queues
are

DP1: Return an element with minimum priority


DP2: Return an element with maximum priority
DP3: Insert an element with an arbitrary priority
DP4: Delete an element with minimum priority
DP5: Delete an element with maximum priority

You might also like