0% found this document useful (0 votes)
4 views

Unit-1-1

The document discusses various searching algorithms, including linear and binary search, detailing their processes and time complexities. It introduces hashing as an efficient searching technique with constant time complexity O(1) and explains different hashing methods, such as static and dynamic hashing, along with hash functions. Additionally, it covers collision resolution techniques, particularly separate chaining and open addressing, to handle cases where multiple keys map to the same hash table index.

Uploaded by

Sanika Deshmukh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Unit-1-1

The document discusses various searching algorithms, including linear and binary search, detailing their processes and time complexities. It introduces hashing as an efficient searching technique with constant time complexity O(1) and explains different hashing methods, such as static and dynamic hashing, along with hash functions. Additionally, it covers collision resolution techniques, particularly separate chaining and open addressing, to handle cases where multiple keys map to the same hash table index.

Uploaded by

Sanika Deshmukh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 63

UNIT 5

Searching
What is Search?
• Search is a process of finding a value in a list of values. In other words,
searching is the process of locating given value position in a list of values.
Linear Search
Algorithm
(Sequential Search)
Searching
 Linear search algorithm finds given element in a list of elements
with O(n) time complexity where n is total number of elements in the list.

 This search process starts comparing of search element with the first element
in the list.

 If both are matching then results with element found otherwise search
element is compared with next element in the list.

 If both are matched, then the result is "element found".

 Otherwise, repeat the same with the next element in the list until search
element is compared with last element in the list, if that last element also
doesn't match, then the result is "Element not found in the list".

 That means, the search element is compared with element by element in the
list.
Searching
Algorithm
Step 1: Read the search element from the user
Step 2: Compare, the search element with the first element in
the list.
Step 3: If both are matching, then display "Given element
found!!!" and terminate the function
Step 4: If both are not matching, then compare search element
with the next element in the list.
Step 5: Repeat steps 3 and 4 until the search element is
compared with the last element in the list.
Step 6: If the last element in the list is also doesn't match, then
display "Element not found!!!" and terminate the function.
Searching
Example
Searching
Searching
Searching
Binary Search
Algorithm
Binary Search
 Binary search algorithm finds given element in a list of elements with O(log
n) time complexity where n is total number of elements in the list.
 The binary search algorithm can be used with only sorted list of element.
 That means, binary search can be used only with list of element which are already
arranged in a order.
 The binary search can not be used for list of element which are in random order.
This search process starts comparing of the search element with the middle
element in the list.
 If both are matched, then the result is "element found". Otherwise, we check
whether the search element is smaller or larger than the middle element in the
list.
 If the search element is smaller, then we repeat the same process for left sublist
of the middle element.
 If the search element is larger, then we repeat the same process for right sublist of
the middle element.
 We repeat this process until we find the search element in the list or until we left
with a sublist of only one element.
 And if that element also doesn't match with the search element, then the result is
"Element not found in the list".
Binary Search
Algorithm
Step 1: Read the search element from the user
Step 2: Find the middle element in the sorted list
Step 3: Compare, the search element with the middle element in the
sorted list.
Step 4: If both are matching, then display "Given element found!!!" and
terminate the function
Step 5: If both are not matching, then check whether the search
element is smaller or larger than middle element.
Step 6: If the search element is smaller than middle element, then
repeat steps 2, 3, 4 and 5 for the left sublist of the middle element.
Step 7: If the search element is larger than middle element, then
repeat steps 2, 3, 4 and 5 for the right sublist of the middle element.
Step 8: Repeat the same process until we find the search element in
the list or until sublist contains only one element.
Step 9: If that element also doesn't match with the search element,
then display "Element not found in the list!!!" and terminate the
function.
Binary Search
Example
Binary Search
Binary Search
HASH TABLES
In Data Structures,
 There are several searching techniques like linear search, binary search, search trees etc.
 In these techniques, time taken to search any particular element depends on the total number of
elements.

Example-

 Linear Search takes O(n) time to perform the search in unsorted arrays consisting of n elements.
 Binary Search takes O(logn) time to perform the search in sorted arrays consisting of n elements.
 It takes O(logn) time to perform the search in Binary Search Tree consisting of n elements.

Drawback
The main drawback of these techniques is

 As the number of elements increases, time taken to perform the search also increases.
 This becomes problematic when total number of elements become too large.
HASHING
 Hashing is a well-known technique to search any particular element among several elements.
 It minimizes the number of comparisons while performing the search.

ADVANTAGE

Unlike other searching techniques,

 Hashing is extremely efficient.


 The time taken by it to perform the search does not depend upon the total number of elements.
 It completes the search with constant time complexity O(1).
HASHING MECHANISM

In hashing,

 An array data structure called as Hash table is used to store the data items.
 Based on the hash key value, data items are inserted into the hash table.
TYPES OF HASHING

There are two types of hashing.


1. Static hashing
2. Dynamic hashing
STATIC HASHING
Static hashing is a hashing technique in which keys are stored in which keys are stored in hash table
with fixed size.

DYNAMIC HASHING
In this hashing table, the hash function is modified dynamically number of records grow.
HASH FUNCTION

Hash function is a function that maps any big number or string to a small integer value.

 Hash function takes the data item as an input and returns a small integer value as an
output.
 The small integer value is called as a hash value.
 Hash value of the data item is then used as an index for storing it into the hash table.
TYPES OF HASH FUNCTIONS

 Mid Square Hash Function


 Division Hash Function
 Folding Hash Function etc

“It depends on the user which hash function he wants to use.”

PROPERTIES OF HASH FUNCTION


The properties of a good hash function are
 It is efficiently computable.
 It minimizes the number of collisions.
 It distributes the keys uniformly over the table.
Buck and Home bucket
The hash function H(key) is used to map a several dictionary entries in the hash table. Each
function of hash table is called bucket. The function H(k) is home bucket for the dictionary with
pail whose value is key.

KEY VALUE INDEX


5
15 4
3
10 2
1
0
HASH TABLE
In the above diagram or hash table location 2 or 4 is called as home bucket and location 0,1,3,5
are called as bucket.
Division hash function method
The hash function depends upon the remainder of the division. Typically the division is the
table length.
SYNTAX OR FORMULA 9
H (key) = K % table size 8
EXAMPLE:- 7
6
Insert following values or records
5
54, 72, 89, 37 into hash table. The hash table size is 10. 4
3
2
The record 54 is inserted into above hash table by using division hash function.
1
H (key) = k % table size 0
H(Key) = 54%10 = 4
The record 54 is inserted at 4th location.
The record 72 is inserted into above hash table by using division hash function.
H (key) = k % table size
89 9
H(Key) = 72%10 = 2
8
The record 72 is inserted at 2nd location. 37 7
The record 89 is inserted into above hash table by using division hash function.
H (key) = k % table size 6
H(Key) = 89 % 10 = 9 5
54 4
The record 89 is inserted at 9th position or location.
The record 37 is inserted into above hash table by using division hash function. 3
H (key) = k % table size
72 2
H(Key) = 37 % 10 = 7
1
The record 37 is inserted at 7th position. 0
The following hash table determines the inserting records 54, 72, 89, 37 into hash table.
MID SQUARE HASH FUNCTION

In the mid square method, the key is squared and the middle or mid part of the result is used as
index or position or location.

Example the records 311, 3112, 3114 are inserted to hash table. Assume that hash table size is
1000.

SYNTAX OR FORMULA EXAMPLE:-

H(Key) = K2 The record 3111 by using mid square.


H(key) =K2
= (3111)2
=9678321
783 is the middle part of 9678321. So, 783 is the index of 3111.
The record 3112 by using mid square

H (Key) = (3112)2
3111 783
= 9684544

845 is the middle part of 9684544. So, 845 is the index of 3112.
3112 845

The record 3113 by using mid square


3113 907
H (Key) = (3113)2

= 9690769
999
907 is the middle part of 9690769. So, 907 is the index of 3113.
MULTIPLICATIVE HASH FUNCTION
The given record is multiplied by some constant value. The formula computing hash key is
H (Key) = floor (P*(fractional part of key*A))
Where ‘P’ is an integer constant and ‘A’ is real constant.
Donald Knuth suggested to use constant A = 0.61803398987.

EXAMPLE:
Insert the following records 107, 108, 109, 110 into hash table . Here P =50.
107 inserted into hash table by using multiplicative hash function.
H (Key) = floor (P*(fractional part of key*A))
= floor (50*(107* 0.61803398987)
= floor (3306.4818)
=3306
108 inserted into hash table by using multiplicative hash function. 0
H (Key) = floor (50*(108* 0.61803398987)
= floor (3337.3835) 107 3306
= 3337
109 inserted into hash table by using multiplicative hash function.
108 3337
H (Key) = floor (50*(109* 0.61803398987)
= floor (3368.2852)
109 3368
= 3368
110 inserted into hash table by using multiplicative hash function.
H (Key) = floor (50*(110* 0.61803398987) 110 3399

= floor (3399.1869) 3999


= 3399
DIGIT FOLDING OR FOLDING HASH FUNCTION

The key value is divided into separate parts and using some simple operation this parts are
combined to produce hash key.

EXAMPLE:

Consider the record 1, 2, 3, 6, 5, 4, 1, 2 then it is divided into separate parts 123, 654, 12 and
this all are added together.

H (Key) = 123+ 654 + 12 +789

The record 123, 654, 12 will be placed at a location 789 in the hash table.
COLLISION RESOLUTION TECHNIQUE
 Hashing is a well-known searching technique.
 It minimizes the number of comparisons while performing the search.
 It completes the search with constant time complexity O(1).

COLLISION
When the hash value of a key maps to an already occupied bucket of the hash table,
it is called as a Collision.

If collision occurs then it should be handled by applying some techniques. Such techniques are called collision
resolution technique.

The goal of collision resolution techniques is to minimize collisions. There are two methods of handling
collisions.
COLLISION RESOLUTION TECHNIQUES

Collision Resolution Techniques are the techniques used for resolving or handling the collision

1. Open hashing or Separate Chain hashing

2. Closed hashing or Open addressing


SEPARATE CHAINING
OR
OPEN HASHING
SEPARATE CHAINING OR OPEN HASHING
To handle the collision,
 This technique creates a linked list to the slot for which collision occurs.
 The new key is then inserted in the linked list.
 These linked lists to the slots appear like chains.
 That is why, this technique is called as separate chaining.

Time Complexity For Searching


 In worst case, all the keys might map to the same bucket of the hash table.
 In such a case, all the keys will be present in a single linked list.
 Sequential search will have to be performed on the linked list to perform the search.
 So, time taken for searching in worst case is O(n).
For Deletion
 In worst case, the key might have to be searched first and then deleted.
 In worst case, time taken for searching is O(n).
 So, time taken for deletion in worst case is O(n).

Load Factor (α)

Load factor (α) is defined as

If Load factor (α) = constant, then time complexity of Insert, Search, Delete = Θ(1 )
PRACTICE PROBLEM BASED ON SEPARATE CHAINING

Using the hash function ‘key mod 7’, insert the following sequence of keys in the hash table.

50, 700, 76, 85, 92, 73 and 101

Use separate chaining technique for collision resolution.

SOLUTION

The given sequence of keys will be inserted in the hash table as-

STEP-01:

 Draw an empty hash table.


 For the given hash function, the possible range of hash values is [0, 6].
 So, draw an empty hash table consisting of 7 buckets as.
STEP-02:
 Insert the given keys in the hash table one by one.
 The first key to be inserted in the hash table = 50.
 Bucket of the hash table to which key 50 maps = 50 mod 7 = 1.
 So, key 50 will be inserted in bucket-1 of the hash table as.

STEP-03:
 The next key to be inserted in the hash table = 700.
 Bucket of the hash table to which key 700 maps = 700 mod 7 = 0.
 So, key 700 will be inserted in bucket-0 of the hash table as.
STEP-04:
 The next key to be inserted in the hash table = 76.
 Bucket of the hash table to which key 76 maps = 76 mod 7 = 6.
 So, key 76 will be inserted in bucket-6 of the hash table as.

STEP-05:
 The next key to be inserted in the hash table = 85.
 Bucket of the hash table to which key 85 maps = 85 mod 7 = 1.
 Since bucket-1 is already occupied, so collision occurs.
 Separate chaining handles the collision by creating a linked list to bucket-1.
 So, key 85 will be inserted in bucket-1 of the hash table as.
STEP-06:

 The next key to be inserted in the hash table = 92.

 Bucket of the hash table to which key 92 maps = 92 mod 7 = 1.

 Since bucket-1 is already occupied, so collision occurs.

 Separate chaining handles the collision by creating a linked list to bucket-1.

 So, key 92 will be inserted in bucket-1 of the hash table as.


STEP-07:
 The next key to be inserted in the hash table = 73.
 Bucket of the hash table to which key 73 maps = 73 mod 7 = 3.
 So, key 73 will be inserted in bucket-3 of the hash table as.
STEP-08:
 The next key to be inserted in the hash table = 101.
 Bucket of the hash table to which key 101 maps = 101 mod 7 = 3.
 Since bucket-3 is already occupied, so collision occurs.
 Separate chaining handles the collision by creating a linked list to bucket-3.
 So, key 101 will be inserted in bucket-3 of the hash table as.
OPEN ADDRESSING
OR
CLOSE HASHING
IN OPEN ADDRESSING,
 Unlike separate chaining, all the keys are stored inside the hash table.
 No key is stored outside the hash table

TECHNIQUES USED FOR OPEN ADDRESSING:


 Linear Probing
 Quadratic Probing
 Double Hashing

OPERATIONS IN OPEN ADDRESSING


 Insert Operation
 Search Operation
 Delete Operation
INSERT OPERATION
 Hash function is used to compute the hash value for a key to be inserted.
 Hash value is then used as an index to store the key in the hash table.

IN CASE OF COLLISION,
 Probing is performed until an empty bucket is found.
 Once an empty bucket is found, the key is inserted.
 Probing is performed in accordance with the technique used for open addressing.
SEARCH OPERATION
To search any particular key,
 Its hash value is obtained using the hash function used.
 Using the hash value, that bucket of the hash table is checked.
 If the required key is found, the key is searched.
 Otherwise, the subsequent buckets are checked until the required key or an
empty bucket is found.
 The empty bucket indicates that the key is not present in the hash table.

DELETE OPERATION
 The key is first searched and then deleted.
 After deleting the key, that particular bucket is marked as “deleted”.
OPEN ADDRESSING TECHNIQUES

1. Linear Probing

 When collision occurs, we linearly probe for the next bucket.


 We keep probing until an empty bucket is found.

ADVANTAGE

 It is easy to compute.

DISADVANTAGE
 The main problem with linear probing is clustering.
 Many consecutive elements form groups.
 Then, it takes time to search an element or to find an empty bucket.
TIME COMPLEXITY

Worst time to search an element in linear probing is O (table size).

This is because

 Even if there is only one element present and all other elements are deleted.
 Then, “deleted” markers present in the hash table makes search the entire table.
EXAMPLE:

Consider that following keys are to be inserted in the hash table 131, 4, 8,
7, 21, 5, 31, 61, 9, 29.The hash table size is 10. 0 Null
1 131
Initially we will put the following keys in the hash table 131, 4, 8, 7. 2 Null
We will use division hash function. That means that keys are placed using 3 Null
4 4
formula.
5 Null
H (Key) = key % table size 6 Null
7 7
For instance the element 131 can be placed at H (Key) = 131 % 10 =1. 8 8
9 Null
Index 1 will be the home bucket for 131. Continuing in the fashion we will
place 4,8,7.
Now the next to be inserted is 21. According to hash function 0 Null
H (Key) = 21 % 10 =1.
1 131
2 21
 But the index 1 location already occupied with 131 i.e., collision
3 Null
occurs. To resolve this collision we will linearly move down from 1
4 4
to empty location is found.
5 5
 Therefore 21 will be placed at index 2.
6 Null
 If the next element is 5 then we get home bucket for 5 as index 5 this
7 7
bucket is empty so, we will put the element 5 at index 5.
8 8
9 Null
After placing record keys 31, The next record key that comes is 9. According to decision as
61 the hash table will be function it demands for the home bucket 9. Hence we will
place 9 at index 9.
0 Null
1 131 0 Null 0 29
2 21 1 131 1 131
3 31 2 21 2 21
4 4 3 31 3 31
5 5 4 4 4 4
6 61 5 5 5 5
7 7 6 61 6 61
8 8 7 7 7 7
9 Null 8 8 8 8
9 9 9 9
 Now the next final record key is 29 and it hashes a key 9. But home bucket 9 is already
occupied. And there is no next empty bucket as the table size is limited to index 9. The
overflow occurs to handle it we move back to bucket 0 and is the location over there is empty
29 will be placed at 0th index.
QUADRATIC PROBING
Quadratic probing operates by taking original hash value and adding successive values of quadratic
polynomial to the stating value.

This method uses following formula.


0
H(Key) = (H (Key) + i2) %m
1
Where ‘m’ can be table size or any prime number. 2 22
3
EXAMPLE: - If we have insert following elements in the hash table with table size 10. 4
5 55
37, 19, 55, 22, 17, 49, 87.
6
Initially we will put following keys into hash table. 7 37
8
37,19,55,22 9
Now, if you want to place 17 a collision will be occurs 17. 17 % 10 = 7, but bucket 7 has already an element
37. Hence we will apply quadratic probing to insert this record in the hash table.

H (Key) = (H (Key) + i2) % m 0 49


1
Consider I =0
2 22
2
H (key) = (17+0 ) % 10 = 17 % 10 = 7. 3
Then i=1 4
H (Key) = (17 + 12) % 10 =18 % 10 = 8. 5 55
6
The bucket 8 is empty. Hence we will place the element of the index 8.
7 37
8 17
9 19
Now if you want to place 49 a collision will be occur 49 % 10 = 9 and bucket 9 as already occupied with 19.
Hence we will applying quadratic probing to insert this record in the hash table.
Hi (Key) = (H (Key) + i2) % m
I =0 = (49 + 0) % 10 = 49 % 10 = 9
I=1 = (49 + 12) % 10 = 50 % 10 = 0  The bucket 0 is empty.
 Hence the value 49 is inserted at a 0th position.
Now to place 87 we will use quadratic probing.
H(Key) = (87 + 02) % 10 = 87 % 10 = 7
H(Key) = (87 + 12) % 10 = 88 % 10 = 8
H(Key) = (87 + 22) % 10 = 91 % 10 = 1

0 49
1 87
2 22
3
4
5 55
6
7 37
8 17
9 19
DOUBLE PROBING
OR
DOUBLE HASHING
Double hashing is a technique in which a second hash function is applied to key when a collision
occur by applying the second has function we will get number of positions from the point of Collision
inserted. 0 90
1
By using following formulas we can find out the double hashing. 37 % 10 = 7
2 22
90 % 10 = 0 3
H1 (key) = k % table size
45 % 10 = 5 4
H2(key) = M - (K % M) 22 % 10 = 2 5 45
6
Where M is prime number smaller than the size of the table. 7 37
8
Example: consider the following elements to be placed in the Hash table of size 10. 9

37, 90, 45, 22, 17, 49, 55.

Inside Initially the elements using the formula for H1 (key). Insert 37, 90, 45, 22.
NOW IF 17 IS TO BE INSERTED THEN
H1 (17) = 17 % 10 = 7
Here collision will be occur because 7th position already occupied with element 37 or record 37.
So we can apply second hash function to key.
0 90
H2 (key) = M-(K%M) 1 17
Here M is prime number smaller than the size of the table. 2 22
3
Let us prime number is M = 7
4
H2 (17) =7-(17%7) 5 45
= 7 - 3 = 4 ( 4 jumps from the collision index) 6
7 37
17 will be placed at index 1.
8
9
Now to insert number 49 at location 9th position that is 49 % 10 = 9.
0 90
1 17 Now to insert number 55.
2 22 H1 (55) = 55 % 10 = 5 that is collision will be occur. Because the location 5 already
3 occupied with 45. So, we can apply second hash function.
4
H2 (55) = 7 - (55 % 7) = 7 - 6 = 1 ( 1 jumps from the collision index)
5 45
6 That means we have to take one jump from index 5 to place 55.
7 37
8 0 90
9 49 1 17
2 22
3
4
5 45
6 55
7 37
8
9 49
SEPARATE CHAINING OPEN ADDRESSING
 All the keys are stored only inside the hash
 Keys are stored inside the hash table as
table.
well as outside the hash table.
 No key is present outside the hash table.
 The number of keys to be stored in the hash  The number of keys to be stored in the hash
table can even exceed the size of the hash table can never exceed the size of the hash
table. table.
 Deletion is easier.  Deletion is difficult.
 Extra space is required for the pointers to
 No extra space is required.
store the keys outside the hash table.
 Cache performance is poor.  Cache performance is better.
 This is because of linked lists which store  This is because here no linked lists are
the keys outside the hash table. used.
 Some buckets of the hash table are never  Buckets may be used even if no key maps
used which leads to wastage of space. to those particular buckets
Rehashing
Rehashing is a technique in which table is resized that is the size of table is double by
creating a new table. It is preferable if the total size of new table is a prime number. There
are situation in which rehashing is required.

i) When the table size is completely full.

ii) With Quadratic probing when the table is filled half.

iii) When insertion fail due to over flow.

In such situations, we have to transfer entries from old table to new table.
THANK YOU

You might also like