Collision Resolution Techniques
Collision Resolution Techniques
• Quadratic Probing
• Double Hashing
• Rehashing
• Algorithms for:
– insert
– find
– withdraw
1
Open Addressing: Quadratic Probing
• Quadratic probing eliminates primary clusters.
• c(i) is a quadratic function in i of the form c(i) = a*i2 + b*i. Usually c(i) is chosen
as:
c(i) = i2 for i = 0, 1, . . . , tableSize – 1
or
c(i) = ±i2 for i = 0, 1, . . . , (tableSize – 1) / 2
3
Quadratic Probing (cont’d)
h0(23) = (23 % 7) % 7 = 2 hi(key) = (h(key) ± i2) % 7 i = 0, 1, 2, 3
h0(13) = (13 % 7) % 7 = 6
h0(21) = (21 % 7) % 7 = 0
h0(14) = (14 % 7) % 7 = 0 collision
0 O 21
h1(14) = (0 + 12) % 7 = 1
h0(7) = (7 % 7) % 7 = 0 collision
h1(7) = (0 + 12) % 7 = 1 collision 1 O 14
h-1(7) = (0 - 12) % 7 = -1
NORMALIZE: (-1 + 7) % 7 = 6 collision 2 O 23
h2(7) = (0 + 22) % 7 = 4
h0(8) = (8 % 7)%7 = 1 collision 3 O 15
h1(8) = (1 + 12) % 7 = 2 collision
h-1(8) = (1 - 12) % 7 = 0 collision 4 O 7
h2(8) = (1 + 22) % 7 = 5
h0(15) = (15 % 7)%7 = 1 collision 5 O 8
h1(15) = (1 + 12) % 7 = 2 collision
h-1(15) = (1 - 12) % 7 = 0 collision 6 O 13
2
h2(15) = (1 + 2 ) % 7 = 5 collision
h-2(15) = (1 - 22) % 7 = -3
NORMALIZE: (-3 + 7) % 7 = 4 collision 4
h3(15) = (1 + 32)%7 = 3
Secondary Clusters
• Quadratic probing is better than linear probing because it eliminates primary
clustering.
• However, it may result in secondary clustering: if h(k1) = h(k2) the probing
sequences for k1 and k2 are exactly the same. This sequence of locations is called
a secondary cluster.
• Secondary clustering is less harmful than primary clustering because secondary
clusters do not combine to form large clusters.
• Example of Secondary Clustering: Suppose keys k0, k1, k2, k3, and k4 are
inserted in the given order in an originally empty hash table using quadratic
probing with c(i) = i2. Assuming that each of the keys hashes to the same array
index x. A secondary cluster will develop and grow in size:
5
Double Hashing
• To eliminate secondary clustering, synonyms must have different probe sequences.
• Double hashing achieves this by having two hash functions that both depend on the
hash key.
• The function c(i) = i*hp(r) satisfies Property 2 provided hp(r) and tableSize are
relatively prime.
Example: Load the keys 18, 26, 35, 9, 64, 47, 96, 36, and 70 in this order, in an
empty hash table of size 13
(a) using double hashing with the first hash function: h(key) = key % 13 and the
second hash function: hp(key) = 1 + key % 12
(b) using double hashing with the first hash function: h(key) = key % 13 and
the second hash function: hp(key) = 7 - key % 7
Show all computations.
7
Double Hashing (cont’d)
10
Implementation of Open Addressing
public class OpenScatterTable extends AbstractHashTable {
protected Entry array[];
protected static final int EMPTY = 0;
protected static final int OCCUPIED = 1;
protected static final int DELETED = 2;
11
Implementation of Open Addressing (Con’t.)
/* finds the index of the first unoccupied slot
in the probe sequence of obj */
protected int findIndexUnoccupied(Comparable obj){
int hashValue = h(obj);
int tableSize = getLength();
int indexDeleted = -1;
for(int i = 0; i < tableSize; i++){
int index = (hashValue + c(i)) % tableSize;
if(array[index].state == OCCUPIED
&& obj.equals(array[index].object))
throw new IllegalArgumentException(
"Error: Duplicate key");
else if(array[index].state == EMPTY ||
(array[index].state == DELETED &&
obj.equals(array[index].object)))
return indexDeleted ==-1?index:indexDeleted;
else if(array[index].state == DELETED &&
indexDeleted == -1)
indexDeleted = index;
}
if(indexDeleted != -1) return indexDeleted;
throw new IllegalArgumentException(
"Error: Hash table is full");
}
12
Implementation of Open Addressing (Con’t.)
protected int findObjectIndex(Comparable obj){
int hashValue = h(obj);
int tableSize = getLength();
2. Given that,
c(i) = i2,
for c(i) in quadratic probing, we discussed that this equation
does not satisfy Property 2, in general. What cells are missed by
this probing formula for a hash table of size 17? Characterize
using a formula, if possible, the cells that are not examined by
using this function for a hash table of size n.