Dynamic Hashing and Indexing
Dynamic Hashing and Indexing
d1 = 1
Directory rec 1
Locations rec 4
splitting splitting
bucket bucket
d1=0
rec 1 0
rec 2 1
d=0 d=1
record 3 = rec 2 d1 = 1
overflow!! rec 3
record 5 =
overflow!!
NEXT
d1 = 1
rec 1
rec 4
00 splitting
01 d1 = 2
bucket
rec 2
10
rec 3 record 7 =
11
rec 5 d1 = 2 overflow!!
d=2
rec 6
NEXT
splitting
rec 1 d1 = 1 bucket
rec 4 record 8 =
000
d1 = 3
overflow!!
001 rec 2
010 rec 7
011 d1 = 3
rec 3
100
101
110
d1 = 2
111 rec 5
rec 6
d=3
NEXT
d1 = 3
rec 1 NEXT
d1 = 3
rec 4
d1 = 2
rec 8
000
001
010
011
100 rec 2 d1 = 3
101 rec 7
110 rec 3 d1 = 3
111 rec 9
splitting
d=3
rec 5
d1 = 2
bucket
rec 6 record 10 =
overflow!!
d1 = 3
rec 1 NEXT
d1 = 3
rec 4
rec 11
d1 = 2
rec 8
000
rec 12
001
010
011
100 rec 2 d1 = 3
101 rec 7
110 rec 3 d1 = 3
111 rec 9
d=3 d1 = 3
rec 5
splitting
bucket
d1 = 3
rec 6
rec 10 record 13 =
overflow!!
d1 = 3
rec 1
0000 d1 = 3
0001 rec 4
0010 rec 11
0011 rec 8 d1 = 2
0100 rec 12
0101
d1 = 3
0110 rec 2
0111 rec 7
1000 rec 3 d1 = 3
1001 rec 14
1010 d1 = 3
rec 5
1011
rec 15
1100
d1 = 4
1101 rec 6
1110 rec 10
d1 = 4
1111 rec 13
d=4
Advantages and Disadvantages
Benefits of extendable hashing:
Hash performance does not degrade with growth of file
Minimal space overhead
Disadvantages of extendable hashing
Extra level of indirection to find desired record
Bucket address table may itself become very big (larger than
memory)
Need a tree structure to locate desired record in the structure!
Index files are typically much smaller than the original file
Two basic kinds of indices:
Ordered indices: search keys are stored in sorted order
Hash indices: search keys are distributed uniformly across
“buckets” using a “hash function”.
THANK YOU.