Module 12a: Dynamic Hashing
Database System Concepts, 6th Ed.
©Silberschatz, Korth and Sudarshan
See [Link] for conditions on re-use
Dynamic Hashing
Allows the hash function to be modified dynamically
Extendable hashing – one form of dynamic hashing
Hash function generates values over a large range — typically
b-bit integers, with b = 32.
At any time only a prefix of the hash function is uded to index
into a table of bucket addresses.
Let the length of the prefix be i bits, 0 i 31.
Bucket address table size = 2i. Initially i = 0
Value of i grows and shrinks as the size of the database
grows and shrinks.
Multiple entries in the bucket address table may point to the
same bucket.
Thus, actual number of buckets is < 2i
The number of buckets also changes dynamically due to
coalescing and splitting of buckets.
Database System Concepts - 6th Edition 12.2 ©Silberschatz, Korth and Sudarshan
General Extendable Hash Structure
In this structure each bucket hold 2 entries.
i2 = i3 = i
i1 = i – 1
For entry 0 in ther bucket address table, number of pointers is i1 = i
Database System Concepts - 6th Edition 12.3 ©Silberschatz, Korth and Sudarshan
Use of Extendable Hash Structure
Each bucket j is associated with a value ij
All the entries in the bucket address table that point to the same
bucket have the same values on the first ij bits.
To locate the bucket containing search-key K:
1. Compute h(K) = X
2. Use the first i high order bits of X as a displacement into bucket
address table, and follow the pointer to appropriate bucket
To insert a record with search-key value K
Follow same procedure as look-up and locate the bucket, say j.
If there is room in the bucket j insert record in the bucket.
Else, the bucket must be split and insertion re-attempted (see
next slide.)
Overflow buckets used in some cases (will see shortly)
Database System Concepts - 6th Edition 12.4 ©Silberschatz, Korth and Sudarshan
Use of Extendable Hash Structure
Each bucket j is associated with a value ij
All the entries in the bucket address table that point to the same
bucket have the same values on the first ij bits.
To locate the bucket containing search-key Kj:
1. Compute h(Kj) = X
2. Use the first i high order bits of X as a displacement into bucket
address table, and follow the pointer to appropriate bucket
To insert a record with search-key value Kj
Follow same procedure as look-up and locate the bucket, say j.
If there is room in the bucket j insert record in the bucket.
Else the bucket must be split and insertion re-attempted (see
next slide.)
Overflow buckets used in some cases (will see shortly)
Database System Concepts - 6th Edition 12.5 ©Silberschatz, Korth and Sudarshan
Insertion in Extendable Hash Structure
Splitting a bucket j when inserting record with search-key value K:
If i > ij (more than one pointer to bucket j)
Allocate a new bucket z, and set ij = iz = (ij + 1)
Update the second half of the bucket address table entries
originally pointing to j, to point to z
Remove each record in bucket j and reinsert (in j or z)
Recompute new bucket for K and insert record in the bucket
(further splitting is required if the bucket is still full)
If i = ij (only one pointer to bucket j)
If i reaches some limit b, or too many splits have happened in
this insertion, create an overflow bucket
Else
Increment i and double the size of the bucket address table.
Replace each entry in the table by two entries that point to the
same bucket.
Recompute new bucket address table entry for K
Now i > ij so use the first case above.
Database System Concepts - 6th Edition 12.6 ©Silberschatz, Korth and Sudarshan
Deletion in Extendable Hash Structure
To delete a key value,
Locate it in its bucket and remove it.
The bucket itself can be removed if it becomes empty (with
appropriate updates to the bucket address table).
Coalescing of buckets can be done (can coalesce only with
a “buddy” bucket having same value of ij and same ij –1
prefix, if it is present)
Decreasing bucket address table size is also possible
Note: decreasing bucket address table size is an
expensive operation and should be done only if number
of buckets becomes much smaller than the size of the
table
Database System Concepts - 6th Edition 12.7 ©Silberschatz, Korth and Sudarshan
Use of Extendable Hash Structure: Example
Database System Concepts - 6th Edition 12.8 ©Silberschatz, Korth and Sudarshan
Example (Cont.)
Initial hash structure; bucket size = 2
Database System Concepts - 6th Edition 12.9 ©Silberschatz, Korth and Sudarshan
Example (Cont.)
Hash structure after insertion of the records “Mozart”, “Srinivasan”,
and “Wu”
Database System Concepts - 6th Edition 12.10 ©Silberschatz, Korth and Sudarshan
Use of Extendable Hash Structure: Example
Database System Concepts - 6th Edition 12.11 ©Silberschatz, Korth and Sudarshan
Example (Cont.)
Hash structure after insertion of Einstein record
Database System Concepts - 6th Edition 12.12 ©Silberschatz, Korth and Sudarshan
Example (Cont.)
Hash structure after insertion of Gold and El Said records
Database System Concepts - 6th Edition 12.13 ©Silberschatz, Korth and Sudarshan
Example (Cont.)
Hash structure after insertion of Katz record
Database System Concepts - 6th Edition 12.14 ©Silberschatz, Korth and Sudarshan
Example (Cont.)
And after insertion of
eleven records
Database System Concepts - 6th Edition 12.15 ©Silberschatz, Korth and Sudarshan
Use of Extendable Hash Structure: Example
Database System Concepts - 6th Edition 12.16 ©Silberschatz, Korth and Sudarshan
Example (Cont.)
And after insertion of
Kim record in previous
hash structure
Database System Concepts - 6th Edition 12.17 ©Silberschatz, Korth and Sudarshan
Extendable Hashing vs. Other Schemes
Benefits of extendable hashing:
Hash performance does not degrade with growth of file
Minimal space overhead
Disadvantages of extendable hashing
Extra level of indirection to find desired record
Bucket address table may itself become very big (larger than
memory)
Cannot allocate very large contiguous areas on disk either
Solution: B+-tree structure to locate desired record in bucket
address table
Changing size of bucket address table is an expensive
operation
Linear hashing is an alternative mechanism
Allows incremental growth of its directory (equivalent to bucket
address table)
At the cost of more bucket overflows
Database System Concepts - 6th Edition 12.18 ©Silberschatz, Korth and Sudarshan
End of Module 12a
Database System Concepts, 6th Ed.
©Silberschatz, Korth and Sudarshan
See [Link] for conditions on re-use