Hashing Concepts in DBMS PDF
Hashing Concepts in DBMS PDF
Introduction
HashFileorganizationmethodistheonewheredataisstoredatthedatablockswhoseaddressisgeneratedby
usinghashfunction.Thememorylocationwheretheserecordsarestorediscalledasdatablockordatabucket.
Thisdatabucketiscapableofstoringoneormorerecords.
Thehashfunctioncanuseanyofthecolumnvaluetogeneratetheaddress.Mostofthetime,hashfunctionuses
primarykeytogeneratethehashindexaddressofthedatablock.Hashfunctioncanbesimplemathematical
functiontoanycomplexmathematicalfunction.Wecanevenconsiderprimarykeyitselfasaddressofthedata
block.Thatmeanseachrowwillbestoredatthedatablockwhoseaddresswillbesameasprimarykey.This
implieshowsimpleahashfunctioncanbeindatabase.
Abovediagramdepictsdatablockaddresssameasprimarykeyvalue.Thishashfunctioncanalsobesimple
mathematicalfunctionlikemod,sin,cos,exponentialetc.Imaginewehavehashfunctionasmod(5)todetermine
theaddressofthedatablock.Sowhathappenstotheabovecase?Itappliesmod(5)onprimarykeysand
generates3,3,1,4and2respectivelyandtherecordsarestoredinthosedatablockaddresses.
https://www.tutorialcup.com/dbms/hashingconcepts.htm 1/7
6/5/2017 HashingConceptsinDBMS
Fromabovetwodiagramsitnowclearhowhashfunctionworks.
TherearetwotypesofhashfileorganizationsStaticandDynamicHashing.
StaticHashing
Inthismethodofhashing,theresultantdatabucketaddresswillbealwayssame.Thatmeans,ifwewantto
generateaddressforEMP_ID=103usingmod(5)hashfunction,italwaysresultinthesamebucketaddress3.
Therewillnotbeanychangestothebucketaddresshere.Hencenumberofdatabucketsinthememoryforthis
statichashingremainsconstantthroughout.Inourexample,wewillhavefivedatabucketsinthememoryusedto
storethedata.
Searchingarecord
https://www.tutorialcup.com/dbms/hashingconcepts.htm 2/7
6/5/2017 HashingConceptsinDBMS
Usingthehashfunction,databucketaddressisgeneratedforthehashkey.Therecordisthenretrievedfromthat
location.i.e.ifwewanttoretrievewholerecordforID104,andifthehashfunctionismod(5)onID,theaddress
generatedwouldbe4.Thenwewilldirectlygottoaddress4andretrievethewholerecordforID104.HereID
actsasahashkey.
Insertingarecord
Whenanewrecordneedstobeinsertedintothetable,wewillgenerateaaddressforthenewrecordbasedonits
hashkey.Oncetheaddressisgenerated,therecordisstoredinthatlocation.
Deletearecord
Usingthehashfunctionwewillfirstfetchtherecordwhichissupposedtobedeleted.Thenwewillremovethe
recordsforthataddressinmemory.
Updatearecord
Datarecordmarkedforupdatewillbesearchedusingstatichashfunctionandthenrecordinthataddressis
updated.
Supposewehavetoinsertsomerecordsintothefile.Butthedatabucketaddressgeneratedbythehashfunction
isfullorthedataalreadyexistsinthataddress.Howdoweinsertthedata?Thissituationinthestatichashingis
calledbucketoverflow.Thisisoneofthecriticalsituations/drawbackinthismethod.Wherewillwesavethedata
inthiscase?Wecannotlosethedata.Therearevariousmethodstoovercomethissituation.Mostcommonly
usedmethodsarelistedbelow:
Closedhashing
Inthismethodweintroduceanewdatabucketwithsameaddressandlinkitafterthefulldatabucket.These
methodsofovercomingthebucketoverflowarecalledclosedhashingoroverflowchaining.
ConsiderwehavetoinsertanewrecordR2intothetables.Thestatichashfunctiongeneratesthedatabucket
addressasAACDBF.Butthisbucketisfulltostorethenewdata.Whatisdoneinthiscaseisanewdatabucket
isaddedattheendofAACDBFdatabucketandislinkedtoit.ThennewrecordR2isinsertedintothenew
bucket.Thusitmaintainsthestatichashingaddress.Itcanaddanynumberofnewdatabuckets,whenitisfull.
https://www.tutorialcup.com/dbms/hashingconcepts.htm 3/7
6/5/2017 HashingConceptsinDBMS
OpenHashing
Inthismethod,nextavailabledatablockisusedtoenterthenewrecord,insteadofoverwritingontheolderone.
ThismethodiscalledOpenHashingorlinearprobing.
Inthebelowexample,R2isanewrecordwhichneedstobeinserted.Butthehashfunctiongeneratesaddressas
237.Butitisalreadyfull.Sothesystemsearchesnextavailabledatabucket,238andassignsR2toit.
Inthelinearprobing,thedifferencebetweentheolderbucketandthenewbucketisusuallyfixedanditwillbe1
mostofthecases.
Quadraticprobing
Thisissimilartolinearprobing.Buthere,thedifferencebetweenoldandnewbucketislinear.Weusequadratic
functiontodeterminethenewbucketaddress.
https://www.tutorialcup.com/dbms/hashingconcepts.htm 4/7
6/5/2017 HashingConceptsinDBMS
DoubleHashing
Thisisalsoanothermethodoflinearprobing.Herethedifferenceisfixedlikeinlinearprobing,butthisfixed
differenceiscalculatedbyusinganotherhashfunction.Hencethenameisdoublehashing.
DynamicHashing
Thishashingmethodisusedtoovercometheproblemsofstatichashingbucketoverflow.Inthismethodof
hashing,databucketsgrowsorshrinksastherecordsincreasesordecreases.Thismethodofhashingisalso
knownasextendablehashingmethod.Letusseeanexampletounderstandthismethod.
ConsidertherearethreerecordsR1,R2andR4areinthetable.Theserecordsgenerateaddresses100100,
010110and110110respectively.Thismethodofstoringconsidersonlypartofthisaddressespeciallyonlyfirst
onebittostorethedata.Soittriestoloadthreeofthemataddress0and1.
WhatwillhappentoR3here?ThereisnobucketspaceforR3.Thebuckethastogrowdynamicallyto
accommodateR3.Soitchangestheaddresshave2bitsratherthan1bit,andthenitupdatestheexistingdatato
have2bitaddress.ThenittriestoaccommodateR3.
NowwecanseethataddressofR1andR2arechangedtoreflectthenewaddressandR3isalsoinserted.As
thesizeofthedataincreases,ittriestoinsertintheexistingbuckets.Ifnobucketsareavailable,thenumberof
bitsisincreasedtoconsiderlargeraddress,andhenceincreasingthebuckets.Ifwedeleteanyrecordandifthe
datascanbestoredwithlesserbuckets,itshrinksthebucketsize.Itdoestheoppositeofwhatwehaveseen
above.Thisishowadynamichashingworks.Initiallyonlypartialindex/addressgeneratedbythehashfunctionis
consideredtostorethedata.Asthenumberofdataincreasesandthereisaneedformorebucket,largerpartof
theindexisconsidertostorethedata.
AdvantagesofDynamichashing
https://www.tutorialcup.com/dbms/hashingconcepts.htm 5/7
6/5/2017 HashingConceptsinDBMS
Performancedoesnotcomedownasthedatagrowsinthesystem.Itsimplyincreasesthememorysizeto
accommodatethedata.
Sinceitgrowsandshrinkswiththedata,memoryiswellutilized.Therewillnotbeanyunusedmemorylying.
Goodfordynamicdatabaseswheredatagrowsandshrinksfrequently.
DisadvantagesofDynamichashing
Asthedatasizeincreases,thebucketsizeisalsoincreased.Theseaddresseswillbemaintainedinbucket
addresstables.Thisisbecause,theaddressofthedatawillkeepchangingasbucketsgrowandshrink.When
thereisahugeincreaseindata,maintainingthisbucketaddresstablebecomestedious.
Bucketoverflowsituationwilloccurinthiscasetoo.Butitmighttakelittletimetoreachthissituationthanstatic
hashing.
ComparisonofOrderedIndexingandHashing
OrderedIndexing Hashing
Addressesinthememoryaresortedforkey Addressesaregeneratedusinghashfunctionon
value.Thiskeyvaluecanbeprimarykeyorany thekeyvalue.Thiskeyvaluecanbeprimarykey
othercolumninthetable. oranyothercolumninthetable.
Performanceofthismethodcomesdownasthe Performanceofdynamichashingwillbegood
dataincreasesinthefile.Sinceitstoresthedata whenthereisafrequentadditionanddeletionof
inasortedform,whenthereis data.Butifthedatabaseisveryhuge,
insert/delete/updateoperation,anextraeffortto maintenancewillbecostlier.
sorttherecordisneeded.Thisreducesits
performance.
Statichashingwillbegoodforsmaller
databaseswhererecordsizeidpreviously
known.Ifthereisagrowthindata,itresultsin
seriousproblemslikebucketoverflow.
https://www.tutorialcup.com/dbms/hashingconcepts.htm 6/7
6/5/2017 HashingConceptsinDBMS
Therewillbeunuseddatablocksdueto Inbothstaticanddynamichashing,memoryis
delete/updateoperation.Thesedatablockswill wellmanaged.Bucketoverflowisalsohandled
notbereleasedforreuse.Henceperiodic tobetterextentinstatichashing.Datablocks
maintenanceofthememoryisrequired.Else, aredesignedtoshrinkandgrowindynamic
memoryiswastedandperformancewillalso hashing.
degrade.Alsoitwillbecostoverheadto
maintainmemory.
Buttherewillbeanoverheadofmaintainingthe
bucketaddresstableindynamichashingwhen
thereisahugedatabasegrowth.
Preferredforrangeretrievalofdatathatmeans Thismethodissuitabletoretrieveaparticular
whenthereisretrievaldataforparticularrange, recordbasedonthesearchkey.Butitwillnot
thismethodisbestsuited. performbetterifthehashfunctionisnotonthe
searchkey.
https://www.tutorialcup.com/dbms/hashingconcepts.htm 7/7