0% found this document useful (0 votes)
252 views

Hashing Concepts in DBMS PDF

Hash file organization stores data in data blocks whose addresses are generated by a hash function. The hash function typically uses the primary key to generate the address of the data block. There are two types of hashing - static hashing stores records at fixed addresses while dynamic hashing allows data blocks to grow and shrink as records are added or removed. Dynamic hashing overcomes problems with static hashing like bucket overflow by increasing the number of bits used to generate addresses as more space is needed.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
252 views

Hashing Concepts in DBMS PDF

Hash file organization stores data in data blocks whose addresses are generated by a hash function. The hash function typically uses the primary key to generate the address of the data block. There are two types of hashing - static hashing stores records at fixed addresses while dynamic hashing allows data blocks to grow and shrink as records are added or removed. Dynamic hashing overcomes problems with static hashing like bucket overflow by increasing the number of bits used to generate addresses as more space is needed.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

6/5/2017 HashingConceptsinDBMS

Introduction

HashFileorganizationmethodistheonewheredataisstoredatthedatablockswhoseaddressisgeneratedby
usinghashfunction.Thememorylocationwheretheserecordsarestorediscalledasdatablockordatabucket.
Thisdatabucketiscapableofstoringoneormorerecords.

Thehashfunctioncanuseanyofthecolumnvaluetogeneratetheaddress.Mostofthetime,hashfunctionuses
primarykeytogeneratethehashindexaddressofthedatablock.Hashfunctioncanbesimplemathematical
functiontoanycomplexmathematicalfunction.Wecanevenconsiderprimarykeyitselfasaddressofthedata
block.Thatmeanseachrowwillbestoredatthedatablockwhoseaddresswillbesameasprimarykey.This
implieshowsimpleahashfunctioncanbeindatabase.

Abovediagramdepictsdatablockaddresssameasprimarykeyvalue.Thishashfunctioncanalsobesimple
mathematicalfunctionlikemod,sin,cos,exponentialetc.Imaginewehavehashfunctionasmod(5)todetermine
theaddressofthedatablock.Sowhathappenstotheabovecase?Itappliesmod(5)onprimarykeysand
generates3,3,1,4and2respectivelyandtherecordsarestoredinthosedatablockaddresses.

https://www.tutorialcup.com/dbms/hashingconcepts.htm 1/7
6/5/2017 HashingConceptsinDBMS

Fromabovetwodiagramsitnowclearhowhashfunctionworks.

TherearetwotypesofhashfileorganizationsStaticandDynamicHashing.

StaticHashing

Inthismethodofhashing,theresultantdatabucketaddresswillbealwayssame.Thatmeans,ifwewantto
generateaddressforEMP_ID=103usingmod(5)hashfunction,italwaysresultinthesamebucketaddress3.
Therewillnotbeanychangestothebucketaddresshere.Hencenumberofdatabucketsinthememoryforthis
statichashingremainsconstantthroughout.Inourexample,wewillhavefivedatabucketsinthememoryusedto
storethedata.

Searchingarecord
https://www.tutorialcup.com/dbms/hashingconcepts.htm 2/7
6/5/2017 HashingConceptsinDBMS

Usingthehashfunction,databucketaddressisgeneratedforthehashkey.Therecordisthenretrievedfromthat
location.i.e.ifwewanttoretrievewholerecordforID104,andifthehashfunctionismod(5)onID,theaddress
generatedwouldbe4.Thenwewilldirectlygottoaddress4andretrievethewholerecordforID104.HereID
actsasahashkey.

Insertingarecord

Whenanewrecordneedstobeinsertedintothetable,wewillgenerateaaddressforthenewrecordbasedonits
hashkey.Oncetheaddressisgenerated,therecordisstoredinthatlocation.

Deletearecord

Usingthehashfunctionwewillfirstfetchtherecordwhichissupposedtobedeleted.Thenwewillremovethe
recordsforthataddressinmemory.

Updatearecord

Datarecordmarkedforupdatewillbesearchedusingstatichashfunctionandthenrecordinthataddressis
updated.

Supposewehavetoinsertsomerecordsintothefile.Butthedatabucketaddressgeneratedbythehashfunction
isfullorthedataalreadyexistsinthataddress.Howdoweinsertthedata?Thissituationinthestatichashingis
calledbucketoverflow.Thisisoneofthecriticalsituations/drawbackinthismethod.Wherewillwesavethedata
inthiscase?Wecannotlosethedata.Therearevariousmethodstoovercomethissituation.Mostcommonly
usedmethodsarelistedbelow:

Closedhashing

Inthismethodweintroduceanewdatabucketwithsameaddressandlinkitafterthefulldatabucket.These
methodsofovercomingthebucketoverflowarecalledclosedhashingoroverflowchaining.

ConsiderwehavetoinsertanewrecordR2intothetables.Thestatichashfunctiongeneratesthedatabucket
addressasAACDBF.Butthisbucketisfulltostorethenewdata.Whatisdoneinthiscaseisanewdatabucket
isaddedattheendofAACDBFdatabucketandislinkedtoit.ThennewrecordR2isinsertedintothenew
bucket.Thusitmaintainsthestatichashingaddress.Itcanaddanynumberofnewdatabuckets,whenitisfull.

https://www.tutorialcup.com/dbms/hashingconcepts.htm 3/7
6/5/2017 HashingConceptsinDBMS

OpenHashing

Inthismethod,nextavailabledatablockisusedtoenterthenewrecord,insteadofoverwritingontheolderone.
ThismethodiscalledOpenHashingorlinearprobing.

Inthebelowexample,R2isanewrecordwhichneedstobeinserted.Butthehashfunctiongeneratesaddressas
237.Butitisalreadyfull.Sothesystemsearchesnextavailabledatabucket,238andassignsR2toit.

Inthelinearprobing,thedifferencebetweentheolderbucketandthenewbucketisusuallyfixedanditwillbe1
mostofthecases.

Quadraticprobing

Thisissimilartolinearprobing.Buthere,thedifferencebetweenoldandnewbucketislinear.Weusequadratic
functiontodeterminethenewbucketaddress.

https://www.tutorialcup.com/dbms/hashingconcepts.htm 4/7
6/5/2017 HashingConceptsinDBMS

DoubleHashing

Thisisalsoanothermethodoflinearprobing.Herethedifferenceisfixedlikeinlinearprobing,butthisfixed
differenceiscalculatedbyusinganotherhashfunction.Hencethenameisdoublehashing.

DynamicHashing

Thishashingmethodisusedtoovercometheproblemsofstatichashingbucketoverflow.Inthismethodof
hashing,databucketsgrowsorshrinksastherecordsincreasesordecreases.Thismethodofhashingisalso
knownasextendablehashingmethod.Letusseeanexampletounderstandthismethod.

ConsidertherearethreerecordsR1,R2andR4areinthetable.Theserecordsgenerateaddresses100100,
010110and110110respectively.Thismethodofstoringconsidersonlypartofthisaddressespeciallyonlyfirst
onebittostorethedata.Soittriestoloadthreeofthemataddress0and1.

WhatwillhappentoR3here?ThereisnobucketspaceforR3.Thebuckethastogrowdynamicallyto
accommodateR3.Soitchangestheaddresshave2bitsratherthan1bit,andthenitupdatestheexistingdatato
have2bitaddress.ThenittriestoaccommodateR3.

NowwecanseethataddressofR1andR2arechangedtoreflectthenewaddressandR3isalsoinserted.As
thesizeofthedataincreases,ittriestoinsertintheexistingbuckets.Ifnobucketsareavailable,thenumberof
bitsisincreasedtoconsiderlargeraddress,andhenceincreasingthebuckets.Ifwedeleteanyrecordandifthe
datascanbestoredwithlesserbuckets,itshrinksthebucketsize.Itdoestheoppositeofwhatwehaveseen
above.Thisishowadynamichashingworks.Initiallyonlypartialindex/addressgeneratedbythehashfunctionis
consideredtostorethedata.Asthenumberofdataincreasesandthereisaneedformorebucket,largerpartof
theindexisconsidertostorethedata.

AdvantagesofDynamichashing
https://www.tutorialcup.com/dbms/hashingconcepts.htm 5/7
6/5/2017 HashingConceptsinDBMS

Performancedoesnotcomedownasthedatagrowsinthesystem.Itsimplyincreasesthememorysizeto
accommodatethedata.
Sinceitgrowsandshrinkswiththedata,memoryiswellutilized.Therewillnotbeanyunusedmemorylying.
Goodfordynamicdatabaseswheredatagrowsandshrinksfrequently.

DisadvantagesofDynamichashing

Asthedatasizeincreases,thebucketsizeisalsoincreased.Theseaddresseswillbemaintainedinbucket
addresstables.Thisisbecause,theaddressofthedatawillkeepchangingasbucketsgrowandshrink.When
thereisahugeincreaseindata,maintainingthisbucketaddresstablebecomestedious.
Bucketoverflowsituationwilloccurinthiscasetoo.Butitmighttakelittletimetoreachthissituationthanstatic
hashing.

ComparisonofOrderedIndexingandHashing

OrderedIndexing Hashing

Addressesinthememoryaresortedforkey Addressesaregeneratedusinghashfunctionon
value.Thiskeyvaluecanbeprimarykeyorany thekeyvalue.Thiskeyvaluecanbeprimarykey
othercolumninthetable. oranyothercolumninthetable.

Performanceofthismethodcomesdownasthe Performanceofdynamichashingwillbegood
dataincreasesinthefile.Sinceitstoresthedata whenthereisafrequentadditionanddeletionof
inasortedform,whenthereis data.Butifthedatabaseisveryhuge,
insert/delete/updateoperation,anextraeffortto maintenancewillbecostlier.
sorttherecordisneeded.Thisreducesits
performance.
Statichashingwillbegoodforsmaller
databaseswhererecordsizeidpreviously
known.Ifthereisagrowthindata,itresultsin
seriousproblemslikebucketoverflow.

https://www.tutorialcup.com/dbms/hashingconcepts.htm 6/7
6/5/2017 HashingConceptsinDBMS

Therewillbeunuseddatablocksdueto Inbothstaticanddynamichashing,memoryis
delete/updateoperation.Thesedatablockswill wellmanaged.Bucketoverflowisalsohandled
notbereleasedforreuse.Henceperiodic tobetterextentinstatichashing.Datablocks
maintenanceofthememoryisrequired.Else, aredesignedtoshrinkandgrowindynamic
memoryiswastedandperformancewillalso hashing.
degrade.Alsoitwillbecostoverheadto
maintainmemory.
Buttherewillbeanoverheadofmaintainingthe
bucketaddresstableindynamichashingwhen
thereisahugedatabasegrowth.

Preferredforrangeretrievalofdatathatmeans Thismethodissuitabletoretrieveaparticular
whenthereisretrievaldataforparticularrange, recordbasedonthesearchkey.Butitwillnot
thismethodisbestsuited. performbetterifthehashfunctionisnotonthe
searchkey.

https://www.tutorialcup.com/dbms/hashingconcepts.htm 7/7

You might also like