0% found this document useful (0 votes)
39 views108 pages

distributed-systems-pranay

ds material

Uploaded by

Varaha Giri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views108 pages

distributed-systems-pranay

ds material

Uploaded by

Varaha Giri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 108

Studocuisnotsponsoredorendorsedbyanycollegeor university

DISTRIBUTEDSYSTEMS

B.TECH./CSE&IT/R18

SYLLABUS

UNIT–I

CharacterizationofDistributedSystems:Introduction,ExamplesofDistributedsystem
s,Resourcesharingandweb,challenges.

Systemmodels:Introduction,ArchitecturalandFundamentalmodels,NetworkingandInt
ernetworking,InterprocessCommunication.

DistributedobjectsandRemoteInvocation:Introduction,Communicationbetweendist
ributedobjects,RPC,Eventsandnotifications,Casestudy-JavaRMI.

UNIT–II

OperatingSystemSupport:Introduction,OSlayer,Protection,ProcessesandThreads,C
ommunicationandInvocation,Operatingsystemarchitecture,DistributedFileSystems-
Introduction,FileServicearchitecture.

UNIT–III

PeertoPeerSystems:Introduction,Napsteranditslegacy,PeertoPeermiddleware,Routin
goverlays,Overlaycasestudies-Pastry,Tapestry,Applicationcasestudies-
Squirrel,OceanStore.

TimeandGlobalStates:Introduction,Clocks,eventsandProcessstates,Synchronizingp
hysicalclocks,logicaltimeandlogicalclocks,globalstates,distributeddebugging.

CoordinationandAgreement:Introduction,Distributedmutualexclusion,Elections,M
ulticastcommunication,consensusandrelatedproblems.

Page1of 88
UNIT–IV

TransactionsandConcurrencyControl:Introduction,Transactions,NestedTransactio
ns,Locks,Optimisticconcurrencycontrol,Timestampordering.DistributedTransactio
ns:Introduction,FlatandNestedDistributedTransactions,Atomiccommitprotocols,Con
currencycontrolindistributedtransactions,Distributeddeadlocks,Transactionrecovery.

UNIT–V

Replication:Introduction,Systemmodelandgroupcommunication,Faulttolerantservice
s,Transactionswithreplicateddata.

Distributedsharedmemory,DesignandImplementationissues,Consistencymodels.

*****

Page2of88
UNIT-I

CHARACTERIZATION OF DISTRIBUTEDSYSTEMS

INTRODUCTION

DefinitionofDistributedSystem:Adistributedsystemisoneinwhichcomponentslocate
datnetworkedcomputerscommunicateandcoordinatetheiractionsonlybypassingmessag
es.

Introduction:

Wedefineadistributedsystemasoneinwhichhardwareorsoftwarecomponentslocat
edatnetworkedcomputerscommunicateandcoordinatetheiractionsonlybypassin
gmessages.
Thissimpledefinitioncoverstheentirerangeofsystemsinwhichnetworkedcomputer
scanusefullybedeployed.
Computersthatareconnectedbyanetworkmaybespatiallyseparatedbyanydistance.
Theymaybeonseparatecontinents,inthesamebuildingorinthesameroom.

Ourdefinitionofdistributedsystemshasthefollowingsignificantconsequences:

1) Concurrency:Inanetworkofcomputers,concurrentprogramexecutionisthenorm
.Icandomyworkonmycomputerwhileyoudoyourworkonyours,sharingresources
suchaswebpagesorfileswhennecessary.Thecapacityofthesystemtohandleshared
resourcescanbeincreasedbyaddingmoreresources(forexample.computers)tothe
network.
2) Noglobalclock:Whenprogramsneedtocooperatetheycoordinatetheiractionsbye
xchangingmessages.Closecoordinationoftendependsonasharedideaofthetimeat
whichtheprogramsactionsoccur.Butitturnsoutthattherearelimitstotheaccuracyw
ithwhichthecomputersinanetworkcansynchronizetheirclocks–
thereisnosingleglobalnotionofthecorrecttime.
3) Independentfailures:Allcomputersystemscanfail,anditistheresponsibilityofsy
stemdesignerstoplanfortheconsequencesof

Page3of88
possiblefailures.Distributedsystemscanfailinnewways.Faultsinthenetworkresul
tintheisolationofthecomputersthatareconnectedtoit,butthatdoesn’tmeanthatthey
stoprunning.Infact,theprogramsonthemmaynotbeabletodetectwhetherthenetwo
rkhasfailedorhasbecomeunusuallyslow.Similarly,thefailureofacomputer,ortheu
nexpectedterminationofaprogramsomewhereinthesystem(acrash),isnotimmedi
atelymadeknowntotheothercomponentswithwhichitcommunicates.Eachcompo
nentofthesystemcanfailindependently,leavingtheothersstillrunning.

EXAMPLESOFDISTRIBUTEDSYSTEMS

Ourexamplesarebasedonfamiliarandwidelyusedcomputernetworks:

1) TheInternet,
2) Intranetsand
3) Theemergingtechnologyofnetworksbasedonmobiledevices.

1) TheInternet:

TheInternetisavastinterconnectedcollectionofcomputernetworksofmanydifferent
types.
Programsrunningonthecomputersconnectedtoitinteractbypassingmessages,empl
oyingacommonmeansofcommunication.
TheInternetisalsoaverylargedistributedsystem.Itenablesusers,wherevertheyare,t
omakeuseofservicessuchastheWorldWideWeb,emailandfiletransfer.
InternetServiceProviders(ISPs)arecompaniesthatprovidemodemlinksandotherty
pesofconnectiontoindividualusersandsmallorganizations,enablingthemtoaccess
servicesanywhereintheinternetaswellasprovidinglocalservicessuchasemailand
webhosting.
MultimediaservicesareavailableintheInternet,enablinguserstoaccessaudioandvid
eodataincludingmusic,radioandTVchannelsandtoholdphoneandvideoconferenc
es.

Page4of88
2) Intranets:

AnintranetisaportionoftheInternetthatisseparatelyadministeredandhasaboundary
thatcanbeconfiguredtoenforcelocalsecuritypolicies.
Itiscomposedofseverallocalareanetworks(LANs)linkedbybackboneconnections.
Thenetworkconfigurationofaparticularintranetistheresponsibilityoftheorganizat
ionthatadministersitandmayvarywidely-
rangingfromaLANonasinglesitetoaconnectedsetofLANsbelongingtobrancheso
facompanyorotherorganizationindifferentcountries.
AnintranetisconnectedtotheInternetviaarouter,whichallowstheusersinsidetheintr
anettomakeuseofserviceselsewheresuchastheWeboremail.Italsoallowstheusers
inotherintranetstoaccesstheservicesitprovides.
Manyorganizationsneedtoprotecttheirownservicesfromunauthorizedusebypossib
lymalicioususerselsewhere.
Forexample.acompanywillnotwantsecureinformationtobeaccessibletousersinco
mpetingorganizations,andahospitalwillnotwantsensitivepatientdatatobereveale
d.
Companiesalsowanttoprotectthemselvesfromharmfulprogramssuchasvirusesent
eringandattackingthecomputersintheintranetandpossiblydestroyingvaluabledat
a.
Theroleofafirewallistoprotectanintranetbypreventingunauthorizedmessagesleavi
ngorentering.
Afirewallisimplementedbyfilteringincomingandoutgoingmessages,forexamplea
ccordingtotheirsourceordestination.

3) Mobileandubiquitouscomputing:

Technologicaladvancesindeviceminiaturizationandwirelessnetworkinghaveledin
creasinglytotheintegrationofsmallandportablecomputingdevicesintodistributed
systems.
Thesedevicesinclude:
i. Laptopcomputers.

Page5of88
ii. Handhelddevices,includingpersonaldigitalassistants(PDAs),mobilepho
nes,pagers,videocamerasanddigitalcameras.
iii. Wearabledevices,suchassmartwatcheswithfunctionalitysimilartoaP
DA.
iv. Devicesembeddedinappliancessuchaswashingmachines,hi-
fisystems,carsandrefrigerators.

Theportabilityofmanyofthesedevices,togetherwiththeirabilitytoconnectconvenientlyto
networksindifferentplaces.makesmobilecomputingpossible.

RESOURCESHARING

Weroutinelysharehardwareresourcessuchasprinters,dataresourcessuchasfiles,an
dresourceswithmorespecificfunctionalitysuchassearchengines.
Lookedatfromthepointofviewofhardwareprovision,weshareequipmentsuchaspri
ntersanddiskstoreducecosts.
Butoffargreatersignificancetousersisthesharingofthehigher-
levelresourcesthatplayapartintheirapplicationsandintheireverydayworkandsoci
alactivities.
Forexample,usersareconcernedwithsharingdataintheformofashareddatabaseoras
etofwebpages–notthedisksandprocessorsonwhichtheyareimplemented.
Inpractice,patternsofresourcesharingvarywidelyintheirscopeandinhowcloselyuse
rsworktogether.
Atoneextreme,asearchengineontheWebprovidesafacilitytousersthroughoutthew
orld,userswhoneednevercomeintocontactwithoneanotherdirectly.
Weusethetermserviceforadistinctpartofacomputersystemthatmanagesacollection
ofrelatedresourcesandpresentstheirfunctionalitytousersandapplications.
Forexample,weaccesssharedfilesthroughafileservice;wesenddocumentstoprinter
sthroughaprintingservice;webuygoodsthroughanelectronicpaymentservice.

Page6of88
Theonlyaccesswehavetotheserviceisviathesetofoperationsthatitexports.Forexam
ple,afileserviceprovidesread,writeanddeleteoperationsonfiles.
Resourcesinadistributedsystemarephysicallyencapsulatedwithincomputersandca
nonlybeaccessedfromothercomputersbymeansofcommunication.
Foreffectivesharing,eachresourcemustbemanagedbyaprogramthatoffersacommu
nicationinterfaceenablingtheresourcetobeaccessedandupdatedreliablyandconsi
stently.
Thetermserverisprobablyfamiliartomostreaders.Itreferstoarunningprogram(apro
cess)onanetworkedcomputerthatacceptsrequestsfromprogramsrunningonotherc
omputerstoperformaserviceandrespondsappropriately.
Therequestingprocessesarereferredtoasclients,andtheoverallapproachisknownas
client-servercomputing.
Inthisapproach,requestsaresentinmessagesfromclientstoaserverandrepliesaresen
tinmessagesfromtheservertotheclients.
Whentheclientsendsarequestforanoperationtobecarriedout,wesaythattheclientinv
okesanoperationupontheserver.
Acompleteinteractionbetweenaclientandaserver,fromthepointwhentheclientsend
sitsrequesttowhenitreceivestheserver’s response,iscalledaremoteinvocation.
Clientsareactive(makingrequests)andserversarepassive(onlywakingupwhenthey
receiverequests);serversruncontinuously,whereasclientslastonlyaslongastheap
plicationsofwhichtheyformapart.
Ifadistributedsystemiswritteninanobject-
orientedlanguage,resourcesmaybeencapsulatedasobjectsandaccessedbycliento
bjects,inwhichcasewespeakofaclientobjectinvokingamethoduponaserverobject.

Page7of88
WEB

TheWorldWideWebisanevolvingsystemforpublishingandaccessingresourcesand
servicesacrosstheInternet.
TheWebisbasedonthreemainstandardtechnologicalcomponents:
1) TheHyperTextMarkupLanguage(HTML)isalanguageforspecifyingtheco
ntentsandlayoutofpagesastheyaredisplayedbywebbrowsers.
2) UniformResourceLocators(URLs),whichidentifydocumentsandotherres
ourcesstoredaspartoftheWeb.
3) Aclient-
serversystemarchitecture,withstandardrulesforinteraction(theHyperText
TransferProtocol-
HTTP)bywhichbrowsersandotherclientsfetchdocumentsandotherresourc
esfromwebservers.

1) HyperTextMarkupLanguage(HTML):

TheHyperTextMarkupLanguageisusedtospecifythetextandimagesthatmakeupthe
contentsofawebpage,andtospecifyhowtheyarelaidoutandformattedforpresentati
ontotheuser.
Awebpagecontainssuchstructureditemsasheadings,paragraphs,tablesandimages.
HTMLisalsousedtospecifylinksandwhichresourcesareassociatedwiththem.

Example:

<html>

<head>

<title>MyWebPage</title>

</head>

<h1align="center">MyFavouriteWebsites</h1>

<bodybgcolor="yellow">

<center><imgsrc="picture.jpg"width=200height=200/>

Page8of88
</center>

<ahref="http://www.google.co.in/">Google</a><br/>

<ahref="http://www.yahoomail.com/">Yahoomail</a>

</html>

ThisHTMLtextisstoredinafilethatawebservercanaccess-letussaythefilesample.html

2) UniformResourceLocators(URLs):

ThepurposeofaUniformResourceLocatoristoidentifyaresource.
BrowsersexamineURLsinordertoaccessthecorrespondingresources.
SometimestheusertypesaURLintothebrowser.
Morecommonly,thebrowserlooksupthecorrespondingURLwhentheuserclicksona
linkorselectsoneoftheir'bookmarks';orwhenthebrowserfetchesaresourceembed
dedinawebpage,suchanimage.
HTTPURLsarethemostwidelyused,foraccessingresourcesusingthestandardHTT
Pprotocol.
AnHTTPURLhastwomainjobstodo:
i. Toidentifywhichwebservermaintainstheresource,and
ii. Toidentifywhichoftheresourcesatthatserverisrequired.

3) HyperTextTransferProtocol(HTTP):TheHyperTextTransferProtocoldefinesthe
waysinwhichbrowsersandothertypesofclientinteractwithwebservers.

Features:

i. Request-replyinteractions:HITPisa'request-
reply'protocol.TheclientsendsarequestmessagetotheservercontainingtheURLof
therequiredresource.Theserverlooksupthepathnameand,ifitexists,sendsbackthe
file'scontentsinareplymessagetotheclient.Otherwise,itsendsbackanerrorrespons
esuchasthefamiliar'404NotFound'.
ii. Contenttypes:Browsersarenotnecessarilycapableofhandlingeverytypeofconte
nt.Whenabrowsermakesarequest,itincludesalistof

Page9of88
thetypesofcontentitprefers–
forexample,inprincipleitmaybeabletodisplayimagesin'GIF'formatbutnot'JPEG'f
ormat.
iii. Oneresourceperrequest:ClientsspecifyoneresourceperHTIPrequest.Ifawebpa
gecontainsnineimages,say,thenthebrowserwillissueatotaloftenseparaterequests
toobtaintheentirecontentsofthepage.Browserstypicallymakeseveralrequestscon
currently,toreducetheoveralldelaytotheuser.
iv. Simpleaccesscontrol:Bydefault,anyuserwithnetworkconnectivitytoawebserve
rcanaccessanyofitspublishedresources.Ifuserswishtorestrictaccesstoaresource,t
hentheycanconfiguretheservertoissuea'challenge'toanyclientthatrequestsit.Thec
orrespondinguserthenhastoprovethattheyhavetherighttoaccesstheresource,fore
xample,bytypinginapassword”.

CHALLENGES

Thefollowingthevariouschallengesofdistributedsystems.

1) Heterogeneity:

TheInternetenablesuserstoaccessservicesandrunapplicationsoveraheterogeneou
scollectionofcomputersandnetworks.
Heterogeneity(thatis,varietyanddifference)appliestoallofthefollowing:
i. networks;
ii. computerhardware;
iii. operatingsystems;
iv. programminglanguages;
v. implementationsbydifferentdevelopers.

UseofMiddleware:

Thetermmiddlewareappliestoasoftwarelayerthatprovidesaprogrammingabstracti
onaswellasmaskingtheheterogeneityoftheunderlyingnetworks,hardware,operati
ngsystemsandprogramminglanguages.

Page10of88
TheCommonObjectRequestBroker(CORBA)isanexample.Somemiddleware,suc
hasJavaRemoteMethodInvocation(RMI)supportsonlyasingleprogramminglang
uage.
MostmiddlewareisimplementedovertheInternetprotocols,whichthemselvesmaskt
hedifferencesoftheunderlyingnetworks,butallmiddlewaredealswiththedifferenc
esinoperatingsystemsandhardware.
Inadditiontosolvingtheproblemsofheterogeneity,middlewareprovidesauniformco
mputationalmodelforusebytheprogrammersofserversanddistributedapplication
s.Possiblemodelsincluderemoteobjectinvocation,remoteeventnotification,remo
teSQLaccessanddistributedtransactionprocessing.
Forexample,CORBAprovidesremoteobjectinvocation,whichallowsanobjectinapr
ogramrunningononecomputertoinvokeamethodofanobjectinaprogramrunningo
nanothercomputer.

Heterogeneityandmobilecode:

Thetermmobilecodeisusedtorefertoprogramcodethatcanbetransferredfromoneco
mputertoanotherandrunatthedestination– Javaappletsareanexample.
Codesuitableforrunningononecomputerisnotnecessarilysuitableforrunningonano
therbecauseexecutableprogramsarenormallyspecificbothtotheinstructionsetand
tothehostoperatingsystem.
Thevirtualmachineapproachprovidesawayofmakingcodeexecutableonavarietyof
hostcomputers:thecompilerforaparticularlanguagegeneratescodeforavirtualma
chineinsteadofaparticularhardwareordercode.Forexample,theJavacompilerpro
ducescodeforaJavavirtualmachine,whichexecutesitbyinterpretation.
TheJavavirtualmachineneedstobeimplementedonceforeachtypeofcomputertoena
bleJavaprogramstorun.

Page11of88
2) Openness:

Theopennessofacomputersystemisthecharacteristicthatdetermineswhetherthesys
temcanbeextendedandreimplementedinvariousways.
Theopennessofdistributedsystemsisdeterminedprimarilybythedegreetowhichne
wresource-
sharingservicescanbeaddedandbemadeavailableforusebyavarietyofclientprogra
ms.
Opennesscannotbeachievedunlessthespecificationanddocumentationofthekeysof
twareinterfacesofthecomponentsofasystemaremadeavailabletosoftwaredevelop
ers.

3) Security:

Manyoftheinformationresourcesthataremadeavailableandmaintainedindistribut
edsystemsareveryimportanttotheirusers.
Theirsecurityisthereforeofconsiderableimportance.
Securityforinformationresourceshasthreecomponents:
i. Confidentiality:protectionagainstdisclosuretounauthorizedindividuals.
ii. Integrity:protectionagainstalterationorcorruption.
iii. Availability:protectionagainstinterferencewiththemeanstoaccesstheres
ources.

Inadistributedsystem,clientssendrequeststoaccessdatamanagedbyservers,whichinvolvess
endinginformationinmessagesoveranetwork.

Forexample:

i. Adoctormightrequestaccesstohospitalpatientdataorsendadditionstothatdata.
ii. Inelectroniccommerceandbanking,userssendtheircreditcardnumbersacros
stheInternet.

Inbothexamples,thechallengeistosendsensitiveinformationinamessageoveranetworki
nasecuremanner.

Page12of88
Thesecondchallengehereistoidentifyaremoteuserorotheragentcorrectly.Bothofthesech
allengescanbemetbytheuseofencryptiontechniquesdevelopedforthispurpose.However,
thefollowingtwosecuritychallengeshavenotyetbeenfullymet:

Denialofserviceattacks:Anothersecurityproblemisthatausermaywishtodisruptaservic
eforsomereason.Thiscanbeachievedbybombardingtheservicewithsuchalargenumberof
pointlessrequeststhattheserioususersareunabletouseit.Thisiscalledadenialofserviceatta
ck.

Securityofmobilecode:Mobilecodeneedstobehandledwithcare.Considersomeonewho
receivesanexecutableprogramasanelectronicmailattachment:thepossibleeffectsofrunni
ngtheprogramareunpredictable;forexample,itmayseemtodisplayaninterestingpictureb
utinrealityitmayaccesslocalresources.

4) Scalability:

Distributedsystemsoperateeffectivelyandefficientlyatmanydifferentscales,rangin
gfromasmallintranettotheInternet.
Asystemisdescribedasscalableifitwillremaineffectivewhenthereisasignificantinc
reaseinthenumberofresourcesandthenumberofusers.
ThenumberofcomputersandserversintheInternethasincreaseddramatically.
Thebelowtableshowstheincreasingnumberofcomputersandwebserversoverthepe
riodoftime.

Table1.1:GrowthoftheInternet(computersandwebservers)

Page13of88
5) Failurehandling:

Computersystemssometimesfail.
Whenfaultsoccurinhardwareorsoftware,programsmayproduceincorrectresultsor
maystopbeforetheyhavecompletedtheintendedcomputation.
Failuresinadistributedsystemarepartial–
thatis,somecomponentsfailwhileotherscontinuetofunction.
Thereforethehandlingoffailuresisparticularlydifficult.

Thefollowingtechniquesfordealingwithfailuresare:

Detectingfailures:Somefailurescanbedetected.Forexample,checksumscanbeusedtode
tectcorrupteddatainamessageorafile.

Maskingfailures:Somefailuresthathavebeendetectedcanbehiddenormadelesssevere.T
woexamplesofhidingfailures:

i. Messagescanberetransmittedwhentheyfailtoarrive.
ii. Filedatacanbewrittentoapairofdiskssothatifoneiscorrupted,theothermaystillbec
orrect.

Toleratingfailures:MostoftheservicesintheInternetdoexhibitfailures–
itwouldnotbepracticalforthemtoattempttodetectandhideallofthefailuresthatmightoccur
insuchalargenetworkwithsomanycomponents.Theirclientscanbedesignedtotoleratefail
ures,whichgenerallyinvolvestheuserstoleratingthemaswell.

Forexample,whenawebbrowsercannotcontactawebserver,itdoesnotmaketheuserwaitfo
reverwhileitkeepsontrying–
itinformstheuserabouttheproblem,leavingthemfreetotryagainlater.

Recoveryfromfailures:Recoveryinvolvesthedesignofsoftwaresothatthestate of
permanent data can be recovered or “rolled back”afteraserverhascrashed.

6) Concurrency:

Thereisthereforeapossibilitythatseveralclientswillattempttoaccessasharedresourc
eatthesametime.

Page14of88
Theprocessthatmanagesasharedresourcecouldtakeoneclientrequestatatime.
Butthatapproachlimitsthroughput.Thereforeservicesandapplicationsgenerallyall
owmultipleclientrequeststobeprocessedconcurrently.

7) Transparency:

Transparencyisdefinedastheconcealmentfromtheuserandtheapplicationprogram
meroftheseparationofcomponentsinadistributedsystem,sothatthesystemispercei
vedasawholeratherthanasacollectionofindependentcomponents.
Thetwomostimportanttransparenciesareaccessandlocationtransparency;theirpres
enceorabsencemoststronglyaffectstheutilizationofdistributedresources.
Theyaresometimesreferredtotogetherasnetworktransparency.

SYSTEMMODELS

INTRODUCTION

Systemsthatareintendedforuseinreal-
worldenvironmentsshouldbedesignedtofunctioncorrectlyinthewidestpossiblera
ngeofcircumstancesandinthefaceofmanypossibledifficultiesandthreats.
Thereisnoglobaltimeinadistributedsystem,sotheclocksondifferentcomputersdon
otnecessarilygivethesametimeasoneanother.
Allcommunicationbetweenprocessesisachievedbymeansofmessages:Messageco
mmunicationoveracomputernetworkcanbeaffectedbydelays,cansufferfromavar
ietyoffailuresandisvulnerabletosecurityattacks.

Theseissuesareaddressedbythreemodels:

1) TheinteractionmodeldealswithperformanceandwiththedifficultyofsettingtimeL
imitsinadistributedsystem,forexampleformessagedelivery.

Page15of88
2) Thefailuremodelattemptstogiveaprecisespecificationofthefaultsthatcanbe,exhib
itedbyprocessesandcommunicationchannels.Itdefinesreliable,communicationa
ndcorrectprocesses.
3) Thesecuritymodeldiscussesthepossiblethreatstoprocessesandcommunicationch
annels.

ARCHITECTURALMODELS

“Anarchitecturalmodelofadistributedsystemisconcernedwiththe
placementofitspartsandtherelationshipsbetweenthem.”
Examplesincludetheclient-servermodelandthepeer-to-peermodel.
Anarchitecturalmodelofadistributedsystemfirstsimplifiesandabstractsthefunctio
nsoftheindividualcomponentsofadistributedsystemandthenitconsiders:
⮡ Theplacementofthecomponentsacrossanetworkofcomputers
– seekingtodefineusefulpatternsforthedistributionofdataandworkload.
⮡ Theinterrelationshipsbetweenthecomponents–thatistheir
functionalrolesandthepatternsofcommunicationbetweenthem.

Softwarelayers:Thetermssoftwarearchitecturereferredoriginallytothestructuringofsof
twareaslayersorModulesinasinglecomputerorgroupofcomputers.

Platform:Thelowest-
levelhardwareandsoftwarelayersareoftenreferredtoasaplatformfordistributedsystemsa
ndapplications.Theselow-levellayersprovideservicestothelayersabovethem.

Page16of88
Middleware:Itisalayerofsoftwarewhosepurposeistomaskheterogeneityandtoprovidea
convenientprogrammingmodeltoapplicationprogrammers.Examples:CORBA,JavaR
MI.

Systemarchitectures:Thedivisionofresponsibilitiesbetweensystemcomponents(appli
cations,serversandotherprocesses)andtheplacementofthecomponentsoncomputersinth
enetworkisperhapsthemostevidentaspectofdistributedsystemdesign.

Thefollowingarethetwomaintypesofarchitecturalmodelsare:

Client-
server:Thisisthearchitecturethatismostoftencitedwhendistributedsystemsarediscussed
.Itishistoricallythemostimportantandremainsthemostwidelyemployed.Figure2.2illustr
atesthesimplestructureinwhichclientprocessesinteractwithindividualserverprocessesin
separatehostcomputersinordertoaccessthesharedresourcesthattheymanage.

Figure2.2

Peer-to-
peer:Inthisarchitecturealloftheprocessesinvolvedinataskoractivityplaysimilarroles,int
eractingcooperativelyaspeerswithoutanydistinctionbetweenclientandserverprocesseso
rthecomputersthattheyrunon.Whiletheclient-
servermodeloffersadirectandrelativelysimpleapproachtothesharingofdataandotherreso
urces,itscalespoorly.

Page17of88
Figure2.3Adistributedapplicationbasedonthepeer-to-peerarchitecture

Figure2.3illustrates the form of apeer-la-peer application. Applications are composed


of large numbers of peer processes running on separate computers and the pattern of
communication between them depends entirely on application requirements.

Variations: Several variations on the above models can be derived from the
consideration of the following factors:

Servicesprovidedbymultipleservers:Servicesmaybeimplementedasseveralserverpro
cessesinseparatehostcomputersinteractingasnecessarytoprovideaservicetoclientproces
ses(Figure2.4).Theserversmaypartitionthesetofobjectsonwhichtheserviceisbasedanddi
stributethembetweenthemselves,ortheymaymaintainreplicatedcopiesofthemonseveral
hosts.

Page18of88
Figure2.4

Proxyserversandcaches:Acacheisastoreofrecentlyuseddataobjectsthatiscloserthanth
eobjectsthemselves.Whenanewobjectisreceivedatacomputeritisaddedtothecachestore,
replacingsomeexistingobjectsifnecessary.Whenanobjectisneededbyaclientprocessthec
achingservicefirstchecksthecacheandsuppliestheobjectfromthereifanup-to-
datecopyisavailable.Ifnot,anup-la-datecopyisfetched.

Figure2.5

Mobilecode:Appletsareawell-knownandwidelyusedexampleofmobilecode-
theuserrunningabrowserselectsalinktoanappletwhosecodeisstoredonawebserver;theco
deisdownloadedtothebrowserandrunsthere,asshownbelow:

Page19of88
Anadvantageofrunningthedownloadedcodelocallyisthatitcangivegoodinteractiverespo
nsesinceitdoesnotsufferfromthedelaysorvariabilityofbandwidthassociatedwithnetwor
kcommunication.

Mobileagents:Amobileagentisarunningprogram(includingbothcodeanddata)thattrave
lsfromonecomputertoanotherinanetworkcarryingoutataskonsomeone'sbehalf,suchasc
ollectinginformation,eventuallyreturningwiththeresults.

Designrequirementsfordistributedarchitectures:Thefactorsmotivatingthedistributi
onofobjectsandprocessesinadistributedsystemarenumerousandtheirsignificanceiscons
iderable.

i. Performanceissues:Performanceissuesarisingfromthelimitedprocessingandco
mmunicationcapacitiesofcomputersandnetworksareconsideredunderthefollowi
ngsubheadings:Responsiveness&Throughput.
ii. Qualityofservice:Onceusersareprovidedwiththefunctionalitythattheyrequireof
aservicesuchasthefileserviceinadistributedsystem,wecangoontoaskaboutthequa
lityoftheserviceprovided.Themainnon-
functionalpropertiesofsystemsthataffectthequalityoftheserviceexperiencedbycl
ientsandusersarereliability,securityandperformance.
iii. Useofcachingandreplication:Thecachesandwebproxyservers,withoutdiscussi
nghowcachedcopiesofresourcescanbekeptuptodatewhentheresourceataserveris
updated.Abrowserorproxycanvalidateacachedresponsebycheckingwiththeorigi
nalwebserverto

Page20of88
seewhetheritisstilluptodate.Ifitfailsthetest,thewebserverreturnsafreshresponse,
whichiscachedinsteadofthestaleresponse.
iv. Dependabilityissues:Dependabilityisarequirementinmostapplicationdomains.
Wedefinedthedependabilityofcomputersystemsascorrectness,securityandfaultt
olerance.Dependableapplicationsshouldcontinuetofunctioncorrectlyintheprese
nceoffaultsinhardware,softwareandnetworks.

FUNDAMENTALMODELS

Theaspectsofdistributedsystemsthatwewishtocaptureinourfundamentalmodelsareinte
ndedtohelpustodiscussandreasonabout:

Interaction:Computationoccurswithinprocesses;theprocessesinteractbypassingmessa
ges,resultingincommunication(i.e.,informationflow)andcoordination(synchronization
andorderingofactivities)betweenprocesses.

Failure:Thecorrectoperationofadistributedsystemisthreatenedwheneverafaultoccursi
nanyofthecomputersonwhichitruns.

Security:Themodularnatureofdistributedsystemsandtheiropennessexposesthemtoatta
ckbybothexternalandinternalagents.

Interactionmodel:Thediscussionofsystemarchitecturesindicatesthatdistributedsyste
msarecomposedofmanyprocesses,interactingincomplexways.Forexample:

Multipleserverprocessesmaycooperatewithoneanothertoprovideaservice.
AsetofpeerprocessesmaycooperatewithoneanothertoachieveacommonGoal.

Interactingprocessesperformalloftheactivityinadistributedsystem.Eachprocesshasitso
wnstate,consistingofthesetofdatathatitcanaccessandupdate,includingthevariablesinits
program.Thestatebelongingtoeachprocessiscompletelyprivate–
thatis,itcannotbeaccessedorupdatedbyanyotherprocess.

Page21of88
Inthissection,wediscusstwosignificantfactorsaffectinginteractingprocessesinadistribut
edsystem:

1) Communicationperformanceisoftenalimitingcharacteristic;
2) Itisimpossibletomaintainasingleglobalnotionoftime.

Performanceofcommunicationchannels:Thecommunicationchannelsinourmodelare
realizedinavarietyofwaysindistributedsystems,forexamplebyanimplementationofstrea
msorbysimplemessagepassingoveracomputernetwork.

Communicationoveracomputernetworkhasthefollowingperformancecharacteristicsrel
atingtolatency,bandwidthandjitter:

1) Latency:Thedelaybetweenthestartofamessage'Stransmissionfromoneprocessa
ndthebeginningofitsreceiptbyanotherisreferredtoaslatency.
2) Bandwidth:Thebandwidthofacomputernetworkisthetotalamountofinformation
thatcanbetransmittedoveritinagiventime.Whenalargenumberofcommunication
channelsareusingthesamenetwork,theyhavetosharetheavailablebandwidth.
3) Jitter:TheJitteristhevariationinthetimetakentodeliveraseriesofmessages.

Computerclocksandtimingevents:Eachcomputerinadistributedsystemhasitsowninte
rnalclock,whichcanbeusedbylocalprocessestoobtainthevalueofthecurrenttime.Therefo
retwoprocessesrunningondifferentcomputerscanassociatetimestampswiththeirevents.

Failuremodel:Inadistributedsystembothprocessesandcommunicationchannelsmayfail
-
thatis,theymaydepartfromwhatisconsideredtobecorrectordesirablebehavior.Thesearep
resentedundertheheadingsomissionfailures,arbitraryfailuresandtimingfailures.

1) Omissionfailures:Thefaultsclassifiedasomissionfailuresrefertocaseswhenaprocess
orcommunicationchannelfailstoperformactionsthatitissupposedtodo.

Page22of88
i. Processomissionfailures:Thechiefomissionfailureofaprocessistocrash.Whenw
esaythataprocesshascrashedwemeanthatithashaltedandwillnotexecuteanyfurthe
rstepsofitsprogramever.
ii. Timeouts:thatis,amethodinwhichoneprocessallowsafixedperiodoftimeforsome
thingtooccur.Inanasynchronoussystematimeoutcanindicateonlythataprocessisn
otresponding-
itmayhavecrashedormaybeslow,orthemessagesmaynothavearrived.

2) Arbitraryfailures:

Thetermarbitraryfailureisusedtodescribetheworstpossiblefailuresemantics,inwhi
chanytypeofanormayoccur.Forexample,aprocessmaysetwrongvaluesinitsdatait
ems,oritmayreturnawrongvalueinresponsetoaninvocation.
Anarbitraryfailureofaprocessisoneinwhichitarbitrarilyomitsintendedprocessingst
epsortakesunintendedprocessingsteps.Thereforearbitraryfailuresinprocessesca
nnotbedetectedbyseeingwhethertheprocessrespondstoinvocationsbecauseitmig
htarbitrarilyomittoreply.

3) Timingfailures:

Timingfailuresareapplicableinsynchronousdistributedsystemswheretimelimitsar
esetonprocessexecutiontime,messagedeliverytimeandclockdriftrate.
Inansynchronousdistributedsystem,anoverloadedservermayrespondtooslowly,bu
twecannotsaythatithasatimingfailuresincenoguaranteehasbeenoffered.

Securitymodel:Weidentifiedthesharingofresourcesasamotivatingfactorfordistributed
systems.Thesecurityofadistributedsystemcanbeachievedbysecuringtheprocessesandth
echannelsusedfortheirinteractionsandbyprotectingtheobjectsthattheyencapsulateagain
stunauthorizedaccess.Protectionisdescribedintermsofobjects,althoughtheconceptsappl
yequallywelltoresourcesofalltypes.

Page23of88
Protectingobjects:Figure2.12showsaserverthatmanagesacollectionofobjectsonbehalf
ofsomeusers.

Figure2.12

Theuserscanrunclientprogramsthatsendinvocationstotheservertoperformoperatio
nsontheobjects.
Theservercarriesouttheoperationspecifiedineachinvocationandsendstheresulttoth
eclient.
Theserverisresponsibleforverifyingtheidentityoftheprincipalbehindeachinvocati
onandcheckingthattheyhavesufficientaccessrightstoperformtherequestedoperat
ionontheparticularobjectinvoked,rejectingthosethatdonot.
Theclientmaychecktheidentityoftheprincipalbehindtheservertoensurethattheresu
ltcomesfromtherequiredserver.

Securingprocessesandtheirinteractions:

Processesinteractbysendingmessages.
Themessagesareexposedtoattackbecausethenetworkandthecommunicationservic
ethattheyuseisopen,toenableanypairofprocessestointeract.

Theenemy:Tomodelsecuritythreats,wepostulateanenemy(sometimesalsoknownasthe
adversary)thatiscapableofsendinganymessagetoanyprocessandreadingorcopyinganym
essagebetweenapairofprocesses,asshowninFigure2.13.

Page24of88
Figure2.13-Theenemy

i. Servers:Sinceaservercanreceiveinvocationsfrommanydifferentclients,itcannot
necessarilydeterminetheidentityoftheprincipalbehindanyparticularinvocation.E
venifaserverrequirestheinclusionoftheprincipal'sidentityineachinvocation,anen
emymightgenerateaninvocationwithafalseidentity.Withoutreliableknowledgeo
fthesender'sidentity,aservercannottellwhethertoperformtheoperationortorejectit
.
ii. Clients:Whenaclientreceivestheresultofaninvocationfromaserver,itcannotnece
ssarilytellwhetherthesourceoftheresultmessageisfromtheintendedserverorfrom
anenemy,perhaps'spoofing'themailserver.

Cryptographyisthescienceofkeepingmessagessecure,andencryptionistheprocessofscr
amblingamessageinsuchawayastohideitscontents.Modemcryptographyisbasedonencr
yptionalgorithmsthatusesecretkeys
-largenumbersthataredifficulttoguess–
totransformdatainamannerthatcanonlybereversedwithknowledgeofthecorrespondingd
ecryptionkey.

Page25of88
NETWORKINGANDINTERNETWORKING

Distributedsystemsuselocalareanetworks,wideareanetworksandinternetworksforcom
munication.

Introduction:Thenetworksusedindistributedsystemsarcbuiltfromavarietyoftransmissi
onmedia,includingwire,cable,fibreandwirelesschannels;hardwaredevices.includingro
uters,switches,bridges,hubs,repeatersandnetworkinterfaces;andsoftwarecomponents.

Networkingissuesfordistributedsystems:

1) Performance:Thenetworkperformanceparametersthatareofprimaryinterestforo
urpurposesarcthoseaffectingthespeedwithwhichindividualmessagescanbetransf
erredbetweentwointerconnectedcomputers.Thesearethelatencyandthepointto-
pointdatatransferrate.
2) Scalability:ThepotentialfuturesizeoftheInternetiscommensuratewiththepopula
tionoftheplanet.Itisrealistictoexpectittoincludeseveralbillionnodesandhundreds
ofmillionsofactivehosts.
3) Reliability:Thereliabilityofmostphysicaltransmissionmediaisveryhigh.Whener
rorsoccurtheyareusuallyduetofailuresinthesoftwareatthesenderorreceiver(forex
ample,failurebythereceivingcomputertoacceptapacket)orbufferoverflowrathert
hanerrorsinthenetwork.
4) Security:Thefirstlevelofdefenseadoptedbymostorganizationsistoprotectitsnetw
orksandthecomputersattachedtothemwithafirewall.Afirewallcreatesaprotection
boundarybetweentheorganization'sintranetandtherestoftheInternet.
5) Mobility:Mobiledevicessuchaslaptopcomputers,PDAsandInternet-
capablemobilephonesaremovedfrequentlybetweenlocationsandreconnectedatc
onvenientnetworkconnectionpointsorevenusedwhileonthemove.TheInternetme
chanismshavebeenadaptedandextendedtosupportmobility,buttheexpectedfuture
growthintheuseofmobiledeviceswilldemandfurtherdevelopment.

Page26of88
6) Qualityofservice:Thequalityofserviceastheabilitytomeetdeadlineswhentransm
ittingandprocessingstreamsofreal-timemultimediadata.
7) Multicasting:Whilethiscanbesimulatedbysendstoseveraldestinations,thatismor
ecostlythannecessary,andmaynotexhibitthefault-
tolerancecharacteristicsrequiredbyapplications.Forthesereasons,manynetworkt
echnologiessupportthesimultaneoustransmissionofmessagestoseveralrecipients
.

Typesofnetworks:Hereweintroducethemaintypesofnetworkthatareusedtosupportdistr
ibutedsystems:personalareanetworks,localareanetworks,wideareanetworks,metropolit
anareanetworksandthewirelessvariantsofthem.InternetworkssuchastheInternetarecons
tructedfromnetworksofallthesetypes.

1) Personalareanetworks(PANs):PANsareasub-
categoryoflocalnetworksinwhichthevariousdigitaldevicescarriedbyauserarecon
nectedbyalow-
cost,lowenergynetwork.WiredPANsarenotofmuchsignificancebecausefewuser
swishtobeencumberedbyanetworkofwiresontheirperson,butwirelesspersonalar
eanetworks(WPANs)areofincreasingimportanceduetothenumberofpersonaldev
icessuchasmobilephones,PDAs,digitalcameras,musicplayersandsoonthatareno
wcarriedbymanypeople.
2) Localareanetworks(LANs):LANscarrymessagesatrelativelyhighspeedsbetwe
encomputersconnectedbyasinglecommunicationmedium,suchastwistedcopper
wire,coaxialcableoropticalfibre.Asegmentisasectionofcablethatservesadepartm
entorafloorofabuildingandmayhavemanycomputersattached.
3) Metropolitanareanetworks(MANs):Thistypeofnetworkisbasedonthehighban
dwidthcopperandfibreopticcablingrecentlyinstalledinsometownsandcitiesforth
etransmissionofvideo,voiceandotherdataoverdistancesofupto50kilometers.
4) Wideareanetworks(WANs):WANscarrymessagesatlowerspeedsbetweennode
sthatareoftenindifferentorganizationsandmaybe

Page27of88
separatedbylargedistances.Theymaybelocatedindifferentcities,countriesorconti
nents.Thecommunicationmediumisasetofcommunicationcircuitslinkingasetofd
edicatedcomputerscalledrouters.
5) Wirelesslocalareanetworks(WLANs):WLANsaredesignedforuseinplaceofwi
redLANstoprovideconnectivityformobiledevicesorsimplytoremovetheneedfor
awiredinfrastructuretoconnectcomputerswithinhomesandofficebuildingstoeach
otherandtheInternet.
6) Wirelessmetropolitanareanetworks(WMANs):ThelEEE802.16WiMAXstan
dardistargetedatthisclassofnetwork.
7) Wirelesswideareanetworks(WWANs):Mostmobilephonenetworksarebasedo
ndigitalwirelessnetworktechnologiessuchastheGSM(GlobalSystemforMobilec
ommunication)standard,whichisusedinmostcountriesoftheworld.Mobilephone
networksaredesignedtooperateoverwideareas(typicallyentirecountriesorcontine
nts)throughtheuseofcellularradioconnections.

Internetworks:Aninternetworkisacommunicationsubsysteminwhichseveralnetworks
arelinkedtogethertoprovidecommondatacommunicationfacilitiesthatoverlaythetechno
logiesandprotocolsoftheindividualcomponentnetworksandthemethodsusedfortheirinte
rconnection.

INTERPROCESSCOMMUNICATION

Remotemethodinvocationallowsanobjecttoinvokeamethodinanobjectinaremotep
rocess.
ExamplesofsystemsforremoteinvocationareCORBAandJavaRMI.
Inasimilarway,aremoteprocedurecallallowsaclienttocallaprocedureinaremoteser
ver.
Message-
passingoperationscanbeusedtoconstructprotocolstosupportparticularprocessrol
esandcommunicationpatterns.Forexampleremotemethodinvocations.

Thecharacteristicsofinterprocesscommunication:Messagepassingbetweenapairofp
rocessescanbesupportedbytwomessagecommunicationoperations:sendandreceive,defi
nedintermsofdestinationsandmessages.

Page28of88
Inorderforoneprocesstocommunicatewithanother,oneprocesssendsamessage(asequen
ceofbytes)toadestinationandanotherprocessatthedestinationreceivesthemessage.

Client-
servercommunication:Thisformofcommunicationisdesignedtosupporttherolesandme
ssageexchangesintypicalclient-serverinteractions.Inthenormalcase,request-
replycommunicationissynchronousbecausetheclientprocessblocksuntilthereplyarrives
fromtheserver.Itcanalsobereliablebecausethereplyfromtheserveriseffectivelyanackno
wledgementtotheclient.

Therequest-
replyprotocolisbasedonatrioofcommunicationprimitives:doOperation,getRequestandsend
Reply,asshowninabovefigure.

ThedoOperationmethodisusedbyclientstoinvokeremoteoperations.GetRequestisuse

dbyaserverprocesstoacquireservicerequests.sendReplytosendthereplymessagetothe

client.

RPCexchangeprotocols:

1) therequest(R)protocol;
2) therequest-reply(RR)protocol;
3) therequest-reply-acknowledgereply(RRA)protocol.

Groupcommunication:Thepairwiseexchangeofmessagesisnotthebestmodelforcommuni
cationfromoneprocesstoagroupofotherprocesses,as

Page29of88
forexamplewhenaserviceisimplementedasanumberofdifferentprocessesindifferentco
mputers.

DISTRIBUTEDOBJECTSANDREMOTEINVOCATIONINTRODU

CTION

Thischapterisconcernedwithprogrammingmodelsfordistributedapplications–
thatis,thoseapplications(thatarecomposedofcooperatingprogramsrunninginseveraldiff
erentprocesses.Suchprogramsneedtobeabletoinvokeoperationsinotherprocesses,oftenf
unningindifferentcomputers.

Thefollowingmodelsaredevelopedbasedontheaboveconcept.

Theearliestandperhapsthebest-
knownofthesewastheextensionoftheconventionalprocedurecallmodeltotheremo
teprocedurecallmodel,whichallowsclientprogramstocallproceduresinserverpro
gramsrunninginseparateprocessesandgenerallyindifferentcomputersfromthecli
ent.
Inthe1990s,theobject-
basedprogrammingmodelwasextendedtoallowobjectsindifferentprocessestoco
mmunicatewithoneanotherbymeansofremotemethodinvocation(RMJ).RMIisan
extensionoflocalmethodinvocationthatallowsanobjectlivinginoneprocess1.0inv
okethemethodsofanobjectlivinginanotherprocess.
Theevent-
basedprogrammingmodelallowsobjectstoreceivenotificationoftheeventsatother
objectsinwhichtheyhaveregisteredinterest.

COMMUNICATIONBETWEENDISTRIBUTEDOBJECTS

Theobject-
basedmodelforadistributedsystemextendsthemodelsupportedbyobjectorientedprogra
mminglanguagestomakeitapplytodistributedobjects.

Theobjectmodel:Abriefreviewoftherelevantaspectsoftheobjectmodel.Suitableforther
eaderwithabasicknowledgeofanobject-
orientedprogramminglanguage,forexampleJavaorC++.

Page30of88
Distributedobjects:Apresentationofobject-
baseddistributedsystems,whicharguesthattheobjectmodelisveryappropriatefordistribut
edsystems.

Thedistributedobjectmodel:Adiscussionoftheextensionstotheobjectmodelnecessary
forittosupportdistributedobjects.

Designissues:Asetofargumentsaboutthedesignalternatives.

Implementation:Anexplanationastohowalayerofmiddlewareabovetherequestreplypr
otocolmaybedesignedtosupportRMIbetweenapplication-leveldistributedobjects.

Distributedgarbagecollection:Apresentationofanalgorithmfordistributedgarbagecoll
ectionthatissuitableforusewiththeRMIimplementation.

Theobjectmodel:Anobject-orientedprogram,forexampleinJavaorC+
+,consistsofacollectionofinteractingobjects,eachofwhichconsistsofasetofdataandaseto
fmethods.Anobjectcommunicateswithotherobjectsbyinvokingtheirmethods,generally
passingargumentsandreceivingresults.

Objectscanencapsulatetheirdataandthecodeoftheirmethods.Somelanguages,forexampl
eJavaandC+
+,allowprogrammerstodefineobjectswhoseinstancevariablescanbeaccesseddirectly.Bu
tforuseinadistributedobjectsystem,anobject'sdatashouldbeaccessibleonlyviaitsmethod
s.

Distributedobjects:Distributedobjectsystemsmayadopttheclient-
serverarchitecture.Inthiscase,objectsaremanagedbyserversandtheirclientsinvoketheir
methodsusingremotemethodinvocation.InRMl,theclient'srequesttoinvokeamethodofa
nobjectissentinamessagetotheservermanagingtheobject.Theinvocationiscarriedoutbye
xecutingamethodoftheobjectattheserverandtheresultisreturnedtotheclientinanotherme
ssage.Toallowforchainsofrelatedinvocations,objectsinserversareallowedtobecomeclie
ntsofobjectsinotherservers.

Thedistributedobjectmodel:Eachprocesscontainsacollectionofobjects,someofwhich
canreceivebothlocalandremoteinvocations,whereastheotherobjectscanreceiveonlyloca
linvocations,asshowninbelowFigure.

Page31of88
Methodinvocationsbetweenobjectsindifferentprocesses,whetherinthesamecomputeror
not,areknownasremotemethodinvocations.Methodinvocationsbetweenobjectsinthesa
meprocessarclocalmethodinvocations.Werefertoobjectsthatcanreceiveremoteinvocati
onsasremoteobjects.InbelowFiguretheobjectsBandFareremoteobjects.

DesignIssuesforRMI:TheprevioussectionsuggestedthatRMIisanaturalextensionofloc
almethodinvocation.Inthissection,wediscusstwodesignissuesthatariseinmakingthisext
ension:

1) Thechoiceofinvocationsemantics-
althoughlocalinvocationsareexecutedexactlyonce,thuscannotalwaysbethecasef
orremotemethodinvocations.
2) TheleveloftransparencythatisdesirableforRMI.

ImplementationofRMI:

Severalseparateobjectsandmodulesareinvolvedinachievingaremotemethodinvocation.The
seareshowninaboveFigure,inwhichanapplication-
levelobjectAinvokesamethodinaremoteapplication-
levelobjectBforwhichitholdsaremoteobjectreference.

Page32of88
Distributedgarbagecollection:Theaimofadistributedgarbagecollectoristoensurethatif
alocalorremotereferencetoanobjectisstillheldanywhereinasetofdistributedobjects,thent
heobjectitselfwillcontinuetoexist,butassoonasnoobjectanylongerholdsareferencetoit,t
heobjectwillbecollectedandthememoryitusesrecovered.

REMOTEPROCEDURECALL(RPC)

Aremoteprocedurecallisverysimilartoaremotemethodinvocationinthataclientpro
gramcallsaprocedureinanotherprogramrunninginaserverprocess.
ServersmaybeclientsofotherserverstoallowchainsofRPCs.
Asmentionedintheintroductiontothischapter,aserverprocessdefinesinitsserviceint
erfacetheproceduresthatareavailableforcallingremotely.
Ineffect,thissonofserviceisratherlikeasingleremoteobjectinthatithasstateandmeth
ods.
However,itlackstheabilitytocreatenewinstancesofobjectsandthereforedoesnotsup
portremoteobjectreferences.

Asshownintheabovefigure,Theclientthataccessesaserviceincludesonestubproced
ureforeachprocedureintheserviceinterface.Theroleofastubprocedureissimilartot
hatofaproxymethod.
Itbehaveslikealocalproceduretotheclient,butinsteadofexecutingthecall,itmarshal
stheprocedureidentifierandtheargumentsintoarequestmessage,whichitsendsviai
tscommunicationmoduletotheserver.

Page33of88
Whenthereplymessagearrives,itunmarshalstheresults.Theserverprocesscontainsa
dispatchertogetherwithoneserverstubprocedureandoneserviceprocedureforeach
procedureintheserviceinterface.
Thedispatcherselectsoneoftheserverstubproceduresaccordingtotheprocedureiden
tifierintherequestmessage.
Aserverstubprocedureislikeaskeletonmethodinthatitunmarshalstheargumentsinth
erequestmessage,callsthecorrespondingserviceprocedureandmarshalsthereturn
valuesforthereplymessage.

EVENTSANDNOTIFICATIONS

Theideabehindtheuseofeventsisthatoneobjectcanreacttoachangeoccurringinanot
herobject.
Notificationsofeventsarcessentiallyasynchronousanddeterminedbytheirreceivers
.
Inparticularininteractiveapplications,theactionsthattheuserperformsonobjects.Fo
rexample,bymanipulatingabuttonwiththemouseorenteringtextinatextboxviathe
keyboard,areseenaseventsthatcausechangesintheobjectsthatmaintainthestateoft
heapplication.Theobjectsthatareresponsiblefordisplayingaviewofthecurrentstar
earenotifiedwheneverthestatechanges.
Distributedevent-
basedsystemsextendthelocaleventmodelbyallowingmultipleobjectsatdifferentl
ocationstobenotifiedofeventstakingplaceatanobject.Theyusethepublish-
subscribeparadigm,inwhichanobjectthatgenerateseventspublishesthetypeofeve
ntsthatitwillmakeavailableforobservationbyotherobjects.Objectsthatwanttorece
ivenotificationsfromanobjectthathaspublisheditseventssubscribetothetypesofev
entsthatareofinteresttothem.Objectsthatrepresenteventsarecallednotifications.
Notificationsmaybestored,sentinmessages,queriedandappliedinavarietyoforderst
odifferentthings.Whenapublisherexperiencesanevent,subscribersthatexpressed
aninterestinthattypeofeventwillreceivenotifications.Subscribingtoaparticularty
peofeventisalsocalledregisteringinterestinthattypeofevent.

Page34of88
CASESTUDY:JAVARMI

JavaRMIextendstheJavaobjectmodeltoprovidesupportfordistributedobjectsintheJaval
anguage.Inparticular,itallowsobjectstoinvokemethodsonremoteobjectsusingthesames
yntaxasforlocalinvocations.Inaddition,typecheckingappliesequallytoremoteinvocatio
nsastolocalones.However,anobjectmakingaremoteinvocationisawarethatitstargetisre
motebecauseitmusthandleRemoteExceptions;andtheimplementorofaremoteobjectisa
warethat.itisremotebecauseitmustimplementtheRemoteinterface.Althoughthedistribut
edobjectmodelisintegratedintoJavainanaturalway,thesemanticsofparameterpassingdif
ferbecauseinvokerandtargetareremotefromoneanother.

TheprogrammingofdistributedapplicationsinJavaRMIshouldberelativelysimplebecaus
eitisasingle-languagesystem-remoteinterfacesaredefinedintheJavalanguage.

RMIregistry:TheRMIregistryisthebinderforJavaRMI.AninstanceofRMIregistrymust
runoneveryservercomputerthathostsremoteobjects.

Page35of88
UNIT-II

OPERATINGSYSTEMSUPPORT

Theoperatingsystemfacilitatestheencapsulationandprotectionofresourcesinsideservers
;anditsupportstheinvocationmechanismsrequiredtoaccesstheseresources,includingco
mmunicationandscheduling.

INTRODUCTION

Wehavelearnedthatanimportantaspectofdistributedsystemsisresourcesharing.
Clientapplicationsinvokeoperationsonresourcesthatareoftenonanothernodeoratle
astinanotherprocess.
Applications(intheformofclients)andservices(intheformofresourcemanagers)use
themiddlewarelayerfortheirinteractions.
Middlewareprovidesremoteinvocationsbetweenobjectsorprocessesatthenodesofa
distributedsystem.
Belowthemiddlewarelayeristheoperatingsystem(OS)layer,whichisthesubjectofth
ischapter.Weshallbeexaminingtherelationshipbetweenthetwo,andinparticularh
owwelltherequirementsofmiddlewarecanbemetbytheoperatingsystem.
Thetaskofanyoperatingsystemistoprovideproblem-
orientedabstractionsoftheunderlyingphysicalresources-
theprocessors,memory,communications,andstoragemedia.

THEOPERATINGSYSTEM(OS)LAYER

Page36of88
TheaboveFigureshowshowtheoperatingsystemlayerateachoftwonodessupportsacomm
onmiddlewarelayerinprovidingadistributedinfrastructureforapplicationsandservices.

OurgoalinthischapteristoexaminetheimpactofparticularOSmechanismsonmiddleware'
sabilitytodeliverdistributedresourcesharingtousers.

Werequireatleastthefollowingofthem:

1) Encapsulation:Theyshouldprovideausefulserviceinterfacetotheirresources–
thatis,asetofoperationsthatmeettheirclients'needs.Detailssuchasmanagementof
memoryanddevicesusedtoimplementresourcesshouldbehiddenfromclients.
2) Protection:Resourcesrequireprotectionfromillegitimateaccesses-
forexample,filesareprotectedfrombei.ngreadbyuserswithoutreadpermissions,an
ddeviceregistersareprotectedfromapplicationprocesses.
3) Concurrentprocessing:Clientsmayshareresourcesandaccessthemconcurrently
.Resourcemanagersareresponsibleforachievingconcurrencytransparency.

CoreOSFunctionality:

TheaboveFigureshowsthecoreasfunctionalitythatweshallbeconcernedwith:processand
threadmanagement,memorymanagement,andcommunicationbetweenprocessesonthes
amecomputer(horizontal

Page37of88
divisionsinthefiguredenotedependencies).Thekernelsuppliesmuchofthisfunctionality-
allofitinthecaseofsomeoperatingsystems.

ThecoreOScomponentsarethefollowing:

1) Processmanager:Handlesthecreationofandoperationsuponprocesses.Aprocess
isaunitorresourcemanagement.includinganaddressspaceandoneormarcthreads.
2) Threadmanager:Threadcreation,synchronizationandscheduling.Threadsaresc
hedulableactivitiesattachedtoprocesses.
3) Communicationmanager:Communicationbetween(Threadsattachedtodiffere
ntprocessesonthesamecomputer.Somekernelsalsosupportcommunicationbetwe
enthreadsinremoteprocesses.
4) Memorymanager:Managementofphysicalandvirtualmemory.
5) Supervisor:Dispatchingofinterrupts,systemcalltrapsandotherexceptions:contr
olofmemorymanagementunit.

PROTECTION

To understand what we mean by an“ illegitim access ”to a resource, consider a file.
Let us suppose,
forthesakeofexplanation,thatopenfileshaveonlytwooperations.readandwrite.

Protectingthefileconsistoftwosub-
problems.Thefirstistoensurethateachofthefilestwooperationscanbeperformedonlybycli
entswiththerighttoperformit.Forexample,Smithwhoownsthefilehasreadandwriterightst
oit.Jonesmayonlyperformthereadoperation.AnillegitimateaccessherewouldbeifJoness
omehowmanagedtoperformwriteoperationonthefile.

Kernelsandprotection:Thekernelisaprogramthatisdistinguishedbythefactsthatitalwa
ysrunsanditscodeisexecutedwithcompleteaccessprivilegesforthephysicalresourcesonit
shostcomputer.Inparticular,itcancontrolthememorymanagementunitandsettheprocess
orregisterssothatnoothercodemayaccessthemachinesphysicalresourcesexceptinaccept
ableways.

Page38of88
PROCESSESANDTHREADS

Aprocessisaprogramunderexecutionandthreadisapartoftheprocess.
Aprocessconsistsofanexecutionenvironmenttogetherwithoneormorethreads.
Anexecutionenvironmentprimarilyconsistsof:
⮡ anaddressspace;

threadsynchronizationandcommunicationresourcessuchassemaphoresa
ndcommunicationinterfaces(forexamplesockets):
⮡ higher-levelresourcessuchasopenfilesandwindows.

Threadscanbecreatedanddestroyeddynamicallyasneeded.Thecentralaimofhavingmulti
plethreadsofexecutionistomaximizethedegreeofconcurrentexecutionbetweenoperation
s.Thusenablingtheoverlapofcomputationwithinputandoutput.Andenablingconcurrentp
rocessingonmultiprocessors.

Processesareheavyweighttasksthatrequiretheirownseparateaddressspaces.
Interprocesscommunicationisexpensiveandlimited.
Contextswitchingfromoneprocesstoanotherisalsocostly.
Threads,ontheotherhand,arelightweight.
Theysharethesameaddressspaceandcooperativelysharethesameheavyweightproc
ess.Interthreadcommunicationisinexpensive,andcontextswitchingfromonethrea
dtothenextislowcost.

COMMUNICATIONANDINVOCATION

Hereweconcentrateoncommunicationaspartoftheimplementationofwhatwehavecalled
aninvocation-
aconstructsuchasaremotemethodinvocation,remoteprocedurecalloreventnotification.

Weshallcoveroperatingsystemdesignissuesandconceptsbyaskingthefollowingquestion
sabouttheOS:

1) Whatcommunicationprimitivesdoesitsupply?

Page39of88
2) WhichprotocolsdoesitSupportandhowopenisthecommunicationimplementation
'?
3) Whatstepsaretakentomakecommunicationasefficientaspossible?
4) Whatsupportisprovidedforhigh-latencyanddisconnectedoperation?

Communicationprimitives:Therearethreecommunicationprimitives:doOperation,get
RequestandsendReply.Alltheseprimitivesareusedtoestablishcommunicationbetweenv
ariousclientsandservers.

Protocolsandopenness:Oneofthemainrequirementsoftheoperatingsystemistoprovides
tandardprotocolsthatenableinterworkingbetweenmiddlewareimplementationsondiffer
entplatforms.

Invocationperformance:Invocationperformanceisacriticalfactorindistributedsystem
design.Themoredesignersseparatefunctionalitybetweenaddressspaces.themoreremotei
nvocationsarerequired.Clientsandserversmaymakemanymillionsofinvocation-
relatedoperationsintheirlifetimes.sothatsmallfunctionsofmillisecondscountininvocatio
ncosts.Networktechnologiescontinuetoimprove.butinvocationtimeshavenotdecreasedi
nproportionwithincreasesinnetworkbandwidth.

RPCandRMIimplementationshavebeenthesubjectofstudybecauseofthewidespreadacc
eptanceofthesemechanismsforgeneral-purposeclient–
serverprocessing.Muchoftheresearchhasbeencarriedoutintoinvocationsoverthenetwor
kandparticularlyintohowinvocationmechanismscantakeadvantageofhighperformance
networks.

OPERATINGSYSTEMARCHITECTURE

Inthissection,weexaminethearchitectureofakernelsuitableforadistributedsystem.Wead
optafirst-
principlesapproachofstartingwiththerequirementofopennessandexaminingthemajorke
rnelarchitecturesthathavebeenproposedIdeally,thekernelwouldprovideonlythemostbas
icmechanismsuponwhichthegeneralresourcemanagementtasksalanodearccarriedout.

TypesofKernels(Monolithickernelsandmicrokernels):Therearetwokeyexamplesof
kerneldesign:theso-calledmonolithicandmicrokernel
approaches.Wherethesedesignsdifferprimarilyisinthedecisionas10whatfunctionalitybelon
gsinthekernelandwhatistobelefttoserverprocessesthatcanbedynamicallyloadedtorunontop
ofit.Althoughmicrokernelshavenotbeendeployedwidely.

TheUNIXoperatingsystemkernelhasbeencalledmonolithic.

Bycontrastinthecaseofamicrokerneldesignthekernelprovidesonlythemostbasicabstrac
tions,principallyaddressspaces,threadsandlocalinterprocesscommunication:allothersy
stemservicesareprovidedbyservers.

DISTRIBUTEDFILESYSTEMS

Adistributedfilesystemenablesprogramstostoreandaccessremotefilesexactlyasth
eydolocalones,allowinguserstoaccessfilesfromanycomputeronanetwork.
Filesystemswereoriginallydevelopedforcentralizedcomputersystemsanddesktop
computersasanoperatingsystemfacilityprovidingaconvenientprogramminginter
facetodiskstorage.
Theysubsequentlyacquiredfeaturessuchasaccesscontrolandfile-
lockingmechanismsthatmadethemusefulforthesharingofdataandprograms.
Distributedfilesystemssupportthesharingofinformationintheformoffilesandhard
wareresourcesintheformofpersistentstoragethroughoutanintranet.
Awelldesignedtileserviceprovidesaccesstofilesstoredataserverwithperformanceandreliabilitysimilar
toandinsomecasesbetterthanfilesstoredonlocaldisks

Filesystemmodules:
Fileattributerecordstructure:

Files contain both data and attributes. The data consist of a sequence of data items (typically 8-bit
bytes), accessible by operations to read and write any portion of the sequence. The attributes are held
as a single record containing information such as the length of the file, timestamps, file type, owner’s
identity and access control lists. A typical attribute record structure is illustrated in Figure 12.3. The
shaded attributes are managed by the file system and are not normally updatable by user programs.
Characteristicsoffilesystems:

1) Filesystemsareresponsiblefortheorganization,Storage,retrieval,naming,sharing
andprotectionoffiles.
2) Filesarestoredondisksorothernon-volatilestoragemedia.
3) Filescontainbothdateandattributes.
4) Thedataconsistofasequenceofdataitems(typically8-
bitbytes)accessiblebyoperationstoreadandwriteanyportionofthesequence.
5) Theattributesarcheldasasinglerecordcontaininginformationsuchasthelengthofth
efile,Timestamps,filetype,owner'sidentityandaccess-controllists.
6) Atypicalattributerecordstructureisillustratedinpreviousfigure.
7) Theshadedattributesaremanagedbythefilesystemandarenotnormallyupdatableb
yuserprograms.
8) Filesystemsaredesignedtostoreandmanagelargenumbersoffileswithfacilitiesfor
creating,naminganddeletingfiles.
9) Adirectoryisafileoftenofaspecialtypethatprovidesamappingfromtextnamestoint
ernalfileidentifiers;Directoriesmayincludethenamesofotherdirectories.

Distributedfilesystemrequirements:Manyotherrequirementsandpotentialpitfallsinth
edesignofdistributedserviceswerefirstobservedintheearlydevelopmentofdistributedfile
systems.

1) Transparency:Thefileserviceisusuallythemostheavilyloadedserviceinanintranet,s
oitsfunctionalityandperformancearecritical.Thedesignof

Page42of88
thefileserviceshouldsupportmanyofthetransparencyrequirementsfordistributedsystem
sThefollowingformsoftransparencyarepartiallyorwhollyaddressedbycurrentfileservice
s:

i. Accesstransparency:Clientprogramsshouldbeunawareofthedistributionoffiles
.Asinglesetofoperationsisprovidedforaccesstolocalandremotefiles.Programswri
ttentooperateonlocalfilesareabletoaccessremotefileswithoutmodification.
ii. Locationtransparency:Clientprogramsshouldseeauniformfilenamespace.Files
orgroupsoffilesmayberelocatedwithoutchangingtheirpathnames,anduserprogra
msseethesamenamespacewherevertheyareexecuted.
iii. Mobilitytransparency:Neitherclientprogramsnorsystemadministrationtablesi
nclientnodesneedtobechangedwhenfilesaremoved.Thisallowsfilemobility-
filesormorecommonly,setsorvolumesoffilesmaybemoved,eitherbysystemadmi
nistratorsorautomatically.
iv. Performancetransparency:Clientprogramsshouldcontinuetoperformsatisfact
orilywhiletheloadontheservicevarieswithinaspecifiedrange.
v. Scalingtransparency:Theservicecanbeexpandedbyincrementalgrowthtodealw
ithawiderangeofloadsandnetworksizes.

2) Concurrentfileupdates:Changestoafilebyoneclientshouldnotinterferewiththeoper
ationofotherclientssimultaneouslyaccessingorchangingthesamefile.Theneedforconcur
rencycontrolforaccesstoshareddatainmanyapplicationsiswidelyacceptedandtechnique
sareknownforitsimplementation,buttheyarecostly.

3) Filereplication:Inafileservicethatsupportsreplication,afilemayberepresentedbysev
eralcopiesofitscontentsatdifferentlocations.Thishastwobenefits-
itenablesmultipleserverstosharetheloadofprovidingaservicetoclientsaccessingthesame
setoffiles,enhancingthescalabilityoftheservice,anditenhancesfaulttolerancebyenabling
clientstolocateanotherserverthatholdsacopyofthefilewhenonehasfailed.

Page43of88
4) Hardwareandoperatingsystemheterogeneity:Theserviceinterfacesshouldbedefi
nedsothatclientandserversoftwarecanbeimplementedfordifferentoperatingsystemsand
computers.Thisrequirementisanimportantaspectofopenness.

5) Faulttolerance:Thecentralroleofthefileserviceindistributedsystemsmakesitessenti
althattheservicecontinuetooperateinthefaceofclientandserverfailures.

6) Consistency:Thisreferstoamodelforconcurrentaccesstofilesinwhichthefilecontents
seenbyallortheprocessesaccessingorupdatingagivenfilearethosethattheywouldseeifonl
yasinglecopyofthefilecontentsexisted.

7) Security:Virtuallyallfilesystemsprovideaccesscontrolmechanismsbasedontheuseo
faccesscontrollists.Indistributedfilesystems,thereisaneedtoauthenticateclientrequestss
othataccesscontrolattheserverisbasedoncorrectuseridentitiesandtoprotectthecontentso
frequestandreplymessageswithdigitalsignaturesand(optionally)encryptionofsecretdata
.

8) Efficiency:Adistributedfileserviceshouldofferfacilitiesthatareofatleastthesamepo
werandgeneralityasthosefoundinconventionalfilesystemsandshouldachieveacompara
blelevelofperformance.

FILESERVICEARCHITECTURE

Anarchitecturethatoffersaclearseparationofthemainconcernsinprovidingaccesstofilesi
sobtainedbystructuringthefileserviceasthreecomponents–
aflatfileservice,adirectoryserviceandaclientmodule.

Thedivisionofresponsibilitiesbetweenthemodulescanbedefinedasfollows:

1) Flatfileservice:Theflatfileserviceisconcernedwithimplementingoperationsonth
econtentsoffiles.UniqueFileIdentifiers(UFIDs)areusedtorefertofilesinallrequest
sforthatfileserviceoperations.Thedivisionofresponsibilitiesbetweenthefileservi
ceandthedirectoryserviceisbasedupontheuseofUFIDs.

Page44of88
2) Directoryservice:Thedirectoryserviceprovidesamappingbetweentextnamesfor
filesandtheirUFIDs.ClientsmayobtaintheUFIDofafilebyquotingitstextnametoth
edirectoryservice.
3) Clientmodule:Aclientmodulerunsineachclientcomputerintegratingandextendi
ngtheoperationsoftheflatfileserviceandthedirectoryserviceunderasingleapplicat
ionprogramminginterfacethatisavailabletouser-
levelprogramsinclientcomputers.

CASESTUDY:SUNNETWORKFILESYSTEM

ThebelowfigureshowsthearchitectureofSunNFS.Itfollowstheabstractmodeldefinedint
heprecedingsection.AllimplementationsofNFSsupporttheNFSprotocol-
asetofremoteprocedurecallsthatprovidethemeansforclientstoperformoperationsonare
motefilestore.

TheNFSservermoduleresidesinthekerneloneachcomputerthatactsasanNFSserver.Req
uestsreferringtofilesinaremotefilesystemaretranslatedbytheclientmoduletoNFSprotoc
oloperationsandthenpassedtotheNFSservermoduleatthecomputerholdingtherelevantfil
esystem.

TheNFSclientandservermodulescommunicateusingremoteprocedurecallingSun'sRPC
systemwasdevelopedforuseinNFS.ItcanbeconfiguredtouseeitherUDPorTCP,andtheN
FSprotocoliscompatiblewithboth.Aportmapperserviceisincludedtoenableclientstobind
toservicesinagivenhostbyname.TheRPCinterfacetotheNFSserverisopen:anyprocessca
n
sendrequeststoanNFSserver;iftherequestsarevalidandtheyincludevalidusercredentials.
theywillbeactedupon.Thesubmissionofsignedusercredentialscanberequiredasanoption
alsecurityfeature,ascantheencryptionofdataforprivacyandintegrity.

NFSadoptstheUNIXmountablefilesystemastheunitoffilegroupingdefinedintheprecedi
ngsection.
(Terminologynote:thesinglewordfilesystemreferstothesetoffilesheldinastoragedeviceo
rpartition.whereasthewordfilesystemrefertoasoftwarecomponentthatprovidesaccesstof
iles.)Thefilesystemidentifierfieldisauniquenumberthatisallocatedtoeachfilesystemwhe
nitiscreated(andintheUNIXimplementationisstoredinthesuperblockofthefilesystem).T
hei-nodegenerationnumberisneededbecauseintheconventionalUNIXfilesystemi-
nodenumbersarereusedafterafileisremoved.

*****
UNIT–III

PEERTOPEERSYSTEMS

INTRODUCTIO
N

ThedemandforservicesintheInternetcanbeexpectedtogrowtoascalethatislimitedo
nlybythesizeoftheworld’spopulation.
Thegoalofpeer-to-
peersystemsistoenablethesharingofdataandresourcesonaverylargescalebyelimi
natinganyrequirementforseparately-
managedserversandtheirassociatedinfrastructure.
Peer-to-
peersystemsaimtosupportusefuldistributedservicesandapplicationsusingdataan
dcomputingresourcesavailableinthepersonalcomputersandworkstationsthatarep
resentontheInternetandothernetworksinever-increasingnumbers.
Traditionalclient-
serversystemsmanageandprovideaccesstoresourcessuchasfiles,webpagesorothe
rinformationobjectslocatedonasingleservercomputerorasmallclusterortightly-
coupledservers.Withsuchcentralizeddesignsfewdecisionsarerequiredaboutthepl
acementortheresourcesorthemanagementofserverhardwareresources,butthescal
eoftheserviceislimitedbytheserverhardwarecapacityandnetworkconnectivity.
Peer-to-
peersystemsprovideaccesstoinformationresourceslocatedoncomputersthrougho
utanetwork(whetheritbetheInternetoracorporatenetwork).

NAPSTERANDITSLEGACY

Thefirstapplicationinwhichademandforaglobally-
scalableinformationstorageandretrievalserviceemergedwasthedownloadingofdi
gitalmusicfiles.
Boththeneedandthefeasibilityofapeer-to-
peersolutionwerefirstdemonstratedbytheNapsterfilesharingsystemwhichprovid
edameansforuserstosharefiles.

Page47of88
Napsterbecameverypopularformusicexchangesoonafteritslaunchin1999.Atitspea
k,severalmillionuserswereregisteredandthousandswereswappingmusicfilessim
ultaneously.

Napster'sarchitectureincludedcentralizedindexesbutuserssuppliedthefiles,which
werestoredandaccessedontheirpersonalcomputers.
Napster'smethodoroperationisillustratedbythesequenceofstepsshowninaboveFig
ure.
Notethatinstep5clientsareexpectedtoaddtheirownmusicfilestothepoolofsharedres
ourcesbytransmittingalinktotheNapsterindexingserviceforeachavailablefile.
ThusthemotivationforNapsterandthekeytoitssuccesswastomakealargewidely-
distributedsetoffilesavailabletousersthroughouttheInternet.

PEERTOPEERMIDDLEWARE

Akeyprobleminthedesignofpeer-to-
peerapplicationsistoprovideamechanismtoenableclientstoaccessdataresourcesq
uicklyanddependablywherevertheyarelocatedthroughoutthenetwork.

Page48of88
Napstermaintainedaunifiedindexofavailablefilesforthispurposegivingthenetwor
kaddressesoftheirhosts.
Peer-to-
peermiddlewaresystemsaredesignedspecificallytomeettheneedfortheautomatic
placementandsubsequentlocationofthedistributedobjectsmanagedbypeer-to-
peersystemsandapplications.

Functionalrequirements:Thefunctionofpeer-to-
peermiddlewareistosimplifytheconstructionofservicesthatareimplementedacrossmany
hostsinawidelydistributednetwork.Toachievethisitmustenableclientstolocateandcomm
unicatewithanyindividualresourcemadeavailabletoaserviceeventhoughtheresourcesare
widelydistributedamongstthehosts.

Non-functionalrequirements:Toperformeffectivelypeer-to-
peermiddlewaremustalsoaddressthefollowingnon-functionalrequirements.

Globalscalability:Oneoftheaimsofpeer-to-
peerapplicationsistoexploitthehardwareresourcesofverylargenumbersofhostsconnecte
dtotheInternet.Peer-
topeermiddlewaremustthereforebedesignedtosupportapplicationsthataccessmillionsof
objectsontensofthousandsorhundredsofthousandsofhosts.

Loadbalancing:Theperformanceofanysystemdesignedtoexploitalargenumberofcomp
utersdependsuponthebalanceddistributionofworkloadacrossthem.

Optimizationforlocalinteractionsbetweenneighbouringpeers:Thenetworkdistance
betweennodesthatinteracthasasubstantialimpactonthelatencyofindividualinteractionss
uchasclientrequestsforaccesstoresources.

Accommodatingtohighlydynamichostavailability:Mostpeer-to-
peersystemsareconstructedfromhostcomputersthatarefreetojoinorleavethesystematany
time.

Page49of88
ROUTINGOVERLAYS

Adistributedalgorithmknownasaroutingoverlaytakesresponsibilityforlocatingno
desandobjects.
Thenamedenotesthefactthatthemiddlewaretakestheformofalayerthatisresponsibl
eforroutingrequestsfromanyclienttoahostthatholdstheobjecttowhichtherequesti
saddressed.
Theobjectsofinterestmaybeplacedandsubsequentlyrelocatedtoanynodeinthenetw
orkwithoutclientinvolvement.
Itistermedanoverlaysinceitimplementsaroutingmechanismintheapplicationlayert
hatisquiteseparatefromanyotherroutingmechanismsdeployedatthenetworklevel
suchasIProuting.

Theroutingoverlayensuresthatanynodecanaccessanyobjectbyroutingeachrequestt
hroughasequenceofnodes,exploitingknowledgeateachofthemtolocatethedestina
tionobjectPeer-to-
peersystemsusuallystoremultiplereplicasofobjectstoensureavailability.
Inthatcase,theroutingoverlaymaintainsknowledgeofthelocationofalltheavailabler
eplicasanddeliversrequeststothenearest'live'node(i.e.onethathasnotfailed)thatha
sacopyoftherelevantobject.

Page50of88
Themaintaskofaroutingoverlayisthefollowing:
⮡Aclientwishingtoinvokeanoperationonanobjectsubmitsarequestincludingt
heobject'sGUIDtotheroutingoverlaywhichroutestherequesttoanodeatwhi
chareplicaoftheobjectresides.

OVERLAYCASESTUDIES:PASTRY&TAPESTRY

TheprefixroutingapproachisadoptedbybothPastryandTapestry.Pastryhasastraightforw
ardbuteffectivedesignwhichmakesitagoodfirstexampleforustostudyindetail.Pastryisth
emessageroutinginfrastructuredeployedinseveralapplications.

Pastry:AllthenodesandobjectsthatcanbeaccessedthroughPastryareassigned128-
bitGUIDs.Fornodes,thesearecomputedbyapplyingasecurehashfunction(suchasSHA)to
thepublickeywithwhicheachnodeisprovided.ForobjectssuchasfilestheGUIDiscompute
dbyapplyingasecurehashfunctiontotheobject'snameortosomepartoftheobject'sstoredsta
te.TheresultingGUIDhastheusualpropertiesofsecurehashvalues.

InanetworkwithNparticipatingnodes,thePastryroutingalgorithmwillcorrectlyrouteame
ssageaddressedtoanyGUIDinO(logN)steps.IftheGUIDidentifiesanodethatiscurrentlya
ctive,themessageisdeliveredtothatnode;otherwisethemessageisdeliveredtotheactiveno
dewhoseGUIDisnumericallyclosesttoit.Activenodestakeresponsibilityforprocessingre
questsaddressedtoallobjectsintheirnumericalneighbourhood.

Tapestry:Tapestryimplementsadistributedhashtableandroutesmessagestonodesbased
onGUIDsassociatedwithresourcesusingprefixroutinginamannersimilartoPastry.Nodes
thatholdresourcesusethepublish(GUID)primitivetomakethemknowntoTapestry,thehol
dersofresourcesremainresponsibleforstoringthem.Replicatedresourcesarepublishedwit
hthesameGUIDbyeachnodethatholdsareplica,resultinginmultipleentriesintheTapestryr
outingstructure.

ThisgivesTapestryapplicationsadditionalflexibility:theycanplacereplicasclose(innetw
orkdistance)tofrequentusersofresourcesinordertoreduce

Page51of88
latenciesandminimizenetworkloadsortoensuretoleranceofnetworkandhostfailures.

InTapestry160-
bitidentifiersareusedtoreferbothtoobjectsandtothenodesthatperformroutingactions.Ide
ntifiersarceitherNodeId,whichrefertocomputersthatperformroutingoperationsorGUID
swhichrefertotheobjects.

APPLICATIONCASESTUDIES:SQUIRREL&OCEANSTORE

Large-scalepeer-to-
peersystemsarenotyetamainstreamtechnology.Theirwidestdeploymenthasbeeninappli
cationsforfiledownloadingbyend-
usersinsystemssuchasNapster,Freenet,Gnutella.KazaaandBitTorrent.Butthosesystems
donotemployseparateroutingoverlaylayers,soevaluationsoftheirperformancearedifficu
lttoextrapolatetootherapplications.

Theroutingoverlaylayersdescribedintheprecedingsectionhavebeenexploitedinseverala
pplicationexperimentsandtheresultingapplicationshavebeenextensivelyevaluated.

Wehavechosentwoofthemforfurtherstudy,theSquirrelwebcachingservicebasedonPastr
y,andtheOceanStore.

Squirrelwebcache:TheauthorsofPastryhavedevelopedtheSquirrelpeer-to-
peerwebcachingserviceforuseinlocalnetworksofpersonalcomputers.Inmediumandlarg
elocalnetworkswebcachingistypicallyperformedusingadedicatedservercomputerorclu
ster.TheSquirrelsystemperformsthesametaskbyexploitingstorageandcomputingresour
cesalreadyavailableondesktopcomputersinthelocalnetwork.Wefirstgiveabriefgenerald
escriptionoftheoperationofawebcachingservice,thenweoutlinethedesignofSquirrelandr
eviewitseffectiveness.

Webcaching:WebbrowsersgenerateHTTPGETrequestsforInternetobjectslikeHTMLp
ages,imagesetc.Thesemaybeservicedfromabrowsercacheontheclientmachine,fromapr
oxywebcache-
aservicerunningonanothercomputerinthesamelocalnetworkoronanearbynodeintheInte
rnetorfromtheoriginwebserver– theserverwhosedomainflameisincludedinthe

Page52of88
parametersoftheGETrequest-dependingonwhichcontainsafreshcopyoftheobject.

Thelocalandproxycacheseachcontainsasetofrecently-
retrievedobjectsorganizedforfastlookupbyURL.Someobjectsareunreachablebecauseth
eyaregenerateddynamicallybytheserverinresponsetoeachrequest.

WhenabrowsercacheorproxywebcachereceivesaGETrequesttherearethreepossibilities
:therequestedobjectisunreachable.thereisacachemissortheobjectisroundinthecache.Int
hefirsttwocasestherequestisforwardedtothenextleveltowardstheoriginwebserver.Whe
ntherequestedobjectisfoundinacache,thecachedcopymustbetestedforfreshness.

Squirrel:TheSquirrelwebcachingserviceperformsthesamefunctionsusingasmallpartof
theresourcesofeachclientcomputeronalocalnetwork.TheSHA-
IsecurehashfunctionisappliedtotheURLofeachcachedobjecttoproducea128-
bitPastryGUID.InthesimplestimplementationofSquirrel-
whichprovedtobethemosteffectiveone-
thenodewhoseGUIDisnumericallyclosesttotheGUIDofanobjectbecomesthatobject'sho
menode,responsibleforholdinganycachedcopyoftheobject.

ClientnodesareconfiguredtoincludealocalSquirrelproxyprocesswhichtakesresponsibili
tyforbothlocalandremotecachingofwebobjects.Ifafreshcopyofarequiredobjectisnotinth
elocalcacheSquirrelroutesaGetrequest.Ifthehomenodehasafreshcopyitdirectlyrespond
stotheclientwithanot-modifiedmessageorafreshcopy,asappropriate.

EvaluationofSquirrel:TheevaluationcomparedtheperformanceofaSquirrelwebcache
withacentralizedoneinthreerespects:

1) Thereductionintotalexternalbandwidthused:Thetotalexternalbandwidthisin
verselyrelatedtothehitratio,sinceitisonlycachemissesthatgeneraterequeststoexte
rnalwebservers.
2) Thelatencyperceivedbyusersforaccesstowebobjects:Theuseofaroutingoverla
yresultsinseveralmessagetransfers(routinghops)acrossthelocalnetworktotransm
itarequestfromaclienttothehostresponsibleforcachingtherelevantobject(thehom
enode).

Page53of88
3) Thecomputationalandstorageloadimposedonclientmodules:Theaveragenu
mberofcacherequestsservedforothernodesbyeachnodeoverthewholeperiodofthe
evaluationwasextremelylow.

OceanStorefilestore:ThedevelopersofTapestryhavedesignedandbuiltaprototypeforap
eer-to-
peerfilestore.TheOceanStoredesignaimstoprovideaverylargescaleincrementally-
scalablepersistentstoragefacilityformutabledataobjectswithlong-
termpersistenceandreliabilityinanenvironmentofconstantlychangingnetworkandcomp
utingresources.Thedesignincludesprovisionforthereplicatedstorageofbothmutableandi
mmutabledataobjects.

ThreetypesofGUIDsareusedassummarizedinaboveFigure.ThefirsttwoareGUIDsorthet
ypenormallyassignedtoobjectsstoredinTapestry,theyarecomputedfromthecontentsorth
erelevantblockusingasecurehashfunctionsothattheycanbeusedlatertoauthenticateandve
rifytheintegrityofthecontents.ThethirdtypeoridentifierusedisAGUIDs.Theserefer(indir
ectly)totheentirestreamofversionsofanobjectenablingclientstoaccessthecurrentversion
oftheobjectoranypreviousversion.

TIMEANDGLOBALSTATES

INTRODUCTION

Timeisanimportantandinterestingissueindistributedsystemsforseveralreasons.Firsttim
eisaquantityweoftenwanttomeasureaccurately.Inordertoknowatwhattimeordayapartic
ulareventoccurredataparticularcomputeritisnecessarytosynchronizeitsclockwithanaut
horitativeexternal source of time. For example, an ‘e-commerce’ transaction involves
eventsatamerchant'scomputerandatabankcomputer.Itisimportantforauditingpurposest
hatthoseeventsarctimestampedaccurately.

Page54of88
CLOCKS,EVENTSANDPROCESSSTATES

Clocks:Wehaveseenhowtoordertheeventsataprocessbutnothowtotimestampthem-
toassigntothemadateandtimeofday.Computerseachcontaintheirownphysicalclock.The
seclocksareelectronicdevicesthatcountoscillationsoccurringinacrystalatadefinitefrequ
encyandthattypicallydividethiscountandstoretheresultinacounterregister.

Clockskewandclockdrift:Computerclockslikeanyotherstendnottobeinperfectagreem
entasshowninbelowfigure.Theinstantaneousdifferencebetweenthereadingsofanytwocl
ocksiscalledtheirskew.Alsothecrystal-
basedclocksusedincomputersarelikeanyotherclocks,subjecttoclockdriftwhichmeansth
attheycounttimeatdifferentrates,andsodiverge.Theunderlyingoscillatorsaresubjecttoph
ysicalvariationswiththeconsequencethattheirfrequenciesofoscillationdiffer.Moreover,
eventhesameclock'sfrequencyvarieswithtemperature.Designsexistthatattempttocompe
nsateforthisvariation,buttheycannoteliminateit.Thedifferenceintheoscillationperiodbet
weentwoclocksmightbeextremelysmall,butthedifferenceaccumulatedovermanyoscilla
tionsleads10anobservabledifferenceinthecountersregisteredbytwoclocks,nomatterhow
accuratelytheywereinitializedtothesamevalue.

CoordinatedUniversalTime:Computerclockscanbesynchronizedtoexternalsourcesof
highlyaccuratetime.CoordinatedUniversalTime-
abbreviatedasUTC(fromtheFrenchequivalent)-
isaninternationalstandardfortimekeeping.Itisbasedonatomictime,butaso-
calledleapsecondisinserted-or,morerarelydeleted-
occasionallytokeepinstepwithastronomicaltime.UTCsignalsaresynchronizedandbroad
castregularlyfromlandbasedradiostationsandsatellitescoveringmanypartsoftheworld.

Page55of88
Forexample,intheUSA,theradiostationWWVbroadcaststimesignalsonseveralshortwav
efrequencies.SatellitesourcesincludetheGlobalPositioningSystem(GPS).

SYNCHRONIZINGPHYSICALCLOCKS

Inordertoknowatwhattimeofdayeventsoccurattheprocessesinourdistributedsystemfore
xample.Foraccountancypurposes-
itisnecessarytosynchronizetheprocesses'clocksCiwithanauthoritativeexternalsourceoft
ime.Thisisexternalsynchronization.AndiftheclocksCiarcsynchronizedwithoneanothert
oaknowndegreeofaccuracythenwecanmeasuretheintervalbetweentwoeventsoccurring
atdifferentcomputersbyappealingtotheirlocalclocks-
eventhoughtheyarenotnecessarilysynchronizedtoanexternalsourceoftime.Thisisintern
alsynchronization.

Cristian'smethodforsynchronizingclocks:Cristiansuggestedtheuseofatimeserver,co
nnectedtoadevicethatreceivessignalsfromasourceofUTC,tosynchronizecomputersexte
rnally.Uponrequest,theserverprocessSsuppliesthelimeaccordingtoitsclock,asshowninb
elowFigure.

Cristianobservedthatwhilethereisnoupperboundonmessagetransmissiondelaysinanasy
nchronoussystem.Theround-
triptimesformessagesexchangedbetweenpairsofprocessesareoftenreasonablyshort-
asmallfractionofasecond.Hedescribesthealgorithmasprobabilistic:themethodachievess
ynchronizationonlyiftheobservedround-
triptimesbetweenclientandserveraresufficientlyshortcomparedwiththerequiredaccurac
y.

Aprocessprequeststhetimeinamessagemr,andreceivesthetimevaluet,inamessagemt,
(tisinsertedinmt,atthelastpossiblepointbeforetransmissionfromS'scomputer).Processpr
ecordsthetotalround-
triptimeTroundtakentosendtherequestmrandreceivethereplymt.Itcanmeasurethistime
withreasonableaccuracyit'itsrateofclockdriftissmall.

Cristianmethodsuffersfromtheproblemassociatedwithallservicesimplementedbyasingl
eserver,thatthesingletimeservermightfailandthusrendersynchronizationimpossibletem
porarily.Cristiansuggestedforthis

Page56of88
reason,thattimeshouldbeprovidedbyagroupofsynchronizedtimeservers,eachwitharecei
verforUTCtimesignals.Forexample,aclientcouldmulticastitsrequesttoallserversanduse
sonlythefirstreplyobtained.

TheBerkeleyalgorithm:Itisaninternalsynchronizationmethodinwhichacoordinatorco
mputerischosentoactasthemaster.UnlikeCristian'sprotocol,thiscomputerperiodicallyp
ollstheothercomputerswhoseclocksaretobesynchronized,calledslaves.Theslavessendb
acktheirclockvaluestoit.Themasterestimatestheirlocalclocktimesbyobservingtheround
-
triptimes(similarlytoCristian'stechnique)anditaveragesthevaluesobtained(includingits
ownclock'sreading).

Insteadofsendingtheupdatedcurrenttimebacktotheothercomputers–
whichwouldintroducefurtheruncertaintyduetothemessagetransmissiontime-
themastersendstheamountbywhicheachindividualslave'sclockrequiresadjustment.This
canbeapositiveornegativevalue.

Thealgorithmeliminatesreadingsfromfaultyclocks.Suchclockscouldhaveasignificanta
dverseeffectifanordinaryaveragewastaken.Themastertakesafaulttolerantaverage.Thati
sasubsetorclocksischosenthatdonotdifferfromoneanotherbymorethanaspecifiedamoun
tandtheaverageistakenofreadingsfromonlytheseclocks.

TheNetworkTimeProtocol:

Cristian’s
methodandtheBerkeleyalgorithmareintendedprimarilyforusewithinintranets.
TheNetworkTimeProtocol(NTP)definesanarchitectureforatimeserviceandaprot
ocoltodistributetimeinformationovertheInternet.

LOGICALTIMEANDLOGICALCLOCKS

Fromthepointofviewofanysingleprocesseventsareordereduniquelybytimesshownonthe
localclock.However,asLamportpointedoutsincewecannotsynchronizeclocksperfectlya
crossadistributedsystem,wecannotingeneralusephysicaltimetofindouttheorderofanyar
bitrarypairofeventsoccurringwithinit.

Page57of88
Ingeneral,wecalluseaschemethatissimilartophysicalcausality,butthatappliesindistribut
edsystems,toordersomeoftheeventsthatoccuratdifferentprocesses.Thisorderingisbased
ontwosimpleandintuitivelyobviouspoints:

Lamportcalledthepartialorderingobtainedbygeneralizingthesetworelationshipsthehapp
ened-beforerelation.

GLOBALSTATES

Inthisandthenextsectionweshallexaminetheproblemoffindingoutwhetheraparticularpr
opertyistrueofadistributedsystemasitexecutes.Webeginbygivingtheexamplesofdistribu
tedgarbagecollection,deadlockdetection,terminationdetectionanddebugging.

Page58of88
The'snapshot'algorithmofChandyandLamport:
Page59of88
DISTRIBUTEDDEBUGGING

Wenowexaminetheproblemofrecordingasystem'sglobalstatesothatwemaymakeusefuls
tatementsaboutwhetheratransitorystate-asopposedtoastablestate-
occurredinanactualexecution.Thisiswhatwerequire,ingeneral,whendebuggingadistribu
tedsystem.
Page60of88
Observingconsistentglobalstates:

COORDINATIONANDAGREEMENT

INTRODUCTION

Failureassumptionsandfailuredetectors:
Page61of88
DISTRIBUTEDMUTUALEXCLUSION
Page62of88
Algorithmsformutualexclusion:
Page63of88
ELECTIONS
Page64of88
MULTICASTCOMMUNICATION
Page65of88
Basicmulticast:

Reliablemulticast:
Page66of88
Orderedmulticast:

CONSENSUSANDRELATEDPROBLEMS
Page67of88
*****
Page68of88
UNIT–IV

TRANSACTIONSANDCONCURRENCYCONTROLINTRODUC

TION

Page69of88
TRANSACTIONS

NESTEDTRANSACTIONS
Page70of88
LOCKS
Page71of88
Deadlocks:
Page72of88
Page73of88
OPTIMISTICCONCURRENCYCONTROL
Page74of88
TIMESTAMPORDERING
Page75of88
*****
Page76of88
UNIT–

VREPLICATI

ON

INTRODUCTION

Inthischapter,westudythereplicationofdata:themaintenanceofcopiesofdataatmultipleco
mputers.Replicationisakeytotheeffectivenessofdistributedsystemsinthatitcanprovidee
nhancedperformance,highavailabilityandfaulttolerance.Replicationisusedwidely.Fore
xample,thecachingofresourcesfromwebserversinbrowsersandwebproxyserversisaform
ofreplication,sincethedataheldincachesandatserversarereplicasofoneanother.TheDNS
namingservice,maintainscopiesofname-to-
attributemappingsforcomputersandisreliedonforday-to-
dayaccesstoservicesacrosstheInternet.

Replicationisatechniqueforenhancingservices.Themotivationsforreplicationinclude:

1) Performanceenhancement:Thecachingofdataatclientsandserversisbynowfam
iliarasameansofperformanceenhancement.Forexample,browsersandproxyserve
rscachecopiesofwebresourcestoavoidthelatencyoffetchingresourcesfromtheori
ginatingserver.Furthermore,dataaresometimesreplicatedtransparentlybetweens
everaloriginatingserversinthesamedomain.Theworkloadissharedbetween the
servers by binding all the server IP addresses to the
site’sDNSname,saywww.aWebSite.org.ADNSlookupofwww.aWebSite.orgres
ults in one of the several servers’ IP addresses being returned, in a
roundrobinfashionMoresophisticatedload-
balancingstrategiesarerequiredformorecomplexservicesbasedondatareplicatedb
etweenthousandsofservers.
2) Increasedavailability:Usersrequireservicestobehighlyavailable.Thatis,thepro
portionoftimeforwhichaserviceisaccessiblewithreasonableresponsetimesshould
becloseto100%.Apartfromdelaysduetopessimisticconcurrencycontrolconflicts(
duetodatalocking),thefactorsthatarerelevanttohighavailabilityare:

Page77of88
serverfailures;
networkpartitionsanddisconnectedoperation(communicationdisconnectio
nsthatareoftenunplannedandareasideeffectofusermobility).

Totakethefirstofthese,replicationisatechniqueforautomaticallymaintainingtheavailabil
ityofdatadespiteserverfailures.Ifdataarereplicatedattwoormorefailure-
independentservers,thenclientsoftwaremaybeabletoaccessdataatanalternativeserversh
ouldthedefaultserverfailorbecomeunreachable.Thatis,thepercentageoftimeduringwhic
htheserviceisavailablecanbeenhancedbyreplicatingserverdata.Ifeachofnservershasani
ndependentprobabilitypoffailingorbecomingunreachable,thentheavailabilityofanobjec
tstoredateachoftheseserversis:

Page78of88
SYSTEMMODEL&GROUPCOMMUNICATION

Thedatainoursystemconsistofacollectionofitemsthatweshallcallobjects.An‘object’
could be a file, say, or a Java object. But each such logical
objectisimplementedbyacollectionofphysicalcopiescalledreplicas.Thereplicasarephysi
calobjects,eachstoredatasinglecomputer,withdataandbehaviourthataretiedtosomedegr
eeofconsistencybythesystem’s operation. The ‘replicas’ of a given object are not
necessarily identical, at least notatanyparticularpointintime.

Somereplicasmayhavereceivedupdatesthatothershavenotreceived.Inthissection,wepro
videageneralsystemmodelformanagingreplicasandthendescribetheroleofgroupcommu
nicationsystemsinachievingfaulttolerancethroughreplication,highlightingtheimportan
ceofview-synchronousgroupcommunication.

Systemmodel:Weassumeanasynchronoussysteminwhichprocessesmayfailonlybycras
hing.Ourdefaultassumptionisthatnetworkpartitionsmaynotoccur,butweshallsometimes
considerwhathappensiftheydooccur.Networkpartitionsmakeithardertobuildfailuredete
ctors,whichweusetoachievereliableandtotallyorderedmulticast.

Forthesakeofgenerality,wedescribearchitecturalcomponentsbytheirrolesanddonotmea
ntoimplythattheyarenecessarilyimplementedbydistinctprocesses(orhardware).Themo
delinvolvesreplicasheldbydistinctreplicamanagers(seeFigure18.1),whicharecompone
ntsthatcontainthereplicasonagivencomputerandperformoperationsuponthemdirectly.T
hisgeneralmodelmaybeappliedinaclient-
serverenvironment,inwhichcaseareplicamanagerisaserver.

Page79of88
Weshallsometimessimplycallthemserversinstead.Equally,itmaybeappliedtoanapplicat
ionandapplicationprocessescaninthatcaseactasbothclients andreplica managers.
Forexample, the user’s laptopon a train
maycontainanapplicationthatactsasareplicamanagerfortheirdiary.

Weshallalwaysrequirethatareplicamanagerappliesoperationstoitsreplicasrecoverably.
Thisallowsustoassumethatanoperationatareplicamanagerdoesnotleaveinconsistentresu
ltsifitfailspartwaythrough.Suchareplicamanagerappliesoperationstoitsreplicasatomica
lly(indivisibly),sothatitsexecutionisequivalenttoperformingoperationsinsomestrictseq
uence.

Moreover,thestateofitsreplicasisadeterministicfunctionoftheirinitialstatesandtheseque
nceofoperationsthatitappliestothem.Otherstimuli,suchasthereadingonaclockoranattac
hedsensor,havenobearingonthesestatevalues.Withoutthisassumption,consistencyguar
anteesbetweenreplicamanagersthatacceptupdateoperationsindependentlycouldnotbem
ade.Thesystemcanonlydeterminewhichoperationstoapplyatallreplicamanagersandinw
hatorder– itcannotreproducenon-
deterministiceffects.Theassumptionimpliesthatitmaynotbepossible,dependinguponthe
threadingarchitecture,fortheserverstobemulti-threaded.

Ofteneachreplicamanagermaintainsareplicaofeveryobject,andweassumethisissounless
westateotherwise.However,thereplicasofdifferentobjectsmaybemaintainedbydifferent
setsofreplicamanagers.Forexample,oneobjectmaybeneededmostlybyclientsononenet
workandanotherbyclientsonanothernetwork.Thereislittletobegainedbyreplicatingthem
atmanagersontheothernetwork.

Ingeneral,fivephasesareinvolvedintheperformanceofasinglerequestuponthereplicated
objects[Wiesmannetal.2000].Theactionsineachphasevaryaccordingtothetypeofsystem
,aswillbecomeclearinthenexttwosections.Forexample,aservicethatsupportsdisconnect
edoperationbehavesdifferentlyfromonethatprovidesafault-
tolerantservice.Thephasesareasfollows:

Page80of88
1) Request:Thefrontendissuestherequesttooneormorereplicamanagers:–
eitherthefrontendcommunicateswithasinglereplicamanager,whichinturncomm
unicateswithotherreplicamanagers;–
orthefrontendmulticaststherequesttothereplicamanagers.
2) Coordination:Thereplicamanagerscoordinateinpreparationforexecutingthereq
uestconsistently.Theyagree,ifnecessaryatthisstage,onwhethertherequestistobea
pplied(itmightnotbeappliedatalliffailuresoccuratthisstage).Theyalsodecideonth
eorderingofthisrequestrelativetoothers.
i. FIFOordering:Ifafrontendissuesrequestrandthenrequestrc,anycorrectre
plicamanagerthathandlesrchandlesrbeforeit.
ii. Causalordering:Iftheissueofrequestrhappened-
beforetheissueofrequestrc,thenanycorrectreplicamanagerthathandlesrch
andlesrbeforeit.
iii. Totalordering:Ifacorrectreplicamanagerhandlesrbeforerequestrc,thena
nycorrectreplicamanagerthathandlesrchandlesrbeforeit.

FAULTTOLERANTSERVICES

Inthissection,weexaminehowtoprovideaservicethatiscorrectdespiteuptofprocessfailure
s,byreplicatingdataandfunctionalityatreplicamanagers.Forthesakeofsimplicity,weassu
methatcommunicationremainsreliableandthatnopartitionsoccur.Eachreplicamanageris
assumedtobehaveaccordingtoaspecificationofthesemanticsoftheobjectsitmanages,whe
ntheyhavenotcrashed.Forexample,aspecificationofbankaccountswouldincludeanassur
ancethatfundstransferredbetweenbankaccountscanneverdisappear,andthatonlydeposit
sandwithdrawalsaffectthebalanceofanyparticularaccount.

Intuitively,aservicebasedonreplicationiscorrectifitkeepsrespondingdespitefailuresandi
fclientscannottellthedifferencebetweentheservicetheyobtainfromanimplementationwit
hreplicateddataandoneprovidedbyasinglecorrectreplicamanager.Careisneededinmeeti
ngthesecriteria.If

Page81of88
precautionsarenottaken,thenanomaliescanarisewhenthereareseveralreplicamanagers–
evenbearinginmindthatweareconsideringtheeffectsofindividualoperations,nottransacti
ons.

Consideranaivereplicationsystem,inwhichapairofreplicamanagersatcomputersAandB
eachmaintainreplicasoftwobankaccounts,xandy.Clientsreadandupdatetheaccountsatth
eirlocalreplicamanagerbuttryanotherreplicamanagerifthelocalonefails.

Replicamanagerspropagateupdatestooneanotherinthebackgroundafterrespondingtothe
clients.Bothaccountsinitiallyhaveabalanceof$0.

Client1updatesthebalanceofxatitslocalreplicamanagerBtobe$1andthen attempts to
update y’s balance to be $2, but discovers that B has failed.
Client1thereforeappliestheupdateatAinstead.Nowclient2readsthebalancesatitslocalrep
licamanagerA.Itfindsfirstthatyhas$2andthenthatxhas$0–
theupdatetobankaccountxfromBhasnotarrived,sinceBfailed.Thesituationisshownbelo
w,wheretheoperationsarelabelledbythecomputeratwhichtheyfirsttookplaceandlowero
perationshappenlater:

Thisexecutiondoesnotmatchacommon-
sensespecificationforthebehaviourofbankaccounts:client2shouldhavereadabalanceof$
1forx,given that it read the balance of $2 for y, since y’s balance was updated
afterthatofx.Theanomalousbehaviourinthereplicatedcasecouldnothaveoccurrediftheba
nkaccountshadbeenimplementedbyasingleserver.Wecanconstructsystemsthatmanager
eplicatedobjectswithouttheanomalousbehaviourproducedbythenaiveprotocolinourexa
mple.First,weneedtounderstandwhatcountsascorrectbehaviourforareplicatedsystem.

Page82of88
Areplicatedsharedobjectserviceissaidtobelinearizableifforanyexecutionthereissomeint
erleavingoftheseriesofoperationsissuedbyalltheclientsthatsatisfiesthefollowingtwocrit
eria:

1) Theinterleavedsequenceofoperationsmeetsthespecificationofa(single)correctco
pyoftheobjects.
2) Theorderofoperationsintheinterleavingisconsistentwiththerealtimesatwhichthe
operationsoccurredintheactualexecution.

Thisdefinitioncapturestheideathatforanysetofclientoperationsthereisavirtualcanonical
execution– theinterleavedoperationsthatthedefinitionrefersto–
againstavirtualsingleimageofthesharedobjects.Andeachclientseesaviewofthesharedob
jectsthatisconsistentwiththatsingleimage:thatis,theresults oftheclient’s operations
makesenseas theyoccurwithin theinterleaving.

Theservicethatgaverisetotheexecutionofthebankaccountclientsintheprecedingexample
isnotlinearizable.Evenignoringtherealtimeatwhichthe operations took place, there is
no interleaving of the two clients’ operations
thatwouldsatisfyanycorrectbankaccountspecification:forauditingpurposes,ifoneaccou
ntupdateoccurredafteranother,thenthefirstupdateshouldbeobservedifthesecondhasbee
nobserved.

Notethatlinearizabilityconcernsonlytheinterleavingofindividualoperationsandisnotint
endedtobetransactional.Alinearizableexecutionmaybreak

Page83of88
application-specificnotionsofconsistencyifconcurrencycontrolisnotapplied.

Thereal-
timerequirementinlinearizabilityisdesirableinanidealworld,becauseitcapturesournotio
nthatclientsshouldreceiveup-to-
dateinformation.But,equally,thepresenceofrealtimeinthedefinitionraisestheissueofline
arizability’s
practicality,becausewecannotalwayssynchronizeclockstotherequireddegreeofaccurac
y.Aweakercorrectnessconditionissequentialconsistency,whichcapturesanessentialrequ
irementconcerningtheorderinwhichrequestsareprocessedwithoutappealingtorealtime.
Thedefinitionkeepsthefirstcriterionfromthedefinitionforlinearizabilitybutmodifiesthes
econd.

Areplicatedsharedobjectserviceissaidtobesequentiallyconsistentifforanyexecutionther
eissomeinterleavingoftheseriesofoperationsissuedbyalltheclientsthatsatisfiesthefollow
ingtwocriteria:

1) Theinterleavedsequenceofoperationsmeetsthespecificationofa(single)correctco
pyoftheobjects.
2) Theorderofoperationsintheinterleavingisconsistentwiththeprogramorderinwhic
heachindividualclientexecutedthem.

Notethatabsolutetimedoesnotappearinthisdefinition.Nordoesanyothertotalorderonallo
perations.Theonlynotionoforderingthatisrelevantistheorderofeventsateachseparateclie
nt–
theprogramorder.Theinterleavingofoperationscanshufflethesequenceofoperationsfro
masetofclientsinany order, as long as each client’s order is not violated and the result
of each operationisconsistent,intermsoftheobjects’
specification,withtheoperationsthatprecededit.Thisissimilartoshufflingtogetherseveral
packsofcardssothattheyareintermingledinsuchawayastopreservetheoriginalorderofeac
hpack.

Passive(primary-backup)replication:Inthepassiveorprimary-
backupmodelofreplicationforfaulttolerance(Figure18.3),thereisatanyonetimeasinglepr
imaryreplicamanagerandoneormoresecondaryreplicamanagers–
‘backups’or‘slaves’.Inthepureformofthemodel,frontendscommunicate

Page84of88
onlywiththeprimaryreplicamanagertoobtaintheservice.Theprimaryreplicamanagerexe
cutestheoperationsandsendscopiesoftheupdateddatatothebackups.Iftheprimaryfails,on
eofthebackupsispromotedtoactastheprimary.

Thesequenceofeventswhenaclientrequestsanoperationtobeperformedisasfollows:

1) Request:Thefrontendissuestherequest,containingauniqueidentifier,totheprimar
yreplicamanager.
2) Coordination:Theprimarytakeseachrequestatomically,intheorderinwhichitrec
eivesit.Itcheckstheuniqueidentifier,incaseithasalreadyexecutedtherequest,andif
soitsimplyresendstheresponse.
3) Execution:Theprimaryexecutestherequestandstorestheresponse.
4) Agreement:Iftherequestisanupdate,thentheprimarysendstheupdatedstate,there
sponseandtheuniqueidentifiertoallthebackups.Thebackupssendanacknowledge
ment.
5) Response:Theprimaryrespondstothefrontend,whichhandstheresponsebacktoth
eclient.

Thissystemobviouslyimplementslinearizabilityiftheprimaryiscorrect,sincetheprimarys
equencesalltheoperationsuponthesharedobjects.Iftheprimaryfails,thenthesystemretain
slinearizabilityifasinglebackupbecomesthenewprimaryandifthenewsystemconfigurati
ontakesoverexactlywherethelastleftoff.Thatisif:

Page85of88
Theprimaryisreplacedbyauniquebackup(iftwoclientsbeganusingtwobackups,the
nthesystemcouldperformincorrectly).
Thereplicamanagersthatsurviveagreeonwhichoperationshadbeenperformedatthe
pointwhenthereplacementprimarytakesover.

Bothoftheserequirementsaremetifthereplicamanagers(primaryandbackups)areorganiz
edasagroupandiftheprimaryusesview-
synchronousgroupcommunicationtosendtheupdatestothebackups.Thefirstoftheabovet
worequirementsistheneasilysatisfied.Whentheprimarycrashes,thecommunicationsyste
meventuallydeliversanewviewtothesurvivingbackups,onethatexcludestheoldprimary.
Thebackupthatreplacestheprimarycanbechosenbyanyfunctionofthatview.Forexample,
thebackupscanchoosethefirstmemberinthatviewasthereplacement.Thatbackupcanregi
steritselfastheprimarywithanameservicethattheclientsconsultwhentheysuspectthatthep
rimaryhasfailed(orwhentheyrequiretheserviceinthefirstplace).

Thesecondrequirementisalsosatisfied,bytheorderingpropertyofview-
synchronyandtheuseofstoredidentifierstodetectrepeatedrequests.Theview-
synchronoussemanticsguaranteethateitherallthebackupsornoneofthemwilldeliveranyg
ivenupdatebeforedeliveringthenewview.Thusthenewprimaryandthesurvivingbackups
allagreeonwhetheranyparticularclient’s updatehasorhasnotbeenprocessed.

Activereplication:Intheactivemodelofreplicationforfaulttolerance(seeFigure18.4),the
replicamanagersarestatemachinesthatplayequivalentrolesandareorganizedasagroup.Fr
ontendsmulticasttheirrequeststothegroupofreplicamanagersandallthereplicamanagers
processtherequestindependentlybutidenticallyandreply.Ifanyreplicamanagercrashes,t
hisneedhavenoimpactupontheperformanceoftheservice,sincetheremainingreplicaman
agerscontinuetorespondinthenormalway.WeshallseethatactivereplicationcantolerateB
yzantinefailures,becausethefrontendcancollectandcomparetherepliesitreceives.

Underactivereplication,thesequenceofeventswhenaclientrequestsanoperationtobeperf
ormedisasfollows:

Page86of88
1) Request:Thefrontendattachesauniqueidentifiertotherequestandmulticastsittoth
egroupofreplicamanagers,usingatotallyordered,reliablemulticastprimitive.Thef
rontendisassumedtofailbycrashingatworst.Itdoesnotissuethenextrequestuntilith
asreceivedaresponse.
2) Coordination:Thegroupcommunicationsystemdeliverstherequesttoeverycorre
ctreplicamanagerinthesame(total)order.
3) Execution:Everyreplicamanagerexecutestherequest.Sincetheyarestatemachine
sandsincerequestsaredeliveredinthesametotalorder,correctreplicamanagersallpr
ocesstherequestidentically.Theresponsecontainstheclient’suniquerequestidenti
fier.
4) Agreement:Noagreementphaseisneeded,becauseofthemulticastdeliverysemant
ics.
5) Response:Eachreplicamanagersendsitsresponsetothefrontend.Thenumberofre
pliesthatthefrontendcollectsdependsuponthefailureassumptionsandthemulticast
algorithm.If,forexample,thegoalistotolerateonlycrashfailuresandthemulticastsa
tisfiesuniformagreementandorderingproperties,thenthefrontendpassesthefirstre
sponsetoarrivebacktotheclientanddiscardstherest(itcandistinguishthesefromres
ponsestootherrequestsbyexaminingtheidentifierintheresponse).

Thissystemachievessequentialconsistency.Allcorrectreplicamanagersprocessthesame
sequenceofrequests.Thereliabilityofthemulticastensuresthateverycorrectreplicamanag
erprocessesthesamesetofrequests

Page87of88
andthetotalorderensuresthattheyprocesstheminthesameorder.Sincetheyarestatemachin
es,theyallendupwiththesamestateasoneanotheraftereachrequest.Eachfrontend’sreques
tsareservedinFIFOorder(becausethefrontendawaitsaresponsebeforemakingthenextreq
uest),whichisthesameas‘programorder’.Thisensuressequentialconsistency.

Ifclientsdonotcommunicatewithotherclientswhilewaitingforresponsestotheirrequests,t
hentheirrequestsareprocessedinhappened-
beforeorder.Ifclientsaremultithreadedandcancommunicatewithoneanotherwhileawaiti
ngresponsesfromtheservice,thentoguaranteerequestprocessinginhappened-
beforeorderwewouldhavetoreplacethemulticastwithonethatisbothcausallyandtotallyor
dered.

Theactivereplicationsystemdoesnotachievelinearizability.Thisisbecausethetotalorderi
nwhichthereplicamanagersprocessrequestsisnotnecessarilythesameasthereal-
timeorderinwhichtheclientsmadetheirrequests.Schneider[1990]describeshow,inasync
hronoussystemwithapproximatelysynchronizedclocks,thetotalorderinwhichthereplica
managersprocessrequestscanbebasedontheorderofphysicaltimestampsthatthefrontend
ssupplywiththeirrequests.Thisdoesnotguaranteelinearizability,becausethetimestampsa
renotperfectlyaccurate;butitapproximatesit.

*****

Page88of88

You might also like