0% found this document useful (0 votes)

196 views32 pages

How To Use The Bayes Net Toolbox

This document provides information on how to use the Bayes Net Toolbox. It covers creating Bayes networks, specifying graph structures and parameters, loading models from files, performing inference, and learning network structures and parameters from data. Various inference algorithms, CPD types, and example models are also described.

Uploaded by

Koushik Kashyap

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

196 views32 pages

How To Use The Bayes Net Toolbox

Uploaded by

Koushik Kashyap

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 32

9/21/2016

HowtousetheBayesNetToolbox

HowtousetheBayesNetToolbox
Thisdocumentationwaslastupdatedon29October2007.
ClickhereforaFrenchversionofthisdocumentation(lastupdatedin2005).
Installation
CreatingyourfirstBayesnet
Creatingamodelbyhand
Loadingamodelfromafile
CreatingamodelusingaGUI
Graphvisualization
Inference
Computingmarginaldistributions
Computingjointdistributions
Soft/virtualevidence
Mostprobableexplanation
ConditionalProbabilityDistributions
Tabular(multinomial)nodes
Noisyornodes
Other(noisy)deterministicnodes
Softmax(multinomiallogit)nodes
Neuralnetworknodes
Rootnodes
Gaussiannodes
Generalizedlinearmodelnodes
Classification/regressiontreenodes
Othercontinuousdistributions
SummaryofCPDtypes
Examplemodels
Gaussianmixturemodels
PCA,ICA,andallthat
Mixturesofexperts
Hierarchicalmixturesofexperts
QMR
ConditionalGaussianmodels
Otherhybridmodels
Parameterlearning
Loadingdatafromafile
Maximumlikelihoodparameterestimationfromcompletedata
Parameterpriors
(Sequential)Bayesianparameterupdatingfromcompletedata
Maximumlikelihoodparameterestimationwithmissingvalues(EM)
Parametertying
Structurelearning
Exhaustivesearch
K2
Hillclimbing
MCMC
Activelearning
StructuralEM
Visualizingthelearnedgraphstructure
Constraintbasedmethods
Inferenceengines
Junctiontree
Variableelimination
Globalinferencemethods
Quickscore
Beliefpropagation
Sampling(MonteCarlo)
Summaryofinferenceengines
Influencediagrams/decisionmaking
DBNs,HMMs,Kalmanfiltersandallthat
http://www.cs.ubc.ca/~murphyk/Software/BNT/usage.html#basics

1/32

9/21/2016

HowtousetheBayesNetToolbox

CreatingyourfirstBayesnet
TodefineaBayesnet,youmustspecifythegraphstructureandthentheparameters.Welookateachinturn,usingasimpleexample
(adaptedfromRussellandNorvig,"ArtificialIntelligence:aModernApproach",PrenticeHall,1995,p454).

Graphstructure
Considerthefollowingnetwork.

Tospecifythisdirectedacyclicgraph(dag),wecreateanadjacencymatrix:
N=4;
dag=zeros(N,N);
C=1;S=2;R=3;W=4;
dag(C,[RS])=1;
dag(R,W)=1;
dag(S,W)=1;

Wehavenumberedthenodesasfollows:Cloudy=1,Sprinkler=2,Rain=3,WetGrass=4.Thenodesmustalwaysbenumberedin
topologicalorder,i.e.,ancestorsbeforedescendants.Foramorecomplicatedgraph,thisisalittleinconvenient:wewillseehowto
getaroundthisbelow.
InMatlab6,youcanuselogicalarraysinsteadofdoublearrays,whichare4timessmaller:
dag=false(N,N);
dag(C,[RS])=true;
...

However,somegraphfunctions(egacyclic)donotworkonlogicalarrays!
Youcanvisualizetheresultinggraphstructureusingthemethodsdiscussedbelow.FordetailsonGUIs,clickhere.

CreatingtheBayesnetshell
Inadditiontospecifyingthegraphstructure,wemustspecifythesizeandtypeofeachnode.Ifanodeisdiscrete,itssizeisthenumber
ofpossiblevalueseachnodecantakeonifanodeiscontinuous,itcanbeavector,anditssizeisthelengthofthisvector.Inthiscase,
wewillassumeallnodesarediscreteandbinary.
discrete_nodes=1:N;
node_sizes=2*ones(1,N);

Ifthenodeswerenotbinary,youcouldtypee.g.,
http://www.cs.ubc.ca/~murphyk/Software/BNT/usage.html#basics

2/32

9/21/2016

HowtousetheBayesNetToolbox

node_sizes=[4235];

meaningthatCloudyhas4possiblevalues,Sprinklerhas2possiblevalues,etc.Notethatthesearecardinalvalues,notordinal,i.e.,
theyarenotorderedinanyway,like'low','medium','high'.
WearenowreadytomaketheBayesnet:
bnet=mk_bnet(dag,node_sizes,'discrete',discrete_nodes);

Bydefault,allnodesareassumedtobediscrete,sowecanalsojustwrite
bnet=mk_bnet(dag,node_sizes);

Youmayalsospecifywhichnodeswillbeobserved.Ifyoudon'tknow,orifthisnotfixedinadvance,justusetheemptylist(the
default).
onodes=[];
bnet=mk_bnet(dag,node_sizes,'discrete',discrete_nodes,'observed',onodes);

Notethatoptionalargumentsarespecifiedusinganame/valuesyntax.ThisiscommonformanyBNTfunctions.Ingeneral,tofindout
moreaboutafunction(e.g.,whichoptionalargumentsittakes),pleaseseeitsdocumentationstringbytyping
helpmk_bnet

SeealsootherusefulMatlabtips.
Itispossibletoassociatenameswithnodes,asfollows:
bnet=mk_bnet(dag,node_sizes,'names',{'cloudy','S','R','W'},'discrete',1:4);

Youcanthenrefertoanodebyitsname:
C=bnet.names('cloudy');%bnet.namesisanassociativearray
bnet.CPD{C}=tabular_CPD(bnet,C,[0.50.5]);

Thisfeatureusesmyownassociativearrayclass.

Parameters
Amodelconsistsofthegraphstructureandtheparameters.TheparametersarerepresentedbyCPDobjects(CPD=Conditional
ProbabilityDistribution),whichdefinetheprobabilitydistributionofanodegivenitsparents.(Wewillusetheterms"node"and
"randomvariable"interchangeably.)ThesimplestkindofCPDisatable(multidimensionalarray),whichissuitablewhenallthe
nodesarediscretevalued.Notethatthediscretevaluesarenotassumedtobeorderedinanywaythatis,theyrepresentcategorical
quantities,likemaleandfemale,ratherthanordinalquantities,likelow,mediumandhigh.(WewilldiscussCPDsinmoredetail
below.)
TabularCPDs,alsocalledCPTs(conditionalprobabilitytables),arestoredasmultidimensionalarrays,wherethedimensionsare
arrangedinthesameorderasthenodes,e.g.,theCPTfornode4(WetGrass)isindexedbySprinkler(2),Rain(3)andthenWetGrass
(4)itself.Hencethechildisalwaysthelastdimension.Ifanodehasnoparents,itsCPTisacolumnvectorrepresentingitsprior.Note
thatinMatlab(unlikeC),arraysareindexedfrom1,andarelayedoutinmemorysuchthatthefirstindextogglesfastest,e.g.,theCPT
fornode4(WetGrass)isasfollows

wherewehaveusedtheconventionthatfalse==1,true==2.WecancreatethisCPTinMatlabasfollows
CPT=zeros(2,2,2);
CPT(1,1,1)=1.0;
CPT(2,1,1)=0.1;
...

Hereisaneasierway:
http://www.cs.ubc.ca/~murphyk/Software/BNT/usage.html#basics

3/32

9/21/2016

HowtousetheBayesNetToolbox

CPT=reshape([10.10.10.0100.90.90.99],[222]);

Infact,wedon'tneedtoreshapethearray,sincetheCPDconstructorwilldothatforus.Sowecanjustwrite
bnet.CPD{W}=tabular_CPD(bnet,W,'CPT',[10.10.10.0100.90.90.99]);

Theothernodesarecreatedsimilarly(usingtheoldsyntaxforoptionalparameters)
bnet.CPD{C}=tabular_CPD(bnet,C,[0.50.5]);
bnet.CPD{R}=tabular_CPD(bnet,R,[0.80.20.20.8]);
bnet.CPD{S}=tabular_CPD(bnet,S,[0.50.90.50.1]);
bnet.CPD{W}=tabular_CPD(bnet,W,[10.10.10.0100.90.90.99]);

RandomParameters
IfwedonotspecifytheCPT,randomparameterswillbecreated,i.e.,each"row"oftheCPTwillbedrawnfromtheuniform
distribution.Toensurerepeatableresults,use
rand('state',seed);
randn('state',seed);

Tocontrolthedegreeofrandomness(entropy),youcansampleeachrowoftheCPTfromaDirichlet(p,p,...)distribution.Ifp<<1,
thisencourages"deterministic"CPTs(oneentrynear1,therestnear0).Ifp=1,eachentryisdrawnfromU[0,1].Ifp>>1,theentries
willallbenear1/k,wherekisthearityofthisnode,i.e.,eachrowwillbenearlyuniform.Youcandothisasfollows,assumingthis
nodeisnumberi,andnsisthenode_sizes.
k=ns(i);
ps=parents(dag,i);
psz=prod(ns(ps));
CPT=sample_dirichlet(p*ones(1,k),psz);
bnet.CPD{i}=tabular_CPD(bnet,i,'CPT',CPT);

Loadinganetworkfromafile
IfyoualreadyhaveaBayesnetrepresentedintheXMLbasedBayesNetInterchangeFormat(BNIF)(e.g.,downloadedfromthe
BayesNetrepository),youcanconvertittoBNTformatusingtheBIFBNTJavaprogramwrittenbyKenShan.(Thisisnot
necessarilyuptodate.)
Itiscurrentlynotpossibletosave/loadaBNTmatlabobjecttofile,butthisiseasilyfixedifyoumodifyalltheconstructorsforall
theclasses(seematlabdocumentation).

CreatingamodelusingaGUI
SenthilNachimuthuhasstarted(Oct07)anopensourceGUIforBNTcalledprojenyusingJava.ThisisasuccessortoBNJ.
PhilippeLeRayhaswritten(Sep05)aBNTGUIinmatlab.
LinkStrength,apackagebyImmeEbertUphoffforvisualizingthestrengthofdependenciesbetweennodes.

Graphvisualization
Clickhereformoreinformationongraphvisualization.

Inference
HavingcreatedtheBN,wecannowuseitforinference.TherearemanydifferentalgorithmsfordoinginferenceinBayesnets,that
makedifferenttradeoffsbetweenspeed,complexity,generality,andaccuracy.BNTthereforeoffersavarietyofdifferentinference
"engines".Wewilldiscusstheseinmoredetailbelow.Fornow,wewillusethejunctiontreeengine,whichisthemotherofallexact
inferencealgorithms.Thiscanbecreatedasfollows.
engine=jtree_inf_engine(bnet);

Theotherengineshavesimilarconstructors,butmighttakeadditional,algorithmspecificparameters.Allenginesareusedinthesame
way,oncetheyhavebeencreated.Weillustratethisinthefollowingsections.

Computingmarginaldistributions
http://www.cs.ubc.ca/~murphyk/Software/BNT/usage.html#basics

4/32

9/21/2016

HowtousetheBayesNetToolbox

Supposewewanttocomputetheprobabilitythatthesprinkerwasongiventhatthegrassiswet.Theevidenceconsistsofthefactthat
W=2.Alltheothernodesarehidden(unobserved).Wecanspecifythisasfollows.
evidence=cell(1,N);
evidence{W}=2;

Weusea1Dcellarrayinsteadofavectortocopewiththefactthatnodescanbevectorsofdifferentlengths.Inaddition,thevalue[]
canbeusedtodenote'noevidence',insteadofhavingtospecifytheobservationpatternasaseparateargument.(Clickhereforaquick
tutorialoncellarraysinmatlab.)
Wearenowreadytoaddtheevidencetotheengine.
[engine,loglik]=enter_evidence(engine,evidence);

Thebehaviorofthisfunctionisalgorithmspecific,andisdiscussedinmoredetailbelow.Inthecaseofthejtreeengine,
enter_evidenceimplementsatwopassmessagepassingscheme.Thefirstreturnargumentcontainsthemodifiedengine,which
incorporatestheevidence.Thesecondreturnargumentcontainstheloglikelihoodoftheevidence.(Notallenginesarecapableof
computingtheloglikelihood.)
Finally,wecancomputep=P(S=2|W=2)asfollows.
marg=marginal_nodes(engine,S);
marg.T
ans=
0.57024
0.42976
p=marg.T(2);

Weseethatp=0.4298.
Nowletusaddtheevidencethatitwasraining,andseewhatdifferenceitmakes.
evidence{R}=2;
[engine,loglik]=enter_evidence(engine,evidence);
marg=marginal_nodes(engine,S);
p=marg.T(2);

Wefindthatp=P(S=2|W=2,R=2)=0.1945,whichislowerthanbefore,becausetheraincan``explainaway''thefactthatthegrassis
wet.
Youcanplotamarginaldistributionoveradiscretevariableasabarchartusingthebuilt'bar'function:
bar(marg.T)

Thisiswhatitlookslike

Observednodes
Whathappensifweaskforthemarginalonanobservednode,e.g.P(W|W=2)?Anobserveddiscretenodeeffectivelyonlyhas1value
(theobservedone)allothervalueswouldresultin0probability.Forefficiency,BNTtreatsobserved(discrete)nodesasiftheywere
setto1,asweseebelow:
evidence=cell(1,N);
evidence{W}=2;
engine=enter_evidence(engine,evidence);
m=marginal_nodes(engine,W);
http://www.cs.ubc.ca/~murphyk/Software/BNT/usage.html#basics

5/32

9/21/2016

HowtousetheBayesNetToolbox

m.T
ans=
1

Thiscangetalittleconfusing,sinceweassignedW=2.SowecanaskBNTtoaddtheevidencebackinbypassinginanoptional
argument:
m=marginal_nodes(engine,W,1);
m.T
ans=
0
1

ThisshowsthatP(W=1|W=2)=0andP(W=2|W=2)=1.

Computingjointdistributions
Wecancomputethejointprobabilityonasetofnodesasinthefollowingexample.
evidence=cell(1,N);
[engine,ll]=enter_evidence(engine,evidence);
m=marginal_nodes(engine,[SRW]);

misastructure.The'T'fieldisamultidimensionalarray(inthiscase,3dimensional)thatcontainsthejointprobabilitydistributionon
thespecifiednodes.
>>m.T
ans(:,:,1)=
0.29000.0410
0.02100.0009
ans(:,:,2)=
00.3690
0.18900.0891

WeseethatP(S=1,R=1,W=2)=0,sinceitisimpossibleforthegrasstobewetifboththerainandsprinklerareoff.
LetusnowaddsomeevidencetoR.
evidence{R}=2;
[engine,ll]=enter_evidence(engine,evidence);
m=marginal_nodes(engine,[SRW])
m=
domain:[234]
T:[2x1x2double]
>>m.T
m.T
ans(:,:,1)=
0.0820
0.0018
ans(:,:,2)=
0.7380
0.1782

ThejointT(i,j,k)=P(S=i,R=j,W=k|evidence)shouldhaveT(i,1,k)=0foralli,k,sinceR=1isincompatiblewiththeevidencethatR=2.
Insteadofcreatinglargetableswithmany0s,BNTsetstheeffectivesizeofobserved(discrete)nodesto1,asexplainedabove.Thisis
whym.Thassize2x1x2.Togeta2x2x2table,type
m=marginal_nodes(engine,[SRW],1)
m=
domain:[234]
T:[2x2x2double]
>>m.T
m.T
ans(:,:,1)=
00.082
00.0018
ans(:,:,2)=
00.738
00.1782

Note:Itisnotalwayspossibletocomputethejointonarbitrarysetsofnodes:itdependsonwhichinferenceengineyouuse,as
discussedinmoredetailbelow.

http://www.cs.ubc.ca/~murphyk/Software/BNT/usage.html#basics

6/32

9/21/2016

HowtousetheBayesNetToolbox

Soft/virtualevidence
Sometimesanodeisnotobserved,butwehavesomedistributionoveritspossiblevaluesthisisoftencalled"soft"or"virtual"
evidence.Onecanusethisasfollows
[engine,loglik]=enter_evidence(engine,evidence,'soft',soft_evidence);

wheresoft_evidence{i}iseither[](ifnodeihasnosoftevidence)orisavectorrepresentingtheprobabilitydistributionoveri's
possiblevalues.Forexample,ifwedon'tknowi'sexactvalue,butweknowitslikelihoodratiois60/40,wecanwriteevidence{i}=[]
andsoft_evidence{i}=[0.60.4].
Currentlyonlyjtree_inf_enginesupportsthisoption.Itassumesthatallhiddennodes,andallnodesforwhichwehavesoftevidence,
arediscrete.Foralongerexample,seeBNT/examples/static/softev1.m.

Mostprobableexplanation
Tocomputethemostprobableexplanation(MPE)oftheevidence(i.e.,themostprobableassignment,oramodeofthejoint),use
[mpe,ll]=calc_mpe(engine,evidence);

mpe{i}isthemostlikelyvalueofnodei.Thiscallsenter_evidencewiththe'maximize'flagsetto1,whichcausestheenginetodo
maxproductinsteadofsumproduct.Theresultingmaxmarginalsarethenthresholded.Ifthereismorethanonemaximumprobability
assignment,wemusttakecaretobreaktiesinaconsistentmanner(thresholdingthemaxmarginalsmaygivethewrongresult).To
forcethisbehavior,type
[mpe,ll]=calc_mpe(engine,evidence,1);

NotethatcomputingtheMPEissometiescalledabductivereasoning.
Youcanalsousecalc_mpe_bucketwrittenbyRonZohar,thatdoesaforwardsmaxproductpass,andthenabackwardstracebackpass,
whichishowViterbiistraditionallyimplemented.

ConditionalProbabilityDistributions
AConditionalProbabilityDistributions(CPD)definesP(X(i)|X(Pa(i))),whereX(i)isthei'thnode,andX(Pa(i))aretheparentsof
nodei.Therearemanywaystorepresentthisdistribution,whichdependinpartonwhetherX(i)andX(Pa(i))arediscrete,continuous,
oracombination.Wewilldiscussvariousrepresentationsbelow.

Tabularnodes
IftheCPDisrepresentedasatable(i.e.,ifitisamultinomialdistribution),ithasanumberofparametersthatisexponentialinthe
numberofparents.Seetheexampleabove.

Noisyornodes
AnoisyORnodeislikearegularlogicalORgateexceptthatsometimestheeffectsofparentsthatareongetinhibited.Lettheprob.
thatparentigetsinhibitedbeq(i).Thenanode,C,with2parents,AandB,hasthefollowingCPD,whereweuseFandTtorepresent
offandon(1and2inBNT).
ABP(C=off)P(C=on)

FF1.00.0
TFq(A)1q(A)
FTq(B)1q(B)
TTq(A)q(B)1q(A)q(B)

Thusweseethatthecausesgetinhibitedindependently.Itiscommontoassociatea"leak"nodewithanoisyorCPD,whichislikea
parentthatisalwayson.Thiscanaccountforallotherunmodelledcauseswhichmightturnthenodeon.
Thenoisyordistributionissimilartothelogisticdistribution.Toseethis,letthenodes,S(i),havevaluesin{0,1},andletq(i,j)bethe
prob.thatjinhibitsi.Then
Pr(S(i)=1|parents(S(i)))=1prod_{j}q(i,j)^S(j)

Nowdefinew(i,j)=lnq(i,j)andrho(x)=1exp(x).Then
http://www.cs.ubc.ca/~murphyk/Software/BNT/usage.html#basics

7/32

9/21/2016

HowtousetheBayesNetToolbox

Pr(S(i)=1|parents(S(i)))=rho(sum_jw(i,j)S(j))

Forasigmoidnode,wehave
Pr(S(i)=1|parents(S(i)))=sigma(sum_jw(i,j)S(j))

wheresigma(x)=1/(1+exp(x)).Hencetheydifferinthechoiceoftheactivationfunction(althoughbotharemonotonically
increasing).Inaddition,inthecaseofanoisyor,theweightsareconstrainedtobepositive,sincetheyderivefromprobabilitiesq(i,j).
Inbothcases,thenumberofparametersislinearinthenumberofparents,unlikethecaseofamultinomialdistribution,wherethe
numberofparametersisexponentialinthenumberofparents.WewillseeanexampleofnoisyORnodesbelow.

Other(noisy)deterministicnodes
DeterministicCPDsfordiscreterandomvariablescanbecreatedusingthedeterministic_CPDclass.Itisalsopossibleto'flip'the
outputofthefunctionwithsomeprobability,tosimulatenoise.Theboolean_CPDclassisjustaspecialcaseofadeterministicCPD,
wheretheparentsandchildareallbinary.
Bothoftheseclassesarejust"syntacticsugar"forthetabular_CPDclass.

Softmaxnodes
Ifwehaveadiscretenodewithacontinuousparent,wecandefineitsCPDusingasoftmaxfunction(alsoknownasthemultinomial
logitfunction).Thisactslikeasoftthresholdingoperator,andisdefinedasfollows:
exp(w(:,i)'*x+b(i))
Pr(Q=i|X=x)=
sum_jexp(w(:,j)'*x+b(j))

Theparametersofasoftmaxnode,w(:,i)andb(i),i=1..|Q|,havethefollowinginterpretation:w(:,i)w(:,j)isthenormalvectortothe
decisionboundarybetweenclassesiandj,andb(i)b(j)isitsoffset(bias).Forexample,supposeXisa2vector,andQisbinary.Then
w=[11;
00];
b=[00];

meansclass1arepointsinthe2Dplanewithpositivexcoordinate,andclass2arepointsinthe2Dplanewithnegativexcoordinate.If
whaslargemagnitude,thedecisionboundaryissharp,otherwiseitissoft.InthespecialcasethatQisbinary(0/1),thesoftmax
functionreducestothelogistic(sigmoid)function.
Fittingasoftmaxfunctioncanbedoneusingtheiterativelyreweightedleastsquares(IRLS)algorithm.Weusetheimplementation
fromNetlab.Notethatsincethesoftmaxdistributionisnotintheexponentialfamily,itdoesnothavefinitesufficientstatistics,and
hencewemuststoreallthetrainingdatainuncompressedform.Ifthistakestoomuchspace,oneshoulduseonline(stochastic)
gradientdescent(notimplementedinBNT).
Ifasoftmaxnodealsohasdiscreteparents,weuseadifferentsetofw/bparametersforeachcombinationofparentvalues,asinthe
conditionallinearGaussianCPD.ThisfeaturewasimplementedbyPierpaoloBrutti.Heiscurrentlyextendingitsothatdiscrete
parentscanbetreatedasiftheywerecontinuous,byaddingindicatorvariablestotheXvector.
Wewillseeanexampleofsoftmaxnodesbelow.

Neuralnetworknodes
PierpaoloBruttihasimplementedthemlp_CPDclass,whichusesamultilayerperceptrontoimplementamappingfromcontinuous
parentstodiscretechildren,similartothesoftmaxfunction.(Iftherearealsodiscreteparents,itcreatesamixtureofMLPs.)Ituses
codefromNetlab.Thisisworkinprogress.

Rootnodes
Arootnodehasnoparentsandnoparametersitcanbeusedtomodelanobserved,exogeneousinputvariable,i.e.,onewhichis
"outside"themodel.Thisisusefulforconditionaldensitymodels.Wewillseeanexampleofrootnodesbelow.

Gaussiannodes
http://www.cs.ubc.ca/~murphyk/Software/BNT/usage.html#basics

8/32

9/21/2016

HowtousetheBayesNetToolbox

Wenowconsideradistributionsuitableforthecontinuousvaluednodes.SupposethenodeiscalledY,itscontinuousparents(ifany)
arecalledX,anditsdiscreteparents(ifany)arecalledQ.ThedistributiononYisdefinedasfollows:
noparents:Y~N(mu,Sigma)
ctsparents:Y|X=x~N(mu+Wx,Sigma)
discreteparents:Y|Q=i~N(mu(:,i),Sigma(:,:,i))
ctsanddiscreteparents:Y|X=x,Q=i~N(mu(:,i)+W(:,:,i)*x,Sigma(:,:,i))

whereN(mu,Sigma)denotesaNormaldistributionwithmeanmuandcovarianceSigma.Let|X|,|Y|and|Q|denotethesizesofX,Y
andQrespectively.Iftherearenodiscreteparents,|Q|=1ifthereismorethanone,then|Q|=avectorofthesizesofeachdiscrete
parent.Iftherearenocontinuousparents,|X|=0ifthereismorethanone,then|X|=thesumoftheirsizes.Thenmuisa|Y|*|Q|vector,
Sigmaisa|Y|*|Y|*|Q|positivesemidefinitematrix,andWisa|Y|*|X|*|Q|regression(weight)matrix.
WecancreateaGaussiannodewithrandomparametersasfollows.
bnet.CPD{i}=gaussian_CPD(bnet,i);

Wecanspecifythevalueofoneormoreoftheparametersasinthefollowingexample,inwhich|Y|=2,and|Q|=1.
bnet.CPD{i}=gaussian_CPD(bnet,i,'mean',[0;0],'weights',randn(Y,X),'cov',eye(Y));

WewillseeanexampleofconditionallinearGaussiannodesbelow.
WhenlearningGaussiansfromdata,itishelpfultoensurethedatahasasmallmagnitde(seee.g.,KPMstats/standardize)toprevent
numericalproblems.Unlessyouhavealotofdata,itisalsoaverygoodideatousediagonalinsteadoffullcovariancematrices.(BNT
doesnotcurrentlysupportsphericalcovariances,althoughitwouldbeeasytoadd,sinceKPMstats/clg_Mstepsupportsthisoptionyou
wouldjustneedtomodifygaussian_CPD/update_esstoaccumulateweightedinnerproducts.)

Othercontinuousdistributions
CurrentlyBNTdoesnotsupportanyCPDsforcontinuousnodesotherthantheGaussian.However,youcanuseamixtureof
Gaussianstoapproximateothercontinuousdistributions.WewillseesomeanexampleofthiswiththeIFAmodelbelow.

Generalizedlinearmodelnodes
Inthefuture,wemayincorporatesomeofthefunctionalityofglmlabintoBNT.

Classification/regressiontreenodes
WeplantoaddclassificationandregressiontreestodefineCPDsfordiscreteandcontinuousnodes,respectively.Treeshavemany
advantages:theyareeasytointerpret,theycandofeatureselection,theycanhandlediscreteandcontinuousinputs,theydonotmake
strongassumptionsabouttheformofthedistribution,thenumberofparameterscangrowinadatadependentway(i.e.,theyaresemi
parametric),theycanhandlemissingdata,etc.However,theyarenotyetimplemented.

SummaryofCPDtypes
WelistallthedifferenttypesofCPDssupportedbyBNT.ForeachCPD,wespecifyifthechildandparentscanbediscrete(D)or
continuous(C)(Binary(B)nodesareaspecialcase).Wealsospecifywhichmethodseachclasssupports.Ifamethodisinherited,the
nameoftheparentclassismentioned.Ifaparentclasscallsachildmethod,thisismentioned.
TheCPD_to_CPTmethodconvertsaCPDtoatablethisrequiresthatthechildandallparentsarediscrete.TheCPTmightbe
exponentiallybig...convert_to_tableevaluatesaCPDwithevidence,andrepresentsthetheresultingpotentialasanarray.This
requiresthatthechildisdiscrete,andanycontinuousparentsareobserved.convert_to_potevaluatesaCPDwithevidence,and
representstheresultingpotentialasadpot,gpot,cgpotorupot,asrequested.(d=discrete,g=Gaussian,cg=conditionalGaussian,u=
utility).
Whenwesampleanode,alltheparentsareobserved.Whenwecomputethe(log)probabilityofanode,alltheparentsandthechild
areobserved.
Wealsospecifyiftheparametersarelearnable.ForlearningwithEM,werequirethemethodsreset_ess,update_essand
maximize_params.Forlearningfromfullyobserveddata,werequirethemethodlearn_params.Bydefault,allclassesinheritthisfrom
generic_CPD,whichsimplycallsupdate_essNtimes,onceforeachdatacase,followedbymaximize_params,i.e.,itislikeEM,
withouttheEstep.Someclassesimplementabatchformula,whichisquicker.
Bayesianlearningmeanscomputingaposteriorovertheparametersgivenfullyobserveddata.
http://www.cs.ubc.ca/~murphyk/Software/BNT/usage.html#basics

9/32

9/21/2016

HowtousetheBayesNetToolbox

Pearlmeansweimplementthemethodscompute_piandcompute_lambda_msg,usedbypearl_inf_engine,whichrunsondirected
graphs.belprop_inf_engineonlyneedsconvert_to_pot.HThepearlmethodscanexploitspecialpropertiesoftheCPDsforcomputing
themessagesefficiently,whereasbelpropdoesnot.
Theonlymethodimplementedbygeneric_CPDisadjustable_CPD,whichisnotshown,sinceitisnotveryinteresting.
Name
boolean

Child Parents Comments CPD_to_CPT conv_to_table conv_to_pot

Syntactic
sugarfor
tabular
Syntactic
sugarfor
tabular
Virtual
class

sample

prob

learn Bayes Pearl

deterministic D

Discrete

C/D

Gaussian
gmux

C
C

C/D
C/D

MLP

C/D

noisyor

root

C/D none

noparams N

Inheritsfrom Inheritsfrom Inheritsfrom Inheritsfrom

N
discrete
discrete
discrete
discrete
N
Y
Y
Y
N

softmax

generic

C/D C/D

Virtual
class

Tabular

C/D

Calls
Calls
Calls
Calls
N
CPD_to_CPT conv_to_table conv_to_table conv_to_table

N
N

N
Y

Inheritsfrom Inheritsfrom Inheritsfrom

Y
discrete
discrete
discrete

Inheritsfrom Inheritsfrom Inheritsfrom Inheritsfrom

Y
discrete
discrete
discrete
discrete

N
multiplexer N
multilayer
N
perceptron

Y
Y
Inheritsfrom
discrete

Y
N
Inheritsfrom
discrete

Y
Y
N
N
Inheritsfrom
Y
discrete

Examplemodels
Gaussianmixturemodels
RichardW.DeVaulhasmadeadetailedtutorialonhowtofitmixturesofGaussiansusingBNT.Availablehere.

PCA,ICA,andallthat
InFigure(a)below,weshowhowFactorAnalysiscanbethoughtofasagraphicalmodel.Here,XhasanN(0,I)prior,andY|X=x~
N(mu+Wx,Psi),wherePsiisdiagonalandWiscalledthe"factorloadingmatrix".SincethenoiseonbothXandYisdiagonal,the
componentsofthesevectorsareuncorrelated,andhencecanberepresentedasindividualscalarnodes,asweshowin(b).(Thisis
usefulifpartsoftheobservationsontheYvectorareoccasionallymissing.)Weusuallytakek=|X|<<|Y|=D,sothemodeltriesto
explainmanyobservationsusingalowdimensionalsubspace.

(a)

(b)

(c)

(d)

WecancreatethismodelinBNTasfollows.
http://www.cs.ubc.ca/~murphyk/Software/BNT/usage.html#basics

10/32

9/21/2016

HowtousetheBayesNetToolbox

ns=[kD];
dag=zeros(2,2);
dag(1,2)=1;
bnet=mk_bnet(dag,ns,'discrete',[]);
bnet.CPD{1}=gaussian_CPD(bnet,1,'mean',zeros(k,1),'cov',eye(k),...
'cov_type','diag','clamp_mean',1,'clamp_cov',1);
bnet.CPD{2}=gaussian_CPD(bnet,2,'mean',zeros(D,1),'cov',diag(Psi0),'weights',W0,...
'cov_type','diag','clamp_mean',1);

TherootnodeisclampedtotheN(0,I)distribution,sothatwewillnotupdatetheseparametersduringlearning.Themeanoftheleaf
nodeisclampedto0,sinceweassumethedatahasbeencentered(haditsmeansubtractedoff)thisisjustforsimplicity.Finally,the
covarianceoftheleafnodeisconstrainedtobediagonal.W0andPsi0aretheinitialparameterguesses.
Wecanfitthismodel(i.e.,estimateitsparametersinamaximumlikelihood(ML)sense)usingEM,asweexplainbelow.Not
surprisingly,theMLestimatesformuandPsiturnouttobeidenticaltothesamplemeanandvariance,whichcanbecomputeddirectly
as
mu_ML=mean(data);
Psi_ML=diag(cov(data));

NotethatWcanonlybeidentifieduptoarotationmatrix,becauseofthesphericalsymmetryofthesource.
IfwerestrictPsitobespherical,i.e.,Psi=sigma*I,thereisaclosedformsolutionforWaswell,i.e.,wedonotneedtouseEM.In
particular,Wcontainsthefirst|X|eigenvectorsofthesamplecovariancematrix,withscalingsdeterminedbytheeigenvaluesand
sigma.ClassicalPCAcanbeobtainedbytakingthesigma>0limit.Fordetails,see
"EMalgorithmsforPCAandSPCA",SamRoweis,NIPS97.(Matlabsoftware)
"Mixturesofprobabilisticprincipalcomponentanalyzers",TippingandBishop,NeuralComputation11(2):443482,1999.
Byaddingahiddendiscretevariable,wecancreatemixturesofFAmodels,asshownin(c).Nowwecanexplainthedatausingasetof
subspaces.WecancreatethismodelinBNTasfollows.
ns=[MkD];
dag=zeros(3);
dag(1,3)=1;
dag(2,3)=1;
bnet=mk_bnet(dag,ns,'discrete',1);
bnet.CPD{1}=tabular_CPD(bnet,1,Pi0);
bnet.CPD{2}=gaussian_CPD(bnet,2,'mean',zeros(k,1),'cov',eye(k),'cov_type','diag',...

'clamp_mean',1,'clamp_cov',1);
bnet.CPD{3}=gaussian_CPD(bnet,3,'mean',Mu0','cov',repmat(diag(Psi0),[11M]),...

'weights',W0,'cov_type','diag','tied_cov',1);

NoticehowthecovariancematrixforYisthesameforallvaluesofQthatis,thenoiselevelineachsubspaceisassumedthesame.
However,weallowtheoffset,mu,tovary.Fordetails,see
TheEMAlgorithmforMixturesofFactorAnalyzers,Ghahramani,Z.andHinton,G.E.(1996),UniversityofTorontoTechnical
ReportCRGTR961.(Matlabsoftware)
"Mixturesofprobabilisticprincipalcomponentanalyzers",TippingandBishop,NeuralComputation11(2):443482,1999.
IhaveincludedZoubin'sspecializedMFAcode(withhispermission)withthetoolbox,soyoucancheckthatBNTgivesthesame
results:see'BNT/examples/static/mfa1.m'.
IndependentFactorAnalysis(IFA)generalizesFAbyallowinganonGaussianprioroneachcomponentofX.(Notethatwecan
approximateanonGaussianpriorusingamixtureofGaussians.)Thismeansthatthelikelihoodfunctionisnolongerrotationally
invariant,sowecanuniquelyidentifyWandthehiddensourcesX.IFAalsoallowsanondiagonalPsi(i.e.correlationsbetweenthe
componentsofY).WerecoverclassicalIndependentComponentsAnalysis(ICA)inthePsi>0limit,andbyassumingthat|X|=|Y|,so
thattheweightmatrixWissquareandinvertible.Fordetails,see
IndependentFactorAnalysis,H.Attias,NeuralComputation11:803851,1998.

Mixturesofexperts
Asanexampleoftheuseofthesoftmaxfunction,weintroducetheMixtureofExpertsmodel.Asbefore,circlesdenotecontinuous
valuednodes,squaresdenotediscretenodes,clearmeanshidden,andshadedmeansobserved.

http://www.cs.ubc.ca/~murphyk/Software/BNT/usage.html#basics

11/32

9/21/2016

HowtousetheBayesNetToolbox

Xistheobservedinput,Yistheoutput,andtheQnodesarehidden"gating"nodes,whichselecttheappropriatesetofparametersfor
Y.Duringtraining,Yisassumedobserved,butfortesting,thegoalistopredictYgivenX.Notethatthisisaconditionaldensity
model,sowedon'tassociateanyparameterswithX.HenceX'sCPDwillbearootCPD,whichisawayofmodellingexogenous
nodes.Iftheoutputisacontinuousvaluedquantity,weassumethe"experts"arelinearregressionunits,andsetY'sCPDtolinear
Gaussian.Iftheoutputisdiscrete,wesetY'sCPDtoasoftmaxfunction.TheQCPDswillalwaysbesoftmaxfunctions.
Asaconcreteexample,considerthemixtureofexpertsmodelwhereXandYarescalars,andQisbinary.Thisisjustpiecewiselinear
regression,wherewehavetwolinesegments,i.e.,

Wecancreatethismodelwithrandomparametersasfollows.(ThiscodeisbundledinBNT/examples/static/mixexp2.m.)
X=1;
Q=2;
Y=3;
dag=zeros(3,3);
dag(X,[QY])=1
dag(Q,Y)=1;
ns=[121];%makeXandYscalars,andhave2experts
onodes=[13];
bnet=mk_bnet(dag,ns,'discrete',2,'observed',onodes);
rand('state',0);
randn('state',0);
bnet.CPD{1}=root_CPD(bnet,1);
bnet.CPD{2}=softmax_CPD(bnet,2);
bnet.CPD{3}=gaussian_CPD(bnet,3);

NowletusfitthismodelusingEM.Firstweloadthedata(1000trainingcases)andplotthem.
data=load('/examples/static/Misc/mixexp_data.txt','ascii');
plot(data(:,1),data(:,2),'.');

http://www.cs.ubc.ca/~murphyk/Software/BNT/usage.html#basics

12/32

9/21/2016

HowtousetheBayesNetToolbox

Thisiswhatthemodellookslikebeforetraining.(ThankstoThomasHofmanforwritingthisplottingroutine.)

Nowlet'strainthemodel,andplotthefinalperformance.(Wewilldiscusshowtotrainmodelsinmoredetailbelow.)
ncases=size(data,1);%eachrowofdataisatrainingcase
cases=cell(3,ncases);
cases([13],:)=num2cell(data');%eachcolumnofcasesisatrainingcase
engine=jtree_inf_engine(bnet);
max_iter=20;
[bnet2,LLtrace]=learn_params_em(engine,cases,max_iter);

(Wespecifywhichnodeswillbeobservedwhenwecreatetheengine.HenceBNTknowsthatthehiddennodesarealldiscrete.For
complexmodels,thiscanleadtoasignificantspeedup.)Belowweshowwhatthemodellookslikeafter16iterationsofEM(with100
IRLSiterationsperMstep),whenitconvergedusingthedefaultconvergencetolerance(thatthefractionalchangeintheloglikelihood
belessthan1e3).Beforelearning,theloglikelihoodwas322.927442afterwards,itwas13.728778.

http://www.cs.ubc.ca/~murphyk/Software/BNT/usage.html#basics

13/32

9/21/2016

HowtousetheBayesNetToolbox

(SeeBNT/examples/static/mixexp2.mfordetailsofthecode.)

Hierarchicalmixturesofexperts
Ahierarchicalmixtureofexperts(HME)extendsthemixtureofexpertsmodelbyhavingmorethanonehiddennode.Atwolevel
exampleisshownbelow,alongwithitsmoretraditionalrepresentationasaneuralnetwork.Thisislikea(balanced)probabilistic
decisiontreeofheight2.

PierpaoloBruttihaswrittenanextensivesetofroutinesforHMEs,whicharebundledwithBNT:seetheexamples/static/HME
directory.Theseroutinesallowyoutochoosethenumberofhidden(gating)layers,andtheformoftheexperts(softmaxorMLP).See
thefilehmemenu,whichprovidesademo.Forexample,thefigurebelowshowsthedecisionboundarieslearnedforaternary
classificationproblem,usinga2levelHMEwithsoftmaxgatesandsoftmaxexpertsthetrainingsetisontheleft,thetestingsetonthe
right.

http://www.cs.ubc.ca/~murphyk/Software/BNT/usage.html#basics

14/32

9/21/2016

HowtousetheBayesNetToolbox

Formoredetails,seethefollowing:
HierarchicalmixturesofexpertsandtheEMalgorithmM.I.JordanandR.A.Jacobs.NeuralComputation,6,181214,1994.
DavidMartin'smatlabcodeforHME
Whythelogisticfunction?Atutorialdiscussiononprobabilitiesandneuralnetworks.M.I.Jordan.MITComputational
CognitiveScienceReport9503,August1995.
"GeneralizedLinearModels",McCullaghandNelder,ChapmanandHalll,1983.
"Improvedlearningalgorithmsformixturesofexpertsinmulticlassclassification".K.Chen,L.Xu,H.Chi.NeuralNetworks
(1999)12:12291252.
ClassificationUsingHierarchicalMixturesofExpertsS.R.WaterhouseandA.J.Robinson.InProc.IEEEWorkshoponNeural
NetworkforSignalProcessingIV(1994),pp.177186
Localizedmixturesofexperts,P.Moerland,1998.
"Nonlineargatedexpertsfortimeseries",A.S.WeigendandM.Mangeas,1995.

QMR
Bayesnetsoriginallyaroseoutofanattempttoaddprobabilitiestoexpertsystems,andthisisstillthemostcommonuseforBNs.A
famousexampleisQMRDT,adecisiontheoreticreformulationoftheQuickMedicalReference(QMR)model.

Here,thetoplayerrepresentshiddendiseasenodes,andthebottomlayerrepresentsobservedsymptomnodes.Thegoalistoinferthe
posteriorprobabilityofeachdiseasegivenallthesymptoms(whichcanbepresent,absentorunknown).Eachnodeinthetoplayerhas
aBernoulliprior(withalowpriorprobabilitythatthediseaseispresent).Sinceeachnodeinthebottomlayerhasahighfanin,weuse
anoisyORparameterizationeachdiseasehasanindependentchanceofcausingeachsymptom.TherealQMRDTmodelis
copyright,butwecancreatearandomQMRlikemodelasfollows.
functionbnet=mk_qmr_bnet(G,inhibit,leak,prior)
%MK_QMR_BNETMakeaQMRmodel
%bnet=mk_qmr_bnet(G,inhibit,leak,prior)
%
%G(i,j)=1iffthereisanarcfromdiseaseitofindingj
%inhibit(i,j)=inhibitionprobabilityoni>jarc
%leak(j)=inhibitionprob.onleak>jarc
%prior(i)=prob.diseaseiison
[NdiseasesNfindings]=size(inhibit);
http://www.cs.ubc.ca/~murphyk/Software/BNT/usage.html#basics

15/32

9/21/2016

HowtousetheBayesNetToolbox

N=Ndiseases+Nfindings;
finding_node=Ndiseases+1:N;
ns=2*ones(1,N);
dag=zeros(N,N);
dag(1:Ndiseases,finding_node)=G;
bnet=mk_bnet(dag,ns,'observed',finding_node);
ford=1:Ndiseases
CPT=[1prior(d)prior(d)];
bnet.CPD{d}=tabular_CPD(bnet,d,CPT');
end
fori=1:Nfindings
fnode=finding_node(i);
ps=parents(G,i);
bnet.CPD{fnode}=noisyor_CPD(bnet,fnode,leak(i),inhibit(ps,i));
end

InthefileBNT/examples/static/qmr1,wecreatearandombipartitegraphG,with5diseasesand10findings,andrandomparameters.
(Ingeneral,tocreatearandomdag,use'mk_random_dag'.)Wecanvisualizetheresultinggraphstructureusingthemethodsdiscussed
below,withthefollowingresults:

Nowletusputsomerandomevidenceonalltheleavesexcepttheveryfirstandverylast,andcomputethediseaseposteriors.
pos=2:floor(Nfindings/2);
neg=(pos(end)+1):(Nfindings1);
onodes=myunion(pos,neg);
evidence=cell(1,N);
evidence(findings(pos))=num2cell(repmat(2,1,length(pos)));
evidence(findings(neg))=num2cell(repmat(1,1,length(neg)));
engine=jtree_inf_engine(bnet);
[engine,ll]=enter_evidence(engine,evidence);
post=zeros(1,Ndiseases);
fori=diseases(:)'
m=marginal_nodes(engine,i);
post(i)=m.T(2);
end

JunctiontreecanbequiteslowonlargeQMRmodels.Fortunately,itispossibletoexploitpropertiesofthenoisyORfunctiontospeed
upexactinferenceusinganalgorithmcalledquickscore,discussedbelow.

ConditionalGaussianmodels
AconditionalGaussianmodelisoneinwhich,conditionedonallthediscretenodes,thedistributionovertheremaining(continuous)
nodesismultivariateGaussian.Thismeanswecanhavearcsfromdiscrete(D)tocontinuous(C)nodes,butnotviceversa.(Weare
allowedC>Darcsifthecontinuousnodesareobserved,asinthemixtureofexpertsmodel,sincethisdistributioncanberepresented
withadiscretepotential.)
WenowgiveanexampleofaCGmodel,fromthepaper"PropagationofProbabilities,MeansamdVariancesinMixedGraphical
AssociationModels",SteffenLauritzen,JASA87(420):10981108,1992(reprintedinthebook"ProbabilisticNetworksandExpert
Systems",R.G.Cowell,A.P.Dawid,S.L.LauritzenandD.J.Spiegelhalter,Springer,1999.)
http://www.cs.ubc.ca/~murphyk/Software/BNT/usage.html#basics

16/32

9/21/2016

HowtousetheBayesNetToolbox

Specifyingthegraph
Considerthemodelofwasteemissionsfromanincineratorplantshownbelow.Wefollowthestandardconventionthatshadednodes
areobserved,clearnodesarehidden.Wealsousethenonstandardconventionthatsquarenodesarediscrete(tabular)androundnodes
areGaussian.

Wecancreatethismodelasfollows.
F=1;W=2;E=3;B=4;C=5;D=6;Min=7;Mout=8;L=9;
n=9;
dag=zeros(n);
dag(F,E)=1;
dag(W,[EMinD])=1;
dag(E,D)=1;
dag(B,[CD])=1;
dag(D,[LMout])=1;
dag(Min,Mout)=1;
%nodesizesallctsnodesarescalar,alldiscretenodesarebinary
ns=ones(1,n);
dnodes=[FWB];
cnodes=mysetdiff(1:n,dnodes);
ns(dnodes)=2;
bnet=mk_bnet(dag,ns,'discrete',dnodes);

'dnodes'isalistofthediscretenodes'cnodes'isthecontinuousnodes.'mysetdiff'isafasterversionofthebuiltin'setdiff'.

Specifyingtheparameters
Theparametersofthediscretenodescanbespecifiedasfollows.
bnet.CPD{B}=tabular_CPD(bnet,B,'CPT',[0.850.15]);%1=stable,2=unstable
bnet.CPD{F}=tabular_CPD(bnet,F,'CPT',[0.950.05]);%1=intact,2=defect
bnet.CPD{W}=tabular_CPD(bnet,W,'CPT',[2/75/7]);%1=industrial,2=household

Theparametersofthecontinuousnodescanbespecifiedasfollows.
bnet.CPD{E}=gaussian_CPD(bnet,E,'mean',[3.90.43.20.5],...

'cov',[0.000020.00010.000020.0001]);
bnet.CPD{D}=gaussian_CPD(bnet,D,'mean',[6.56.07.57.0],...

'cov',[0.030.040.10.1],'weights',[1111]);
bnet.CPD{C}=gaussian_CPD(bnet,C,'mean',[21],'cov',[0.10.3]);
bnet.CPD{L}=gaussian_CPD(bnet,L,'mean',3,'cov',0.25,'weights',0.5);
bnet.CPD{Min}=gaussian_CPD(bnet,Min,'mean',[0.50.5],'cov',[0.010.005]);
bnet.CPD{Mout}=gaussian_CPD(bnet,Mout,'mean',0,'cov',0.002,'weights',[11]);

Inference
http://www.cs.ubc.ca/~murphyk/Software/BNT/usage.html#basics

17/32

9/21/2016

HowtousetheBayesNetToolbox

Firstwecomputetheunconditionalmarginals.
engine=jtree_inf_engine(bnet);
evidence=cell(1,n);
[engine,ll]=enter_evidence(engine,evidence);
marg=marginal_nodes(engine,E);

'marg'isastructurethatcontainsthefields'mu'and'Sigma',whichcontainthemeanand(co)varianceofthemarginalonE.Inthis
case,theyarebothscalars.Letuschecktheymatchthepublishedfigures(to2decimalplaces).
tol=1e2;
assert(approxeq(marg.mu,3.25,tol));
assert(approxeq(sqrt(marg.Sigma),0.709,tol));

Wecancomputetheotherposteriorssimilarly.Nowletusaddsomeevidence.
evidence=cell(1,n);
evidence{W}=1;%industrial
evidence{L}=1.1;
evidence{C}=0.9;
[engine,ll]=enter_evidence(engine,evidence);

Nowwefind
marg=marginal_nodes(engine,E);
assert(approxeq(marg.mu,3.8983,tol));
assert(approxeq(sqrt(marg.Sigma),0.0763,tol));

Wecanalsocomputethejointprobabilityonasetofnodes.Forexample,P(D,Mout|evidence)isa2DGaussian:
marg=marginal_nodes(engine,[DMout])
marg=
domain:[68]
mu:[2x1double]
Sigma:[2x2double]
T:1.0000

Themeanis
marg.mu
ans=
3.6077
4.1077

andthecovariancematrixis
marg.Sigma
ans=
0.10620.1062
0.10620.1182

ItiseasytovisualizethisposteriorusingstandardMatlabplottingfunctions,e.g.,
gaussplot2d(marg.mu,marg.Sigma);

producesthefollowingpicture.

TheTfieldindicatesthatthemixingweightofthisGaussiancomponentis1.0.Ifthejointcontainsdiscreteandcontinuousvariables,
theresultwillbeamixtureofGaussians,e.g.,
marg=marginal_nodes(engine,[FE])
domain:[13]
mu:[3.90000.4003]
Sigma:[1x1x2double]
T:[0.99954.7373e04]

TheinterpretationisSigma(i,j,k)=Cov[E(i)E(j)|F=k].Inthiscase,Eisascalar,soi=j=1kspecifiesthemixturecomponent.
http://www.cs.ubc.ca/~murphyk/Software/BNT/usage.html#basics

18/32

9/21/2016

HowtousetheBayesNetToolbox

WesawinthesprinklernetworkthatBNTsetstheeffectivesizeofobserveddiscretenodesto1,sincetheyonlyhaveonelegalvalue.
Forcontinuousnodes,BNTsetstheirlengthto0,sincetheyhavebeenreducedtoapoint.Forexample,
marg=marginal_nodes(engine,[BC])
domain:[45]
mu:[]
Sigma:[]
T:[0.01230.9877]

Itissimpletopostprocesstheoutputofmarginal_nodes.Forexample,thefileBNT/examples/static/cg1setsthemutermofobserved
nodestotheirobservedvalue,andtheSigmatermto0(sinceobservednodeshavenovariance).
NotethattheimplementedversionofthejunctiontreeisnumericallyunstablewhenusingCGpotentials(whichiswhy,intheexample
above,weonlyrequiredouranswerstoagreewiththepublishedonesto2dp.)Thisiswhyyoumightwanttouse
stab_cond_gauss_inf_engine,implementedbyShanHuang.Thisisdescribedin
"StableLocalComputationwithConditionalGaussianDistributions",S.LauritzenandF.Jensen,TechReportR992014,Dept.
Math.Sciences,AllborgUniv.,1999.
However,eventhenumericallystableversioncanbecomputationallyintractableiftherearemanyhiddendiscretenodes,becausethe
numberofmixturecomponentsgrowsexponentiallye.g.,inaswitchinglineardynamicalsystem.Ingeneral,onemustresortto
approximateinferencetechniques:seethediscussiononinferenceenginesbelow.

Otherhybridmodels
WhenwehaveC>Darcs,whereCishidden,weneedtouseapproximateinference.Oneapproach(notimplementedinBNT)is
describedin
AVariationalApproximationforBayesianNetworkswithDiscreteandContinuousLatentVariables,K.Murphy,UAI99.
Ofcourse,onecanalwaysusesamplingmethodsforapproximateinferenceinsuchmodels.

ParameterLearning
TheparameterestimationroutinesinBNTcanbeclassifiedinto4types,dependingonwhetherthegoalistocomputeafull(Bayesian)
posteriorovertheparametersorjustapointestimate(e.g.,MaximumLikelihoodorMaximumAPosteriori),andwhetherallthe
variablesarefullyobservedorthereismissingdata/hiddenvariables(partialobservability).
Fullobs

Partialobs

Point
Bayes bayes_update_params notyetsupported
learn_params

learn_params_em

Loadingdatafromafile
ToloadnumericdatafromanASCIItextfilecalled'dat.txt',whereeachrowisacaseandcolumnsareseparatedbywhitespace,such
as
0119791626.50.0
0219791367.00.0
...

youcanuse
data=load('dat.txt');

or
loaddat.txtascii

Inthelattercase,thedataisstoredinavariablecalled'dat'(thefilenameminustheextension).Alternatively,supposethedataisstored
ina.csvfile(hascommasseparatingthecolumns,andcontainsaheaderline),suchas
headerinfogoeshere
ORD,011979,1626.5,0.0
DSM,021979,1367.0,0.0
...
http://www.cs.ubc.ca/~murphyk/Software/BNT/usage.html#basics

19/32

9/21/2016

HowtousetheBayesNetToolbox

Youcanloadthisusing
[a,b,c,d]=textread('dat.txt','%s%d%f%f','delimiter',',','headerlines',1);

Ifyourfileisnotineitheroftheseformats,youcaneitherusePerltoconvertittothisformat,orusetheMatlabscanfcommand.Type
helpiofunformoreinformationonMatlab'sfilefunctions.
BNTlearningroutinesrequiredatatobestoredinacellarray.data{i,m}isthevalueofnodeiincase(example)m,i.e.,eachcolumnis
acase.Ifnodeiisnotobservedincasem(missingvalue),setdata{i,m}=[].(Notallthelearningroutinescancopewithsuchmissing
values,however.)Inthespecialcasethatallthenodesareobservedandarescalarvalued(asopposedtovectorvalued),thedatacanbe
storedinamatrix(asopposedtoacellarray).
Suppose,asinthemixtureofexpertsexample,thatwehave3nodesinthegraph:X(1)istheobservedinput,X(3)istheobserved
output,andX(2)isahidden(gating)node.Wecancreatethedatasetasfollows.
data=load('dat.txt');
ncases=size(data,1);
cases=cell(3,ncases);
cases([13],:)=num2cell(data');

Noticehowwetransposedthedata,toconvertrowsintocolumns.Also,cases{2,m}=[]forallm,sinceX(2)isalwayshidden.

Maximumlikelihoodparameterestimationfromcompletedata
Asanexample,let'sgeneratesomedatafromthesprinklernetwork,randomizetheparameters,andthentrytorecovertheoriginal
model.Firstwecreatesometrainingdatausingforwardssampling.
samples=cell(N,nsamples);
fori=1:nsamples
samples(:,i)=sample_bnet(bnet);
end

samples{j,i}containsthevalueofthej'thnodeincasei.sample_bnetreturnsacellarraybecause,ingeneral,eachnodemightbea
vectorofdifferentlength.Inthiscase,allnodesarediscrete(andhencescalars),sowecouldhaveusedaregulararrayinstead(which
canbequicker):
data=cell2num(samples);

Nowwecreateanetworkwithrandomparameters.(Theinitialvaluesofbnet2don'tmatterinthiscase,sincewecanfindtheglobally
optimalMLEindependentofwherewestart.)
%Makeatabularasa
bnet2=mk_bnet(dag,node_sizes);
seed=0;
rand('state',seed);
bnet2.CPD{C}=tabular_CPD(bnet2,C);
bnet2.CPD{R}=tabular_CPD(bnet2,R);
bnet2.CPD{S}=tabular_CPD(bnet2,S);
bnet2.CPD{W}=tabular_CPD(bnet2,W);

Finally,wefindthemaximumlikelihoodestimatesoftheparameters.
bnet3=learn_params(bnet2,samples);

Toviewthelearnedparameters,weusealittleMatlabhackery.
CPT3=cell(1,N);
fori=1:N
s=struct(bnet3.CPD{i});%violateobjectprivacy
CPT3{i}=s.CPT;
end

Herearetheparameterslearnedfornode4.
dispcpt(CPT3{4})
11:1.00000.0000
21:0.20000.8000
12:0.22730.7727
22:0.00001.0000

Soweseethatthelearnedparametersarefairlyclosetothe"true"ones,whichwedisplaybelow.
dispcpt(CPT{4})
11:1.00000.0000
http://www.cs.ubc.ca/~murphyk/Software/BNT/usage.html#basics

20/32

9/21/2016

HowtousetheBayesNetToolbox

21:0.10000.9000
12:0.10000.9000
22:0.01000.9900

Wecangetbetterresultsbyusingalargertrainingset,orusinginformativepriors(seebelow).

Parameterpriors
Currently,onlytabularCPDscanhavepriorsontheirparameters.TheconjugatepriorforamultinomialistheDirichlet.(Forbinary
randomvariables,themultinomialisthesameastheBernoulli,andtheDirichletisthesameastheBeta.)
TheDirichlethasasimpleinterpretationintermsofpseudocounts.IfweletN_ijk=thenum.timesX_i=kandPa_i=joccursinthe
trainingset,wherePa_iaretheparentsofX_i,thenthemaximumlikelihood(ML)estimateisT_ijk=N_ijk/N_ij(whereN_ij=
sum_k'N_ijk'),whichwillbe0ifN_ijk=0.Topreventusfromdeclaringthat(X_i=k,Pa_i=j)isimpossiblejustbecausethiseventwas
notseeninthetrainingset,wecanpretendwesawvaluekofX_i,foreachvaluejofPa_isomenumber(alpha_ijk)oftimesinthe
past.TheMAP(maximumaposterior)estimateisthen
T_ijk=(N_ijk+alpha_ijk)/(N_ij+alpha_ij)

andisnever0ifallalpha_ijk>0.Forexample,considerthenetworkA>B,whereAisbinaryandBhas3values.Auniformpriorfor
Bhastheform
B=1B=2B=3
A=1111
A=2111

whichcanbecreatedusing
tabular_CPD(bnet,i,'prior_type','dirichlet','dirichlet_type','unif');

Thispriordoesnotsatisfythelikelihoodequivalenceprinciple,whichsaysthatMarkovequivalentmodelsshouldhavethesame
marginallikelihood.Apriorthatdoessatisfythisprincipleisshownbelow.Heckerman(1995)callsthistheBDeuprior(likelihood
equivalentuniformBayesianDirichlet).
B=1B=2B=3
A=11/61/61/6
A=21/61/61/6

whereweputN/(q*r)ineachbinNistheequivalentsamplesize,r=|A|,q=|B|.Thiscanbecreatedasfollows
tabular_CPD(bnet,i,'prior_type','dirichlet','dirichlet_type','BDeu');

Here,1istheequivalentsamplesize,andisthestrengthoftheprior.Youcanchangethisusing
tabular_CPD(bnet,i,'prior_type','dirichlet','dirichlet_type',...
'BDeu','dirichlet_weight',10);

(Sequential)Bayesianparameterupdatingfromcompletedata
Ifweuseconjugatepriorsandhavefullyobserveddata,wecancomputetheposteriorovertheparametersinbatchformasfollows.
cases=sample_bnet(bnet,nsamples);
bnet=bayes_update_params(bnet,cases);
LL=log_marg_lik_complete(bnet,cases);

bnet.CPD{i}.priorcontainsthenewDirichletpseudocounts,andbnet.CPD{i}.CPTissettothemeanoftheposterior(thenormalized
counts).(Henceiftheinitialpseudocountsare0,bayes_update_paramsandlearn_paramswillgivethesameresult.)
Wecancomputethesameresultsequentially(online)asfollows.
LL=0;
form=1:nsamples
LL=LL+log_marg_lik_complete(bnet,cases(:,m));
bnet=bayes_update_params(bnet,cases(:,m));
end

ThefileBNT/examples/static/StructLearn/model_select1hasanexampleofsequentialmodelselectionwhichusesthesameidea.We
generatedatafromthemodelA>Bandcomputetheposteriorprobofall3dagson2nodes:(1)AB,(2)A<B,(3)A>BModels2
and3areMarkovequivalent,andthereforeindistinguishablefromobservationaldataalone,soweexpecttheirposteriorstobethe
same(assumingapriorwhichsatisfieslikelihoodequivalence).Ifweuserandomparameters,the"true"modelonlygetsahigher
http://www.cs.ubc.ca/~murphyk/Software/BNT/usage.html#basics

21/32

9/21/2016

HowtousetheBayesNetToolbox

posteriorafter2000trials!However,ifwemakeBanoisyNOTgate,thetruemodel"wins"after12trials,asshownbelow(red=
model1,blue/green(superimposed)representsmodels2/3).

Theuseofmarginallikelihoodformodelselectionisdiscussedingreaterdetailinthesectiononstructurelearning.

Maximumlikelihoodparameterestimationwithmissingvalues
Nowweconsiderlearningwhensomevaluesarenotobserved.Letusrandomlyhidehalfthevaluesgeneratedfromthewatersprinkler
example.
samples2=samples;
hide=rand(N,nsamples)>0.5;
[I,J]=find(hide);
fork=1:length(I)
samples2{I(k),J(k)}=[];
end

samples2{i,l}isthevalueofnodeiintrainingcasel,or[]ifunobserved.
NowwewillcomputetheMLEsusingtheEMalgorithm.Weneedtouseaninferencealgorithmtocomputetheexpectedsufficient
statisticsintheEsteptheM(maximization)stepisasabove.
engine2=jtree_inf_engine(bnet2);
max_iter=10;
[bnet4,LLtrace]=learn_params_em(engine2,samples2,max_iter);

LLtrace(i)istheloglikelihoodatiterationi.Wecanplotthisasfollows:
plot(LLtrace,'x')

Let'sdisplaytheresultsafter10iterationsofEM.
celldisp(CPT4)
CPT4{1}=
0.6616
0.3384
CPT4{2}=
0.65100.3490
0.87510.1249
CPT4{3}=
0.83660.1634
0.01970.9803
CPT4{4}=
(:,:,1)=
0.82760.0546
0.54520.1658
(:,:,2)=
0.17240.9454
0.45480.8342

Wecangetimprovedperformancebyusingoneormoreofthefollowingmethods:
Increasingthesizeofthetrainingset.
Decreasingtheamountofhiddendata.
RunningEMforlonger.
Usinginformativepriors.
InitialisingEMfrommultiplestartingpoints.
ClickhereforadiscussionoflearningGaussians,whichcancausenumericalproblems.
ForamorecompleteexampleoflearningwithEM,seethescriptBNT/examples/static/learn1.m.
http://www.cs.ubc.ca/~murphyk/Software/BNT/usage.html#basics

22/32

9/21/2016

HowtousetheBayesNetToolbox

Parametertying
Innetworkswithrepeatedstructure(e.g.,chainsandgrids),itiscommontoassumethattheparametersarethesameateverynode.
Thisiscalledparametertying,andreducestheamountofdataneededforlearning.
Whenwehavetiedparameters,thereisnolongeraonetoonecorrespondencebetweennodesandCPDs.Rather,eachCPDspecies
theparametersforawholeequivalenceclassofnodes.Itiseasiesttoseethisbyexample.ConsiderthefollowinghiddenMarkov
model(HMM)

WhenHMMsareusedforsemiinfiniteprocesseslikespeechrecognition,weassumethetransitionmatrixP(H(t+1)|H(t))isthesame
foralltthisiscalledatimeinvariantorhomogenousMarkovchain.Hencehiddennodes2,3,...,Tareallinthesameequivalence
class,sayclassHclass.Similarly,theobservationmatrixP(O(t)|H(t))isassumedtobethesameforallt,sotheobservednodesareall
inthesameequivalenceclass,sayclassOclass.Finally,thepriortermP(H(1))isinaclassallbyitself,sayclassH1class.Thisis
illustratedbelow,whereweexplicitlyrepresenttheparametersasrandomvariables(dottednodes).

InBNT,wecannotrepresentparametersasrandomvariables(nodes).Instead,we"hide"theparametersinsideoneCPDforeach
equivalenceclass,andthenspecifythattheotherCPDsshouldsharetheseparameters,asfollows.
hnodes=1:2:2*T;
onodes=2:2:2*T;
H1class=1;Hclass=2;Oclass=3;
eclass=ones(1,N);
eclass(hnodes(2:end))=Hclass;
eclass(hnodes(1))=H1class;
eclass(onodes)=Oclass;
%createdagandnsintheusualway
bnet=mk_bnet(dag,ns,'discrete',dnodes,'equiv_class',eclass);

Finally,wedefinetheparametersforeachequivalenceclass:
bnet.CPD{H1class}=tabular_CPD(bnet,hnodes(1));%prior
bnet.CPD{Hclass}=tabular_CPD(bnet,hnodes(2));%transitionmatrix
ifcts_obs
bnet.CPD{Oclass}=gaussian_CPD(bnet,onodes(1));
else
bnet.CPD{Oclass}=tabular_CPD(bnet,onodes(1));
end

Ingeneral,ifbnet.CPD{e}=xxx_CPD(bnet,j),thenjshouldbeamemberofe'sequivalenceclassthatis,itisnotalwaysthecasethat
e==j.Youcanusebnet.rep_of_eclass(e)toreturntherepresentativeofequivalenceclasse.BNTwilllookuptheparentsofjto
determinethesizeoftheCPTtouse.Itassumesthatthisisthesameforallmembersoftheequivalenceclass.Clickhereforamore
complexexampleofparametertying.
Note:NormallyonewoulddefineanHMMasaDynamicBayesNet(seethefunctionBNT/examples/dynamic/mk_chmm.m).
However,onecandefineanHMMasastaticBNusingthefunctionBNT/examples/static/Models/mk_hmm_bnet.m.

Structurelearning
http://www.cs.ubc.ca/~murphyk/Software/BNT/usage.html#basics

23/32

9/21/2016

HowtousetheBayesNetToolbox

Update(9/29/03):PhillipeLeRayisdevelopingsomeadditionalstructurelearningcodeontopofBNT.Clickherefordetails.
Therearetwoverydifferentapproachestostructurelearning:constraintbasedandsearchandscore.Intheconstraintbasedapproach,
westartwithafullyconnectedgraph,andremoveedgesifcertainconditionalindependenciesaremeasuredinthedata.Thishasthe
disadvantagethatrepeatedindependencetestslosestatisticalpower.
Inthemorepopularsearchandscoreapproach,weperformasearchthroughthespaceofpossibleDAGs,andeitherreturnthebest
onefound(apointestimate),orreturnasampleofthemodelsfound(anapproximationtotheBayesianposterior).
ThenumberofDAGsasafunctionofthenumberofnodes,G(n),issuperexponentialinn,andisgivenbythefollowingrecurrence

Thefirstfewvaluesareshownbelow.
n G(n)
1 1
2 3
3 25
4 543
5 29,281
6 3,781,503
7 1.1x10^9
8 7.8x10^11
9 1.2x10^15
10 4.2x10^18
SincethenumberofDAGsissuperexponentialinthenumberofnodes,wecannotexhaustivelysearchthespace,soweeitherusea
localsearchalgorithm(e.g.,greedyhillclimbining,perhapswithmultiplerestarts)oraglobalsearchalgorithm(e.g.,MarkovChain
MonteCarlo).
Ifweknowatotalorderingonthenodes,findingthebeststructureamountstopickingthebestsetofparentsforeachnode
independently.ThisiswhattheK2algorithmdoes.Iftheorderingisunknown,wecansearchoverorderings,whichismoreefficient
thansearchingoverDAGs(KollerandFriedman,2000).
Inadditiontothesearchprocedure,wemustspecifythescoringfunction.Therearetwopopularchoices.TheBayesianscoreintegrates
outtheparameters,i.e.,itisthemarginallikelihoodofthemodel.TheBIC(BayesianInformationCriterion)isdefinedaslog
P(D|theta_hat)0.5*d*log(N),whereDisthedata,theta_hatistheMLestimateoftheparameters,disthenumberofparameters,and
Nisthenumberofdatacases.TheBICmethodhastheadvantageofnotrequiringaprior.
BICcanbederivedasalargesampleapproximationtothemarginallikelihood.(ItisalsoequaltotheMinimumDescriptionLengthof
amodel.)However,inpractice,thesamplesizedoesnotneedtobeverylargefortheapproximationtobegood.Forexample,inthe
figurebelow,weplottheratiobetweenthelogmarginallikelihoodandtheBICscoreagainstdatasetsizeweseethattheratiorapidly
approaches1,especiallyfornoninformativepriors.(ThisplotwasgeneratedbythefileBNT/examples/static/bic1.m.Itusesthewater
sprinklerBNwithBDeuDirichletpriorswithdifferentequivalentsamplesizes.)

Aswithparameterlearning,handlingmissingdata/hiddenvariablesismuchharderthanthefullyobservedcase.Thestructurelearning
routinesinBNTcanthereforebeclassifiedinto4types,analogouslytotheparameterlearningcase.
Fullobs

Partialobs

Point learn_struct_K2

notyetsupported

Bayes

notyetsupported

learn_struct_mcmc

http://www.cs.ubc.ca/~murphyk/Software/BNT/usage.html#basics

24/32

9/21/2016

HowtousetheBayesNetToolbox

Markovequivalence
IftwoDAGsencodethesameconditionalindependencies,theyarecalledMarkovequivalent.ThesetofallDAGscanbeparitioned
intoMarkovequivalenceclasses.Graphswithinthesameclasscanhavethedirectionofsomeoftheirarcsreversedwithoutchanging
anyoftheCIrelationships.EachclasscanberepresentedbyaPDAG(partiallydirectedacyclicgraph)calledanessentialgraphor
pattern.Thisspecifieswhichedgesmustbeorientedinacertaindirection,andwhichmaybereversed.
Whenlearninggraphstructurefromobservationaldata,thebestonecanhopetodoistoidentifythemodeluptoMarkovequivalence.
Todistinguishamongstgraphswithinthesameequivalenceclass,oneneedsinterventionaldata:seethediscussiononactivelearning
below.

Exhaustivesearch
ThebruteforceapproachtostructurelearningistoenumerateallpossibleDAGs,andscoreeachone.Thisprovidesa"goldstandard"
withwhichtocompareotheralgorithms.Wecandothisasfollows.
dags=mk_all_dags(N);
score=score_dags(data,ns,dags);

wheredata(i,m)isthevalueofnodeiincasem,andns(i)isthesizeofnodei.IftheDAGshavealotoffamiliesincommon,wecan
cachethesufficientstatistics,makingthispotentiallymoreefficientthanscoringtheDAGsoneatatime.(Cachingisnotcurrently
implemented,however.)
Bydefault,weusetheBayesianscoringmetric,andassumeCPDsarerepresentedbytableswithBDeu(1)priors.Wecanoverride
thesedefaultsasfollows.Ifwewanttouseuniformpriors,wecansay
params=cell(1,N);
fori=1:N
params{i}={'prior','unif'};
end
score=score_dags(data,ns,dags,'params',params);

params{i}isacellarray,containingoptionalargumentsthatarepassedtotheconstructorforCPDi.
Nowsupposewewanttousedifferentnodetypes,e.g.,Supposenodes1and2areGaussian,andnodes3and4softmax(boththese
CPDscansupportdiscreteandcontinuousparents,whichisnecessarysinceallothernodeswillbeconsideredasparents).The
BayesianscoringmetriccurrentlyonlyworksfortabularCPDs,sowewilluseBIC:
score=score_dags(data,ns,dags,'discrete',[34],'params',[],
'type',{'gaussian','gaussian','softmax',softmax'},'scoring_fn','bic')

Inpractice,onecan'tenumerateallpossibleDAGsforN>5,butonecanevaluateanyreasonablysizedsetofhypothesesinthisway
(e.g.,nearestneighborsofyourcurrentbestguess).Thinkofthisas"computerassistedmodelrefinement"asopposedtodenovo
learning.

K2
TheK2algorithm(CooperandHerskovits,1992)isagreedysearchalgorithmthatworksasfollows.Initiallyeachnodehasno
parents.Itthenaddsincrementallythatparentwhoseadditionmostincreasesthescoreoftheresultingstructure.Whentheadditionof
nosingleparentcanincreasethescore,itstopsaddingparentstothenode.Sinceweareusingafixedordering,wedonotneedto
checkforcycles,andcanchoosetheparentsforeachnodeindependently.
TheoriginalpaperusedtheBayesianscoringmetricwithtabularCPDsandDirichletpriors.BNTgeneralizesthistoallowanykindof
CPD,andeithertheBayesianscoringmetricorBIC,asintheexampleabove.Inaddition,youcanspecifyanoptionalupperboundon
thenumberofparentsforeachnode.ThefileBNT/examples/static/k2demo1.mgivesanexampleofhowtouseK2.Weusethewater
sprinklernetworkandsample100casesfromitasbefore.Thenweseehowmuchdataittakestorecoverthegeneratingstructure:
order=[CSRW];
max_fan_in=2;
sz=5:5:100;
fori=1:length(sz)
dag2=learn_struct_K2(data(:,1:sz(i)),node_sizes,order,'max_fan_in',max_fan_in);
correct(i)=isequal(dag,dag2);
end

Herearetheresults.
http://www.cs.ubc.ca/~murphyk/Software/BNT/usage.html#basics

25/32

9/21/2016

HowtousetheBayesNetToolbox

correct=
Columns1through12
000000010111
Columns13through20
11111111

Soweseeittakesaboutsz(10)=50cases.(BICbehavessimilarly,showingthatthepriordoesn'tmattertoomuch.)Ingeneral,we
cannothopetorecoverthe"true"generatingstructure,onlyonethatisinitsMarkovequivalenceclass.

Hillclimbing
Hillclimbingstartsataspecificpointinspace,considersallnearestneighbors,andmovestotheneighborthathasthehighestscoreif
noneighborshavehigherscorethanthecurrentpoint(i.e.,wehavereachedalocalmaximum),thealgorithmstops.Onecanthen
restartinanotherpartofthespace.
Acommondefinitionof"neighbor"isallgraphsthatcanbegeneratedfromthecurrentgraphbyadding,deletingorreversingasingle
arc,subjecttotheacyclicityconstraint.Otherneighborhoodsarepossible:seeOptimalStructureIdentificationwithGreedySearch,
MaxChickering,JMLR2002.

MCMC
WecanuseaMarkovChainMonteCarlo(MCMC)algorithmcalledMetropolisHastings(MH)tosearchthespaceofallDAGs.The
standardproposaldistributionistoconsidermovingtoallnearestneighborsinthesensedefinedabove.
Thefunctioncanbecalledasinthefollowingexample.
[sampled_graphs,accept_ratio]=learn_struct_mcmc(data,ns,'nsamples',100,'burnin',10);

Wecanconvertoursetofsampledgraphstoahistogram(empiricalposterioroveralltheDAGs)thus
all_dags=mk_all_dags(N);
mcmc_post=mcmc_sample_to_hist(sampled_graphs,all_dags);

Toseehowwellthisperforms,letuscomputetheexactposteriorexhaustively.
score=score_dags(data,ns,all_dags);
post=normalise(exp(score));%assuminguniformstructuralprior

Weplottheresultsbelow.(Thedatasetwas100samplesdrawnfromarandom4nodebnetseethefileBNT/examples/static/mcmc1.)
subplot(2,1,1)
bar(post)
subplot(2,1,2)
bar(mcmc_post)

http://www.cs.ubc.ca/~murphyk/Software/BNT/usage.html#basics

26/32

9/21/2016

HowtousetheBayesNetToolbox

WecanalsoplottheacceptanceratioversusnumberofMCMCsteps,asacrudeconvergencediagnostic.
clf
plot(accept_ratio)

EventhoughthenumberofsamplesneededbyMCMCistheoreticallypolynomial(notexponential)inthedimensionalityofthesearch
space,inpracticeithasbeenfoundthatMCMCdoesnotconvergeinreasonabletimeforgraphswithmorethanabout10nodes.

Activestructurelearning
Aswasmentionedabove,onecanonlylearnaDAGuptoMarkovequivalence,evengiveninfinitedata.Ifoneisinterestedinlearning
thestructureofacausalnetwork,oneneedsinterventionaldata.(By"intervention"wemeanforcinganodetotakeonaspecificvalue,
therebyeffectivelyseveringitsincomingarcs.)
Mostofthescoringfunctionsacceptanoptionalargumentthatspecifieswhetheranodewasobservedtohaveacertainvalue,orwas
forcedtohavethatvalue:wesetclamped(i,m)=1ifnodeiwasforcedintrainingcasem.e.g.,seethefile
BNT/examples/static/cooper_yoo.

http://www.cs.ubc.ca/~murphyk/Software/BNT/usage.html#basics

27/32

9/21/2016

HowtousetheBayesNetToolbox

Aninterestingquestionistodecidewhichinterventionstoperform(c.f.,designofexperiments).Fordetails,seethefollowingtech
report
ActivelearningofcausalBayesnetstructure,KevinMurphy,March2001.

StructuralEM
ComputingtheBayesianscorewhenthereispartialobservabilityiscomputationallychallenging,becausetheparameterposterior
becomesmultimodal(thehiddennodesinduceamixturedistribution).OnethereforeneedstouseapproximationssuchasBIC.
Unfortunately,searchalgorithmsarestillexpensive,becauseweneedtorunEMateachsteptocomputetheMLE,whichisneededto
computethescoreofeachmodel.AnalternativeapproachistodothelocalsearchstepsinsideoftheMstepofEM,whichismore
efficientsincethedatahasbeen"filledin"thisiscalledthestructuralEMalgorithm(Friedman1997),andprovablyconvergestoa
localmaximumoftheBICscore.
WeiHuhasimplementedSEMfordiscretenodes.Youcandownloadhispackagefromhere.Pleaseaddressallquestionsaboutthis
[email protected]'simplementationofSEM.

Visualizingthegraph
Clickhereformoreinformationongraphvisualization.

Constraintbasedmethods
TheICalgorithm(PearlandVerma,1991),andthefaster,butotherwiseequivalent,PCalgorithm(Spirtes,Glymour,andScheines
1993),computesmanyconditionalindependencetests,andcombinestheseconstraintsintoaPDAGtorepresentthewholeMarkov
equivalenceclass.
IC*/FCIextendIC/PCtohandlelatentvariables:seebelow.(ICstandsforinductivecausationPCstandsforPeterandClark,thefirst
namesofSpirtesandGlymourFCIstandsforfastcausalinference.Whatwe,followingPearl(2000),callIC*wascalledICinthe
originalPearlandVermapaper.)Fordetails,see
Causation,Prediction,andSearch,Spirtes,GlymourandScheines(SGS),2001(2ndedition),MITPress.
Causality:Models,ReasoningandInference,J.Pearl,2000,CambridgeUniversityPress.
ThePCalgorithmtakesasargumentsafunctionf,thenumberofnodesN,themaximumfaninK,andadditionalargumentsAwhich
arepassedtof.Thefunctionf(X,Y,S,A)returns1ifXisconditionallyindependentofYgivenS,and0otherwise.Forexample,
supposewecheatbypassinginaCI"oracle"whichhasaccesstothetrueDAGtheoracletestsfordseparationinthisDAG,i.e.,
f(X,Y,S)callsdsep(X,Y,S,dag).Wecantothisasfollows.
pdag=learn_struct_pdag_pc('dsep',N,max_fan_in,dag);

pdag(i,j)=1ifthereisdefinitelyani>jarc,andpdag(i,j)=1ifthereiseitherani>jorandi<jarc.
Appliedtothesprinklernetwork,thisreturns
pdag=
0110
1001
1001
0000

Soasexpected,weseethattheVstructureattheWnodeisuniquelyidentified,buttheotherarcshaveambiguousorientation.
Wenowgiveanexamplefromp141(1stedn)/p103(2ndend)oftheSGSbook.Thisexampleconcernsthefemaleorgasm.Weare
givenacorrelationmatrixCbetween7measuredfactors(suchassubjectiveexperiencesofcoitalandmasturbatoryexperiences),
derivedfrom281samples,andwanttolearnacausalmodelofthedata.Wewillnotdiscussthemeritsofthistypeofworkhere,but
merelyshowhowtoreproducetheresultsintheSGSbook.Theirprogram,Tetrad,makesuseoftheFisherZtestforconditional
independence,sowedothesame:
max_fan_in=4;
nsamples=281;
alpha=0.05;
pdag=learn_struct_pdag_pc('cond_indep_fisher_z',n,max_fan_in,C,nsamples,alpha);

Inthiscase,theCItestis
http://www.cs.ubc.ca/~murphyk/Software/BNT/usage.html#basics

28/32

9/21/2016

HowtousetheBayesNetToolbox

f(X,Y,S)=cond_indep_fisher_z(X,Y,S,C,nsamples,alpha)

TheresultsmatchthoseofFig12aofSGSapartfromtwoedgedifferencespresumablythisisduetoroundingerror(althoughitcould
beabug,eitherinBNTorinTetrad).ThisexamplecanbefoundinthefileBNT/examples/static/pc2.m.
TheIC*algorithm(PearlandVerma,1991),andthefasterFCIalgorithm(Spirtes,Glymour,andScheines1993),areliketheIC/PC
algorithm,exceptthattheycandetectthepresenceoflatentvariables.Seethefilelearn_struct_pdag_ic_starwrittenbyTamar
Kushnir.TheoutputisamatrixP,definedasfollows(seePearl(2000),p52fordetails):
%P(i,j)=1ifthereiseitheralatentvariableLsuchthati<L>jORthereisadirectededgefromi>j.
%P(i,j)=2ifthereisamarkeddirectedi*>jedge.
%P(i,j)=P(j,i)=1ifthereisandundirectededgeij
%P(i,j)=P(j,i)=2ifthereisalatentvariableLsuchthati<L>j.

PhilippeLeray'sstructurelearningpackage
PhilippeLerayhaswrittenastructurelearningpackagethatusesBNT.Itcurrently(Juen2003)hasthefollowingfeatures:
PCwithChi2statisticaltest
MWST:MaximumweightedSpanningTree
HillClimbing
GreedySearch
StructuralEM
hist_ic:optimalHistogrambasedonICinformationcriterion
cpdag_to_dag
dag_to_cpdag
...

Inferenceengines
Upuntilnow,wehaveusedthejunctiontreealgorithmforinference.However,sometimesthisistooslow,ornotevenapplicable.In
general,therearemanyinferencealgorithmseachofwhichmakedifferenttradeoffsbetweenspeed,accuracy,complexityand
generality.Furthermore,theremightbemanyimplementationsofthesamealgorithmforinstance,ageneralpurpose,readableversion,
andahighlyoptimized,specializedone.Tocopewiththisvariety,wetreateachinferencealgorithmasanobject,whichwecallan
inferenceengine.
Aninferenceengineisanobjectthatcontainsabnetandsupportsthe'enter_evidence'and'marginal_nodes'methods.Theengine
constructortakesthebnetasargumentandmaydosomemodelspecificprocessing.When'enter_evidence'iscalled,theenginemay
dosomeevidencespecificprocessing.Finally,when'marginal_nodes'iscalled,theenginemaydosomequeryspecificprocessing.
Theamountofworkdonewheneachstageisspecifiedstructure,parameters,evidence,andquerydependsontheengine.Thecost
ofworkdoneearlyinthissequencecanbeamortized.Ontheotherhand,onecanmakebetteroptimizationsifonewaitsuntillaterin
thesequence.Forexample,theparametersmightimplyconditionalindpendenciesthatarenotevidentinthegraphstructure,butcan
neverthelessbeexploitedtheevidenceindicateswhichnodesareobservedandhencecaneffectivelybedisconnectedfromthegraph
andthequerymightindicatethatlargepartsofthenetworkaredseparatedfromthequerynodes.(Sinceitisnottheactualvaluesof
theevidencethatmatters,justwhichnodesareobserved,manyenginesallowyoutospecifywhichnodeswillbeobservedwhenthey
areconstructed,i.e.,beforecalling'enter_evidence'.Someenginescanstillcopeiftheactualpatternofevidenceisdifferent,e.g.,if
thereismissingdata.)
Althoughbeingmaximallylazy(i.e.,onlydoingworkwhenaqueryisissued)mayseemdesirable,thisisnotalwaysthemost
efficient.Forexample,whenlearningusingEM,weneedtocallmarginal_nodesNtimes,whereNisthenumberofnodes.Variable
eliminationwouldenduprepeatingalotofworkeachtimemarginal_nodesiscalled,makingitinefficientforlearning.Thejunction
treealgorithm,bycontrast,usesdynamicprogrammingtoavoidthisredundantcomputationitcalculatesallmarginalsintwopasses
during'enter_evidence',socalling'marginal_nodes'takesconstanttime.
WewilldiscusssomeoftheinferencealgorithmsimplementedinBNTbelow,andfinishwithasummaryofallofthem.

Variableelimination
Thevariableeliminationalgorithm,alsoknownasbucketeliminationorpeeling,isoneofthesimplestinferencealgorithms.Thebasic
ideaisto"pushsumsinsideofproducts"thisisexplainedinmoredetailhere.
Theprincipleofdistributingsumsoverproductscanbegeneralizedgreatlytoapplytoanycommutativesemiring.Thisformsthebasis
ofmanycommonalgorithms,suchasViterbidecodingandtheFastFourierTransform.Fordetails,see
http://www.cs.ubc.ca/~murphyk/Software/BNT/usage.html#basics

29/32

9/21/2016

HowtousetheBayesNetToolbox

R.McElieceandS.M.Aji,2000.TheGeneralizedDistributiveLaw,IEEETrans.Inform.Theory,vol.46,no.2(March2000),
pp.325343.
F.R.Kschischang,B.J.FreyandH.A.Loeliger,2001.FactorgraphsandthesumproductalgorithmIEEETransactionson
InformationTheory,February,2001.
ChoosinganorderinwhichtosumoutthevariablessoastominimizecomputationalcostisknowntobeNPhard.The
implementationofthisalgorithminvar_elim_inf_enginemakesnoattempttooptimizethisordering(incontrast,say,to
jtree_inf_engine,whichusesagreedysearchproceduretofindagoodordering).
Note:unlikemostalgorithms,var_elimdoesallitscomputationalworkinsideofmarginal_nodes,notinsideofenter_evidence.

Globalinferencemethods
Thesimplestinferencealgorithmofallistoexplicitelyconstructthejointdistributionoverallthenodes,andthentomarginalizeit.
Thisisimplementedinglobal_joint_inf_engine.Sincethesizeofthejointisexponentialinthenumberofdiscrete(hidden)nodes,
thisisnotaverypracticalalgorithm.Itisincludedmerelyforpedagogicalanddebuggingpurposes.
Threespecializedversionsofthisalgorithmhavealsobeenimplemented,correspondingtothecaseswhereallthenodesarediscrete
(D),allareGaussian(G),andsomearediscreteandsomeGaussian(CG).Theyarecalledenumerative_inf_engine,
gaussian_inf_engine,andcond_gauss_inf_enginerespectively.
Note:unlikemostalgorithms,theseglobalinferencealgorithmsdoalltheircomputationalworkinsideofmarginal_nodes,notinsideof
enter_evidence.

Quickscore
ThejunctiontreealgorithmisquiteslowontheQMRnetwork,sincethecliquesaresobig.Onesimpletrickwecanuseistonotice
thathiddenleavesdonotaffecttheposteriorsontheroots,andhencedonotneedtobeincludedinthenetwork.Asecondtrickisto
noticethatthenegativefindingscanbe"absorbed"intotheprior:seethefileBNT/examples/static/mk_minimal_qmr_bnetfordetails.
Amuchmoresignificantspeedupisobtainedbyexploitingspecialpropertiesofthenoisyornode,asdonebythequickscore
algorithm.Fordetails,see
Heckerman,"Atractableinferencealgorithmfordiagnosingmultiplediseases",UAI89.
RishandDechter,"Ontheimpactofcausalindependence",UCItechreport,1998.
ThishasbeenimplementedinBNTasaspecialpurposeinferenceengine,whichcanbecreatedandusedasfollows:
engine=quickscore_inf_engine(inhibit,leak,prior);
engine=enter_evidence(engine,pos,neg);
m=marginal_nodes(engine,i);

Beliefpropagation
Evenusingquickscore,exactinferencetakestimethatisexponentialinthenumberofpositivefindings.Henceforlargenetworkswe
needtoresorttoapproximateinferencetechniques.Seeforexample
T.JaakkolaandM.Jordan,"VariationalprobabilisticinferenceandtheQMRDTnetwork",JAIR10,1999.
K.Murphy,Y.WeissandM.Jordan,"Loopybeliefpropagationforapproximateinference:anempiricalstudy",UAI99.
ThelatterapproximationentailsapplyingPearl'sbeliefpropagationalgorithmtoamodelevenifithasloops(hencethenameloopy
beliefpropagation).Pearl'salgorithm,implementedaspearl_inf_engine,givesexactresultswhenappliedtosinglyconnectedgraphs
(a.k.a.polytrees,sincetheunderlyingundirectedtopologyisatree,butanodemayhavemultipleparents).Toapplythisalgorithmtoa
graphwithloops,usepearl_inf_engine.Thiscanuseacentralizedordistributedmessagepassingprotocol.Youcanuseitasinthe
followingexample.
engine=pearl_inf_engine(bnet,'max_iter',30);
engine=enter_evidence(engine,evidence);
m=marginal_nodes(engine,i);

Wefoundthatthisalgorithmoftenconverges,andwhenitdoes,oftenisveryaccurate,butitdependsontheprecisesettingofthe
parametervaluesofthenetwork.(SeethefileBNT/examples/static/qmr1torepeattheexperimentforyourself.)Understandingwhen
andwhybeliefpropagationconverges/worksisatopicofongoingresearch.
pearl_inf_enginecanexploitspecialstructureinnoisyorandgmuxnodestocomputemessagesefficiently.
http://www.cs.ubc.ca/~murphyk/Software/BNT/usage.html#basics

30/32

9/21/2016

HowtousetheBayesNetToolbox

belprop_inf_engineislikepearl,butusespotentialstorepresentmessages.Hencethisisslower.
belprop_fg_inf_engineislikebelprop,butisdesignedforfactorgraphs.

Sampling
BNTnow(Mar'02)hastwosampling(MonteCarlo)inferencealgorithms:
likelihood_weighting_inf_enginewhichdoesimportancesamplingandcanhandleanynodetype.
gibbs_sampling_inf_engine,writtenbyBhaskaraMarthi.CurrentlythiscanonlyhandletabularCPDs.Foramuchfasterand

morepowerfulGibbssamplingprogram,seeBUGS.
Note:Togeneratesamplesfromanetwork(whichisnotthesameasinference!),usesample_bnet.

Summaryofinferenceengines
Theinferenceenginesdifferinmanyways.Herearesomeofthemajor"axes":
Worksforalltopologiesormakesrestrictions?
Worksforallnodetypesormakesrestrictions?
Exactorapproximateinference?
Intermsoftopology,mostengineshandleanykindofDAG.belprop_fgdoesapproximateinferenceonfactorgraphs(FG),whichcan
beusedtorepresentdirected,undirected,andmixed(chain)graphs.(Inthefuture,weplantosupportexactinferenceonchaingraphs.)
quickscoreonlyworksonQMRlikemodels.
Intermsofnodetypes:algorithmsthatusepotentialscanhandlediscrete(D),Gaussian(G)orconditionalGaussian(CG)models.
Samplingalgorithmscanessentiallyhandleanykindofnode(distribution).Otheralgorithmsmakemorerestrictiveassumptionsin
exchangeforspeed.
Finally,mostalgorithmsaredesignedtogivetheexactanswer.Thebeliefpropagationalgorithmsareexactifappliedtotrees,andin
someothercases.Samplingisconsideredapproximate,eventhough,inthelimitofaninfinitenumberofsamples,itgivestheexact
answer.
HereisasummaryofthepropertiesofalltheenginesinBNTwhichworkonstaticnetworks.
Name

Exact? Nodetype? topology

belprop
belprop_fg

approx D
approx D

DAG
factorgraph

cond_gauss
enumerative
gaussian

exact
exact
exact

DAG
DAG
DAG

gibbs
global_joint
jtree

approx D
exact D,G,CG
exact D,G,CG

CG
D
G

DAG
DAG
DAGb

likelihood_weighting approx any

pearl
approx D,G
pearl
exact D,G

DAG
DAG
polytree

quickscore
stab_cond_gauss
var_elim

QMR
DAG
DAG

exact
exact
exact

noisyor
CG
D,G,CG

Influencediagrams/decisionmaking
BNTimplementsanexactalgorithmforsolvingLIMIDs(limitedmemoryinfluencediagrams),describedin

http://www.cs.ubc.ca/~murphyk/Software/BNT/usage.html#basics

31/32

9/21/2016

HowtousetheBayesNetToolbox

S.L.LauritzenandD.Nilsson.RepresentingandsolvingdecisionproblemswithlimitedinformationManagementScience,47,
12381251.September2001.
LIMIDsexplicitelyshowallinformationarcs,ratherthanimplicitelyassumingnoforgetting.Thisallowsthemtomodelforgetful
controllers.
SeetheexamplesinBNT/examples/limidsfordetails.

DBNs,HMMs,Kalmanfiltersandallthat
ClickherefordocumentationabouthowtouseBNTfordynamicalsystemsandsequencedata.

http://www.cs.ubc.ca/~murphyk/Software/BNT/usage.html#basics

32/32

GI 2.710 Latast PDF
50% (2)
GI 2.710 Latast PDF
35 pages
Mastering Proxmox - Second Edition
From Everand
Mastering Proxmox - Second Edition
Wasim Ahmed
No ratings yet
Practical Machine Learning: Learn how to build Machine Learning applications to solve real-world data analysis challenges with this Machine Learning book – packed with practical tutorials
From Everand
Practical Machine Learning: Learn how to build Machine Learning applications to solve real-world data analysis challenges with this Machine Learning book – packed with practical tutorials
Sunila Gollapudi
3/5 (2)
Sagar Dey Bwubtd22087
No ratings yet
Sagar Dey Bwubtd22087
7 pages
MSBNX: A Component-Centric Toolkit For Modeling and Inference With Bayesian Networks
No ratings yet
MSBNX: A Component-Centric Toolkit For Modeling and Inference With Bayesian Networks
34 pages
BayesNets2016
No ratings yet
BayesNets2016
62 pages
Aiml Ses 2-1
No ratings yet
Aiml Ses 2-1
15 pages
Building A BN Tutorial
No ratings yet
Building A BN Tutorial
29 pages
Bayesian Network
100% (1)
Bayesian Network
442 pages
Bayes' Theorem Explained
No ratings yet
Bayes' Theorem Explained
18 pages
Lecture 5 Bayesian Networks
No ratings yet
Lecture 5 Bayesian Networks
12 pages
Learning Beyasian Networks Classifiers
No ratings yet
Learning Beyasian Networks Classifiers
10 pages
10 1 1 207 351 PDF
No ratings yet
10 1 1 207 351 PDF
133 pages
Bayesian Networks and Belief Propagation
No ratings yet
Bayesian Networks and Belief Propagation
67 pages
Data Analytics- Lecture#8-Spring 24
No ratings yet
Data Analytics- Lecture#8-Spring 24
23 pages
c14 15bayesian Networks 2020
No ratings yet
c14 15bayesian Networks 2020
115 pages
KMurphy PDF
No ratings yet
KMurphy PDF
20 pages
BNetwork Presentation
No ratings yet
BNetwork Presentation
18 pages
Hands-On Bayesian Neural Networks
No ratings yet
Hands-On Bayesian Neural Networks
24 pages
(MCTS) Microsoft BizTalk Server (70595) Certification and Assessment Guide: Second Edition
From Everand
(MCTS) Microsoft BizTalk Server (70595) Certification and Assessment Guide: Second Edition
Kent Weare
No ratings yet
Bayenian Networks Jensen
No ratings yet
Bayenian Networks Jensen
6 pages
Hands-On Bayesian Neural NetworksA Tutorial For Deep Learning Users
No ratings yet
Hands-On Bayesian Neural NetworksA Tutorial For Deep Learning Users
20 pages
Bayesian Neworks
No ratings yet
Bayesian Neworks
32 pages
BayesiaLab User Guide
No ratings yet
BayesiaLab User Guide
380 pages
ML-9
No ratings yet
ML-9
15 pages
A Scalable Data Science Workflow Approach For Big Data Bayesian Network Learning
No ratings yet
A Scalable Data Science Workflow Approach For Big Data Bayesian Network Learning
10 pages
Julia for Data Science
From Everand
Julia for Data Science
Anshul Joshi
No ratings yet
Bayesian-Belief-Networks-in-Data-Mining
No ratings yet
Bayesian-Belief-Networks-in-Data-Mining
5 pages
BayesiaLab Book V18 PDF
No ratings yet
BayesiaLab Book V18 PDF
383 pages
BayesianNetClassifiers 3.5.5
No ratings yet
BayesianNetClassifiers 3.5.5
35 pages
Full download Bayesian Optimization : Theory and Practice Using Python Peng Liu pdf docx
100% (4)
Full download Bayesian Optimization : Theory and Practice Using Python Peng Liu pdf docx
66 pages
BayesianNetClassifiers 3 5 8
No ratings yet
BayesianNetClassifiers 3 5 8
47 pages
EECS6895 AdvancedBigDataAnalytics Lecture6
No ratings yet
EECS6895 AdvancedBigDataAnalytics Lecture6
81 pages
Pso Entrnamiento Redes Bayesianas
No ratings yet
Pso Entrnamiento Redes Bayesianas
6 pages
Artificial Intelligence Neural Networks
100% (1)
Artificial Intelligence Neural Networks
7 pages
Ai 1
No ratings yet
Ai 1
3 pages
Bayesian Network Structure Ensemble Learning 1st Edition by Feng Liu, Fengzhan Tian, Qiliang Zhu 9783540738701download
No ratings yet
Bayesian Network Structure Ensemble Learning 1st Edition by Feng Liu, Fengzhan Tian, Qiliang Zhu 9783540738701download
40 pages
Bayesian Optimization in Action MEAP V07 1st / chapters 1 to 8 of 13 Edition Quan Nguyen - Read the ebook online or download it for a complete experience
No ratings yet
Bayesian Optimization in Action MEAP V07 1st / chapters 1 to 8 of 13 Edition Quan Nguyen - Read the ebook online or download it for a complete experience
80 pages
Good BayesianNetworksPrimer
No ratings yet
Good BayesianNetworksPrimer
23 pages
Bayesian Networks: (Aka Bayes Nets, Belief Nets) (One Type of Graphical Model)
No ratings yet
Bayesian Networks: (Aka Bayes Nets, Belief Nets) (One Type of Graphical Model)
18 pages
Bayesian Network Classifiers in Weka
No ratings yet
Bayesian Network Classifiers in Weka
23 pages
[Ebooks PDF] download Bayesian Optimization in Action MEAP V07 1st / chapters 1 to 8 of 13 Edition Quan Nguyen full chapters
No ratings yet
[Ebooks PDF] download Bayesian Optimization in Action MEAP V07 1st / chapters 1 to 8 of 13 Edition Quan Nguyen full chapters
87 pages
BNN Tutorial CILVR
No ratings yet
BNN Tutorial CILVR
83 pages
Unit4 - Lecture 2
No ratings yet
Unit4 - Lecture 2
17 pages
Bayesian Networks Intro v16
No ratings yet
Bayesian Networks Intro v16
30 pages
Bayesian Network Homework
100% (1)
Bayesian Network Homework
5 pages
Introduction To Bayesian Networks
100% (2)
Introduction To Bayesian Networks
15 pages
Uusitalo_2007_EcolMod_preprint
No ratings yet
Uusitalo_2007_EcolMod_preprint
27 pages
Bayesian Networks: Construction, Inference, Learning and Causal Interpretation
No ratings yet
Bayesian Networks: Construction, Inference, Learning and Causal Interpretation
58 pages
PostgreSQL 9.0 High Performance
From Everand
PostgreSQL 9.0 High Performance
Gregory Smith
4/5 (1)
What Are Artificial Neural Networks (Anns) ?: Neurons Dendrites Neurons. Axons
No ratings yet
What Are Artificial Neural Networks (Anns) ?: Neurons Dendrites Neurons. Axons
11 pages
Oracle Essbase 9 Implementation Guide
From Everand
Oracle Essbase 9 Implementation Guide
Joseph Sydney Gomez
No ratings yet
Colloquium - Bayesian Optimization Algorithm - Sajib Kumar Biswas
No ratings yet
Colloquium - Bayesian Optimization Algorithm - Sajib Kumar Biswas
25 pages
Lecture 06 Bayesian Networks 07112022 011127pm
No ratings yet
Lecture 06 Bayesian Networks 07112022 011127pm
33 pages
Microsoft BizTalk Server 2010 Patterns
From Everand
Microsoft BizTalk Server 2010 Patterns
Dan Rosanova
2/5 (1)
(Machine Learning) BAYES’ THEOREM AND CONCEPT LEARNING
No ratings yet
(Machine Learning) BAYES’ THEOREM AND CONCEPT LEARNING
22 pages
SP14 CS188 Lecture 16 Bayes Nets 4
No ratings yet
SP14 CS188 Lecture 16 Bayes Nets 4
42 pages
Bayesian Network Structure Ensemble Learning 1st Edition by Feng Liu, Fengzhan Tian, Qiliang Zhu 9783540738701 - The 2025 ebook edition is available with updated content
100% (2)
Bayesian Network Structure Ensemble Learning 1st Edition by Feng Liu, Fengzhan Tian, Qiliang Zhu 9783540738701 - The 2025 ebook edition is available with updated content
45 pages
OpenNebula 3 Cloud Computing
From Everand
OpenNebula 3 Cloud Computing
Giovanni Toraldo
No ratings yet
Haskell High Performance Programming
From Everand
Haskell High Performance Programming
Samuli Thomasson
No ratings yet
Citrix XenApp® 7.5 Desktop Virtualization Solutions
From Everand
Citrix XenApp® 7.5 Desktop Virtualization Solutions
Andy Paul
No ratings yet
E3634 90001 PDF
No ratings yet
E3634 90001 PDF
182 pages
VMs Crime (Impp)
No ratings yet
VMs Crime (Impp)
99 pages
Mahabharata12 Shanti
No ratings yet
Mahabharata12 Shanti
960 pages
Collaborative Resource Exchanges For Peer-to-Peer Video Streaming Over Wireless Mesh Networks
No ratings yet
Collaborative Resource Exchanges For Peer-to-Peer Video Streaming Over Wireless Mesh Networks
15 pages
Quality of Service For Voip
No ratings yet
Quality of Service For Voip
146 pages
Mg1 Numerical Inversion Methods
No ratings yet
Mg1 Numerical Inversion Methods
3 pages
Current Students With A Valid ITSC Account Should Make The Payment by Credit Card. Go To "Miscellaneous Purchases" in The Student Center
No ratings yet
Current Students With A Valid ITSC Account Should Make The Payment by Credit Card. Go To "Miscellaneous Purchases" in The Student Center
2 pages
A Quick Introduction To Loops in Matlab: See Pp. 102-105 For A Description of The Fprintf Statement
No ratings yet
A Quick Introduction To Loops in Matlab: See Pp. 102-105 For A Description of The Fprintf Statement
3 pages
Inventory of Teachers - LOCAL FUNDS As of NOV 2019 Baleguian ES
No ratings yet
Inventory of Teachers - LOCAL FUNDS As of NOV 2019 Baleguian ES
12 pages
Micom P14X: (P141, P142, P143 & P145)
No ratings yet
Micom P14X: (P141, P142, P143 & P145)
16 pages
2023 Malgosa, Alvarez, Marre Self Touching Genitals Pleasure and Privacy The Governance of Sexuality in Primary Schools in Spain
No ratings yet
2023 Malgosa, Alvarez, Marre Self Touching Genitals Pleasure and Privacy The Governance of Sexuality in Primary Schools in Spain
16 pages
A Semi Detailed Lesson Plan in Earth and Life Science
No ratings yet
A Semi Detailed Lesson Plan in Earth and Life Science
6 pages
Track-Stair Interaction Analysis and Online Tipover Prediction For A Self-Reconfigurable Tracked Mobile Robot Climbing Stairs
No ratings yet
Track-Stair Interaction Analysis and Online Tipover Prediction For A Self-Reconfigurable Tracked Mobile Robot Climbing Stairs
37 pages
American Caving Accidents: 1998 Accident and Incident Reports On File
No ratings yet
American Caving Accidents: 1998 Accident and Incident Reports On File
2 pages
M-4272 Presentation 20120119
No ratings yet
M-4272 Presentation 20120119
67 pages
Isl 223E Fundamentals of Production Management Homework#3 Due Date: 02.01.2022 Case Study: Store
No ratings yet
Isl 223E Fundamentals of Production Management Homework#3 Due Date: 02.01.2022 Case Study: Store
3 pages
Dra. Michel Fernanda Girón Luna Cirujano Dentista
No ratings yet
Dra. Michel Fernanda Girón Luna Cirujano Dentista
2 pages
GoalSettingQuestionnaireInfo PDF
No ratings yet
GoalSettingQuestionnaireInfo PDF
4 pages
Vanessa Perez Resume
No ratings yet
Vanessa Perez Resume
2 pages
crocus-us
No ratings yet
crocus-us
8 pages
921 - Design Build Comprehensive - Quality Plan PDF
No ratings yet
921 - Design Build Comprehensive - Quality Plan PDF
19 pages
Isre 2400
No ratings yet
Isre 2400
8 pages
Blockchain Technology
No ratings yet
Blockchain Technology
28 pages
KK Summary
No ratings yet
KK Summary
3 pages
MMLG Brochure EN 2020 - 07 Grid GA 0722
No ratings yet
MMLG Brochure EN 2020 - 07 Grid GA 0722
4 pages
ANNOTATED MAS- First Preboard
No ratings yet
ANNOTATED MAS- First Preboard
12 pages
Asian Journal of Multidisciplinary Studies
No ratings yet
Asian Journal of Multidisciplinary Studies
10 pages
02 - Patterns in Nature
No ratings yet
02 - Patterns in Nature
30 pages
Project Report On Training and Development
No ratings yet
Project Report On Training and Development
55 pages
Jennifer M
No ratings yet
Jennifer M
2 pages
Border Template - Finals
No ratings yet
Border Template - Finals
45 pages
A2 Media Coursework Examples
100% (2)
A2 Media Coursework Examples
8 pages
Plato'SThe Republic
No ratings yet
Plato'SThe Republic
16 pages
Arecanut Cultivation (Betel Nut) Information - Agrifarming
No ratings yet
Arecanut Cultivation (Betel Nut) Information - Agrifarming
7 pages
Mount Kilimanjaro
No ratings yet
Mount Kilimanjaro
11 pages
port
No ratings yet
port
52 pages

How To Use The Bayes Net Toolbox

Uploaded by

How To Use The Bayes Net Toolbox

Uploaded by

9/21/2016

Child Parents Comments CPD_to_CPT conv_to_table conv_to_pot

learn Bayes Pearl

Inheritsfrom Inheritsfrom Inheritsfrom Inheritsfrom

Inheritsfrom Inheritsfrom Inheritsfrom

Inheritsfrom Inheritsfrom Inheritsfrom Inheritsfrom

Exact? Nodetype? topology

likelihood_weighting approx any

You might also like