Bayesian Belief Networks
Bayesian Belief Networks
are graphical models that represent the probabilistic relationships among a set of variables.
They provide a way to model complex systems where uncertainty is involved and are widely
used in fields such as artificial intelligence, decision analysis, and machine learning.
Key Components of Bayesian Belief Networks
1. Nodes: Each node represents a random variable that can take on different states or
values. These variables can be observable quantities, latent (hidden) variables, or
hypothesis-related variables.
2. Directed Edges: The edges between nodes are directed, indicating conditional
dependencies between the variables. A directed edge from node A to node B means
that A directly influences B.
3. Conditional Probability Tables (CPTs): For each node, there is a conditional
probability table that quantifies the relationship between the node and its parents. This
table specifies the probability distribution over the node’s possible states, given the
states of its parent nodes.
4. Directed Acyclic Graph (DAG): The structure of a Bayesian Belief Network is a
directed acyclic graph, meaning there are no cycles, so the graph flows in one
direction.
How Bayesian Belief Networks Work
Bayesian Belief Networks represent the joint probability distribution of a set of random
variables in a way that makes it feasible to manage and compute complex relationships. They
enable the calculation of the probability of a particular outcome given observed evidence,
using Bayes' Theorem.
1. Structure of the Network
The structure encodes which variables are directly dependent on others. A node is
conditionally independent of its non-descendants given its parents, simplifying the
computation of probabilities.
2. Inference in Bayesian Networks
Inference is the process of computing the probability of certain variables given
observed values of other variables. The main inference tasks are:
o Predictive Inference: Calculating the probability of a future event given
current evidence.
o Diagnostic Inference: Determining the probability of a cause given an
observed effect.
o Intercausal Inference: Updating beliefs about one cause after observing
another cause and their common effect.
3. Joint Probability Distribution
The joint probability distribution over all variables can be factored as a product of the
conditional probabilities defined in the CPTs. For example, given a set of variables
Bayesian Belief Network to diagnose a patient based on symptoms that may suggest two
possible conditions: Cold and Flu. The presence of Fever and Cough are two symptoms
influenced by these conditions.
Scenario:
We have a Bayesian Belief Network with three variables:
1. Cold (C): Whether the patient has a cold (Yes/No).
2. Flu (F): Whether the patient has the flu (Yes/No).
3. Fever (Fe): Whether the patient has a fever (Yes/No).
4. Cough (Co): Whether the patient has a cough (Yes/No).
The relationships are:
Both Cold and Flu can independently cause Cough and Fever.
These two symptoms provide evidence that helps diagnose the condition.
The Expectation-Maximization (EM) Algorithm is an iterative method used for finding the
maximum likelihood estimates of parameters in probabilistic models, especially when the
model involves latent (hidden) variables or incomplete data. The EM algorithm is widely
used in machine learning and statistics for tasks like clustering (e.g., Gaussian Mixture
Models), data imputation, and missing data handling.
How the EM Algorithm Works
The EM algorithm consists of two main steps that are repeated iteratively until convergence:
1. Expectation Step (E-step): In this step, the algorithm calculates the expected value
of the log-likelihood function, with respect to the current estimate of the parameters.
2. Maximization Step (M-step): In this step, the algorithm maximizes the expected log-
likelihood found in the E-step to update the parameter estimates.
These steps are repeated until the algorithm converges, which typically means that changes in
the parameter estimates fall below a specified threshold or the likelihood function stops
improving significantly.
Steps of the EM Algorithm
Let's break down the process in more detail: