aai
aai
Generative modelling
● Purpose: It models how data is generated by learning the joint probability distribution
P(x,y)P(x, y)P(x,y) or just P(x)P(x)P(x) for unlabeled data. It aims to understand the underlying
data structure and generate new, realistic samples.
● How it Works: It learns both the features and their correlations, allowing it to generate new
data points similar to the training set. It tries to answer, "How likely is this input?"
● Examples:
○ GANs (Generative Adversarial Networks): Generate realistic images, videos, or
sounds by pitting two neural networks (Generator and Discriminator) against each
other.
○ VAEs (Variational Autoencoders): Learn efficient data representations to generate
new data samples with controlled variation.
● Applications: Image synthesis, text generation, data augmentation, and anomaly detection.
● Advantages: Can generate new, unseen data samples, useful for creative tasks and data
augmentation.
● Disadvantages: Often harder to train and prone to issues like mode collapse (repetitive
outputs).
Discriminative modelling
● Purpose: It models the decision boundary between classes by learning the conditional
probability P(y∣x)P(y|x)P(y∣x), focusing on distinguishing between different categories.
● How it Works: It learns to map inputs to their corresponding labels, directly optimizing for
classification accuracy. It tries to answer, "Which class does this input belong to?"
● Examples:
○ Logistic Regression: Classifies data into two categories by estimating probabilities.
○ SVM (Support Vector Machine): Finds the optimal hyperplane that separates
different classes with the maximum margin.
○ Neural Networks: Learn complex decision boundaries for tasks like image
classification and speech recognition.
● Applications: Image classification, spam detection, speech recognition, and fraud detection.
● Advantages: Usually simpler to train and more accurate for classification tasks.
● Disadvantages: Cannot generate new data samples and may require large labeled datasets
Generative VS discriminative modelling
GAN VS VAE
Probabilistic models
GMM
A Gaussian Mixture Model (GMM) is a probabilistic model that assumes all the data points
are generated from a mixture of several Gaussian (normal) distributions with unknown
parameters.
Each component in the mixture represents a cluster, and the model assigns probabilities to
each point for belonging to a particular cluster.
Working of GMM
GMM works by modeling the data as a weighted sum of multiple Gaussian distributions,
each having its own mean and covariance.
🔁 Steps:
1. Initialization:
○ For each data point, calculate the probability of it belonging to each Gaussian
using the current parameters (soft clustering).
4. Repeat E and M steps until convergence (i.e., the parameters stop changing
significantly).
Advantages
● Soft Clustering: Unlike K-Means, GMM assigns probabilities, not hard labels —
better for overlapping clusters.
Limitations
● Number of Components Must Be Predefined: You need to specify the number of
Gaussians in advance.
Applications
● Image Segmentation: Separating objects based on pixel intensities.
HMM
A Hidden Markov Model (HMM) is a statistical model that describes systems that are influenced by
hidden (unobservable) states but produce observable outputs. It models the probabilistic
relationship between a sequence of hidden states and corresponding observations, helping to make
predictions or understand patterns when the underlying system is not directly visible. It involves:
● Hidden States: Unobservable internal states (e.g., weather conditions like Sunny or Rainy).
● Observations: Visible outputs influenced by hidden states (e.g., Dry or Wet ground).
Key Components:
Imagine you want to predict the weather (Sunny or Rainy) based on observed conditions (Dry or
Wet). In this case:
● Hidden States: The actual weather (Sunny, Rainy) which you cannot observe directly.
● Observations: The ground condition (Dry, Wet) that you can see.
Using HMM:
● Start Probabilities: Initial chances of the weather being Sunny (60%) or Rainy (40%).
● Transition Probabilities: Chances of moving from one weather state to another. For example,
if it's Sunny today, there's a 70% chance it will be Sunny tomorrow and a 30% chance of Rain.
● Emission Probabilities: Chances of observing Dry or Wet conditions given the hidden state.
For example, if it's Sunny, there's a 90% chance the ground is Dry.
With this setup, HMM can predict the most likely sequence of weather conditions given a sequence
of observed ground states.
Advantages
Disadvantages
● Assumption of Markov Property: Assumes that the current state only depends on the
previous state, which may not hold in complex scenarios.
● Parameter Estimation: Requires careful estimation of transition and emission probabilities.
● Hidden State Limitations: Number of hidden states must be predefined, which might
oversimplify complex systems.
MRF
A Markov Random Field (MRF), also known as an Undirected Graphical Model or Markov Network,
is a probabilistic model that uses an undirected graph to represent the dependencies between
random variables. In an MRF:
Key Features
● Undirected Edges: Unlike Bayesian networks, MRFs use undirected edges, indicating that the
relationship between connected nodes is mutual without a directional cause-and-effect flow.
● No Conditional Probability Distribution: Edges in an MRF show potential interactions but are
not associated with conditional probabilities.
● Local Interactions: Two nodes interact directly only if they are connected by an edge.
How It Works
1. Graph Structure: Nodes are variables (e.g., pixels), and edges show dependencies.
2. Potential Functions: These define how connected variables influence each other.
3. Joint Probability: Calculated using these functions to find the likelihood of a certain
configuration.
4. Inference: Used to predict unknown variables (e.g., labeling image regions).
Applications
Advantages
Disadvantages
Bayesian network
A Bayesian Network (also known as a Belief Network) is a probabilistic graphical model that
represents a set of variables and their conditional dependencies using a directed acyclic graph
(DAG). It is based on Bayes' theorem and is used to model uncertainty in complex systems.
1. Nodes: Represent random variables, which can be observable quantities, latent variables, or
unknown parameters.
2. Edges: Directed edges between nodes represent conditional dependencies. If there's an edge
from node AAA to node BBB, then AAA directly influences BBB.
3. Conditional Probability Tables (CPTs): Each node has a CPT that quantifies the effect of the
parent nodes on the node.
Scenario:
● B: Burglary occurred
● E: Earthquake occurred
● A: Alarm went off
● J: John called to report the alarm
● M: Mary called to report the alarm
Applications:
● Security Systems: To determine the likelihood of a break-in based on multiple sensor alerts.
● Medical Diagnosis: Inferring diseases from symptoms and test results.
● Fault Detection: In engineering systems based on observed failures.
Advantages:
● Compact Representation: Efficiently represents joint distributions.
● Causal Relationships: Clearly shows dependencies and causal structures.
● Flexible Inference: Can calculate probabilities given any evidence.
Challenges:
EM algorithm
The Expectation-Maximization (EM) algorithm is an iterative method used to find
maximum likelihood estimates of parameters in probabilistic models when the data has
missing or hidden (latent) variables.
How EM Works
It alternates between two steps:
Estimate the expected value of the latent variables (like cluster assignments) given the
current parameters of the model.
Example in GMM: Compute the probability that each data point belongs to each
Gaussian (soft assignment).
Update the model parameters (like mean, variance, mixing coefficients) to maximize the
expected log-likelihood found in the E-step.
Example in GMM: Update the means, covariances, and weights of each
Gaussian using the soft assignments.
Advantages
● Handles missing or hidden data effectively.
Limitations
● May converge to a local maximum, not necessarily the global one.
● Sensitive to initialization.
Applications
● Clustering (e.g., Gaussian Mixture Models)
● Image restoration
● Bioinformatics
● Anomaly detection