Open In App

Bayesian Inference for the Gaussian

Last Updated : 27 Mar, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

Bayesian inference is a strong statistical tool for revising beliefs regarding an unknown parameter given newly released data. For Gaussian (Normal) distributed data, Bayesian inference enables us to make inferences of the mean and variance of the underlying normal distribution in a principled manner. Bayesian inference is especially practical in real-life problems like machine learning, finance, signal processing, and scientific modeling, where previous information can be integrated into the estimation process.

This article explores Bayesian inference of Gaussian distributions, including posterior derivation, conjugate priors, parameter estimation, and applications.

Bayesian Inference Framework

Bayesian methodology is a probabilistic theory of inference that revises earlier beliefs by making use of observed data. It relies on Bayes' theorem, where the prior distribution (initial knowledge) and a likelihood function (new evidence) are combined to result in a posterior distribution. This methodology permits adaptive learning as well as the measurement of uncertainty and is employed in machine learning, finance, healthcare, as well as numerous other scientific areas.

Bayesian inference relies on Bayes' theorem, which states that:

p(\theta | x) = \frac{p(x | \theta) p(\theta)}{p(x)}

where:

  • p(θ∣x) is the posterior probability of the parameter
  • 𝜃 given observed data 𝑥.
  • p(x∣θ) is the likelihood of the data given the parameter.
  • p(θ) is the prior probability of θ, reflecting prior beliefs.
  • p(x) is the evidence (or marginal likelihood), ensuring the posterior normalizes to 1.

Bayesian Inference for Gaussian Distribution

Consider a dataset X = \{x_1, x_2, ..., x_n\} drawn from a normal distribution:

x_i \sim \mathcal{N}(\mu, \sigma^2)

where the mean 𝜇 and variance 𝜎2 are unknown. We use Bayesian inference to estimate these parameters.

1. Estimating the Mean with Known Variance

Prior Distribution

A common choice for the prior distribution of 𝜇 (when variance 𝜎2 is known) is a Gaussian prior:

\mu \sim \mathcal{N}(\mu_0, \tau^2)

where:

  • \mu_0 is the prior mean.
  • \tau^2 is the prior variance

Likelihood Function

Given 𝑛 independent observations, the likelihood function is:

p(X | \mu) = \prod_{i=1}^{n} \frac{1}{\sqrt{2\pi \sigma^2}} \exp \left( -\frac{(x_i - \mu)^2}{2\sigma^2} \right)

Posterior Distribution

Using Bayes’ theorem, the posterior for 𝜇 is also Gaussian:

p(\mu | X) = \mathcal{N} \left(\mu_n, \tau_n^2 \right)

where:

  • Updated mean: \mu_n = \frac{\frac{\mu_0}{\tau^2} + \frac{n \bar{x}}{\sigma^2}}{\frac{1}{\tau^2} + \frac{n}{\sigma^2}}
  • Updated variance: \tau_n^2 = \frac{1}{\frac{1}{\tau^2} + \frac{n}{\sigma^2}}

This shows how prior knowledge is updated with new data.

2. Estimating Variance with Known Mean

Prior Distribution

For unknown variance 𝜎2, the conjugate prior is the inverse-gamma distribution:

\sigma^2 \sim \text{Inv-Gamma}(\alpha_0, \beta_0)

where 𝛼0 and 𝛽0 are hyperparameters controlling prior belief.

Likelihood Function

The likelihood for variance is:

p(X | \sigma^2) \propto (\sigma^2)^{-n/2} \exp \left( -\frac{\sum (x_i - \mu)^2}{2\sigma^2} \right)

Posterior Distribution

The posterior for 𝜎2 is also inverse-gamma:

p(\sigma^2 | X) = \text{Inv-Gamma}(\alpha_n, \beta_n)

where:

  • Updated shape parameter: \alpha_n = \alpha_0 + \frac{n}{2}
  • Updated scale parameter: \beta_n = \beta_0 + \frac{1}{2} \sum (x_i - \mu)^2

3. Joint Estimation of Mean and Variance

When both 𝜇 and 𝜎2 are unknown, we use a Normal-Inverse-Gamma (NIG) prior:

\mu, \sigma^2 \sim \text{NIG}(\mu_0, \lambda_0, \alpha_0, \beta_0)

where \lambda_0 is a precision parameter. The posterior remains a Normal-Inverse-Gamma:

p(\mu, \sigma^2 | X) = \text{NIG}(\mu_n, \lambda_n, \alpha_n, \beta_n)

where:

  • Updated mean: \mu_n = \frac{\lambda_0 \mu_0 + n \bar{x}}{\lambda_0 + n}
  • Updated precision: \lambda_n = \lambda_0 + n
  • Updated shape parameter: \alpha_n = \alpha_0 + \frac{n}{2}
  • Updated scale parameter: \beta_n = \beta_0 + \frac{1}{2} \sum (x_i - \bar{x})^2 + \frac{\lambda_0 n (\bar{x} - \mu_0)^2}{2(\lambda_0 + n)}

Applications of Bayesian Inference for Gaussian Distributions

  1. Machine Learning: Bayesian linear regression uses prior assumptions regarding model parameters in the form of a Gaussian distribution, resulting in more stable predictions, particularly when there is limited data. This prevents overfitting and gives uncertainty quantification.
  2. Finance: Bayesian inference is applied to estimate stock return distributions assuming Gaussianity, enabling improved risk estimation and portfolio optimization through updating beliefs with new financial information.
  3. Medical Science: In medicine, Bayesian models update patient parameters like heart rate or blood pressure dynamically based on prior knowledge, enhancing disease diagnosis and individualized treatment plans over time.
  4. Signal Processing: The Kalman filter, a Bayesian technique, recursively estimates latent states in dynamic systems (e.g., object tracking in radar systems) by representing noise and uncertainties as Gaussian distributions.
  5. Quality Control: Bayesian estimation aids in monitoring and controlling variability of manufacturing processes by updating the estimated parameters of a Gaussian-distributed quality measure continually, maintaining product consistency and reducing defects.



    Next Article

    Similar Reads