Bayesian Inference for the Gaussian

Last Updated : 27 Mar, 2025

Bayesian inference is a strong statistical tool for revising beliefs regarding an unknown parameter given newly released data. For Gaussian (Normal) distributed data, Bayesian inference enables us to make inferences of the mean and variance of the underlying normal distribution in a principled manner. Bayesian inference is especially practical in real-life problems like machine learning, finance, signal processing, and scientific modeling, where previous information can be integrated into the estimation process.

This article explores Bayesian inference of Gaussian distributions, including posterior derivation, conjugate priors, parameter estimation, and applications.

Bayesian Inference Framework

Bayesian methodology is a probabilistic theory of inference that revises earlier beliefs by making use of observed data. It relies on Bayes' theorem, where the prior distribution (initial knowledge) and a likelihood function (new evidence) are combined to result in a posterior distribution. This methodology permits adaptive learning as well as the measurement of uncertainty and is employed in machine learning, finance, healthcare, as well as numerous other scientific areas.

Bayesian inference relies on Bayes' theorem, which states that:

p(\theta | x) = \frac{p(x | \theta) p(\theta)}{p(x)}

where:

p(θ∣x) is the posterior probability of the parameter
𝜃 given observed data 𝑥.
p(x∣θ) is the likelihood of the data given the parameter.
p(θ) is the prior probability of θ, reflecting prior beliefs.
p(x) is the evidence (or marginal likelihood), ensuring the posterior normalizes to 1.

Bayesian Inference for Gaussian Distribution

Consider a dataset X = \{x_1, x_2, ..., x_n\} drawn from a normal distribution:

x_i \sim \mathcal{N}(\mu, \sigma^2)

where the mean 𝜇 and variance 𝜎² are unknown. We use Bayesian inference to estimate these parameters.

1. Estimating the Mean with Known Variance

Prior Distribution

A common choice for the prior distribution of 𝜇 (when variance 𝜎²is known) is a Gaussian prior:

\mu \sim \mathcal{N}(\mu_0, \tau^2)

where:

\mu_0 is the prior mean.
\tau^2 is the prior variance

Likelihood Function

Given 𝑛 independent observations, the likelihood function is:

p(X | \mu) = \prod_{i=1}^{n} \frac{1}{\sqrt{2\pi \sigma^2}} \exp \left( -\frac{(x_i - \mu)^2}{2\sigma^2} \right)

Posterior Distribution

Using Bayes’ theorem, the posterior for 𝜇 is also Gaussian:

p(\mu | X) = \mathcal{N} \left(\mu_n, \tau_n^2 \right)

where:

Updated mean: \mu_n = \frac{\frac{\mu_0}{\tau^2} + \frac{n \bar{x}}{\sigma^2}}{\frac{1}{\tau^2} + \frac{n}{\sigma^2}}
Updated variance: \tau_n^2 = \frac{1}{\frac{1}{\tau^2} + \frac{n}{\sigma^2}}

This shows how prior knowledge is updated with new data.

2. Estimating Variance with Known Mean

Prior Distribution

For unknown variance 𝜎², the conjugate prior is the inverse-gamma distribution:

\sigma^2 \sim \text{Inv-Gamma}(\alpha_0, \beta_0)

where 𝛼₀ and 𝛽₀ are hyperparameters controlling prior belief.

Likelihood Function

The likelihood for variance is:

p(X | \sigma^2) \propto (\sigma^2)^{-n/2} \exp \left( -\frac{\sum (x_i - \mu)^2}{2\sigma^2} \right)

Posterior Distribution

The posterior for 𝜎² is also inverse-gamma:

p(\sigma^2 | X) = \text{Inv-Gamma}(\alpha_n, \beta_n)

where:

Updated shape parameter: \alpha_n = \alpha_0 + \frac{n}{2}
Updated scale parameter: \beta_n = \beta_0 + \frac{1}{2} \sum (x_i - \mu)^2

3. Joint Estimation of Mean and Variance

When both 𝜇 and 𝜎² are unknown, we use a Normal-Inverse-Gamma (NIG) prior:

\mu, \sigma^2 \sim \text{NIG}(\mu_0, \lambda_0, \alpha_0, \beta_0)

where \lambda_0 is a precision parameter. The posterior remains a Normal-Inverse-Gamma:

p(\mu, \sigma^2 | X) = \text{NIG}(\mu_n, \lambda_n, \alpha_n, \beta_n)

where:

Updated mean: \mu_n = \frac{\lambda_0 \mu_0 + n \bar{x}}{\lambda_0 + n}
Updated precision: \lambda_n = \lambda_0 + n
Updated shape parameter: \alpha_n = \alpha_0 + \frac{n}{2}
Updated scale parameter: \beta_n = \beta_0 + \frac{1}{2} \sum (x_i - \bar{x})^2 + \frac{\lambda_0 n (\bar{x} - \mu_0)^2}{2(\lambda_0 + n)}

Applications of Bayesian Inference for Gaussian Distributions

Machine Learning: Bayesian linear regression uses prior assumptions regarding model parameters in the form of a Gaussian distribution, resulting in more stable predictions, particularly when there is limited data. This prevents overfitting and gives uncertainty quantification.
Finance: Bayesian inference is applied to estimate stock return distributions assuming Gaussianity, enabling improved risk estimation and portfolio optimization through updating beliefs with new financial information.
Medical Science: In medicine, Bayesian models update patient parameters like heart rate or blood pressure dynamically based on prior knowledge, enhancing disease diagnosis and individualized treatment plans over time.
Signal Processing: The Kalman filter, a Bayesian technique, recursively estimates latent states in dynamic systems (e.g., object tracking in radar systems) by representing noise and uncertainties as Gaussian distributions.
Quality Control: Bayesian estimation aids in monitoring and controlling variability of manufacturing processes by updating the estimated parameters of a Gaussian-distributed quality measure continually, maintaining product consistency and reducing defects.

Bayesian Inference for the Gaussian

Vandita Gupta

Improve

Article Tags :

Bayesian Inference for the Gaussian

Bayesian Inference Framework

Bayesian Inference for Gaussian Distribution

1. Estimating the Mean with Known Variance

Prior Distribution

Likelihood Function

Posterior Distribution

2. Estimating Variance with Known Mean

Prior Distribution

Likelihood Function

Posterior Distribution

3. Joint Estimation of Mean and Variance

Applications of Bayesian Inference for Gaussian Distributions

Similar Reads

Thank You!

What kind of Experience do you want to share?