bayesbayesstatssummary.pdf#bayesbayesstatssummary
bayesbayesstatssummary.pdf#bayesbayesstatssummary
com
bayesstats summary — Bayesian summary statistics
Description
bayesstats summary calculates and reports posterior summary statistics for model parameters and
functions of model parameters using current Bayesian estimation results. Posterior summary statistics
include posterior means, posterior standard deviations, MCMC standard errors (MCSE), posterior
medians, and equal-tailed credible intervals or highest posterior density (HPD) credible intervals.
Quick start
Posterior summaries for all model parameters after a Bayesian regression model
bayesstats summary
Same as above, but only for parameters {y:x1} and {y:x2}
bayesstats summary {y:x1} {y:x2}
Same as above
bayesstats summary {y:x1 x2}
Posterior summaries for elements 1,1 and 2,1 of matrix parameter {S}
bayesstats summary {S_1_1 S_2_1}
Posterior summaries for all elements of matrix parameter {S}
bayesstats summary {S}
Posterior summaries with HPD instead of equal-tailed credible intervals and with credible level of
90%
bayesstats summary, hpd clevel(90)
Posterior summaries with MCSE calculated using batch means
bayesstats summary, batch(100)
Posterior summaries for functions of scalar model parameters
bayesstats summary ({y:x1}-{y:_cons}) (sd:sqrt({var}))
Posterior summaries for the log-likelihood and log-posterior functions
bayesstats summary _loglikelihood _logposterior
Posterior summaries for selected model parameters and functions of model parameters and for
log-likelihood and log-posterior functions using abbreviated syntax
bayesstats summary {var} ({y:x1}-{y:_cons}) _ll _lp
Posterior summaries of the simulated outcome
bayespredict {_ysim}, saving(predres)
bayesstats summary {_ysim} using predres
1
2 bayesstats summary — Bayesian summary statistics
Posterior summaries of the mean across observations of the simulated outcome labeled as mymean
bayesstats summary (mymean: @mean({_ysim})) using predres
Menu
Statistics > Bayesian analysis > Summary statistics
Syntax
Syntax is presented under the following headings:
Summary statistics for model parameters
Summary statistics for predictions
Full syntax
bayesstats summary spec spec . . . , options
Summary statistics for Mata functions of simulated outcomes, residuals, and more
bayesstats summary (funcspec) (funcspec) . . . using predfile , options
Full syntax
bayesstats summary predspec predspec . . . using predfile , options
predfile is the name of the dataset created by bayespredict that contains prediction results.
yspec is {ysimspec | residspec | muspec | label}.
ysimspec is { ysim#} or { ysim#[numlist]}, where { ysim#} refers to all observations of the #th
simulated outcome and { ysim#[numlist]} refers to the selected observations, numlist, of the #th
simulated outcome. { ysim} is a synonym for { ysim1}.
residspec is { resid#} or { resid#[numlist]}, where { resid#} refers to all residuals of the
#th simulated outcome and { resid#[numlist]} refers to the selected residuals, numlist, of the
#th simulated outcome. { resid} is a synonym for { resid1}.
muspec is { mu#} or { mu#[numlist]}, where { mu#} refers to all expected values of the #th
outcome and { mu#[numlist]} refers to the selected expected values, numlist, of the #th outcome.
{ mu} is a synonym for { mu1}.
label is the name of the function simulated using bayespredict.
With large datasets, specifications { ysim#}, { resid#}, and { mu#} may use a lot of time and
memory and should be avoided. See Generating and saving simulated outcomes in [BAYES] bayespre-
dict.
yexprspec is exprlabel: yexpr, where exprlabel is a valid Stata name and yexpr is a scalar expression
that may contain individual observations of simulated outcomes, { ysim#[#]}; individual expected
outcome values, { mu#[#]}; individual simulated residuals, { resid#[#]}; and other scalar
predictions, {label}.
4 bayesstats summary — Bayesian summary statistics
funcspec is label: @func(arg1 , arg2 ), where label is a valid Stata name; func is an official or user-
defined Mata function
that operates on column vectors
and returns a real scalar; and arg1 and arg2
are one of { ysim # }, { resid # }, or { mu # }. arg2 is primarily for use with user-defined
Mata functions; see Defining test statistics using Mata functions in [BAYES] bayespredict.
predspec is one of yspec, (yexprspec), or (funcspec). See Different ways of specifying predictions
and their functions in [BAYES] Bayesian postestimation.
options Description
Main
clevel(#) set credible interval level; default is clevel(95)
hpd display HPD credible intervals instead of the default equal-tailed credible
intervals
batch(#) specify length of block for batch-means calculations; default is batch(0)
∗
chains( all | numlist) specify which chains to use for computation; default is chains( all)
∗
sepchains compute results separately for each chain
skip(#) skip every # observations from the MCMC sample; default is skip(0)
nolegend suppress table legend
display options control spacing, line width, and base and empty cells
Advanced
corrlag(#) specify maximum autocorrelation lag; default varies
corrtol(#) specify autocorrelation tolerance; default is corrtol(0.01)
∗
Options chains() and sepchains are relevant only when option nchains() is used during Bayesian estimation.
collect is allowed; see [U] 11.1.10 Prefix commands.
Options
Main
clevel(#) specifies the credible level, as a percentage, for equal-tailed and HPD credible intervals.
The default is clevel(95) or as set by [BAYES] set clevel.
hpd displays the HPD credible intervals instead of the default equal-tailed credible intervals.
batch(#) specifies the length of the block for calculating batch means and an MCSE using batch
means. The default is batch(0), which means no batch calculations. When batch() is not
specified, the MCSE is computed using effective sample sizes instead of batch means. batch()
may not be combined with corrlag() or corrtol().
chains( all | numlist) specifies which chains from the MCMC sample to use for computation. The
default is chains( all) or to use all simulated chains. Using multiple chains, provided the chains
have converged, generally improves MCMC summary statistics. Option chains() is relevant only
when option nchains() is used during Bayesian estimation.
sepchains specifies that the results be computed separately for each chain. The default is to compute
results using all chains as determined by option chains(). Option sepchains is relevant only
when option nchains() is used during Bayesian estimation.
showreffects and showreffects(reref ) are for use after multilevel models, and they specify that
the results for all or a list reref of random-effects parameters be provided in addition to other model
parameters. By default, all random-effects parameters are excluded from the results to conserve
computation time.
bayesstats summary — Bayesian summary statistics 5
skip(#) specifies that every # observations from the MCMC sample not be used for computation.
The default is skip(0) or to use all observations in the MCMC sample. Option skip() can be
used to subsample or thin the chain. skip(#) is equivalent to a thinning interval of # +1. For
example, if you specify skip(1), corresponding to the thinning interval of 2, the command will
skip every other observation in the sample and will use only observations 1, 3, 5, and so on in the
computation. If you specify skip(2), corresponding to the thinning interval of 3, the command
will skip every 2 observations in the sample and will use only observations 1, 4, 7, and so on in
the computation. skip() does not thin the chain in the sense of physically removing observations
from the sample, as is done by, for example, bayesmh’s thinning() option. It only discards
selected observations from the computation and leaves the original sample unmodified.
nolegend suppresses the display of the table legend, which identifies the rows of the table with the
expressions they represent.
display options: vsquish, noemptycells, baselevels, allbaselevels, nofvlabel,
fvwrap(#), fvwrapon(style), and nolstretch; see [R] Estimation options.
Advanced
corrlag(#) specifies the maximum autocorrelation lag used for calculating effective sample sizes. The
default is min{500, mcmcsize()/2}. The total autocorrelation is computed as the sum of all lag-k
autocorrelation values for k from 0 to either corrlag() or the index at which the autocorrelation
becomes less than corrtol() if the latter is less than corrlag(). Options corrlag() and
batch() may not be combined.
corrtol(#) specifies the autocorrelation tolerance used for calculating effective sample sizes. The
default is corrtol(0.01). For a given model parameter, if the absolute value of the lag-k
autocorrelation is less than corrtol(), then all autocorrelation lags beyond the k th lag are
discarded. Options corrtol() and batch() may not be combined.
Introduction
bayesstats summary reports posterior summary statistics for model parameters and their functions
using the current Bayesian estimation results. When typed without arguments, the command displays
results for all model parameters. Alternatively, you can specify a subset of model parameters following
the command name; see Different ways of specifying model parameters in [BAYES] Bayesian
postestimation. You can also obtain results for scalar functions of model parameters; see Specifying
functions of model parameters in [BAYES] Bayesian postestimation.
Sometimes, it may be useful to obtain posterior summaries of log-likelihood and log-posterior
functions. This can be done by specifying loglikelihood and logposterior (or the respective
synonyms ll and lp) following the command name.
You can also obtain the posterior summaries for prediction quantities when you specify the prediction
dataset in the using specification; see Different ways of specifying predictions and their functions in
[BAYES] Bayesian postestimation for how to specify prediction quantities with bayesstats summary.
6 bayesstats summary — Bayesian summary statistics
bayesstats summary reports the following posterior summary statistics: posterior mean, posterior
standard deviation, MCMC standard error, posterior median, and equal-tailed credible intervals or, if
the hpd option is specified, HPD credible intervals. The default credible level is set to 95%, but you
can change this by specifying the clevel() option. Equal-tailed and HPD intervals may produce very
different results for asymmetric or highly skewed marginal posterior distributions. The HPD intervals
are preferable in this situation.
You should not confuse the term “HPD interval” with the term “HPD region”. A {100×(1−α)}% HPD
interval is defined such that it contains {100 ×(1 −α)}% of the posterior density. A {100 ×(1 −α)}%
HPD region also satisfies the condition that the density inside the region is never lower than that outside
the region. For multimodal univariate marginal posterior distributions, the HPD regions may include
unions of nonintersecting HPD intervals. For unimodal univariate marginal posterior distributions, HPD
regions are indeed simply HPD intervals. The bayesstats summary command thus calculates HPD
intervals assuming unimodal marginal posterior distributions (Chen and Shao 1999).
Some authors use the term “posterior intervals” instead of “credible intervals” and the term “central
posterior intervals” instead of “equal-tailed credible intervals” (for example, Gelman et al. [2014]).
Likelihood:
mpg ~ normal({mpg:_cons},{var})
Priors:
{mpg:_cons} ~ 1 (flat)
{var} ~ jeffreys
Equal-tailed
Mean Std. dev. MCSE Median [95% cred. interval]
mpg
_cons 21.29222 .6828864 .021906 21.27898 19.99152 22.61904
Equal-tailed
Mean Std. dev. MCSE Median [95% cred. interval]
mpg
_cons 21.29222 .6828864 .021906 21.27898 19.99152 22.61904
The posterior mean of {mpg: cons} is 21.29 and of {var} is 34.8. They are close to their respective
frequentist analogs (the sample mean of mpg is 21.297, and the sample variance is 33.47), because
we used a noninformative prior. Posterior standard deviations are 0.68 for {mpg: cons} and 5.92
for {var}, and they are comparable to frequentist standard errors under this noninformative prior.
The standard error estimates of the posterior means, MCSEs, are low. For example, MCSE is 0.022
for {mpg: cons}. This means that the precision of our estimate is, up to one decimal point, 21.3
provided that MCMC converged. The posterior means and medians of {mpg: cons} are close, which
suggests that the posterior distribution for {mpg: cons} may be symmetric. According to the credible
intervals, we are 95% certain that the posterior mean of {mpg: cons} is roughly between 20 and
23 and that the posterior mean of {var} is roughly between 25 and 48. We can infer from this that
{mpg: cons} is greater than, say, 15, and that {var} is greater than, say, 20, with a very high
probability. (We can use [BAYES] bayestest interval to compute the actual probabilities.)
The above is also equivalent to typing
. bayesstats summary {mpg:_cons} {var}
(output omitted )
Equal-tailed
Mean Std. dev. MCSE Median [90% cred. interval]
mpg
_cons 21.29222 .6828864 .021906 21.27898 20.18807 22.44172
HPD
Mean Std. dev. MCSE Median [95% cred. interval]
mpg
_cons 21.29222 .6828864 .021906 21.27898 19.94985 22.54917
The posterior distribution of {mpg: cons} is symmetric about the posterior mean; thus there is
little difference between the 95% equal-tailed credible interval from example 1 and this 95% HPD
credible interval for {mpg: cons}. The 95% HPD interval for {var} has a smaller width than the
corresponding equal-tailed interval in example 1.
Equal-tailed
Mean Std. dev. MCSE Median [95% cred. interval]
mpg
_cons 21.29222 .6828864 .015315 21.27898 19.99152 22.61904
The batch-means MCSE estimates are somewhat smaller than those obtained by default using effective
sample sizes.
Use caution when choosing the batch size for the batch-means method. For example, if you use
the batch size of 1, you will obtain MCSE estimates under the assumption that the draws in the MCMC
sample are independent, which is not true.
bayesstats summary — Bayesian summary statistics 9
Equal-tailed
Mean Std. dev. MCSE Median [95% cred. interval]
mpg
_cons 21.29554 .6813796 .029517 21.27907 19.98813 22.58582
We selected to skip every 9 observations, which led to a significant reduction of the MCMC sample
size and thus increased our standard deviations. In some cases, with larger MCMC sample sizes,
subsampling may decrease standard deviations because of the decreased autocorrelation in the reduced
MCMC sample.
Equal-tailed
Mean Std. dev. MCSE Median [95% cred. interval]
Expressions can also be used for calculating posterior probabilities, although this can be more
easily done using bayestest interval (see [BAYES] bayestest interval). For illustration, let’s verify
the probability that {var} is within the endpoints of the reported credible interval, indeed 0.95.
. bayesstats summary (prob:{var}>24.913 & {var}<47.613)
Posterior summary statistics MCMC sample size = 10,000
prob : {var}>24.913 & {var}<47.613
Equal-tailed
Mean Std. dev. MCSE Median [95% cred. interval]
Equal-tailed
Mean Std. dev. MCSE Median [95% cred. interval]
We can now summarize the prediction results by using bayesstats summary. We specify the
prediction quantity we wish to summarize, the simulated outcome { ysim} in our example, and the
prediction dataset, mpgreps.dta, which contains the prediction quantity, in the using specification.
bayesstats summary — Bayesian summary statistics 11
Equal-tailed
Mean Std. dev. MCSE Median [95% cred. interval]
bayesstats summary reports posterior summaries for all simulated outcomes in the prediction dataset,
mpgreps.dta. Estimated posterior means and standard deviations are similar to the corresponding
observed values for mpg, 21.30 and 5.79, respectively.
We can specifically examine the first observation of the replicated sample, { ysim 1}, and
compare it with the observed value, mpg[1], of 22.
. bayesstats summary ({_ysim_1}>=‘=mpg[1]’) using mpgreps
Posterior summary statistics MCMC sample size = 10,000
expr1 : _ysim1_1>=22
Equal-tailed
Mean Std. dev. MCSE Median [95% cred. interval]
We find that 45% of the replicates of mpg[1] are greater than 22. The reported probability of
0.45 is known as the posterior predictive p-value and is used for goodness-of-fit checking; see
[BAYES] bayesstats ppvalues.
Stored results
bayesstats summary stores the following in r():
Scalars
r(mcmcsize) MCMC sample size used in the computation
r(clevel) credible interval level
r(hpd) 1 if hpd is specified, 0 otherwise
r(batch) batch length for batch-means calculations
r(skip) number of MCMC observations to skip in the computation; every r(skip) observations
are skipped
r(corrlag) maximum autocorrelation lag
r(corrtol) autocorrelation tolerance
r(nchains) number of chains used in the computation
Macros
r(names) names of model parameters and expressions
r(expr #) #th expression
r(exprnames) expression labels
r(chains) chains used in the computation, if chains() is specified
12 bayesstats summary — Bayesian summary statistics
Matrices
r(summary) matrix with posterior summaries statistics for parameters in r(names)
r(summary chain#) matrix summary for chain #, if sepchains is specified
Most of the summary statistics employed in Bayesian analysis are based on the marginal posterior
distributions of individual model parameters or functions of model parameters.
Let θ be a scalar model parameter and {θt }Tt=1 be an MCMC chain of size T drawn from the
marginal posterior distribution of θ. For a function g(θ), substitute {θt }Tt=1 with {g(θt )}Tt=1 in the
formulas below. If θ is a covariance matrix model parameter, the formulas below are applied to each
element of the lower-diagonal portion of θ.
Point estimates
Marginal posterior moments are approximated using the Monte Carlo integration applied to the
simulated samples {θt }Tt=1 .
Sample posterior mean and sample posterior standard deviation are defined as follows,
T T
1X 1 X
θb = θt , sb2 = b2
(θt − θ)
T t=1 T − 1 t=1
where θb and sb2 are sample estimators of the population posterior mean E(θt ) and posterior variance
Var(θt ).
With multiple chains, the posterior mean and standard deviation are estimated using the combined
sample of all chains or of those that are requested in the chains() option as follows. Let {θjt }Tt=1
be the j th Markov chain, j = 1, . . . , M , with sample mean θbj and variance sb2j . The overall sample
posterior mean is
M T
1 XX
θb = θjt
MT j=1 t=1
and equals the average of the sample means of individual chains. Let B and W be the respective
between-chains and within-chain variances
M M
T X b b 2, W = 1
X
B= (θj − θ) sb2
M − 1 j=1 M j=1 j
T −1 1
sb2 = W+ B (1)
T T
bayesstats summary — Bayesian summary statistics 13
When the chains are strongly stationary, sb2 is an unbiased estimator of the marginal posterior variance
of θ (Gelman et al. 2014, sec. 11.4).
The precision of the sample posterior mean is evaluated by its standard error, also known as the
Monte Carlo standard error √ (MCSE). Note that MCSE cannot be estimated using the classical formula
for the standard error, sb/ T , because of the dependence between θt ’s.
Let
∞
X
σ 2 = Var(θt ) + 2 Cov(θt , θt+k )
k=1
√
Then, T ×MCSE approaches σ asymptotically in T .
bayesstats summary provides two different approaches for estimating MCSE. Both approaches
try to adjust for the existing autocorrelation in the MCMC sample. The first one uses the so-called
effective sample size (ESS), and the second one uses batch means (Roberts 1996; Jones et al. 2006).
The ESS-based estimator for MCSE, the default in bayesstats summary, is given by
√
MCSE(θ)b = sb/ ESS
ESS is defined as
max
Xlags
ESS = T /(1 + 2 ρk )
k=1
where ρk is the lag-k autocorrelation, and max lags is the maximum number less than or equal to
ρlag such that for all k = 1, . . . , max lags, |ρk | > ρtol , where ρlag and ρtol are specified in options
corrlag() and corrtol() with the respective default values of 500 and 0.01. ρk is estimated as
γk /γ0 , where
T −k
1 X
γk = (θt − θ)(θ
b t+k − θ) b
T t=1
is the lag-k empirical autocovariance.
With multiple chains, the overall ESS is given by the sum of the effective sample sizes of individual
chains. The MCSE is then calculated using the formula
v
uM
uX
MCSE(θ) b = sb/t ESSj
j=1
where sb is computed using (1) and ESSj is the effective sample size of the j th chain.
The batch-means estimator of MCSE is obtained as follows. For a given batch of length b, the
initial MCMC chain is split into m batches of size b,
{θj 0 +1 , . . . , θj 0 +b } {θj 0 +b+1 , . . . , θj 0 +2b } . . . {θT −b+1 , . . . , θT }
0
where j = T − m × b and m batch means µ b1 , . . . , µbm are calculated as sample means of each
batch. m is chosen as the maximum number such that m × b ≤ T . If m is not a divisor of T ,
the first T − m × b observations of the sample are not used in the batch-means computation. The
batch-means estimator of the posterior variance, sb2batch , is based on the assumption that µ bj s are much
less correlated than the original sample draws.
The batch-means estimator of the posterior mean is
m
1 X
θbbatch = µ
bj
m j=1
14 bayesstats summary — Bayesian summary statistics
We have θbbatch = θb, whenever m × b = T . Under the assumption that the batch means are
Pm
uncorrelated, sb2batch = {1/(m − 1)} j=1 (b µj − θbbatch )2 can be used as an estimator of σ 2 /b. This
fact justifies the batch-means estimator of MCSE given by
sbbatch
MCSEbatch (θ)
b = √
m
The accuracy of the batch-means estimator depends on the choice of the batch length b. The higher
the batch length b should be, provided
the autocorrelation in the original MCMC sample, the larger √
that the number of batches m does not become too small; T is typically used as the maximum
value for b. The batch length is commonly determined by inspecting the autocorrelation plot for θ.
Under certain assumptions, Flegal and Jones (2010) establish that an asymptotically optimal batch
size is of order T 1/3 .
With multiple chains, the batch-means estimator is calculated using the combined sample of all
chains or of those that are requested in the chains() option.
Credible intervals
Let θ(1) , . . . , θ(T ) be an MCMC sample ordered from smallest to largest. Let (1 − α) be a credible
level. Then, a {100 × (1 − α)}% equal-tailed credible interval is
References
Brooks, S. P., and A. Gelman. 1998. General methods for monitoring convergence of iterative simulations. Journal
of Computational and Graphical Statistics 7: 434–455. https://doi.org/10.1080/10618600.1998.10474787.
Chen, M.-H., and Q.-M. Shao. 1999. Monte Carlo estimation of Bayesian credible and HPD intervals. Journal of
Computational and Graphical Statistics 8: 69–92. https://doi.org/10.2307/1390921.
Flegal, J. M., and G. L. Jones. 2010. Batch means and spectral variance estimators in Markov chain Monte Carlo.
Annals of Statistics 38: 1034–1070. https://doi.org/10.1214/09-AOS735.
Gelman, A., J. B. Carlin, H. S. Stern, D. B. Dunson, A. Vehtari, and D. B. Rubin. 2014. Bayesian Data Analysis.
3rd ed. Boca Raton, FL: Chapman and Hall/CRC.
Jones, G. L., M. Haran, B. S. Caffo, and R. Neath. 2006. Fixed-width output analysis for Markov chain Monte Carlo.
Journal of the American Statistical Association 101: 1537–1547. https://doi.org/10.1198/016214506000000492.
Roberts, G. O. 1996. Markov chain concepts related to sampling algorithms. In Markov Chain Monte Carlo in Practice,
ed. W. R. Gilks, S. Richardson, and D. J. Spiegelhalter, 45–57. Boca Raton, FL: Chapman and Hall.
bayesstats summary — Bayesian summary statistics 15
Also see
[BAYES] bayes — Bayesian regression models using the bayes prefix+
[BAYES] bayesmh — Bayesian models using Metropolis–Hastings algorithm+
[BAYES] bayesselect — Bayesian variable selection for linear regression+
[BAYES] Bayesian estimation — Bayesian estimation commands
[BAYES] Bayesian postestimation — Postestimation tools after Bayesian estimation
[BAYES] bayesgraph — Graphical summaries and convergence diagnostics
[BAYES] bayespredict — Bayesian predictions
[BAYES] bayesstats ess — Effective sample sizes and related statistics
[BAYES] bayesstats ppvalues — Bayesian predictive p-values and other predictive summaries
[BAYES] bayestest interval — Interval hypothesis testing
Stata, Stata Press, and Mata are registered trademarks of StataCorp LLC. Stata and
®
Stata Press are registered trademarks with the World Intellectual Property Organization
of the United Nations. StataNow and NetCourseNow are trademarks of StataCorp
LLC. Other brand and product names are registered trademarks or trademarks of their
respective companies. Copyright c 1985–2023 StataCorp LLC, College Station, TX,
USA. All rights reserved.
For suggested citations, see the FAQ on citing Stata documentation.