Keywords

Bayes, statistics

Abstract

This article shows how to compute confidence intervals for mean, standard-deviation, and variance using Bayesian methods. The method is implemented in SciPy as scipy.stats.bayes_mvs After reviewing some classical estimators for mean, variance, and standard-deviation and showing that un-biased estimates are not usually desirable, a Bayesian perspective is employed to determine what is known about mean, variance, and standard deviation given only that a data set in-fact has a common mean and variance. Maximum-entropy is used to argue that the likelihood function in this situation should be the same as if the data were independent and identically distributed Gaussian. A non-informative prior is derived for the mean and variance and Bayes rule is used to compute the posterior Probability Density Function (PDF) of \left(\mu,\sigma\right) as well as \left(\mu,\sigma^{2}\right) in terms of the sufficient statistics \bar{x}=\frac{1}{n}\sum_{i}x_{i} and C=\frac{1}{n}\sum_{i}\left(x_{i}-\bar{x}\right)^{2}. From the joint distribution marginals are determined. It is shown that \left(\frac{\mu-\bar{x}}{\sqrt{C}}\right)\sqrt{n-1} is distributed as Student-t with n-1 degrees of freedom, \sigma\sqrt{\frac{2}{nC}} is distributed as generalized-gamma with c=-2 and a=\frac{n-1}{2}, and \sigma^{2}\frac{2}{nC} is distributed as inverted-gamma with a=\frac{n-1}{2}. It is suggested to report the mean of these distributions as the estimate (or the peak if n is too small for the mean to be defined) and a confidence interval surrounding the median.