Comment on "Optimal prior for Bayesian inference in a constrained   parameter space" by S. Hannestad and T. Tram, arXiv:1710.08899

Robert D. Cousins

arXiv:1902.07667·astro-ph.CO·February 21, 2019

Comment on "Optimal prior for Bayesian inference in a constrained parameter space" by S. Hannestad and T. Tram, arXiv:1710.08899

Robert D. Cousins

PDF

Open Access

TL;DR

This paper clarifies that the Jeffreys prior remains unchanged when applied to constrained parameter spaces, countering previous claims that it differs from the unconstrained case.

Contribution

It provides a correction to prior assertions, demonstrating that the Jeffreys prior is invariant under parameter constraints.

Findings

01

Jeffreys prior is the same for constrained and unconstrained spaces

02

Counter to previous claims, the prior remains unchanged under constraints

03

Clarifies the correct application of Jeffreys prior in constrained Bayesian inference

Abstract

The Jeffreys prior for a constrained part of a parameter space is the same as that for the unconstrained space, contrary to the assertions of Hannestad and Tram.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Statistical Methods and Inference · Markov Chains and Monte Carlo Methods

Full text

Comment on “Optimal prior for Bayesian inference in a

constrained parameter space” by S. Hannestad and T. Tram, arXiv:1710.08899

Robert D. Cousins

Dept. of Physics and Astronomy, University of California, Los Angeles, California 90095 USA [email protected]

(February 20, 2019)

Abstract: The Jeffreys prior for a constrained part of a parameter space is the same as that for the unconstrained space, contrary to the assertions of Hannestad and Tram.

An arXiv post by Hannestad and Tram [1] states,

Under the assumption that the likelihood is a Gaussian distribution, the Jeffreys prior is a constant, i.e. flat. However, if one parameter is constrained by physical considerations, the Gaussian approximation fails and the flat prior is no longer the Jeffreys prior…In this paper we compute the correct Jeffreys prior for a multivariate normal distribution constrained in one dimension…

When this was pointed out to me last week, I thought this odd since I have long been teaching (including recently in Section 5.3 of Ref. [2]), that such a constraint does not modify the Jeffreys prior, based on my memory of Jim Berger’s book [3]. Since this specific result of Ref. [1] is being cited without criticism [4, 5, 6, 7, 8], it is worth opening Berger’s book and finding on page 89:

One important feature of the Jeffreys noninformative prior is that it is not affected by a restriction on the parameter space. Thus, if it is known in Example 5 that $\theta>0$ , the Jeffreys noninformative prior is still $\pi(\theta)=1$ (on $(0,\infty)$ , of course). This is important, because one of the situations in which noninformative priors prove to be extremely useful is when dealing with restricted parameter spaces (see Chapter 4). In such situations we will, therefore, simply assume that the noninformative prior is that which is inherited from the unrestricted parameter space.

(Example 5 is for a general location parameter, which includes the Gaussian case of Ref. [1].) As the Jeffreys prior is problematic with more than one parameter, “reference priors” were developed by Bernardo and Berger. According to Ref. [9], in general they share this important feature:

Another common type of partial information is constraints on the parameter space. This is typically easily handled, however, in that reference priors for a constrained space are almost always just the unconstrained reference prior times the indicator function on the constrained space.

It appears that Hannestad and Tram have fallen into a trap that I discuss in Section 6.9 of Ref. [2], namely the “Famous confusion re Gaussian $p(x|\mu)$ where $\mu$ is mass $\geq 0$ .” That is, they view the sampled value $x$ (called $q$ in their paper) as an estimate of $\mu$ and thus consider only $x\geq 0$ . In truth, there is nothing anomalous about a sampled $x<0$ , and the computation of the expectation value over $x$ should not be restricted to $x\geq 0$ . Once this is understood (that restrictions in the parameter space do not restrict the sample space just because one may like to think of $x$ as the estimate or “measured value” of the parameter), the quoted passage from Berger’s book becomes obvious. If one still has trouble distinguishing sampled value of $x$ from point estimate of $\mu$ , Section 6.4 of Ref. [2] has an instructive example. As noted at the end of Section 6.8, the confusion is also avoided if $x$ and $\mu$ have different units or dimensions.

In any case, it would seem that this result of Hannestad and Tram is contradicted by well-established professional statistics literature.

Acknowledgments

I think Maurizio Pierini for pointing me to Ref. [1]. This work was partially supported by the U.S. Department of Energy under Award Number DE–SC0009937.

Bibliography9

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Steen Hannestad and Thomas Tram, “Optimal prior for Bayesian inference in a constrained parameter space,” ar Xiv:1710.08899 [astro-ph.CO].
2[2] Robert D. Cousins, “Lectures on Statistics in Theory: Prelude to Statistics in Practice,” ar Xiv:1807.05996 [physics.data-an].
3[3] James O. Berger, Statistical Decision Theory and Bayesian Analysis , Springer Series in Statistics, 2nd edition (New York: Springer, 1985). See page 89.
4[4] S. Casas, M. Pauly and J. Rubio, “Higgs-dilaton cosmology: An inflation–dark-energy connection and forecasts for future galaxy surveys,” Phys. Rev. D 97 043520 (2018) doi:10.1103/Phys Rev D.97.043520 [ar Xiv:1712.04956 [astro-ph.CO]].
5[5] S. Gariazzo, M. Archidiacono, P. F. de Salas, O. Mena, C. A. Ternes and M. Tortola, “Neutrino masses and their ordering: Global Data, Priors and Models,” JCAP 1803 011 (2018) doi:10.1088/1475-7516/2018/03/011 [ar Xiv:1801.04946 [hep-ph]].
6[6] P. F. de Salas, S. Gariazzo, M. Laveder, S. Pastor, O. Pisanti and N. Truong, “Cosmological bounds on neutrino statistics,” JCAP 1803 050 (2018) doi:10.1088/1475-7516/2018/03/050 [ar Xiv:1802.04639 [astro-ph.CO]].
7[7] R. Reischke, A. Spurio Mancini, B. M. Schäfer and P. M. Merkel, “Investigating scalar-tensor-gravity with statistics of the cosmic large-scale structure,” Mon. Not. Roy. Astron. Soc. 482 3274 (2019) doi:10.1093/mnras/sty 2919 [ar Xiv:1804.02441 [astro-ph.CO]].
8[8] J. A. D. Diacoumis and Y. Y. Y. Wong, “Prior dependence of cosmological constraints on dark matter-radiation interactions,” ar Xiv:1811.11408 [astro-ph.CO].