Bayesian Hierarchical Models and the Maximum Entropy Principle

Brendon J. Brewer

arXiv:2603.10252·stat.ML·May 1, 2026

Bayesian Hierarchical Models and the Maximum Entropy Principle

Brendon J. Brewer

PDF

TL;DR

This paper explores how Bayesian hierarchical models, with priors as maximum entropy distributions, imply a maximum entropy property for the marginal prior, clarifying the information assumptions involved.

Contribution

It demonstrates that hierarchical models with maximum entropy priors induce a maximum entropy property on the marginal prior under a different constraint.

Findings

01

Marginal priors inherit a maximum entropy property.

02

The maximum entropy property depends on the marginal distribution of a function of unknowns.

03

Results clarify the information encoded in hierarchical Bayesian models.

Abstract

Bayesian hierarchical models are frequently used in practical data analysis contexts. One interpretation of these models is that they provide an indirect way of assigning a prior for unknown parameters, through the introduction of hyperparameters. The resulting marginal prior for the parameters (integrating over the hyperparameters) is usually dependent, so that learning one parameter provides some information about the others. In this contribution, I will demonstrate that, when the prior given the hyperparameters is a canonical distribution (a maximum entropy distribution with moment constraints), the dependent marginal prior also has a maximum entropy property, with a different constraint. This constraint is on the marginal distribution of some function of the unknown quantities. The results shed light on what information is actually being assumed when we assign a hierarchical model.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.