Bayesian Hierarchical Mixtures of Experts
Christopher M. Bishop, Markus Svensen

TL;DR
This paper introduces a fully Bayesian approach to Hierarchical Mixture of Experts models using variational inference, enabling better regularization and model selection for regression tasks.
Contribution
It develops a novel Bayesian variational inference framework for HME models, improving over previous ad-hoc methods and allowing for principled model complexity control.
Findings
The Bayesian HME model effectively prevents overfitting.
The variational lower bound guides model order selection.
Application to robot arm data demonstrates practical utility.
Abstract
The Hierarchical Mixture of Experts (HME) is a well-known tree-based model for regression and classification, based on soft probabilistic splits. In its original formulation it was trained by maximum likelihood, and is therefore prone to over-fitting. Furthermore the maximum likelihood framework offers no natural metric for optimizing the complexity and structure of the tree. Previous attempts to provide a Bayesian treatment of the HME model have relied either on ad-hoc local Gaussian approximations or have dealt with related models representing the joint distribution of both input and output variables. In this paper we describe a fully Bayesian treatment of the HME model based on variational inference. By combining local and global variational methods we obtain a rigourous lower bound on the marginal probability of the data under the model. This bound is optimized during the training…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Bayesian Modeling and Causal Inference · Machine Learning and Data Classification
