Convergence Rates for Mixture-of-Experts

Eduardo F. Mendes; Wenxin Jiang

arXiv:1110.2058·math.ST·November 2, 2011

Convergence Rates for Mixture-of-Experts

Eduardo F. Mendes, Wenxin Jiang

PDF

Open Access

TL;DR

This paper analyzes the convergence rates of mixture-of-experts models with polynomial regression experts, providing theoretical insights into optimal choices of the number of experts and their complexity for better model performance.

Contribution

It offers a theoretical study on the convergence rates of ME models, revealing how the number of experts and their polynomial degree affect learning efficiency.

Findings

01

Convergence rate depends on both number of experts and expert complexity.

02

Certain combinations of experts and polynomial degree optimize convergence.

03

Results inform optimal expert selection and model complexity balancing.

Abstract

In mixtures-of-experts (ME) model, where a number of submodels (experts) are combined, there have been two longstanding problems: (i) how many experts should be chosen, given the size of the training data? (ii) given the total number of parameters, is it better to use a few very complex experts, or is it better to combine many simple experts? In this paper, we try to provide some insights to these problems through a theoretic study on a ME structure where $m$ experts are mixed, with each expert being related to a polynomial regression model of order $k$ . We study the convergence rate of the maximum likelihood estimator (MLE), in terms of how fast the Kullback-Leibler divergence of the estimated density converges to the true density, when the sample size $n$ increases. The convergence rate is found to be dependent on both $m$ and $k$ , and certain choices of $m$ and $k$ are found to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Methods and Mixture Models · Statistical Methods and Inference · Statistical Methods and Bayesian Inference