Improving Minimax Estimation Rates for Contaminated Mixture of Multinomial Logistic Experts via Expert Heterogeneity
Fanqi Yan, Dung Le, Trang Pham, Huy Nguyen, Nhat Ho

TL;DR
This paper establishes the first theoretical convergence analysis for contaminated mixture of multinomial logistic experts in classification, demonstrating that expert heterogeneity improves estimation efficiency and achieves minimax optimal rates.
Contribution
It provides the first convergence analysis for contaminated MoE in classification and shows expert heterogeneity enhances estimation speed and sample efficiency.
Findings
Heterogeneous experts lead to faster convergence rates.
Established minimax optimality of the estimation rates.
Provided the first theoretical foundation for contaminated MoE in classification.
Abstract
Contaminated mixture of experts (MoE) is motivated by transfer learning methods where a pre-trained model, acting as a frozen expert, is integrated with an adapter model, functioning as a trainable expert, in order to learn a new task. Despite recent efforts to analyze the convergence behavior of parameter estimation in this model, there are still two unresolved problems in the literature. First, the contaminated MoE model has been studied solely in regression settings, while its theoretical foundation in classification settings remains absent. Second, previous works on MoE models for classification capture pointwise convergence rates for parameter estimation without any guaranty of minimax optimality. In this work, we close these gaps by performing, for the first time, the convergence analysis of a contaminated mixture of multinomial logistic experts with homogeneous and heterogeneous…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Mobile Crowdsensing and Crowdsourcing · Machine Learning and Algorithms
