Beyond Mixtures and Products for Ensemble Aggregation: A Likelihood Perspective on Generalized Means
Rapha\"el Razafindralambo, R\'emy Sun, Fr\'ed\'eric Precioso, Damien Garreau, Pierre-Alexandre Mattei

TL;DR
This paper studies density aggregation in machine learning using a likelihood perspective, unifying different pooling methods and identifying the optimal range for systematic improvements, supported by empirical results.
Contribution
It introduces a likelihood-based framework for generalized mean aggregation, clarifies when different pooling methods are effective, and provides theoretical and empirical validation.
Findings
Linear and geometric pooling are justified for systematic improvements.
Aggregation with r outside [0,1] may not yield consistent gains.
Empirical results confirm theoretical insights on image and text benchmarks.
Abstract
Density aggregation is a central problem in machine learning, for instance when combining predictions from a Deep Ensemble. The choice of aggregation remains an open question with two commonly proposed approaches being linear pooling (probability averaging) and geometric pooling (logit averaging). In this work, we address this question by studying the normalized generalized mean of order through the lens of log-likelihood, the standard evaluation criterion in machine learning. This provides a unifying aggregation formalism and shows different optimal configurations for different situations. We show that the regime is the only range ensuring systematic improvements relative to individual distributions, thereby providing a principled justification for the reliability and widespread practical use of linear () and geometric…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImbalanced Data Classification Techniques · Machine Learning and Data Classification · Stochastic Gradient Optimization Techniques
