Beyond Mixtures and Products for Ensemble Aggregation: A Likelihood Perspective on Generalized Means

Rapha\"el Razafindralambo; R\'emy Sun; Fr\'ed\'eric Precioso; Damien Garreau; Pierre-Alexandre Mattei

arXiv:2603.04204·stat.ML·March 5, 2026

Beyond Mixtures and Products for Ensemble Aggregation: A Likelihood Perspective on Generalized Means

Rapha\"el Razafindralambo, R\'emy Sun, Fr\'ed\'eric Precioso, Damien Garreau, Pierre-Alexandre Mattei

PDF

Open Access

TL;DR

This paper studies density aggregation in machine learning using a likelihood perspective, unifying different pooling methods and identifying the optimal range for systematic improvements, supported by empirical results.

Contribution

It introduces a likelihood-based framework for generalized mean aggregation, clarifies when different pooling methods are effective, and provides theoretical and empirical validation.

Findings

01

Linear and geometric pooling are justified for systematic improvements.

02

Aggregation with r outside [0,1] may not yield consistent gains.

03

Empirical results confirm theoretical insights on image and text benchmarks.

Abstract

Density aggregation is a central problem in machine learning, for instance when combining predictions from a Deep Ensemble. The choice of aggregation remains an open question with two commonly proposed approaches being linear pooling (probability averaging) and geometric pooling (logit averaging). In this work, we address this question by studying the normalized generalized mean of order $r \in R \cup {- \infty, + \infty}$ through the lens of log-likelihood, the standard evaluation criterion in machine learning. This provides a unifying aggregation formalism and shows different optimal configurations for different situations. We show that the regime $r \in [0, 1]$ is the only range ensuring systematic improvements relative to individual distributions, thereby providing a principled justification for the reliability and widespread practical use of linear ( $r = 1$ ) and geometric…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImbalanced Data Classification Techniques · Machine Learning and Data Classification · Stochastic Gradient Optimization Techniques