The Empirical Impact of Reducing Symmetries on the Performance of Deep Ensembles and MoE
Andrei Chernov, Oleg Novitskij

TL;DR
This paper empirically studies how reducing symmetries in neural networks improves the performance of deep ensembles and Mixture of Experts, introducing a new Mixture of Interpolated Experts, with results showing significant gains for asymmetric networks.
Contribution
It provides the first empirical analysis of symmetry reduction effects on deep ensembles and MoE architectures, and introduces the Mixture of Interpolated Experts (MoIE) for deeper connectivity exploration.
Findings
Deep ensembles with asymmetric networks outperform symmetric ones as ensemble size grows.
Symmetry reduction benefits are clear for deep ensembles but inconclusive for MoE and MoIE.
Introducing MoIE to explore deeper linear mode connectivity.
Abstract
Recent studies have shown that reducing symmetries in neural networks enhances linear mode connectivity between networks without requiring parameter space alignment, leading to improved performance in linearly interpolated neural networks. However, in practical applications, neural network interpolation is rarely used; instead, ensembles of networks are more common. In this paper, we empirically investigate the impact of reducing symmetries on the performance of deep ensembles and Mixture of Experts (MoE) across five datasets. Additionally, to explore deeper linear mode connectivity, we introduce the Mixture of Interpolated Experts (MoIE). Our results show that deep ensembles built on asymmetric neural networks achieve significantly better performance as ensemble size increases compared to their symmetric counterparts. In contrast, our experiments do not provide conclusive evidence on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsMixture of Experts · Deep Ensembles
