Disentangling Model Multiplicity in Deep Learning
Ari Heljakka, Martin Trapp, Juho Kannala, Arno Solin

TL;DR
This paper investigates the phenomenon of model multiplicity in deep learning, analyzing how different models with similar training performance can vary internally and in predictions, affecting generalization.
Contribution
It introduces a new setup for analyzing representational multiplicity using SVCCA, and explores how training methods influence this multiplicity and its relation to predictive multiplicity.
Findings
Training methods systematically affect representational multiplicity.
Representational multiplicity correlates with predictive multiplicity in test predictions.
Measuring multiplicity can help understand and predict model generalization.
Abstract
Model multiplicity is a well-known but poorly understood phenomenon that undermines the generalisation guarantees of machine learning models. It appears when two models with similar training-time performance differ in their predictions and real-world performance characteristics. This observed 'predictive' multiplicity (PM) also implies elusive differences in the internals of the models, their 'representational' multiplicity (RM). We introduce a conceptual and experimental setup for analysing RM by measuring activation similarity via singular vector canonical correlation analysis (SVCCA). We show that certain differences in training methods systematically result in larger RM than others and evaluate RM and PM over a finite sample as predictors for generalizability. We further correlate RM with PM measured by the variance in i.i.d. and out-of-distribution test predictions in four standard…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Statistical Mechanics and Entropy · Face and Expression Recognition
MethodsTest
