Manifold Metric: A Loss Landscape Approach for Predicting Model Performance
Pranshu Malviya, Jerry Huang, Aristide Baratin, Quentin Fournier, Sarath Chandar

TL;DR
This paper introduces a novel manifold metric based on loss landscape geometry to predict model performance and guide model expansion, reducing the need for extensive training.
Contribution
It proposes a new loss landscape-based metric to evaluate model expansion impact, outperforming existing methods in predicting performance gains.
Findings
Strong correlation between the metric and performance improvements
Metric outperforms baseline methods across different expansion types
First step towards geometry-driven model expansion strategies
Abstract
Determining the optimal model for a given task often requires training multiple models from scratch, which becomes impractical as dataset and model sizes grow. A more efficient alternative is to expand smaller pre-trained models, but this approach is underutilized due to a limited understanding of its impact on the training dynamics. Existing methods for quantifying this impact have notable limitations, including computation cost. To address this, we introduce a new perspective based on the loss landscape, which has been shown to contain a manifold of linearly connected minima. Specifically, we propose a metric that estimates the size of this manifold to study the impact of model expansion. Our experiments reveal a strong correlation between performance gains and our manifold metric, enabling more informed model comparison and offering a first step toward a geometry-driven approach for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning
