Making Foundation Models Probabilistic via Singular Value Ensembles
Mehmet Ozgur Turkoglu, Dominik J. M\"uhlematter, Alexander Becker, Konrad Schindler, Helge Aasen

TL;DR
This paper introduces Singular Value Ensemble (SVE), a parameter-efficient implicit ensemble method that enhances uncertainty quantification in foundation models by modulating singular values, achieving ensemble-like performance with minimal additional parameters.
Contribution
SVE leverages the singular vectors of pretrained models' weight matrices to create diverse ensembles by only training singular values, reducing computational costs and parameter overhead.
Findings
SVE achieves uncertainty calibration comparable to deep ensembles.
SVE improves model calibration while maintaining accuracy.
SVE requires less than 1% additional parameters over the base model.
Abstract
Foundation models have become a dominant paradigm in machine learning, achieving remarkable performance across diverse tasks through large-scale pretraining. However, these models often yield overconfident, uncalibrated predictions. The standard approach to quantifying epistemic uncertainty, training an ensemble of independent models, incurs prohibitive computational costs that scale linearly with ensemble size, making it impractical for large foundation models. We propose Singular Value Ensemble (SVE), a parameter-efficient implicit ensemble method that builds on a simple, but powerful core assumption: namely, that the singular vectors of the weight matrices constitute meaningful subspaces of the model's knowledge. Pretrained foundation models encode rich, transferable information in their weight matrices. If the singular vectors are indeed meaningful (orthogonal) "knowledge…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Advanced Graph Neural Networks
