VIKING: Deep variational inference with stochastic projections

Samuel G. Fadel; Hrittik Roy; Nicholas Kr\"amer; Yevgen Zainchkovskyy; Stas Syrota; Alejandro Valverde Mahou; Carl Henrik Ek; S{\o}ren Hauberg

arXiv:2510.23684·stat.ML·October 29, 2025

VIKING: Deep variational inference with stochastic projections

Samuel G. Fadel, Hrittik Roy, Nicholas Kr\"amer, Yevgen Zainchkovskyy, Stas Syrota, Alejandro Valverde Mahou, Carl Henrik Ek, S{\o}ren Hauberg

PDF

TL;DR

This paper introduces VIKING, a variational inference method for deep neural networks that captures overparametrization effects, leading to improved uncertainty estimates and state-of-the-art performance.

Contribution

It proposes a novel variational family considering two independent linear subspaces, enabling scalable Bayesian inference that reflects neural network reparametrizations.

Findings

01

Achieves state-of-the-art results across various tasks and datasets.

02

Provides a scalable routine for maximizing ELBO and sampling from the posterior.

03

Demonstrates improved uncertainty calibration in deep neural networks.

Abstract

Variational mean field approximations tend to struggle with contemporary overparametrized deep neural networks. Where a Bayesian treatment is usually associated with high-quality predictions and uncertainties, the practical reality has been the opposite, with unstable training, poor predictive power, and subpar calibration. Building upon recent work on reparametrizations of neural networks, we propose a simple variational family that considers two independent linear subspaces of the parameter space. These represent functional changes inside and outside the support of training data. This allows us to build a fully-correlated approximate posterior reflecting the overparametrization that tunes easy-to-interpret hyperparameters. We develop scalable numerical routines that maximize the associated evidence lower bound (ELBO) and sample from the approximate posterior. Empirically, we observe…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.