Last layer state space model for representation learning and uncertainty quantification
Max Cohen (TSP), Maurice Charbit, Sylvain Le Corff (TSP)

TL;DR
This paper introduces a two-step approach combining representation learning with a state space model for uncertainty quantification in neural networks, enabling confidence interval estimation without retraining the entire model.
Contribution
It proposes a novel method to add a state space layer on trained neural networks for uncertainty estimation, separating representation learning from generative modeling.
Findings
Effective uncertainty quantification on benchmark dataset
Provides confidence intervals for predictions
Handles noisy data with unknown variables
Abstract
As sequential neural architectures become deeper and more complex, uncertainty estimation is more and more challenging. Efforts in quantifying uncertainty often rely on specific training procedures, and bear additional computational costs due to the dimensionality of such models. In this paper, we propose to decompose a classification or regression task in two steps: a representation learning stage to learn low-dimensional states, and a state space model for uncertainty estimation. This approach allows to separate representation learning and design of generative models. We demonstrate how predictive distributions can be estimated on top of an existing and trained neural network, by adding a state space-based last layer whose parameters are estimated with Sequential Monte Carlo methods. We apply our proposed methodology to the hourly estimation of Electricity Transformer Oil temperature,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Model Reduction and Neural Networks · Energy Load and Power Forecasting
MethodsMulti-Head Attention · Attention Is All You Need · Layer Normalization · Absolute Position Encodings · Byte Pair Encoding · Linear Layer · Label Smoothing · Adam · Position-Wise Feed-Forward Layer · Residual Connection
