Last layer state space model for representation learning and uncertainty   quantification

Max Cohen (TSP); Maurice Charbit; Sylvain Le Corff (TSP)

arXiv:2307.01566·stat.ML·July 6, 2023·1 cites

Last layer state space model for representation learning and uncertainty quantification

Max Cohen (TSP), Maurice Charbit, Sylvain Le Corff (TSP)

PDF

Open Access

TL;DR

This paper introduces a two-step approach combining representation learning with a state space model for uncertainty quantification in neural networks, enabling confidence interval estimation without retraining the entire model.

Contribution

It proposes a novel method to add a state space layer on trained neural networks for uncertainty estimation, separating representation learning from generative modeling.

Findings

01

Effective uncertainty quantification on benchmark dataset

02

Provides confidence intervals for predictions

03

Handles noisy data with unknown variables

Abstract

As sequential neural architectures become deeper and more complex, uncertainty estimation is more and more challenging. Efforts in quantifying uncertainty often rely on specific training procedures, and bear additional computational costs due to the dimensionality of such models. In this paper, we propose to decompose a classification or regression task in two steps: a representation learning stage to learn low-dimensional states, and a state space model for uncertainty estimation. This approach allows to separate representation learning and design of generative models. We demonstrate how predictive distributions can be estimated on top of an existing and trained neural network, by adding a state space-based last layer whose parameters are estimated with Sequential Monte Carlo methods. We apply our proposed methodology to the hourly estimation of Electricity Transformer Oil temperature,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Model Reduction and Neural Networks · Energy Load and Power Forecasting

MethodsMulti-Head Attention · Attention Is All You Need · Layer Normalization · Absolute Position Encodings · Byte Pair Encoding · Linear Layer · Label Smoothing · Adam · Position-Wise Feed-Forward Layer · Residual Connection