Bayesian Attention Networks for Data Compression

Michael Tetelman

arXiv:2103.15319·cs.LG·March 30, 2021

Bayesian Attention Networks for Data Compression

Michael Tetelman

PDF

Open Access

TL;DR

This paper introduces Bayesian Attention Networks for lossless data compression, leveraging a latent space to relate training and prediction samples through attention mechanisms, resulting in a context-dependent solution approach.

Contribution

It proposes a novel Bayesian Attention Network framework with a latent space for practical, context-aware data compression based on sample correlations.

Findings

01

Attention factor is defined by sample correlation functions.

02

Latent space effectively maps prediction samples to context.

03

Approach enables context-dependent, efficient compression.

Abstract

The lossless data compression algorithm based on Bayesian Attention Networks is derived from first principles. Bayesian Attention Networks are defined by introducing an attention factor per a training sample loss as a function of two sample inputs, from training sample and prediction sample. By using a sharpened Jensen's inequality we show that the attention factor is completely defined by a correlation function of the two samples w.r.t. the model weights. Due to the attention factor the solution for a prediction sample is mostly defined by a few training samples that are correlated with the prediction sample. Finding a specific solution per prediction sample couples together the training and the prediction. To make the approach practical we introduce a latent space to map each prediction sample to a latent space and learn all possible solutions as a function of the latent space along…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Bayesian Modeling and Causal Inference · Neural Networks and Applications