Probabilistic Learning and Generation in Deep Sequence Models
Wenlong Chen

TL;DR
This paper explores how deep sequence models can incorporate probabilistic reasoning and uncertainty quantification, improving their interpretability and robustness through novel Bayesian inference methods and self-supervised learning techniques.
Contribution
It introduces new Bayesian inference methods tailored for Transformers and Gaussian processes, and investigates self-supervised learning for sequential latent variables in generative models.
Findings
Developed a Transformer-specific Bayesian inference method based on attention-Gaussian process similarity.
Constructed an interdomain inducing point using HiPPOs for online learning and memory retention.
Enhanced sequential generative models with self-supervision for latent states.
Abstract
Despite exceptional predictive performance of Deep sequence models (DSMs), the main concern of their deployment centers around the lack of uncertainty awareness. In contrast, probabilistic models quantify the uncertainty associated with unobserved variables with rules of probability. Notably, Bayesian methods leverage Bayes' rule to express our belief of unobserved variables in a principled way. Since exact Bayesian inference is computationally infeasible at scale, approximate inference is required in practice. Two major bottlenecks of Bayesian methods, especially when applied in deep neural networks, are prior specification and approximation quality. In Chapter 3 & 4, we investigate how the architectures of DSMs themselves can be informative for the design of priors or approximations in probabilistic models. We first develop an approximate Bayesian inference method tailored to the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Gaussian Processes and Bayesian Inference · Machine Learning in Healthcare
