Attentive Gaussian processes for probabilistic time-series generation
Kuilin Chen, Chi-Guhn Lee

TL;DR
This paper introduces Attentive-GP, an efficient attention-based Gaussian process model for probabilistic time-series generation that improves training scalability and uncertainty estimation over recurrent networks.
Contribution
It combines attention mechanisms with Gaussian processes, develops a block-wise training algorithm for scalability, and demonstrates broad applicability to hybrid models.
Findings
Improved training efficiency over recurrent models.
Scalable mini-batch training with convergence guarantees.
Comparable or better quality in sequence generation.
Abstract
The transduction of sequence has been mostly done by recurrent networks, which are computationally demanding and often underestimate uncertainty severely. We propose a computationally efficient attention-based network combined with the Gaussian process regression to generate real-valued sequence, which we call the Attentive-GP. The proposed model not only improves the training efficiency by dispensing recurrence and convolutions but also learns the factorized generative distribution with Bayesian representation. However, the presence of the GP precludes the commonly used mini-batch approach to the training of the attention network. Therefore, we develop a block-wise training algorithm to allow mini-batch training of the network while the GP is trained using full-batch, resulting in a scalable training method. The algorithm has been proved to converge and shows comparable, if not better,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Machine Learning and Data Classification · Neural Networks and Applications
MethodsGaussian Process
