Amortized Context Vector Inference for Sequence-to-Sequence Networks
Kyriacos Tolias, Ioannis Kourouklides, Sotirios Chatzis

TL;DR
This paper introduces a novel approach that treats context vectors in sequence-to-sequence models as latent variables, using amortized variational inference to improve generalization and performance in tasks like summarization, captioning, and translation.
Contribution
It proposes a new method applying amortized variational inference to context vectors in attention models, enhancing their generalization and effectiveness.
Findings
Improved performance over state-of-the-art methods in document summarization.
Enhanced results in video captioning benchmarks.
Better generalization in machine translation tasks.
Abstract
Neural attention (NA) has become a key component of sequence-to-sequence models that yield state-of-the-art performance in as hard tasks as abstractive document summarization (ADS) and video captioning (VC). NA mechanisms perform inference of context vectors; these constitute weighted sums of deterministic input sequence encodings, adaptively sourced over long temporal horizons. Inspired from recent work in the field of amortized variational inference (AVI), in this work we consider treating the context vectors generated by soft-attention (SA) models as latent variables, with approximate finite mixture model posteriors inferred via AVI. We posit that this formulation may yield stronger generalization capacity, in line with the outcomes of existing applications of AVI to deep networks. To illustrate our method, we implement it and experimentally evaluate it considering challenging ADS,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Human Pose and Action Recognition · Video Analysis and Summarization
