Bidirectional Awareness Induction in Autoregressive Seq2Seq Models
Jia Cheng Hu, Roberto Cavicchioli, Alessandro Capotondi

TL;DR
This paper introduces Bidirectional Awareness Induction (BAI), a training method that enhances autoregressive seq2seq models by enabling bidirectional learning through Pivots, improving performance across multiple architectures and tasks.
Contribution
The paper presents BAI, a novel training approach that allows bidirectional learning in autoregressive models without architectural changes, applicable to various architectures and pre-trained models.
Findings
Up to 2.4 CIDEr improvement in Image-Captioning
Up to 4.96 BLEU increase in Neural Machine Translation
Up to 1.16 ROUGE boost in Text Summarization
Abstract
Autoregressive Sequence-To-Sequence models are the foundation of many Deep Learning achievements in major research fields such as Vision and Natural Language Processing. Despite that, they still present significant limitations. For instance, when errors occur in the early steps of the prediction, the whole output is severely affected. Such reliance on previously predicted tokens and the inherent computational unfriendliness of sequential algorithms, motivated researchers to explore different architectures and methods in the search for bidirectional approaches. In this work, we introduce the Bidirectional Awareness Induction (BAI), a training method that leverages a subset of elements in the network, the Pivots, to perform bidirectional learning without breaking the autoregressive constraints. To showcase its flexibility, we apply the method to three architectures, the Transformer,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCell Image Analysis Techniques · Neural Networks and Applications · Cellular Automata and Applications
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Cosine Annealing · Linear Layer · Adam · Layer Normalization · Weight Decay · Position-Wise Feed-Forward Layer · Dense Connections · Residual Connection
