Searchable Hidden Intermediates for End-to-End Models of Decomposable Sequence Tasks
Siddharth Dalmia, Brian Yan, Vikas Raunak, Florian Metze, Shinji, Watanabe

TL;DR
This paper introduces a novel end-to-end framework that learns searchable hidden intermediates in sequence models, leveraging decomposed sub-tasks and beam search to improve performance in complex tasks like speech translation.
Contribution
It proposes a framework that exploits compositionality to learn and improve hidden intermediates, enabling better search and adaptation in end-to-end sequence models.
Findings
Outperforms previous state-of-the-art by +6 and +3 BLEU on Fisher-CallHome test sets.
Achieves +3 and +4 BLEU improvements on MuST-C English-German and English-French datasets.
Demonstrates the effectiveness of searchable hidden intermediates in complex sequence tasks.
Abstract
End-to-end approaches for sequence tasks are becoming increasingly popular. Yet for complex sequence tasks, like speech translation, systems that cascade several models trained on sub-tasks have shown to be superior, suggesting that the compositionality of cascaded systems simplifies learning and enables sophisticated search capabilities. In this work, we present an end-to-end framework that exploits compositionality to learn searchable hidden representations at intermediate stages of a sequence model using decomposed sub-tasks. These hidden intermediates can be improved using beam search to enhance the overall performance and can also incorporate external models at intermediate stages of the network to re-score or adapt towards out-of-domain data. One instance of the proposed framework is a Multi-Decoder model for speech translation that extracts the searchable hidden intermediates…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis
