ReadOnce Transformers: Reusable Representations of Text for Transformers
Shih-Ting Lin, Ashish Sabharwal, Tushar Khot

TL;DR
ReadOnce Transformers create reusable, compressed text representations that enable faster training and evaluation, and can handle longer documents without new pre-training, benefiting multiple downstream tasks.
Contribution
The paper introduces ReadOnce Transformers, a novel method for generating reusable, task-independent text representations that improve efficiency and scalability in transformer models.
Findings
2x-5x speedup in training and evaluation
Enables handling of longer documents
Improves performance on multiple downstream tasks
Abstract
We present ReadOnce Transformers, an approach to convert a transformer-based model into one that can build an information-capturing, task-independent, and compressed representation of text. The resulting representation is reusable across different examples and tasks, thereby requiring a document shared across many examples or tasks to only be \emph{read once}. This leads to faster training and evaluation of models. Additionally, we extend standard text-to-text transformer models to Representation+Text-to-text models, and evaluate on multiple downstream tasks: multi-hop QA, abstractive QA, and long-document summarization. Our one-time computed representation results in a 2x-5x speedup compared to standard text-to-text models, while the compression also allows existing language models to handle longer documents without the need for designing new pre-trained models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
