ReadOnce Transformers: Reusable Representations of Text for Transformers

Shih-Ting Lin; Ashish Sabharwal; Tushar Khot

arXiv:2010.12854·cs.CL·August 5, 2021

ReadOnce Transformers: Reusable Representations of Text for Transformers

Shih-Ting Lin, Ashish Sabharwal, Tushar Khot

PDF

Open Access

TL;DR

ReadOnce Transformers create reusable, compressed text representations that enable faster training and evaluation, and can handle longer documents without new pre-training, benefiting multiple downstream tasks.

Contribution

The paper introduces ReadOnce Transformers, a novel method for generating reusable, task-independent text representations that improve efficiency and scalability in transformer models.

Findings

01

2x-5x speedup in training and evaluation

02

Enables handling of longer documents

03

Improves performance on multiple downstream tasks

Abstract

We present ReadOnce Transformers, an approach to convert a transformer-based model into one that can build an information-capturing, task-independent, and compressed representation of text. The resulting representation is reusable across different examples and tasks, thereby requiring a document shared across many examples or tasks to only be \emph{read once}. This leads to faster training and evaluation of models. Additionally, we extend standard text-to-text transformer models to Representation+Text-to-text models, and evaluate on multiple downstream tasks: multi-hop QA, abstractive QA, and long-document summarization. Our one-time computed representation results in a 2x-5x speedup compared to standard text-to-text models, while the compression also allows existing language models to handle longer documents without the need for designing new pre-trained models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications