Characterizing the Efficiency vs. Accuracy Trade-off for Long-Context   NLP Models

Phyllis Ang; Bhuwan Dhingra; Lisa Wu Wills

arXiv:2204.07288·cs.CL·April 18, 2022

Characterizing the Efficiency vs. Accuracy Trade-off for Long-Context NLP Models

Phyllis Ang, Bhuwan Dhingra, Lisa Wu Wills

PDF

Open Access 1 Repo

TL;DR

This paper systematically analyzes the trade-offs between accuracy, speed, and energy consumption in long-sequence NLP models, revealing how model size and sequence length impact efficiency and performance across tasks.

Contribution

It provides a comparative study of Longformer-Encoder-Decoder and Big Bird, highlighting how hyperparameters affect the efficiency-accuracy trade-off in long-text NLP tasks.

Findings

01

LED outperforms Big Bird in accuracy and energy efficiency.

02

Increasing model size is more energy-efficient than increasing sequence length for summarization.

03

Smaller models are both more accurate and efficient in question answering.

Abstract

With many real-world applications of Natural Language Processing (NLP) comprising of long texts, there has been a rise in NLP benchmarks that measure the accuracy of models that can handle longer input sequences. However, these benchmarks do not consider the trade-offs between accuracy, speed, and power consumption as input sizes or model sizes are varied. In this work, we perform a systematic study of this accuracy vs. efficiency trade-off on two widely used long-sequence models - Longformer-Encoder-Decoder (LED) and Big Bird - during fine-tuning and inference on four datasets from the SCROLLS benchmark. To study how this trade-off differs across hyperparameter settings, we compare the models across four sequence lengths (1024, 2048, 3072, 4096) and two model sizes (base and large) under a fixed resource budget. We find that LED consistently achieves better accuracy at lower energy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

phyllisayk/nlp-efficiency-tradeoff
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Biomedical Text Mining and Ontologies