Consistent Accelerated Inference via Confident Adaptive Transformers

Tal Schuster; Adam Fisch; Tommi Jaakkola; Regina Barzilay

arXiv:2104.08803·cs.CL·September 10, 2021

Consistent Accelerated Inference via Confident Adaptive Transformers

Tal Schuster, Adam Fisch, Tommi Jaakkola, Regina Barzilay

PDF

Open Access 1 Repo

TL;DR

This paper introduces Confident Adaptive Transformers (CATs), a method that accelerates inference in large Transformers by dynamically stopping computation while ensuring high confidence in output consistency.

Contribution

The paper proposes a novel training and stopping mechanism for Transformers that guarantees output consistency with high confidence, improving efficiency without sacrificing reliability.

Findings

01

Effective acceleration on multiple tasks

02

High confidence in output consistency

03

Maintains performance while reducing computation

Abstract

We develop a novel approach for confidently accelerating inference in the large and expensive multilayer Transformers that are now ubiquitous in natural language processing (NLP). Amortized or approximate computational methods increase efficiency, but can come with unpredictable performance costs. In this work, we present CATs -- Confident Adaptive Transformers -- in which we simultaneously increase computational efficiency, while guaranteeing a specifiable degree of consistency with the original model with high confidence. Our method trains additional prediction heads on top of intermediate layers, and dynamically decides when to stop allocating computational effort to each input using a meta consistency classifier. To calibrate our early prediction stopping rule, we formulate a unique extension of conformal prediction. We demonstrate the effectiveness of this approach on four…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

TalSchuster/CATs
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Adversarial Robustness in Machine Learning · Machine Learning and Algorithms