Standing on the Shoulders of Giant Frozen Language Models
Yoav Levine, Itay Dalmedigos, Ori Ram, Yoel Zeldes, Daniel Jannai, Dor, Muhlgay, Yoni Osin, Opher Lieber, Barak Lenz, Shai Shalev-Shwartz, Amnon, Shashua, Kevin Leyton-Brown, Yoav Shoham

TL;DR
This paper introduces advanced techniques for leveraging large frozen language models, achieving performance comparable to fine-tuning without sacrificing model versatility, thus unlocking their untapped potential across various tasks.
Contribution
The paper presents three novel methods—input-dependent prompt tuning, frozen readers, and recursive LMs—that significantly enhance the capabilities of frozen language models.
Findings
Some methods outperform fine-tuning in certain domains.
Frozen models can match fine-tuning performance with new techniques.
The proposed methods are computationally efficient relative to model size.
Abstract
Huge pretrained language models (LMs) have demonstrated surprisingly good zero-shot capabilities on a wide variety of tasks. This gives rise to the appealing vision of a single, versatile model with a wide range of functionalities across disparate applications. However, current leading techniques for leveraging a "frozen" LM -- i.e., leaving its weights untouched -- still often underperform fine-tuning approaches which modify these weights in a task-dependent way. Those, in turn, suffer forgetfulness and compromise versatility, suggesting a tradeoff between performance and versatility. The main message of this paper is that current frozen-model techniques such as prompt tuning are only the tip of the iceberg, and more powerful methods for leveraging frozen LMs can do just as well as fine tuning in challenging domains without sacrificing the underlying model's versatility. To demonstrate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
