Technical Report: Auxiliary Tuning and its Application to Conditional Text Generation
Yoel Zeldes, Dan Padnos, Or Sharir, and Barak Peleg

TL;DR
Auxiliary Tuning is an efficient method for adapting pre-trained language models to new tasks by adding an auxiliary model that shifts output distributions, enabling effective conditional text generation with less training resources.
Contribution
The paper introduces Auxiliary Tuning, a flexible and resource-efficient approach for task adaptation that combines a pre-trained model with an auxiliary model at the logits level.
Findings
Achieves similar performance to training from scratch on various tasks
Requires significantly fewer training resources
Effectively conditions text generation on additional inputs like keywords
Abstract
We introduce a simple and efficient method, called Auxiliary Tuning, for adapting a pre-trained Language Model to a novel task; we demonstrate this approach on the task of conditional text generation. Our approach supplements the original pre-trained model with an auxiliary model that shifts the output distribution according to the target task. The auxiliary model is trained by adding its logits to the pre-trained model logits and maximizing the likelihood of the target task output. Our method imposes no constraints on the auxiliary architecture. In particular, the auxiliary model can ingest additional input relevant to the target task, independently from the pre-trained model's input. Furthermore, mixing the models at the logits level provides a natural probabilistic interpretation of the method. Our method achieved similar results to training from scratch for several different tasks,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Domain Adaptation and Few-Shot Learning
