Fine-Tuned In-Context Learners for Efficient Adaptation
Jorg Bornschein, Clare Lyle, Yazhe Li, Amal Rannen-Triki, Xu Owen He, Razvan Pascanu

TL;DR
This paper introduces a unified method that combines in-context learning with fine-tuning for large language models, improving sample efficiency and performance across downstream tasks.
Contribution
It proposes a novel fine-tuning approach that incorporates in-context examples, bridging prompt-based and traditional fine-tuning methods, and introduces prequential evaluation for hyperparameter tuning.
Findings
Unified approach outperforms traditional fine-tuning and in-context learning
Prequential evaluation effectively guides hyperparameter selection in low-data regimes
Method achieves consistent performance gains across multiple downstream tasks
Abstract
When adapting large language models (LLMs) to a specific downstream task, two primary approaches are commonly employed: (1) prompt engineering, often with in-context few-shot learning, leveraging the model's inherent generalization abilities, and (2) fine-tuning on task-specific data, directly optimizing the model's parameters. While prompt-based methods excel in few-shot scenarios, their effectiveness often plateaus as more data becomes available. Conversely, fine-tuning scales well with data but may underperform when training examples are scarce. We investigate a unified approach that bridges these two paradigms by incorporating in-context learning directly into the fine-tuning process. Specifically, we fine-tune the model on task-specific data augmented with in-context examples, mimicking the structure of k-shot prompts. This approach, while requiring per-task fine-tuning, combines…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Topic Modeling · Multimodal Machine Learning Applications
