In-Context Alignment: Chat with Vanilla Language Models Before Fine-Tuning
Xiaochuang Han

TL;DR
This paper demonstrates that in-context learning with vanilla Llama-2 models can significantly improve alignment performance, rivaling models fine-tuned for alignment without altering model weights.
Contribution
It introduces the concept of in-context alignment, showing that prompt-based demonstrations can enhance model alignment comparable to fine-tuning methods.
Findings
In-context alignment with Llama-2 improves win-rate 7x over direct prompting.
Average of 9 demonstration examples suffices for effective alignment.
Vanilla Llama-2 matches strong fine-tuned alignment baselines.
Abstract
In this note, we explore inference-time alignment through in-context learning. We consider a vanilla pretrained language model Llama-2 before any fine-tuning and retrieve an average of 9 demonstration alignment examples when the model is prompted to follow chat-style instructions. Compared to direct prompting, the in-context alignment without changing model weights leads to a 7x increase in win-rate w.r.t. the text-davinci-003 model from OpenAI, making the vanilla language model comparable to strong baselines with alignment fine-tuning.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
