Aligning Language Models with Demonstrated Feedback
Omar Shaikh, Michelle S. Lam, Joey Hejna, Yijia Shao, Hyundong Cho,, Michael S. Bernstein, Diyi Yang

TL;DR
This paper introduces DITTO, a method that aligns language models to specific tasks using a small number of demonstrations as feedback, outperforming traditional methods in style and task alignment.
Contribution
The paper proposes DITTO, a novel online imitation learning approach that efficiently aligns LLM outputs to user demonstrations with minimal data.
Findings
DITTO outperforms few-shot prompting, supervised fine-tuning, and self-play methods by an average of 19% in win-rate.
The method effectively learns fine-grained style and task alignment across diverse domains.
User study confirms DITTO's superior customization capabilities for LLMs.
Abstract
Language models are aligned to emulate the collective voice of many, resulting in outputs that align with no one in particular. Steering LLMs away from generic output is possible through supervised finetuning or RLHF, but requires prohibitively large datasets for new ad-hoc tasks. We argue that it is instead possible to align an LLM to a specific setting by leveraging a very small number (< 10) of demonstrations as feedback. Our method, Demonstration ITerated Task Optimization (DITTO), directly aligns language model outputs to a user's demonstrated behaviors. Derived using ideas from online imitation learning, DITTO cheaply generates online comparison data by treating users' demonstrations as preferred over output from the LLM and its intermediate checkpoints. Concretely, DITTO operates by having an LLM generate examples that are presumed to be inferior to expert demonstrations. The…
Peer Reviews
Decision·ICLR 2025 Poster
- The paper proposes a data-efficient training method that enables LLMs to follow expert demonstrations. The Reinforcement Learning from Human Feedback (RLHF) data can be continuously generated by simply comparing expert demonstrations with the intermodel's responses. This approach can also be seen as a blend of Reinforcement Learning from AI Feedback (RLAIF) and RLHF, making it a reasonable and effective method. - The authors demonstrate the performance improvements of DITTO-trained models usin
- The authors did not investigate potential side effects, such as performance degradation on other benchmark datasets, after training with DITTO. Since the LLM is fine-tuned exclusively on targeted demonstrations, there’s a risk of significant performance drops in broader tasks. It is essential to preserve the LLM's original knowledge and abilities while adjusting its output to align with specific style and preference. - Also they overlooks the computational inefficiency of iterative training in
The paper introduces DITTO, a novel method designed to guide LLMs toward specific settings for effective customization, achieving sample efficiency with fewer than 10 demonstrations. DITTO outperforms strong baselines, including SFT and GPT-4 with few-shot prompting. Additionally, a detailed user study further reinforces the reliability of DITTO.
1. The static experiments in Section 4.1 are not particularly convincing. Have you considered testing additional baselines or employing other automatic evaluation methods, such as calculating sentence embedding similarity to compare styles? 2. Have you evaluated DITTO on more benchmarks or tested its generalization ability? I noticed that only three authors were used for validation or testing. Can the DITTO method generalize to tasks beyond writing?
- DITTO introduces a new approach to user-specific alignment by using a small set of demonstrations to generate online comparison data. This is innovative and practical for settings where data collection is costly. - The paper provides a strong theoretical justification for DITTO, grounding it in online imitation learning. The derivation explains why DITTO can outperform traditional methods like SFT in low-data scenarios. - The paper completes various experiments, demonstrating DITTO’s effective
- Limited exploration is done into how DITTO scales to broader and more diverse tasks that may require a more generalized alignment. This is seen in how the experiments primarily focus on a small number of demonstrations. - DITTO’s approach heavily relies on the quality of user-provided demonstrations. If demonstrations are unclear or poorly constructed, the alignment could suffer. This could limit DITTO’s real-world applicability when high-quality demonstrations are not readily available. - The
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems
MethodsALIGN
