AlpacaFarm: A Simulation Framework for Methods that Learn from Human   Feedback

Yann Dubois; Xuechen Li; Rohan Taori; Tianyi Zhang; Ishaan Gulrajani,; Jimmy Ba; Carlos Guestrin; Percy Liang; Tatsunori B. Hashimoto

arXiv:2305.14387·cs.LG·January 9, 2024·54 cites

AlpacaFarm: A Simulation Framework for Methods that Learn from Human Feedback

Yann Dubois, Xuechen Li, Rohan Taori, Tianyi Zhang, Ishaan Gulrajani,, Jimmy Ba, Carlos Guestrin, Percy Liang, Tatsunori B. Hashimoto

PDF

Open Access 2 Repos

TL;DR

AlpacaFarm is a low-cost simulation framework for developing and evaluating methods that learn from human feedback in training large language models, enabling efficient research and benchmarking.

Contribution

The paper introduces AlpacaFarm, a simulation environment that reduces data collection costs, provides trustworthy automatic evaluation, and offers reference implementations for feedback-based learning methods.

Findings

01

Reward model methods outperform supervised fine-tuning.

02

PPO implementation achieves +10% win-rate against Davinci003.

03

Model rankings in AlpacaFarm align with real human feedback.

Abstract

Large language models (LLMs) such as ChatGPT have seen widespread adoption due to their strong instruction-following abilities. Developing these LLMs involves a complex yet poorly understood workflow requiring training with human feedback. Replicating and understanding this instruction-following requires tackling three major challenges: the high cost of data collection, the lack of trustworthy evaluation, and the absence of reference method implementations. We address these challenges with AlpacaFarm, a simulator that enables research and development for learning from feedback at a low cost. First, we design LLM prompts to simulate human feedback that are 50x cheaper than crowdworkers and display high agreement with humans. Second, we propose an automatic evaluation and validate it against human instructions obtained on real-world interactions. Third, we contribute reference…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Artificial Intelligence in Healthcare and Education

MethodsEntropy Regularization · Proximal Policy Optimization