Aligning Large Language Models through Synthetic Feedback
Sungdong Kim, Sanghwan Bae, Jamin Shin, Soyoung Kang, Donghyun Kwak,, Kang Min Yoo, Minjoon Seo

TL;DR
This paper introduces ALMoST, a novel method for aligning large language models using synthetic feedback, reducing reliance on human annotations and proprietary models, and achieving superior alignment performance.
Contribution
The paper presents a new framework for LLM alignment using synthetic feedback for reward modeling, eliminating the need for extensive human data or proprietary models.
Findings
ALMoST outperforms recent open-source models in alignment benchmarks.
Synthetic feedback effectively guides LLM alignment without human annotations.
Human evaluations favor ALMoST over Alpaca and Dolly-v2.
Abstract
Aligning large language models (LLMs) to human values has become increasingly important as it enables sophisticated steering of LLMs. However, it requires significant human demonstrations and feedback or distillation from proprietary LLMs such as ChatGPT. In this work, we propose a novel alignment learning framework with synthetic feedback not dependent on extensive human annotations and proprietary LLMs. First, we perform reward modeling (RM) with synthetic feedback by contrasting responses from vanilla LLMs with various sizes and prompts. Then, we use the RM to simulate high-quality demonstrations to train a supervised policy and further optimize the model with reinforcement learning. Our resulting model, Aligned Language Model with Synthetic Training dataset (ALMoST), outperforms recent open-sourced models, which are trained on the outputs of InstructGPT or human-annotated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
MethodsMulti-Head Attention · Attention Is All You Need · Absolute Position Encodings · Softmax · Layer Normalization · Byte Pair Encoding · Dropout · Linear Layer · Label Smoothing · Adam
