Aligning Language Models Using Follow-up Likelihood as Reward Signal
Chen Zhang, Dading Chong, Feng Jiang, Chengguang Tang, Anningzhe Gao,, Guohua Tang, Haizhou Li

TL;DR
This paper introduces a novel reward signal based on follow-up utterance likelihood to improve language model alignment without human annotations, achieving competitive performance on preference benchmarks.
Contribution
It proposes Follow-up Likelihood as Reward (FLR), a new method for reward modeling that leverages follow-up utterance likelihood, and demonstrates its effectiveness in aligning language models.
Findings
FLR matches strong reward models on preference benchmarks.
Mining preference data from model generations boosts helpfulness.
Fine-tuning models with natural language feedback enhances FLR performance.
Abstract
In natural human-to-human conversations, participants often receive feedback signals from one another based on their follow-up reactions. These reactions can include verbal responses, facial expressions, changes in emotional state, and other non-verbal cues. Similarly, in human-machine interactions, the machine can leverage the user's follow-up utterances as feedback signals to assess whether it has appropriately addressed the user's request. Therefore, we propose using the likelihood of follow-up utterances as rewards to differentiate preferred responses from less favored ones, without relying on human or commercial LLM-based preference annotations. Our proposed reward mechanism, ``Follow-up Likelihood as Reward" (FLR), matches the performance of strong reward models trained on large-scale human or GPT-4 annotated data on 8 pairwise-preference and 4 rating-based benchmarks. Building…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Speech Recognition and Synthesis · Speech and dialogue systems
MethodsAttention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Label Smoothing · Byte Pair Encoding · Absolute Position Encodings · Softmax · Layer Normalization · Dropout · Dense Connections
