Dancing in Chains: Reconciling Instruction Following and Faithfulness in   Language Models

Zhengxuan Wu; Yuhao Zhang; Peng Qi; Yumo Xu; Rujun Han and; Yian Zhang; Jifan Chen; Bonan Min; Zhiheng Huang

arXiv:2407.21417·cs.CL·August 1, 2024

Dancing in Chains: Reconciling Instruction Following and Faithfulness in Language Models

Zhengxuan Wu, Yuhao Zhang, Peng Qi, Yumo Xu, Rujun Han and, Yian Zhang, Jifan Chen, Bonan Min, Zhiheng Huang

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper investigates the trade-off between instruction following and faithfulness in language models, proposing a Rejection Sampling method that improves alignment with less data, revealing insights into training objectives.

Contribution

It introduces ReSet, a rejection sampling-based approach that enhances language model alignment, outperforming traditional multi-task learning methods with less data.

Findings

01

ReSet outperforms vanilla multi-task learning.

02

Training with less high-quality data yields better results.

03

Identifies a fundamental trade-off in alignment objectives.

Abstract

Modern language models (LMs) need to follow human instructions while being faithful; yet, they often fail to achieve both. Here, we provide concrete evidence of a trade-off between instruction following (i.e., follow open-ended instructions) and faithfulness (i.e., ground responses in given context) when training LMs with these objectives. For instance, fine-tuning LLaMA-7B on instruction following datasets renders it less faithful. Conversely, instruction-tuned Vicuna-7B shows degraded performance at following instructions when further optimized on tasks that require contextual grounding. One common remedy is multi-task learning (MTL) with data mixing, yet it remains far from achieving a synergic outcome. We propose a simple yet effective method that relies on Rejection Sampling for Continued Self-instruction Tuning (ReSet), which significantly outperforms vanilla MTL. Surprisingly, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

frankaging/dancing-in-chains
noneOfficial

Videos

Dancing in Chains: Reconciling Instruction Following and Faithfulness in Language Models· underline

Taxonomy

TopicsNatural Language Processing Techniques