Does Instruction Tuning Make LLMs More Consistent?

Constanza Fierro; Jiaang Li; Anders S{\o}gaard

arXiv:2404.15206·cs.CL·October 4, 2024

Does Instruction Tuning Make LLMs More Consistent?

Constanza Fierro, Jiaang Li, Anders S{\o}gaard

PDF

Open Access

TL;DR

This paper investigates how instruction tuning affects the consistency of large language models, showing that it generally improves stability in predictions and representations across various tasks.

Contribution

It provides a comparative analysis demonstrating that instruction tuning enhances model consistency, supported by mechanistic insights into factual recall.

Findings

01

Instruction tuning increases model consistency in predictions.

02

Models show improved stability in zero-shot and downstream tasks.

03

Mechanistic analysis explains the basis for increased consistency.

Abstract

The purpose of instruction tuning is enabling zero-shot performance, but instruction tuning has also been shown to improve chain-of-thought reasoning and value alignment (Si et al., 2023). Here we consider the impact on $consistency$ , i.e., the sensitivity of language models to small perturbations in the input. We compare 10 instruction-tuned LLaMA models to the original LLaMA-7b model and show that almost across-the-board they become more consistent, both in terms of their representations and their predictions in zero-shot and downstream tasks. We explain these improvements through mechanistic analyses of factual recall.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Artificial Intelligence in Law · Text Readability and Simplification

MethodsLLaMA