Instruction-tuning Aligns LLMs to the Human Brain
Khai Loong Aw, Syrielle Montariol, Badr AlKhamissi, Martin Schrimpf,, Antoine Bosselut

TL;DR
This study investigates how instruction-tuning large language models affects their similarity to human brain activity and behavior, revealing increased brain alignment but unchanged behavioral alignment, with model size and world knowledge as key factors.
Contribution
The paper provides empirical evidence that instruction-tuning enhances brain-like representations in LLMs and identifies key factors influencing this alignment, such as model size and world knowledge.
Findings
Instruction-tuning increases brain alignment by approximately 6%.
Model size strongly correlates with brain alignment (r=0.95).
Performance on world knowledge tasks correlates with brain alignment (r=0.81).
Abstract
Instruction-tuning is a widely adopted finetuning method that enables large language models (LLMs) to generate output that more closely resembles human responses. However, no studies have shown that instruction-tuning actually teaches LLMs to process language in a similar manner as humans. We investigate the effect of instruction-tuning on aligning LLM and human language processing mechanisms in two ways: (1) brain alignment, the similarity of LLM internal representations to neural activity in the human language system, and (2) behavioral alignment, the similarity of LLM and human behavior on a reading task. We assess 25 vanilla and instruction-tuned LLMs on three datasets involving humans reading naturalistic stories and sentences, and find that instruction-tuning generally enhances brain alignment (~6%), but has no similar effect on behavioral alignment. To identify factors underlying…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning in Materials Science
