Do Large Language Models Perform the Way People Expect? Measuring the   Human Generalization Function

Keyon Vafa; Ashesh Rambachan; Sendhil Mullainathan

arXiv:2406.01382·cs.CL·June 4, 2024·5 cites

Do Large Language Models Perform the Way People Expect? Measuring the Human Generalization Function

Keyon Vafa, Ashesh Rambachan, Sendhil Mullainathan

PDF

Open Access 1 Repo

TL;DR

This paper investigates how human beliefs about LLM performance influence deployment decisions, showing that more capable models often misalign with human expectations, especially in high-stakes scenarios, by modeling and predicting human generalization patterns.

Contribution

It introduces a dataset of human generalization across tasks, demonstrates that these patterns can be predicted with NLP methods, and evaluates LLM alignment with human expectations.

Findings

01

Humans generalize in consistent, structured ways.

02

More capable models can perform worse on tasks humans expect them to handle.

03

Alignment with human generalization improves understanding of LLM deployment risks.

Abstract

What makes large language models (LLMs) impressive is also what makes them hard to evaluate: their diversity of uses. To evaluate these models, we must understand the purposes they will be used for. We consider a setting where these deployment decisions are made by people, and in particular, people's beliefs about where an LLM will perform well. We model such beliefs as the consequence of a human generalization function: having seen what an LLM gets right or wrong, people generalize to where else it might succeed. We collect a dataset of 19K examples of how humans make generalizations across 79 tasks from the MMLU and BIG-Bench benchmarks. We show that the human generalization function can be predicted using NLP methods: people have consistent structured ways to generalize. We then evaluate LLM alignment with the human generalization function. Our results show that -- especially for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

keyonvafa/human-generalization-llms
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputational and Text Analysis Methods · Computational Physics and Python Applications · Topic Modeling