Bayesian Statistical Modeling with Predictors from LLMs

Michael Franke; Polina Tsvilodub; Fausto Carcassi

arXiv:2406.09012·cs.CL·June 14, 2024

Bayesian Statistical Modeling with Predictors from LLMs

Michael Franke, Polina Tsvilodub, Fausto Carcassi

PDF

Open Access

TL;DR

This paper evaluates the human-likeness of LLM predictions in decision tasks using Bayesian models, revealing that LLMs do not capture individual variance but can approximate aggregate behavior with specific methods.

Contribution

It introduces Bayesian statistical modeling to assess LLMs' alignment with human data and explores methods to derive meaningful distributional predictions from LLMs.

Findings

01

LLMs do not capture variance at the individual item level.

02

Some methods of deriving condition-level predictions fit human data adequately.

03

Assessment of LLM performance depends on methodological choices.

Abstract

State of the art large language models (LLMs) have shown impressive performance on a variety of benchmark tasks and are increasingly used as components in larger applications, where LLM-based predictions serve as proxies for human judgements or decision. This raises questions about the human-likeness of LLM-derived information, alignment with human intuition, and whether LLMs could possibly be considered (parts of) explanatory models of (aspects of) human cognition or language use. To shed more light on these issues, we here investigate the human-likeness of LLMs' predictions for multiple-choice decision tasks from the perspective of Bayesian statistical modeling. Using human data from a forced-choice experiment on pragmatic language use, we find that LLMs do not capture the variance in the human data at the item-level. We suggest different ways of deriving full distributional…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification