Comparing the Framing Effect in Humans and LLMs on Naturally Occurring Texts
Gili Lior, Liron Nacchace, Gabriel Stanovsky

TL;DR
This study compares how humans and large language models respond to framing effects in naturally occurring texts, revealing similarities and differences in susceptibility and raising questions about model development goals.
Contribution
Introduces WildFrame, a new dataset for evaluating LLM responses to framing in real-world texts, and compares model behavior to human responses on this dataset.
Findings
All models respond to framing with some correlation to human behavior.
Models are more influenced by positive reframing than negative.
GPT models show the least alignment with human framing responses.
Abstract
Humans are influenced by how information is presented, a phenomenon known as the framing effect. Prior work suggests that LLMs may also be susceptible to framing, but it has relied on synthetic data and did not compare to human behavior. To address this gap, we introduce WildFrame - a dataset for evaluating LLM responses to positive and negative framing in naturally-occurring sentences, alongside human responses on the same data. WildFrame consists of 1,000 real-world texts selected to convey a clear sentiment; we then reframe each text in either a positive or negative light and collect human sentiment annotations. Evaluating eleven LLMs on WildFrame, we find that all models respond to reframing in a human-like manner (), and that both humans and models are influenced more by positive than negative reframing. Notably, GPT models are the least correlated with human behavior…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
