Can Large Language Models Capture Dissenting Human Voices?

Noah Lee; Na Min An; James Thorne

arXiv:2305.13788·cs.CL·October 30, 2023·1 cites

Can Large Language Models Capture Dissenting Human Voices?

Noah Lee, Na Min An, James Thorne

PDF

Open Access 1 Repo

TL;DR

This paper investigates whether large language models can accurately reflect human disagreement in natural language inference tasks, revealing their limited ability to do so and raising questions about their true understanding and representativeness.

Contribution

It introduces two methods to evaluate LLM alignment with human disagreement and demonstrates their shortcomings in capturing the full spectrum of human opinions.

Findings

01

LLMs show limited performance on NLI tasks.

02

LLMs fail to capture the distribution of human disagreement.

03

Performance drops further on highly disagreement-prone data samples.

Abstract

Large language models (LLMs) have shown impressive achievements in solving a broad range of tasks. Augmented by instruction fine-tuning, LLMs have also been shown to generalize in zero-shot settings as well. However, whether LLMs closely align with the human disagreement distribution has not been well-studied, especially within the scope of natural language inference (NLI). In this paper, we evaluate the performance and alignment of LLM distribution with humans using two different techniques to estimate the multinomial distribution: Monte Carlo Estimation (MCE) and Log Probability Estimation (LPE). As a result, we show LLMs exhibit limited ability in solving NLI tasks and simultaneously fail to capture human disagreement distribution. The inference and human alignment performances plunge even further on data samples with high human disagreement levels, raising concerns about their…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xfactlab/emnlp2023-llm-disagreement
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis

Methodsfail · ALIGN