HLB: Benchmarking LLMs' Humanlikeness in Language Use
Xufeng Duan, Bei Xiao, Xuemei Tang, Zhenguang G. Cai

TL;DR
This paper introduces HLB, a comprehensive benchmark using psycholinguistic experiments to evaluate how closely 20 large language models mimic human language use across various linguistic levels, highlighting nuanced differences and the disconnect with traditional performance metrics.
Contribution
The paper presents the first systematic framework for assessing LLMs' humanlikeness in language use through psycholinguistic experiments and a novel response distribution comparison method.
Findings
LLMs show fine-grained differences from human responses across linguistic levels
Improvements in traditional metrics do not necessarily increase humanlikeness
Some LLMs' responses become less humanlike despite better performance metrics
Abstract
As synthetic data becomes increasingly prevalent in training language models, particularly through generated dialogue, concerns have emerged that these models may deviate from authentic human language patterns, potentially losing the richness and creativity inherent in human communication. This highlights the critical need to assess the humanlikeness of language models in real-world language use. In this paper, we present a comprehensive humanlikeness benchmark (HLB) evaluating 20 large language models (LLMs) using 10 psycholinguistic experiments designed to probe core linguistic aspects, including sound, word, syntax, semantics, and discourse (see https://huggingface.co/spaces/XufengDuan/HumanLikeness). To anchor these comparisons, we collected responses from over 2,000 human participants and compared them to outputs from the LLMs in these experiments. For rigorous evaluation, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTaxation and Legal Issues · Legal Language and Interpretation · Library Science and Information Systems
