Humans overrely on overconfident language models, across languages
Neil Rathi, Dan Jurafsky, Kaitlyn Zhou

TL;DR
This study reveals that large language models exhibit overconfidence across multiple languages, leading to high risks of user overreliance, and emphasizes the need for culturally aware safety evaluations.
Contribution
It provides a comprehensive analysis of multilingual epistemic marker usage and human reliance behaviors, highlighting calibration challenges in LLMs across five languages.
Findings
LLMs are overconfident across languages, often generating unwarranted certainty.
Human reliance on model outputs varies significantly across languages.
Models generate more uncertainty markers in Japanese and more certainty markers in German and Mandarin.
Abstract
As large language models (LLMs) are deployed globally, it is crucial that their responses are calibrated across languages to accurately convey uncertainty and limitations. Prior work shows that LLMs are linguistically overconfident in English, leading users to overrely on confident generations. However, the usage and interpretation of epistemic markers (e.g., 'I think it's') differs sharply across languages. Here, we study the risks of multilingual linguistic (mis)calibration, overconfidence, and overreliance across five languages to evaluate LLM safety in a global context. Our work finds that overreliance risks are high across languages. We first analyze the distribution of LLM-generated epistemic markers and observe that LLMs are overconfident across languages, frequently generating strengtheners even as part of incorrect responses. Model generations are, however, sensitive to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLanguage and cultural evolution · Natural Language Processing Techniques · Topic Modeling
