On the Limitations of Large Language Models (LLMs): False Attribution

Tosin Adewumi; Nudrat Habib; Lama Alkhaled; Elisa Barney

arXiv:2404.04631·cs.CL·July 18, 2025·1 cites

On the Limitations of Large Language Models (LLMs): False Attribution

Tosin Adewumi, Nudrat Habib, Lama Alkhaled, Elisa Barney

PDF

Open Access

TL;DR

This paper introduces the Simple Hallucination Index (SHI) to measure false attribution in large language models, evaluates three open models on author prediction tasks, and analyzes their hallucination tendencies and correlations with data frequency.

Contribution

The work presents a new hallucination metric (SHI), empirically evaluates state-of-the-art LLMs on author attribution, and provides insights into their false attribution limitations.

Findings

01

Mixtral 8x7B achieves highest accuracy and lowest SHI among models.

02

Mixtral 8x7B exhibits high hallucination levels for some books.

03

Prediction accuracy correlates with Wikipedia mention frequencies.

Abstract

In this work, we introduce a new hallucination metric - Simple Hallucination Index (SHI) and provide insight into one important limitation of the parametric knowledge of large language models (LLMs), i.e. false attribution. The task of automatic author attribution for relatively small chunks of text is an important NLP task but can be challenging. We empirically evaluate the power of 3 open SotA LLMs in zero-shot setting (Gemma-7B, Mixtral 8x7B, and LLaMA-2-13B). We acquired the top 10 most popular books of a month, according to Project Gutenberg, divided each one into equal chunks of 400 words, and prompted each LLM to predict the author. We then randomly sampled 162 chunks per book for human evaluation, based on the error margin of 7% and a confidence level of 95%. The average results show that Mixtral 8x7B has the highest prediction accuracy, the lowest SHI, and a Pearson's…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques