Diagnosing Failures in Large Language Models' Answers: Integrating Error Attribution into Evaluation Framework
Zishan Xu, Shuyi Xie, Qingsong Lv, Shupei Xiao, Linlin Song, Sui Wenjuan, Fan Lin

TL;DR
This paper introduces a comprehensive framework and dataset for diagnosing and attributing errors in Large Language Models' answers, enabling systematic failure analysis and improved evaluation.
Contribution
It develops a novel error attribution framework, a dedicated dataset, and a fine-tuned judge model for comprehensive LLM answer evaluation.
Findings
The framework effectively categorizes errors in LLM answers.
The AttriData dataset supports detailed error attribution.
MisAttributionLLM demonstrates high accuracy and robustness.
Abstract
With the widespread application of Large Language Models (LLMs) in various tasks, the mainstream LLM platforms generate massive user-model interactions daily. In order to efficiently analyze the performance of models and diagnose failures in their answers, it is essential to develop an automated framework to systematically categorize and attribute errors. However, existing evaluation models lack error attribution capability. In this work, we establish a comprehensive Misattribution Framework with 6 primary and 15 secondary categories to facilitate in-depth analysis. Based on this framework, we present AttriData, a dataset specifically designed for error attribution, encompassing misattribution, along with the corresponding scores and feedback. We also propose MisAttributionLLM, a fine-tuned model on AttriData, which is the first general-purpose judge model capable of simultaneously…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
