PredictaBoard: Benchmarking LLM Score Predictability
Lorenzo Pacchiardi, Konstantinos Voudouris, Ben Slater, Fernando Mart\'inez-Plumed, Jos\'e Hern\'andez-Orallo, Lexin Zhou, Wout Schellaert

TL;DR
PredictaBoard introduces a benchmarking framework to evaluate how well assessors can predict LLM errors, aiming to improve the safety and reliability of large language models by focusing on their predictability.
Contribution
This paper presents a novel collaborative benchmarking framework, PredictaBoard, for evaluating the ability of score predictors to anticipate LLM errors on specific tasks.
Findings
Baseline assessors show varying prediction accuracy
PredictaBoard reveals the importance of predictability in LLM safety
Framework encourages development of more reliable LLM assessors
Abstract
Despite possessing impressive skills, Large Language Models (LLMs) often fail unpredictably, demonstrating inconsistent success in even basic common sense reasoning tasks. This unpredictability poses a significant challenge to ensuring their safe deployment, as identifying and operating within a reliable "safe zone" is essential for mitigating risks. To address this, we present PredictaBoard, a novel collaborative benchmarking framework designed to evaluate the ability of score predictors (referred to as assessors) to anticipate LLM errors on specific task instances (i.e., prompts) from existing datasets. PredictaBoard evaluates pairs of LLMs and assessors by considering the rejection rate at different tolerance errors. As such, PredictaBoard stimulates research into developing better assessors and making LLMs more predictable, not only with a higher average performance. We conduct…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
