ProbeLLM: Automating Principled Diagnosis of LLM Failures
Yue Huang, Zhengzhe Jiang, Yuchen Ma, Yu Jiang, Xiangqi Wang, Yujun Zhou, Yuexing Hao, Kehan Guo, Pin-Yu Chen, Stefan Feuerriegel, Xiangliang Zhang

TL;DR
ProbeLLM introduces a hierarchical, automated framework for systematically diagnosing and understanding the failure modes of large language models, moving beyond isolated cases to structured weaknesses.
Contribution
It presents a novel, benchmark-agnostic probing method using hierarchical Monte Carlo Tree Search to discover and interpret failure modes of LLMs.
Findings
Reveals broader and more detailed failure landscapes than previous methods.
Supports a shift from case-centric to principled weakness discovery.
Achieves more reliable and interpretable failure analysis across diverse benchmarks.
Abstract
Understanding how and why large language models (LLMs) fail is becoming a central challenge as models rapidly evolve and static evaluations fall behind. While automated probing has been enabled by dynamic test generation, existing approaches often discover isolated failure cases, lack principled control over exploration, and provide limited insight into the underlying structure of model weaknesses. We propose ProbeLLM, a benchmark-agnostic automated probing framework that elevates weakness discovery from individual failures to structured failure modes. ProbeLLM formulates probing as a hierarchical Monte Carlo Tree Search, explicitly allocating limited probing budgets between global exploration of new failure regions and local refinement of recurring error patterns. By restricting probing to verifiable test cases and leveraging tool-augmented generation and verification, ProbeLLM grounds…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Text Readability and Simplification · Software Engineering Research
