ProbeLLM: Automating Principled Diagnosis of LLM Failures

Yue Huang; Zhengzhe Jiang; Yuchen Ma; Yu Jiang; Xiangqi Wang; Yujun Zhou; Yuexing Hao; Kehan Guo; Pin-Yu Chen; Stefan Feuerriegel; Xiangliang Zhang

arXiv:2602.12966·cs.CL·February 16, 2026

ProbeLLM: Automating Principled Diagnosis of LLM Failures

Yue Huang, Zhengzhe Jiang, Yuchen Ma, Yu Jiang, Xiangqi Wang, Yujun Zhou, Yuexing Hao, Kehan Guo, Pin-Yu Chen, Stefan Feuerriegel, Xiangliang Zhang

PDF

Open Access

TL;DR

ProbeLLM introduces a hierarchical, automated framework for systematically diagnosing and understanding the failure modes of large language models, moving beyond isolated cases to structured weaknesses.

Contribution

It presents a novel, benchmark-agnostic probing method using hierarchical Monte Carlo Tree Search to discover and interpret failure modes of LLMs.

Findings

01

Reveals broader and more detailed failure landscapes than previous methods.

02

Supports a shift from case-centric to principled weakness discovery.

03

Achieves more reliable and interpretable failure analysis across diverse benchmarks.

Abstract

Understanding how and why large language models (LLMs) fail is becoming a central challenge as models rapidly evolve and static evaluations fall behind. While automated probing has been enabled by dynamic test generation, existing approaches often discover isolated failure cases, lack principled control over exploration, and provide limited insight into the underlying structure of model weaknesses. We propose ProbeLLM, a benchmark-agnostic automated probing framework that elevates weakness discovery from individual failures to structured failure modes. ProbeLLM formulates probing as a hierarchical Monte Carlo Tree Search, explicitly allocating limited probing budgets between global exploration of new failure regions and local refinement of recurring error patterns. By restricting probing to verifiable test cases and leveraging tool-augmented generation and verification, ProbeLLM grounds…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Text Readability and Simplification · Software Engineering Research