To Call or Not to Call: Diagnosing Intrinsic Over-Calling Bias in LLM Agents

Wei Shi; Ziheng Peng; Sihang Li; Xiting Wang; Xiang Wang; Mengnan Du; Na Zou

arXiv:2605.18882·cs.LG·May 20, 2026

To Call or Not to Call: Diagnosing Intrinsic Over-Calling Bias in LLM Agents

Wei Shi, Ziheng Peng, Sihang Li, Xiting Wang, Xiang Wang, Mengnan Du, Na Zou

PDF

1 Repo 1 Models

TL;DR

This paper investigates the intrinsic bias causing over-calling in LLM agents, identifies its mechanistic basis, and proposes a causal correction method to mitigate it, improving overall accuracy.

Contribution

It introduces a mechanistic understanding of over-calling bias in LLMs and develops a causal correction technique using autoencoder-based feature analysis.

Findings

01

Over-calling bias is linked to an activation-independent offset in call/no-call decision mapping.

02

Using SAE-based features, the bias can be estimated and countered.

03

Applying the correction improves overall accuracy with minimal impact on call accuracy.

Abstract

LLM agents exhibit a consistent tendency to over-call, invoking tools even in situations where none is needed. On the When2Call benchmark, six models from three families show high call accuracy but much lower no-call accuracy, leaving overall accuracy in the 55%-70% range. We trace this to an Intrinsic Bias Hypothesis (IBH): the call/no-call decision mapping carries an activation-independent call offset, so the model favors call even at activation parity. Using Sparse Autoencoders (SAEs), we recover behavior-aligned feature bases for the call/no_call decision, reduce them to a signed activation margin, and estimate the offset directly. Across all six models, the model is decision-neutral only when no_call activation outweighs call activation, consistent with IBH. We then causally test IBH with Adaptive Margin-Calibrated Steering (AMCS), a closed-form counter-bias shift along SAE decoder…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

SKURA502/agent-sae
github

Models

🤗
SKwra/toolcalling-sae
model

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.