Calibrate-Then-Act: Cost-Aware Exploration in LLM Agents

Wenxuan Ding; Nicholas Tomlin; Greg Durrett

arXiv:2602.16699·cs.CL·May 19, 2026

Calibrate-Then-Act: Cost-Aware Exploration in LLM Agents

Wenxuan Ding, Nicholas Tomlin, Greg Durrett

PDF

1 Repo

TL;DR

This paper introduces Calibrate-Then-Act, a framework enabling LLM agents to explicitly reason about cost-uncertainty tradeoffs, leading to more optimal decision-making in environment-interaction tasks.

Contribution

It formalizes cost-aware decision-making in LLM agents and proposes a method to improve their performance by inferring environment state priors before acting.

Findings

01

CTA improves decision strategies in synthetic, QA, and file reading tasks.

02

Explicit cost-benefit reasoning leads to more environment-sensitive agent behavior.

03

Agents using CTA outperform baseline approaches in tested scenarios.

Abstract

LLM agents are deployed in environments where they must interact to acquire information. In these scenarios, the agent must reason about inherent cost-uncertainty tradeoffs in how to act, such as when to stop exploring and commit to an answer. For instance, on a programming task, an agent might run the code it generates, or it might generate tests for that code snippet; the cost of writing and running a test is nonzero, but typically lower than the cost of running buggy code. In this work, we show that we can induce LLM agents to explicitly reason about balancing these cost-uncertainty tradeoffs, then act more optimally in their environments. We formalize multiple tasks, including retrieval-augmented QA and a file reading coding task, as sequential decision-making problems under uncertainty. Each problem has latent environment state that impacts the agent's performance. We introduce a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wenwen-d/env-explorer
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Software Engineering Research · Logic, Reasoning, and Knowledge