Towards Trustworthy Report Generation: A Deep Research Agent with Progressive Confidence Estimation and Calibration

Yi Yuan; Xuhong Wang; Shanzhe Lei

arXiv:2604.05952·cs.AI·April 8, 2026

Towards Trustworthy Report Generation: A Deep Research Agent with Progressive Confidence Estimation and Calibration

Yi Yuan, Xuhong Wang, Shanzhe Lei

PDF

TL;DR

This paper introduces a deep research agent that enhances report trustworthiness by integrating progressive confidence estimation and calibration, grounded in verifiable evidence, to improve transparency and user trust.

Contribution

It presents a novel system combining confidence calibration with multi-hop reasoning to produce more trustworthy, transparent research reports in open-ended scenarios.

Findings

01

Significantly improves interpretability of generated reports.

02

Increases user trust through better confidence calibration.

03

Enhances grounding in verifiable evidence.

Abstract

As agent-based systems continue to evolve, deep research agents are capable of automatically generating research-style reports across diverse domains. While these agents promise to streamline information synthesis and knowledge exploration, existing evaluation frameworks-typically based on subjective dimensions-fail to capture a critical aspect of report quality: trustworthiness. In open-ended research scenarios where ground-truth answers are unavailable, current evaluation methods cannot effectively measure the epistemic confidence of generated content, making calibration difficult and leaving users susceptible to misleading or hallucinated information. To address this limitation, we propose a novel deep research agent that incorporates progressive confidence estimation and calibration within the report generation pipeline. Our system leverages a deliberative search model, featuring…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.