Towards Trustworthy Report Generation: A Deep Research Agent with Progressive Confidence Estimation and Calibration
Yi Yuan, Xuhong Wang, Shanzhe Lei

TL;DR
This paper introduces a deep research agent that enhances report trustworthiness by integrating progressive confidence estimation and calibration, grounded in verifiable evidence, to improve transparency and user trust.
Contribution
It presents a novel system combining confidence calibration with multi-hop reasoning to produce more trustworthy, transparent research reports in open-ended scenarios.
Findings
Significantly improves interpretability of generated reports.
Increases user trust through better confidence calibration.
Enhances grounding in verifiable evidence.
Abstract
As agent-based systems continue to evolve, deep research agents are capable of automatically generating research-style reports across diverse domains. While these agents promise to streamline information synthesis and knowledge exploration, existing evaluation frameworks-typically based on subjective dimensions-fail to capture a critical aspect of report quality: trustworthiness. In open-ended research scenarios where ground-truth answers are unavailable, current evaluation methods cannot effectively measure the epistemic confidence of generated content, making calibration difficult and leaving users susceptible to misleading or hallucinated information. To address this limitation, we propose a novel deep research agent that incorporates progressive confidence estimation and calibration within the report generation pipeline. Our system leverages a deliberative search model, featuring…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
