Learning to Seek Evidence: A Verifiable Reasoning Agent with Causal Faithfulness Analysis
Yuhang Huang, Zekai Lin, Fan Zhong, Lei Liu

TL;DR
This paper introduces an interactive, reinforcement learning-based reasoning agent that seeks external visual evidence to produce verifiable explanations, significantly improving diagnostic accuracy and faithfulness in high-stakes AI applications.
Contribution
It presents a novel approach combining evidence-seeking behavior with causal faithfulness analysis to enhance verifiability and trustworthiness of AI explanations.
Findings
Reduced Brier score by 18% with evidence-seeking
Causal intervention confirms evidence importance
Improved calibration and generalization
Abstract
Explanations for AI models in high-stakes domains like medicine often lack verifiability, which can hinder trust. To address this, we propose an interactive agent that produces explanations through an auditable sequence of actions. The agent learns a policy to strategically seek external visual evidence to support its diagnostic reasoning. This policy is optimized using reinforcement learning, resulting in a model that is both efficient and generalizable. Our experiments show that this action-based reasoning process significantly improves calibrated accuracy, reducing the Brier score by 18\% compared to a non-interactive baseline. To validate the faithfulness of the agent's explanations, we introduce a causal intervention method. By masking the visual evidence the agent chooses to use, we observe a measurable degradation in its performance (Brier=+0.029), confirming that the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Multimodal Machine Learning Applications · Adversarial Robustness in Machine Learning
