Learning to Trust: Bayesian Adaptation to Varying Suggester Reliability in Sequential Decision Making
Dylan M. Asmar, Mykel J. Kochenderfer

TL;DR
This paper presents a Bayesian framework enabling autonomous agents to adaptively trust external suggestions in uncertain, sequential decision tasks by learning suggester reliability and strategically requesting advice.
Contribution
It introduces a Bayesian approach that models and adapts to varying suggester reliability and incorporates an explicit 'ask' action for strategic suggestion requests.
Findings
Robust performance across different suggester qualities
Effective adaptation to changing reliability
Strategic suggestion requesting improves decision quality
Abstract
Autonomous agents operating in sequential decision-making tasks under uncertainty can benefit from external action suggestions, which provide valuable guidance but inherently vary in reliability. Existing methods for incorporating such advice typically assume static and known suggester quality parameters, limiting practical deployment. We introduce a framework that dynamically learns and adapts to varying suggester reliability in partially observable environments. First, we integrate suggester quality directly into the agent's belief representation, enabling agents to infer and adjust their reliance on suggestions through Bayesian inference over suggester types. Second, we introduce an explicit ``ask'' action allowing agents to strategically request suggestions at critical moments, balancing informational gains against acquisition costs. Experimental evaluation demonstrates robust…
Peer Reviews
Decision·Submitted to ICLR 2026
The paper is well-written and easy to follow. The model is well explained and provided with nice intuitions.
My main concern is the contribution of the paper: The paper should be viewed more as a "conceptual" work. As noted above, the model is newly proposed and well-explained, but I find it hard to apply it in a real-world scenario. For the following reasons: - Solving such a model requires knowing a lot of parameters like the transition matrix, the noisy rational suggester model, etc. - Generally, the POMDP framework makes the model inapplicable to a real-world scenario with a moderate state space
- Novel Motivation: (1) Addresses a real and growing need in human-AI teaming: adapting trust to variable advice quality. (2) Aligned with the trend of interactive assistance and trust calibration. - Formulation: POMDP and MOMDP are modeled in the scenarios: (1) Present a solid use of the MOMDP structure to efficiently manage the expanded state space introduced by modeling suggester reliability as a latent variable. (2) The Bayesian update mechanism for jointly inferring environment state and
- Human study missing: For human-trust motivation, no human-in-the-loop experiments are conducted. Although this paper acknowledges this, it is still important for this paper. - Scalability: Tag and RockSample are standard but small. What if (1) the larger POMDP domains, (2) higher-dimensional latent human models, (3) multiple suggesters or groups of helpers. - Reliance on **known** Q-values: The ask suggestion model uses pre-solved Q values. What if (1) Q is inaccurate, (2) Q value needs to b
The method section 3 is easy to follow and the proposed contributions/components are introduced clearly with motivations. The paper also experiments in the setting where the proposed suggester model is misspecified (Section 5.4).
The contributions/components (i)–(iii) listed in the summary box are somewhat orthogonal, especially (i)–(ii) relative to (iii). Without comprehensive empirical experiments demonstrating a significant performance improvement over justified baselines, the overall contribution looks like a sum of incremental components. Further, proper ablation studies are critical in this case to understand the strengths and weaknesses of the individual components (maybe Tables 1–2 may touch on this, but it is di
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Human-Automation Interaction and Safety · Decision-Making and Behavioral Economics
