Learning to Trust Experience: A Monitor-Trust-Regulator Framework for Learning under Unobservable Feedback Reliability
Zhipeng Zhang, Zhenjie Yao, Kai Li, Lei Yang

TL;DR
This paper introduces a Monitor-Trust-Regulator framework for autonomous learning systems to assess and adapt to unobservable feedback reliability, improving epistemic accuracy and robustness.
Contribution
It formalizes a modular introspective regulation approach, enabling systems to infer experience credibility without external labels, enhancing learning under unreliable feedback.
Findings
Self-diagnosis improves epistemic identifiability in EIUR.
In reinforcement learning, it enables calibrated skepticism and reward recovery.
In supervised learning, it reveals dissociation between performance and epistemic recovery.
Abstract
Learning under unobservable feedback reliability poses a distinct challenge beyond optimization robustness: a system must decide whether to learn from an experience, not only how to learn stably. We study this setting as Epistemic Identifiability under Unobservable Reliability (EIUR), where each experience has a latent credibility, reliable and unreliable feedback can be locally indistinguishable, and data are generated in a closed loop by the learner's own evolving beliefs and actions. In EIUR, standard robust learning can converge stably yet form high-confidence, systematically wrong beliefs. We propose metacognitive regulation as a practical response: a second, introspective control loop that infers experience credibility from endogenous evidence in the learner's internal dynamics. We formalize this as a modular Monitor-Trust-Regulator (MTR) decomposition and instantiate it with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGame Theory and Applications · Opinion Dynamics and Social Influence · Advanced Bandit Algorithms Research
