Learning What Matters Now: A Dual-Critic Context-Aware RL Framework for Priority-Driven Information Gain
Dimitris Panagopoulos, Adolfo Perrusquia, Weisi Guo

TL;DR
This paper introduces CA-MIQ, a dual-critic reinforcement learning framework that enables autonomous systems to adaptively prioritize information gathering in high-stakes search-and-rescue missions, significantly improving mission success after priority shifts.
Contribution
The paper presents a novel context-aware RL framework with a shift detector and dual critics, enhancing adaptive exploration and focus in dynamic, priority-changing environments.
Findings
Nearly four times higher success rate after a single priority shift
Over three times better performance in multiple-shift scenarios
Achieves 100% recovery in adaptive information gathering tasks
Abstract
Autonomous systems operating in high-stakes search-and-rescue (SAR) missions must continuously gather mission-critical information while flexibly adapting to shifting operational priorities. We propose CA-MIQ (Context-Aware Max-Information Q-learning), a lightweight dual-critic reinforcement learning (RL) framework that dynamically adjusts its exploration strategy whenever mission priorities change. CA-MIQ pairs a standard extrinsic critic for task reward with an intrinsic critic that fuses state-novelty, information-location awareness, and real-time priority alignment. A built-in shift detector triggers transient exploration boosts and selective critic resets, allowing the agent to re-focus after a priority revision. In a simulated SAR grid-world, where experiments specifically test adaptation to changes in the priority order of information types the agent is expected to focus on,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Age of Information Optimization · Advanced Neural Network Applications
