Learning When to Switch: Adaptive Policy Selection via Reinforcement Learning

Chris Tava

arXiv:2512.06250·cs.LG·December 9, 2025

Learning When to Switch: Adaptive Policy Selection via Reinforcement Learning

Chris Tava

PDF

Open Access

TL;DR

This paper presents a reinforcement learning method for autonomous agents to adaptively switch between navigation strategies, significantly improving efficiency and robustness in maze navigation tasks without prior domain-specific heuristics.

Contribution

It introduces a Q-learning based approach enabling agents to learn optimal switching thresholds between exploration and goal-directed policies during runtime.

Findings

01

Adaptive switching outperforms fixed thresholds and single-strategy agents.

02

Performance improvements scale with maze complexity, up to 55% in larger mazes.

03

The learned policy generalizes to unseen maze configurations within each size class.

Abstract

Autonomous agents often require multiple strategies to solve complex tasks, but determining when to switch between strategies remains challenging. This research introduces a reinforcement learning technique to learn switching thresholds between two orthogonal navigation policies. Using maze navigation as a case study, this work demonstrates how an agent can dynamically transition between systematic exploration (coverage) and goal-directed pathfinding (convergence) to improve task performance. Unlike fixed-threshold approaches, the agent uses Q-learning to adapt switching behavior based on coverage percentage and distance to goal, requiring only minimal domain knowledge: maze dimensions and target location. The agent does not require prior knowledge of wall positions, optimal threshold values, or hand-crafted heuristics; instead, it discovers effective switching strategies dynamically…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning