CADENT: Gated Hybrid Distillation for Sample-Efficient Transfer in Reinforcement Learning
Mahyar Alinejad, Yue Wang, George Atia

TL;DR
CADENT introduces an experience-gated distillation framework that combines strategic automaton knowledge with tactical policies, significantly improving sample efficiency and adaptability in reinforcement learning tasks.
Contribution
It unifies automaton-based and policy distillation methods with a dynamic trust mechanism for better transfer learning in RL.
Findings
Achieves 40-60% better sample efficiency than baselines.
Maintains superior asymptotic performance across environments.
Effective in both sparse-reward and continuous control tasks.
Abstract
Transfer learning promises to reduce the high sample complexity of deep reinforcement learning (RL), yet existing methods struggle with domain shift between source and target environments. Policy distillation provides powerful tactical guidance but fails to transfer long-term strategic knowledge, while automaton-based methods capture task structure but lack fine-grained action guidance. This paper introduces Context-Aware Distillation with Experience-gated Transfer (CADENT), a framework that unifies strategic automaton-based knowledge with tactical policy-level knowledge into a coherent guidance signal. CADENT's key innovation is an experience-gated trust mechanism that dynamically weighs teacher guidance against the student's own experience at the state-action level, enabling graceful adaptation to target domain specifics. Across challenging environments, from sparse-reward grid worlds…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Reinforcement Learning in Robotics · Multimodal Machine Learning Applications
