Enhancing Hierarchical Reinforcement Learning through Change Point Detection in Time Series
Hemanath Arumugam, Falong Fan, Bo Liu

TL;DR
This paper presents a novel hierarchical reinforcement learning approach that uses a self-supervised change point detection module to automatically discover meaningful subgoals and improve policy learning in complex tasks.
Contribution
It introduces a Transformer-based change point detection integrated into HRL, enabling autonomous subgoal discovery and enhanced policy training without external supervision.
Findings
Accelerated convergence in tasks like Four-Rooms and Pinball.
Higher cumulative rewards compared to baseline methods.
Improved option specialization and interpretability.
Abstract
Hierarchical Reinforcement Learning (HRL) enhances the scalability of decision-making in long-horizon tasks by introducing temporal abstraction through options-policies that span multiple timesteps. Despite its theoretical appeal, the practical implementation of HRL suffers from the challenge of autonomously discovering semantically meaningful subgoals and learning optimal option termination boundaries. This paper introduces a novel architecture that integrates a self-supervised, Transformer-based Change Point Detection (CPD) module into the Option-Critic framework, enabling adaptive segmentation of state trajectories and the discovery of options. The CPD module is trained using heuristic pseudo-labels derived from intrinsic signals to infer latent shifts in environment dynamics without external supervision. These inferred change-points are leveraged in three critical ways: (i) to serve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
