Learning Hidden Subgoals under Temporal Ordering Constraints in   Reinforcement Learning

Duo Xu; Faramarz Fekri

arXiv:2411.01425·cs.LG·November 5, 2024

Learning Hidden Subgoals under Temporal Ordering Constraints in Reinforcement Learning

Duo Xu, Faramarz Fekri

PDF

Open Access

TL;DR

This paper introduces LSTOC, a reinforcement learning algorithm that learns hidden subgoals and their temporal orderings using contrastive learning, enhancing efficiency and generalization in complex tasks with temporal constraints.

Contribution

LSTOC is the first method to simultaneously learn hidden subgoals and their temporal orderings in RL using a contrastive learning approach and subgoal trees.

Findings

01

LSTOC outperforms baseline methods in various environments.

02

It improves sample efficiency in discovering subgoals.

03

It generalizes well to unseen tasks.

Abstract

In real-world applications, the success of completing a task is often determined by multiple key steps which are distant in time steps and have to be achieved in a fixed time order. For example, the key steps listed on the cooking recipe should be achieved one-by-one in the right time order. These key steps can be regarded as subgoals of the task and their time orderings are described as temporal ordering constraints. However, in many real-world problems, subgoals or key states are often hidden in the state space and their temporal ordering constraints are also unknown, which make it challenging for previous RL algorithms to solve this kind of tasks. In order to address this issue, in this work we propose a novel RL algorithm for {\bf l}earning hidden {\bf s}ubgoals under {\bf t}emporal {\bf o}rdering {\bf c}onstraints (LSTOC). We propose a new contrastive learning objective which can…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEvolutionary Algorithms and Applications

MethodsContrastive Learning