Dynamic Decision Frequency with Continuous Options

Amirmohammad Karimi; Jun Jin; Jun Luo; A. Rupam Mahmood; Martin; Jagersand; Samuele Tosatto

arXiv:2212.04407·cs.LG·October 26, 2023

Dynamic Decision Frequency with Continuous Options

Amirmohammad Karimi, Jun Jin, Jun Luo, A. Rupam Mahmood, Martin, Jagersand, Samuele Tosatto

PDF

Open Access 1 Repo

TL;DR

This paper introduces CTCO, a reinforcement learning framework allowing agents to adaptively choose options of variable durations, enabling flexible interaction frequencies and improving performance in continuous control and real-world tasks.

Contribution

The paper proposes a novel continuous-time, continuous-options framework that decouples decision frequency from environment interaction, enhancing adaptability and exploration in RL agents.

Findings

01

Performance is unaffected by environment interaction frequency.

02

CTCO outperforms classical RL in simulated tasks.

03

Effective exploration in real-world robotic arm task.

Abstract

In classic reinforcement learning algorithms, agents make decisions at discrete and fixed time intervals. The duration between decisions becomes a crucial hyperparameter, as setting it too short may increase the problem's difficulty by requiring the agent to make numerous decisions to achieve its goal while setting it too long can result in the agent losing control over the system. However, physical systems do not necessarily require a constant control frequency, and for learning agents, it is often preferable to operate with a low frequency when possible and a high frequency when necessary. We propose a framework called Continuous-Time Continuous-Options (CTCO), where the agent chooses options as sub-policies of variable durations. These options are time-continuous and can interact with the system at any desired frequency providing a smooth change of actions. We demonstrate the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

amir-karimi96/continuous-time-continuous-option-policy-gradient
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Explainable Artificial Intelligence (XAI)