Dynamic Decision Frequency with Continuous Options
Amirmohammad Karimi, Jun Jin, Jun Luo, A. Rupam Mahmood, Martin, Jagersand, Samuele Tosatto

TL;DR
This paper introduces CTCO, a reinforcement learning framework allowing agents to adaptively choose options of variable durations, enabling flexible interaction frequencies and improving performance in continuous control and real-world tasks.
Contribution
The paper proposes a novel continuous-time, continuous-options framework that decouples decision frequency from environment interaction, enhancing adaptability and exploration in RL agents.
Findings
Performance is unaffected by environment interaction frequency.
CTCO outperforms classical RL in simulated tasks.
Effective exploration in real-world robotic arm task.
Abstract
In classic reinforcement learning algorithms, agents make decisions at discrete and fixed time intervals. The duration between decisions becomes a crucial hyperparameter, as setting it too short may increase the problem's difficulty by requiring the agent to make numerous decisions to achieve its goal while setting it too long can result in the agent losing control over the system. However, physical systems do not necessarily require a constant control frequency, and for learning agents, it is often preferable to operate with a low frequency when possible and a high frequency when necessary. We propose a framework called Continuous-Time Continuous-Options (CTCO), where the agent chooses options as sub-policies of variable durations. These options are time-continuous and can interact with the system at any desired frequency providing a smooth change of actions. We demonstrate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Explainable Artificial Intelligence (XAI)
