MOSEAC: Streamlined Variable Time Step Reinforcement Learning
Dong Wang, Giovanni Beltrame

TL;DR
MOSEAC introduces an adaptive reward scheme for variable time step reinforcement learning, simplifying hyperparameter tuning and reducing computational costs while maintaining high performance in simulation tasks.
Contribution
The paper proposes MOSEAC, a novel method that automatically tunes control loop frequency in VTS-RL using a single hyperparameter, easing deployment and improving efficiency.
Findings
Achieves high task performance with fewer time steps.
Reduces energy consumption in simulated environments.
Simplifies hyperparameter tuning process.
Abstract
Traditional reinforcement learning (RL) methods typically employ a fixed control loop, where each cycle corresponds to an action. This rigidity poses challenges in practical applications, as the optimal control frequency is task-dependent. A suboptimal choice can lead to high computational demands and reduced exploration efficiency. Variable Time Step Reinforcement Learning (VTS-RL) addresses these issues by using adaptive frequencies for the control loop, executing actions only when necessary. This approach, rooted in reactive programming principles, reduces computational load and extends the action space by including action durations. However, VTS-RL's implementation is often complicated by the need to tune multiple hyperparameters that govern exploration in the multi-objective action-duration space (i.e., balancing task performance and number of time steps to achieve a goal). To…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Advanced Control Systems Optimization
