Deep Reinforcement Learning with Adjustments
Hamed Khorasgani, Haiyan Wang, Chetan Gupta, and Susumu Serita

TL;DR
This paper introduces a new Q-learning algorithm for continuous actions that combines the strengths of traditional control and deep reinforcement learning, enabling flexible adjustment for short-term and long-term goals.
Contribution
A novel Q-learning method for continuous actions that balances complex policy learning with easy short-term adjustments, bridging RL and control strategies.
Findings
Achieves long-term and short-term goals without complex reward functions
Provides a practical approximation applicable to pre-trained RL algorithms
Demonstrates effectiveness through case studies
Abstract
Deep reinforcement learning (RL) algorithms can learn complex policies to optimize agent operation over time. RL algorithms have shown promising results in solving complicated problems in recent years. However, their application on real-world physical systems remains limited. Despite the advancements in RL algorithms, the industries often prefer traditional control strategies. Traditional methods are simple, computationally efficient and easy to adjust. In this paper, we first propose a new Q-learning algorithm for continuous action space, which can bridge the control and RL algorithms and bring us the best of both worlds. Our method can learn complex policies to achieve long-term goals and at the same time it can be easily adjusted to address short-term requirements without retraining. Next, we present an approximation of our algorithm which can be applied to address short-term…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Supply Chain and Inventory Management · Scheduling and Optimization Algorithms
MethodsQ-Learning
