Reinforcement learning with distance-based incentive/penalty (DIP) updates for highly constrained industrial control systems
Hyungjun Park, Daiki Min, Jong-hyun Ryu, Dong Gu Choi

TL;DR
This paper introduces a novel reinforcement learning algorithm with distance-based incentive and penalty updates, effectively handling highly constrained industrial control systems involving both discrete and continuous actions.
Contribution
The paper presents a new RL algorithm with distance-based Q-value updates and shadow price-weighted penalties, addressing constraints in industrial control tasks.
Findings
Demonstrates superior performance in microgrid control tasks
Effectively manages discrete and continuous actions within constraints
Outperforms existing RL methods in constrained environments
Abstract
Typical reinforcement learning (RL) methods show limited applicability for real-world industrial control problems because industrial systems involve various constraints and simultaneously require continuous and discrete control. To overcome these challenges, we devise a novel RL algorithm that enables an agent to handle a highly constrained action space. This algorithm has two main features. First, we devise two distance-based Q-value update schemes, incentive update and penalty update, in a distance-based incentive/penalty update technique to enable the agent to decide discrete and continuous actions in the feasible region and to update the value of these types of actions. Second, we propose a method for defining the penalty cost as a shadow price-weighted penalty. This approach affords two advantages compared to previous methods to efficiently induce the agent to not select an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSmart Grid Energy Management · Reinforcement Learning in Robotics · Microgrid Control and Optimization
