Reinforcement Learning for Control of Valves

Rajesh Siraskar

arXiv:2012.14668·cs.LG·February 5, 2021

Reinforcement Learning for Control of Valves

Rajesh Siraskar

PDF

2 Repos

TL;DR

This study evaluates reinforcement learning, specifically DDPG, for controlling nonlinear valves and compares its performance to PID controllers, highlighting advantages in signal tracking and challenges in hyperparameter tuning.

Contribution

It introduces 'Graded Learning', a simplified curriculum approach, to improve RL convergence in complex nonlinear control systems.

Findings

01

RL excels in signal tracking speed and accuracy.

02

PID offers better disturbance rejection and valve longevity.

03

Graded Learning accelerates RL training for nonlinear systems.

Abstract

This paper is a study of reinforcement learning (RL) as an optimal-control strategy for control of nonlinear valves. It is evaluated against the PID (proportional-integral-derivative) strategy, using a unified framework. RL is an autonomous learning mechanism that learns by interacting with its environment. It is gaining increasing attention in the world of control systems as a means of building optimal-controllers for challenging dynamic and nonlinear processes. Published RL research often uses open-source tools (Python and OpenAI Gym environments). We use MATLAB's recently launched (R2019a) Reinforcement Learning Toolbox to develop the valve controller; trained using the DDPG (Deep Deterministic Policy-Gradient) algorithm and Simulink to simulate the nonlinear valve and create the experimental test-bench for evaluation. Simulink allows industrial engineers to quickly adapt and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsBatch Normalization · Adam · Weight Decay · Experience Replay · Convolution · Dense Connections · *Communicated@Fast*How Do I Communicate to Expedia? · Deep Deterministic Policy Gradient