A Novel Entropy-Maximizing TD3-based Reinforcement Learning for   Automatic PID Tuning

Myisha A. Chowdhury; Qiugang Lu

arXiv:2210.02381·eess.SY·October 6, 2022

A Novel Entropy-Maximizing TD3-based Reinforcement Learning for Automatic PID Tuning

Myisha A. Chowdhury, Qiugang Lu

PDF

Open Access

TL;DR

This paper introduces an entropy-maximizing TD3 reinforcement learning method to improve automatic PID tuning, enhancing sample efficiency and global optimality in complex systems.

Contribution

It proposes a novel EMTD3 algorithm combining stochastic exploration and deterministic exploitation for better PID parameter tuning.

Findings

01

Improved sample efficiency over traditional methods

02

Faster convergence to optimal PID parameters

03

Effective in tuning second-order systems

Abstract

Proportional-integral-derivative (PID) controllers have been widely used in the process industry. However, the satisfactory control performance of a PID controller depends strongly on the tuning parameters. Conventional PID tuning methods require extensive knowledge of the system model, which is not always known especially in the case of complex dynamical systems. In contrast, reinforcement learning-based PID tuning has gained popularity since it can treat PID tuning as a black-box problem and deliver the optimal PID parameters without requiring explicit process models. In this paper, we present a novel entropy-maximizing twin-delayed deep deterministic policy gradient (EMTD3) method for automating the PID tuning. In the proposed method, an entropy-maximizing stochastic actor is employed at the beginning to encourage the exploration of the action space. Then a deterministic actor is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExtremum Seeking Control Systems · Advanced Control Systems Optimization · Viral Infectious Diseases and Gene Expression in Insects

Methods*Communicated@Fast*How Do I Communicate to Expedia? · Experience Replay · Clipped Double Q-learning · Target Policy Smoothing · Adam · Dense Connections · Twin Delayed Deep Deterministic