Model Free Deep Deterministic Policy Gradient Controller for Setpoint Tracking of Non-minimum Phase Systems
Fatemeh Tavakkoli, Pouria Sarhadi, Benoit Clement, Wasif Naeem

TL;DR
This paper evaluates the performance of model-free Deep Reinforcement Learning, specifically DDPG controllers, for setpoint tracking in non-minimum phase systems, comparing them with classical controllers across multiple criteria.
Contribution
It introduces a comprehensive performance comparison framework for DRL and classical controllers in control tasks, focusing on robustness and practical applicability.
Findings
DDPG controllers show promising robustness under challenging conditions
Classical controllers still outperform DDPG in some criteria
The paper provides a new evaluation framework for DRL in control systems
Abstract
Deep Reinforcement Learning (DRL) techniques have received significant attention in control and decision-making algorithms. Most applications involve complex decision-making systems, justified by the algorithms' computational power and cost. While model-based versions are emerging, model-free DRL approaches are intriguing for their independence from models, yet they remain relatively less explored in terms of performance, particularly in applied control. This study conducts a thorough performance analysis comparing the data-driven DRL paradigm with a classical state feedback controller, both designed based on the same cost (reward) function of the linear quadratic regulator (LQR) problem. Twelve additional performance criteria are introduced to assess the controllers' performance, independent of the LQR problem for which they are designed. Two Deep Deterministic Policy Gradient…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMechanical Circulatory Support Devices · Cardiovascular Function and Risk Factors · Fuel Cells and Related Materials
MethodsExperience Replay · Weight Decay · *Communicated@Fast*How Do I Communicate to Expedia? · Convolution · Dense Connections · Batch Normalization · Adam · Deep Deterministic Policy Gradient
