Design of twin delayed deep deterministic policy gradient RL based adaptive controller for DC motor speed regulation considering uncertainties
P. Gaur, V. P. Singh, T. Varshney, P. Sanjeevikumar, A. Mathur

TL;DR
This paper proposes a reinforcement learning-based controller for regulating DC motor speed under uncertain conditions and compares it with other control methods.
Contribution
A novel twin delayed deep deterministic policy gradient RL controller is designed and evaluated for DC motor speed regulation in uncertain environments.
Findings
The TD3 RL controller demonstrated robustness and adaptability in handling dynamic uncertainties.
Error indices like ISE, ITAE, IAE, and ITSE were used to compare controller performance.
The proposed controller outperformed benchmark controllers in tracking precision and error minimization.
Abstract
Reinforcement learning offers efficient solutions for optimizing complex decision-making tasks through continuous state-action-reward cycle with real-time adaptability. This work presents twin delayed deep deterministic (TD3) policy gradient RL based adaptive speed controller for the DC motor model while considering the impact of various uncertainties from dynamic environment into account. Various benchmark controller techniques are also utilized for similar objective in order to perform comparative analysis. Responses of each controller are plotted for both constant and variable desired speeds to evaluate their efficacy, robustness, and adaptability to uncertainties. Values of various types of error indices, including integral of squared error (ISE), integral of time-weighted absolute error (ITAE), integral of absolute error (IAE), integral of time-weighted squared error (ITSE), and…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10- —University Of South-Eastern Norway
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdaptive Dynamic Programming Control · Sensorless Control of Electric Motors · Reinforcement Learning in Robotics
Introduction
Reinforcement learning (RL) is a type of machine learning technique where an algorithm learns to make sequential decisions by optimizing its actions in order to maximize a cumulative reward^1,2^. RL has caught attention of researchers of various domains due to it’s capabilities of improving efficiency, save energy, and make more accurate predictions in various industrial application^3,4^. It has proven invaluable potential in real-time decision-making and optimization tasks. Moreover, RL is transforming healthcare sector, where it aids in medical image analysis, radiology reports, real-time health monitoring, personalized treatment plans, and drug discovery^5,6^. The electrical vehicle domain is also witnessing technological advancement due to implementation of RL in load forecasting in charging station, improved battery life, vehicle-to-grid (V2G), and real-time controls^7–9^. RL assisted automated trading and portfolio management are upgrading financial sector^10,11^. The power sector is also going through noteworthy up-gradation due to integration of RL techniques in load-generation balance, optimal resource allocation, and unit commitment tasks^12–14^. Additionally, RL is widely applied in areas such as military^15^, cyber security^16^, aviation^17,18^, cryptocurrency^19,20^, smart agriculture^21,22^, etc.
RL has evolved from fundamental concepts to advance algorithms over the years in order to address complex decision-making problems. The concept of optimality, which laid the framework for Markov decision processes (MDPs) and dynamic programming, was introduced by Bellman^23^. Watkins developed Q-learning^24^, a model-free approach for learning state-action values. Mnih et al.^25^ revolutionized RL with DQN, enabling high-dimensional state space handling by fusing Q-learning and deep neural networks. To extend RL to continuous actions, DPG^26^ was introduced, followed by DDPG^27^, which used an actor-critic framework for improved stability. Twin delayed deep deterministic (TD3) policy gradient algorithm^28^ refined DDPG by addressing overestimation bias and enhancing exploration, making it more efficient for continuous control tasks.
Following the exploration of RL techniques in control systems engineering in various domains, it is essential to understand the fundamental dynamics and control challenges associated with DC motors. DC motors remain a widely used machine in diverse industrial fields such as aviation^29^, electrical vehicle^30^, robotics^31^, irrigation^32,33^, etc. The performance of DC motor is impacted by various external factors such as presence of moisture, dirt, and dust^34^. Performance variation is also observed for the DC motor as the operating temperature changes^35^. Both unbalanced power supply, and failed ventilation system cause degradation in efficacy of the DC motor^34^. The speed control is crucial for ensuring precise performance, energy efficiency, and stability under dynamic operating conditions^36,37^.
Various types of controller designing techniques have been utilized in past research works e.g. AMIGO method^38^, Cohen-Coon (CC) method^39^, Ho’s method^40^, and Zigler-Nichols (ZN) method^41^. These controller techniques^38–41^ may get stuck in sub-optimal solutions, exhibit limited adaptability to disturbances, and struggle with uncertainties in environments. Metaheuristic algorithms like moth flame optimization (MFO)^42^, cuckoo optimization algorithm (COA)^43^, grey wolf optimization (GWO)^44^, whale optimization algorithm (WOA)^45^, and ant colony optimization (ACO)^46^ algorithm are also implemented for designing controller in order to address various industrial objectives. Still such controllers algorithms^42–46^ suffer from computational complexity, slow convergence speed, and performance degradation due to uncertainties in environments.
As the efficiency of the DC motor system is influenced by a wide range of factors^34,35^, need for efficient controller designing is highly desirable, which is adaptive in nature and performs efficiently while considering various uncertainties in the system environment. Although RL is widely applied for various real life applications by reseachers^5–13,15–22^, yet its application in the DC motor speed regulation application remains relatively underexplored. This research gap highlights the need for further investigation into how RL can be utilized for DC motor system to achieve speed regulation in efficient manner while considering various type of uncertainties.
This article proposed TD3 RL based adaptive speed controller in order to achieve efficient speed regulation for DC motor system while considering various uncertainties from environment. Transfer function (TF) based DC motor model is utilized in this work to design and implement the TD3 RL for similar objective. Further, that DC motor’s TF is approximated into first-order plus dead time (FOPDT) model to calculate gains for various benchmark controllers to perform comparative analysis against the proposed adaptive speed controller. Responses of all designed controllers are plotted for both constant and variable desired speeds to test each controller’s adaptability, transition performance, and stability in dynamic operating conditions. Effectiveness and robustness of each controller are evaluated by calculating key error indices. Later, all these error indices are tabulated and compared to evaluate performance of each controller among themselves.
This research work’s subsequent sections are organized as follows: Section “TD3 policy gradient algorithm” presents the key improvements introduced in TD3 RL compared to DDPG RL, along with the algorithm of TD3 RL. Section “System description” explains TD3 algorithm’s structure implementation for speed regulation for the DC motor model while considering various uncertainties from dynamic environment. This section also explains working and significance of each key components utilized in overall control strategy. In Section “Simulation results and discussion”, the results are plotted for both constant and variable desired speeds. Key error indices are also calculated, and tabulated for comparative analysis in this section. Lastly, Section “Conclusion and future work” encapsulates the article with key outcomes and suggesting pathways for extended research.
TD3 policy gradient algorithm
TD3 algorithm is one of the recent advancements in RL techniques as it improves DDPG by using methods that increase stability, lower overestimation, and assure robust learning. These updates make TD3’s performance more effective in continuous action spaces by addressing issues such as unstable updates, overfitting, and Q-value estimate errors. TD3 RL is applied by researchers in various industrial applications such as aviation^47^, 5G-communication^48^, robotics^49^, electrical vehicle^50^, hybrid power system^51^, etc. Three major updates in TD3 compared to DDPG are written below.
- TD3 algorithm utilized twin critic networks in order to estimate Q-value. The lesser of the two values is chosen during target updates. This method will ensure safer policy update and enhance stability of learning throughout training process by reducing the possibility of overestimation and ensures that Q-value estimations stay closer to their actual values.
- The actor-network undergoes updates less often in TD3 RL compared to the critic network by continuously shifting Q-value parameters in order to avoid instability. This approach ensures more reliable and steady improvements in the actor network over time by delaying policy updates, which avoids repeated adjustments with regard to an unmodified critic.
- When calculating the Q-value in TD3, noise drawn from a truncated normal distribution is added to the target actions to prevent deterministic policies from overfitting to narrow peaks in value estimations. This lowers the target’s variance and improves robustness, particularly in stochastic situations with possible failure instances. After the above three key reform, the target values ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$y_1$$\end{document} , and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$y_2$$\end{document} ) for the twin critic networks are computed as per (1).
where, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$Q'_{\psi _1}$$\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$Q'_{\psi _2}$$\end{document} are the target critic networks, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_t$$\end{document} is reward, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\gamma$$\end{document} is discount factor, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s'_{t+1}$$\end{document} is state, and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\tilde{a}}_t$$\end{document} is the noise-added target action. This noise is sampled from a clipped normal distribution and is added to the output of the target actor network as:
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} {\tilde{a}}_t \leftarrow \eta _{\phi '}(s'_t) + \epsilon , \quad \epsilon \sim \operatorname {clip}\!\left( {\mathcal {N}}(0, {\tilde{\sigma }}), -c, c\right) \end{aligned}$$\end{document}Next, to reduce overestimation bias, the minimum of the two calculated target Q-values is utilized as the final target Q-value:
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} y \leftarrow r_t + \gamma \min _{i=1,2} Q'_{\psi _i}(s'_{t+1}, {\tilde{a}}_t) \end{aligned}$$\end{document}The update of the critic networks involves minimizing the mean squared error (MSE) between the computed target value and the current Q-values. Later, the deterministic policy gradient (DPG) update is utilized to modify the actor-network, as shown by:
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \nabla _\phi J(\phi ) = N^{-1} \sum \nabla _{a_t} Q_{\psi _1}(s_t, a_t)\Big |_{a_t = \eta _\phi (s_t)} \nabla _\phi \eta _\phi (s_t) \end{aligned}$$\end{document}Finally, soft updates are performed for both target-critic networks and target-actor network using Polyak averaging to ensure stable learning dynamics across iterations. The algorithm of TD3 RL is discussed in Algorithm 1.
Algorithm 1TD3 RL.
Interpretations of various parameters utilized in above TD3 RL algorithm are mention in the Table 1.Table 1. Parameters of TD3 RL algorithm.S.N.ParametersDescriptions1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$Q_{\psi _1}, Q_{\psi _2}$$\end{document} Twin critic networks, where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\psi _1$$\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\psi _2$$\end{document} are their respective weights2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta _{\phi }$$\end{document} Actor network, where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\phi$$\end{document} is its respective weight3 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\psi _1', \psi _2'$$\end{document} Weights of target critic networks4 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\phi '$$\end{document} Weight of target actor network5 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu$$\end{document} Replay buffer6TTotal number of episodes7Sample \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(s_t, a_t, r_t, s'_{t+1})$$\end{document} Transition group in which \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s_t$$\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$a_t$$\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s'_{t+1}$$\end{document} , and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_t$$\end{document} are the state, action, next state, and reward, respectively8 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\gamma$$\end{document} Discount factor9 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\nabla _\varphi J(\varphi )$$\end{document} A gradient of the actor-network’s loss function10 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\psi _i'$$\end{document} Updated value of the critic network’s weight parameter11 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\phi '$$\end{document} Adjusted weight parameter’s value for critic-network12 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\tau$$\end{document} Adjusted weight parameter’s value for actor-network
System description
This section describes the design and implementation of TD3 RL based adaptive controller for DC motor speed regulation. Overall structural implementation of TD3 RL agent for speed regulation is depicted in Fig. 1. The key components, depicted in Fig. 1, are DC motor, setpoint filter, TD3 RL agent, observation, reward and uncertainties. DC motors are categorized based on methods to supply current to field winding. It can be either self-excited or externally-excited types. In this work, an externally-excited type DC motor is utilized to design of the adaptive speed controller.Fig. 1TD3 algorithm’s structure implementation.
The TF of the DC motor^52^ in an open-loop configuration is given as
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} D_{tf}(s) = \frac{\omega (s)}{V_a(s)} = \frac{K}{(L_a s + R_a)(Js + B) + K_b K} \end{aligned}$$\end{document}where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\omega$$\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$V_a$$\end{document} are the angular speed of motor shaft (rad/s) and the armature voltage (V), respectively. Both the representation and the values of various other parameters, as mentioned in (5), are tabulated in Table 2^53–55^. After utilizing values form Table 2, following TF of DC motor is obtained:
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} D_{tf}(s) = \frac{\omega (s)}{V_a(s)} = \frac{0.015}{0.00108s^2+0.0061s+0.00163} \end{aligned}$$\end{document}The block diagram of the DC motor model is illustrated in Fig. 2^52^.Fig. 2DC motor model.Table 2. Parameters of DC motor.S. N.ParameterSymbolValue1Motor torque constantK0.015 N.m/A2Armature inductance \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$L_a$$\end{document} 2.7 H3Armature resistance \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$R_a$$\end{document} 0.4 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Omega$$\end{document} 4Inertia torque of motorJ0.0004 kg.m^2^5Motor friction constantB0.0022 N.m.s/rad6Electromotive force constant \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$K_b$$\end{document} 0.05 V.s/rad
Sharp variation in desired speed demands quick adjustments from DC motor, which might cause sudden excessive stress on mechanical parts. Such events are highly undesirable because they might degrade overall lifetime and result in frequent maintenance of the motor. That’s why a setpoint filter is utilized in overall speed control strategy in order to ensuring smooth transitions, reducing overshoot, and preventing aggressive control actions^56^. First order setpoint filter, which is the applied to reference speed in Fig. 1, is given as:
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} F(s) = \frac{1}{1+\alpha _fs} \end{aligned}$$\end{document}In (7), \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha _f$$\end{document} is coefficient of set-point filter. This time constant helps in minimizing overshoots during speed variations.
TD3 RL agent is utilized in the overall structural implementation of the adaptive speed controller for DC motor system. It’s various hyperparameter and their specifications are tabulated in Table 3. The observation block is utilized in the overall speed control strategy, because it processes the state information ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s_t$$\end{document} ) received from the environment and feed that to TD3 RL agent, assisting it to choose right action ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$a_t$$\end{document} ) for that state. It takes system states such as speed error ( \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$e_r(t)$$\end{document} ) as an input in proposed adaptive controller. This block takes \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$e_r(t)$$\end{document} and it’s discrete-time integrator as inputs to further process these to TD3 RL agent.
Reward block provides feedback to the TD3 RL agent by evaluating how well \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$a_t$$\end{document} contributes to achieving the desired system performance. It also penalizes \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$e_r(t)$$\end{document} to encourage accurate tracking to desired speed. It is a critic component which ensures that TD3 agent adapts effectively to various types of uncertainties and dynamic environment. Output from reward block is given as:
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} r(t) = -\left( \lambda \, e_r(t)^2 + \beta \, u(t)^2 \right) \end{aligned}$$\end{document}In (8), \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\lambda$$\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\beta$$\end{document} are error penalty factor and action penalty factor, respectively. Value of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\lambda$$\end{document} is considered as 1 whereas the value of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\beta$$\end{document} as 0.01. Controlled signal, u(t), represents the output of TD3 RL agent. Numerous external factors, such as operating temperature, unbalanced power supply, ventilation system, environmental conditions^34,35^, etc. impact the efficacy of the DC motor. Hence, robustness of the proposed adaptive controller is tested by utilizing the uncertainties block in overall speed regulation strategy. This block adds random noise into overall speed regulation system as depicted in Fig. 3. This uncertainties are added by utilizing band-limited white noise block which has noise power of 0.01, a sample time of 0.1 seconds, and a seed value of 23341. It is added to controller’s output which is taken as input to the DC motor model as depicted in Fig. 1.Table 3TD3 hyperparameters.S. N.HyperparametersValue1Mini batch size1282Gradient decay factor0.93Policy update frequency24Actor-network’s learning rate \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$10^{-2}$$\end{document} 5Experience buffer length \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$10^6$$\end{document} 6Critic-network’s learning rate \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$10^{-2}$$\end{document} 7Target smooth factor0.0058Target update frequency29Discount factor0.9910TD3 activation functionReLU11Gradient threshold methodl2 norm
Fig. 3. Uncertainties considered due to multiple influencing factors.
Comparative analysis against benchmark controllers
TD3 RL-based adaptive speed controller’s performance is evaluated against benchmark controllers from the literature. Key controllers techniques used in comparative analysis include Klein et al.^57^, Fertik and Sharpe^58^, McMillan^59^, and Edgar et al.^60^. These benchmark controllers utilized FOPDT model of system in order to design desired controller for any specific purpose. FOPDT model’s generalized expression is mentioned as
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} D_{fopdt} (s) = \frac{K_m}{sT_m+1}e^{-s\tau _d} \end{aligned}$$\end{document}where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$K_m$$\end{document} is gain, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$T_m$$\end{document} is time-constant, and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\tau _d$$\end{document} is delay time of approximated process. The FOPDT model of any TF based model can be obtained by utilizing different approximation techniques such as Maclaurin expansion^61^, Skogestad half rule method^62^, relay feedback^63^, and the process reaction curve method^64^, etc. This work utilized the process reaction curve method in order to obtain the approximated FOPDT model of the DC motor’s TF system as mentioned in (10).
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} D_{fopdt} (s) = \frac{9.1731}{4.2074s+1}e^{-0.1321s} \end{aligned}$$\end{document}Figure 4 presents a comparative analysis of DC motor TF model and its approximated FOPDT model. The step and impulse responses of DC motor TF model and its FOPDT model are given in Fig. 4a and b, respectively. Both responses (Fig. 4a and b) illustrate that the approximated FOPDT model closely follows the original DC motor system’s transient behavior with minimal deviation. Additionally, the Bode plot (given in Fig. 4c) highlights the frequency response characteristics, where the magnitude and phase plots indicate a reasonable approximation of the original system. Finally, the Nyquist plot, as presented in Fig. 4d, demonstrates the stability and frequency domain similarity of the approximated model to the actual DC motor. These results validate that the obtained approximated FOPDT model is maintaining essential system behavior of DC motor system dynamics, making it suitable for further utilization in this work in order to design and implementation of benchmark controller techniques.Fig. 4. Comparison between the DC motor model and its approximated FOPDT model. (a–d) represent step response, impulse response, Bode plot, and Nyquist plot, respectively.
This work utilized proportional and integral (PI) controller for speed regulation of DC motor system. Standard PI controller’s structure is given as:
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} C_{speed}(s) = K_p ( 1 + \frac{1}{sT_i}) \end{aligned}$$\end{document}In (11), \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$K_p$$\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$T_i$$\end{document} are controller’s gains which can be derived utilizing benchmark controller^57–60^ techniques as indicated in Table 4.Table 4. Formulae for controller parameters for different techniques.S. N.Controller technique \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$K_p$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$T_i$$\end{document} 1Klein et al. \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\frac{0.28 T_m}{K_m (\tau _d + 0.1 T_m)}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$0.53 T_m$$\end{document} 2Fertik and Sharpe \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\frac{0.56}{K_m}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$0.65 T_m$$\end{document} 3McMillan \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\frac{K_m}{3}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\tau _d$$\end{document} 4Edgar et al. \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\frac{0.4}{K_m}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$0.5 T_m$$\end{document}
Values of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$K_m$$\end{document} , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\tau _d$$\end{document} , and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$T_m$$\end{document} are obtained as 9.1731, 0.1321, and 4.2074, respectively by comparing (9) and (10). These values are put in Table 4 for calculating the controller gains. After putting these values, Table 5 is obtained.Table 5. Values for controller parameters for different techniques.S.N.Controller technique \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$K_p$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$T_i$$\end{document} 1Klein et al.0.23232.232Fertik and Sharpe0.0612.7353McMillan3.05770.13214Edgar et al.0.04360.06605
Simulation results and discussion
Response of proposed TD3 RL based adaptive controller is observed along with of all benchmark controllers for two different test cases for tracking desired speed while considering various uncertainties from environment into account. First test case is considered for 45 kmph speed which remains constant for overall simulation time. Later, the desired speed is considered as variable for evaluating controller’s adaptability, transition performance, and stability in dynamic operating conditions for second test case.
Test case 1: Desired speed remains constant
The training convergence profile of the TD3RL agent is illustrated in Fig. 5. The training graph demonstrated the stable learning process over 300 episodes. The episode reward demonstrating consistent improvement until convergence which confirm the successful learning and effective control policy.
Responses of all controllers which are mentioned in Table 5 are plotted in Fig. 6 along with TD3 RL based controller for tracking constant desired speed which is 45 kmph. Values of various types of error indices, including integral of squared error (ISE), integral of time-weighted absolute error (ITAE), integral of absolute error (IAE), integral of time-weighted squared error (ITSE), and their respective time-weighted variants like integral of time-squared absolute error (IT \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^{2}$$\end{document} AE) and integral of time-squared squared error (IT \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^{2}$$\end{document} SE), are utilized to evaluate the effectiveness of all controllers. Formulae for these key error indices are mentioned in Table 6, where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$e_r(t)$$\end{document} is error signal and T is overall time.Fig. 5. Convergence curves of TD3RL agent’s training.Fig. 6. The DC motor response corresponding to constant desired speed. (a–d) represent different approaches compared with TD3 RL based adaptive controller.Table 6. Key performance indices and their mathematical formulations.S.N.Performance indexFormula1ISE \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\int _0^T e_r^2(t) \, dt$$\end{document} 2IAE \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\int _0^T |e_r(t)| \, dt$$\end{document} 3ITAE \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\int _0^T t |e_r(t)| \, dt$$\end{document} 4ITSE \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\int _0^T t e_r^2(t) \, dt$$\end{document} 5IT \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^2$$\end{document} AE \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\int _0^T t^2 |e_r(t)| \, dt$$\end{document} 6IT \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^2$$\end{document} SE \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\int _0^T t^2 e_r^2(t) \, dt$$\end{document} Table 7. Key error indices for various controller techniques for constant desired speed.S.N.MethodISEIAEITAEITSEIT^2^AEIT^2^SE1Klein et al. \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1.7{\times }10^3$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$2.1{\times }10^2$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$6.7{\times }10^4$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1.7{\times }10^4$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$4.3{\times }10^7$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1.0{\times }10^7$$\end{document} 2Fertik & Sharpe \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$5.7{\times }10^3$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$4.2{\times }10^2$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1.0{\times }10^5$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$4.7{\times }10^4$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$6.5{\times }10^7$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$2.2{\times }10^7$$\end{document} 3McMillan \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$6.0{\times }10^{301}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1.6{\times }10^{154}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1.5{\times }10^{154}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$5.7{\times }10^{304}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1.4{\times }10^{157}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$5.3{\times }10^{307}$$\end{document} 4Edgar et al. \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1.2{\times }10^4$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1.0{\times }10^3$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$2.0{\times }10^5$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$2.6{\times }10^5$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1.2{\times }10^8$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$8.0{\times }10^7$$\end{document} 5TD3 RL \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\bf {2.9{\times }10^1}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\bf{9.9{\times }10^1}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\bf{4.1{\times }10^4}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\bf{6.0{\times }10^3}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\bf{2.6{\times }10^7}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\bf{3.7{\times }10^6}$$\end{document}
The results from Fig. 6 indicate that the TD3 RL based adaptive controller exhibits superior tracking capability for DC motor model with fast convergence and minimal overshoot to the desired speed. In contrast, the Klein et al. and Fertik and Sharpe controllers demonstrate a slower response, with noticeable overshoot and settling time. The McMillan controller suffers from extreme instability, as reflected in its high oscillatory response. Similarly, the Edgar et al. controller exhibits excessive oscillations before stabilizing, making it inadequate for precise speed regulation.
The quantitative analysis in terms of error indices is presented in Table 7. This table further confirms effectiveness of the TD3 RL controller against rest of benchmark controller techniques. It achieves the lowest ISE value of 28.6, a significant improvement over Klein et al. (i.e. 1724), Fertik and Sharpe (i.e. 5713), and Edgar et al. (i.e. 1.174e+04). McMillan controller’s extremely high error values indicate an unstable response across all metrics, highlighting its unsuitability for precise speed control. The IAE value of 99.31 for TD3 RL is the lowest, reflecting better sustained error minimization. Similarly, for ITAE and ITSE metrics, TD3 RL achieves 4.14e+04 and 5977, significantly outperforming the others. Additionally, the significant reductions are also observed for the higher-order error indices (IT \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^2$$\end{document} AE and IT \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^2$$\end{document} SE) for proposed adaptive controller. That further confirms the robustness and efficacy of TD3 RL based speed controller for DC motor system.
Test case 2: Desired speed is considered as variable
In real world application scenarios, the speed of DC motor rarely remains constant for entire operation period. Designed speed controller may perform well for constant speed but it might fail to adjust the desired speed in dynamic operating conditions. That’s why it become necessary to test the performance of all controllers for variable desired speeds in order to evaluate their efficacy while ensuring smooth transitions between speeds. This test case is considered to verify the stability, robustness, and tracking accuracy of all DC motor speed controllers. Responses of all controllers, which are tested earlier, are plotted in Fig. 7 for tracking variable desired speed.Fig. 7. The DC motor response corresponding to variable desired speed. (a–d) represent different approaches compared with TD3 RL based adaptive controller.
As observed from Fig. 7, the TD3 RL controller successfully tracks the sharp changes in desired speed with minimal steady-state error and rapid adaptation. In contrast, the Klein et al. and the Fertik and Sharpe controllers exhibit slower transient responses, leading to noticeable tracking delays in desired speed for DC motor system. The McMillan controller again demonstrates excessive oscillations, failing to provide stable control over varying speed demands. Similar excessive oscillations are also observed for the Edgar et al. controller before settling to desired speed. The unstable behavior observed in both the Edgar et al. and McMillan controller techniques highlights their limitations in handling dynamic speed variations effectively.Table 8. Key error indices for various controller techniques for variable desired speed.S.N.MethodISEIAEITAEITSEIT \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${^2}$$\end{document} AEIT \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${^2}$$\end{document} SE1Klein et al. \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$6.2{\times }10^3$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$3.7{\times }10^2$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1.3{\times }10^5$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1.9{\times }10^6$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$6.4{\times }10^7$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$7.8{\times }10^8$$\end{document} 2Fertik & Sharpe \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$2.1{\times }10^4$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1.0{\times }10^3$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$3.4{\times }10^5$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$6.5{\times }10^6$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1.6{\times }10^8$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$2.7{\times }10^9$$\end{document} 3McMillan \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1.0{\times }10^{257}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$6.8{\times }10^{128}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$5.4{\times }10^{131}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$5.4{\times }10^{131}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$4.3{\times }10^{134}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$6.6{\times }10^{262}$$\end{document} 4Edgar et al. \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1.2{\times }10^4$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1.0{\times }10^3$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$2.0{\times }10^5$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$2.6{\times }10^5$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1.2{\times }10^8$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$8.0{\times }10^7$$\end{document} 5TD3 RL \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\bf{6.8{\times }10^1}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\bf{1.2{\times }10^2}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\bf{4.2{\times }10^4}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\bf{2.2{\times }10^4}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\bf{2.1{\times }10^7}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\bf{9.3{\times }10^6}$$\end{document}
The comparison among error indices, presented in Table 8, further validate the superiority of the TD3 RL controller compared to the rest of benchmark controller techniques. It maintains the lowest ISE, IAE, ITAE, and ITSE values, indicating enhanced tracking accuracy and adaptability. The ISE value for TD3 RL based controller is 67.58. It is far better than next best value of 6219 (Klein et al.), indicating exceptional tracking precision. McMillan again exhibits an unstable response with excessively high values for all types of performance indices. The IAE value of 116.2 is the lowest for proposed controller, showing consistent error minimization. Further, in the ITAE and ITSE metrics, TD3 RL records 4.243e+04 and 2.179e+04, which are far better than rest of control strategies. The significant reduction is again observed in higher-order error indices (IT²AE and IT²SE) which further supports that the TD3 RL based adaptive controller’s outperformed to minimize fluctuations and energy consumption during speed transitions. The improved performance under dynamic speed variations demonstrates the capability of the TD3 RL-based controller to handle real-world operating scenarios for DC motor system more effectively than benchmark controller techniques.
Analysis of control effort and performance trade-offs
The evaluation of integrated absolute control effort for each controller technique is critical parameter to quantifies the total control energy expended by any controller. It is also practical consideration for measuring the system efficiency and overall energy consumption. The control effort \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$J_{effort}$$\end{document} is defined as:
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} {J_{effort} = \int _{0}^{T} |u(t)| dt} \end{aligned}$$\end{document}where u(t) is the control signal and T is the simulation time.
Both constant and variable desired speed cases are taken into consideration for calculation of the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$J_{effort}$$\end{document} . The results are demonstrated in Table 9 reveal a critical performance trade-off.Table 9. Control-effort metric for both test cases.S.N.MethodTest case-1Test case-21Klein et al.502.236792Fertik and Sharpe483.5****36443McMillan \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$7.805 \times 10^{20}$$\end{document} \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$4.524 \times 10^{149}$$\end{document} 4Edgar et al.61641765TD3 RL533.33832
The TD3 RL controller achieves a control effort of 533.3 and 3832 in test case 1 and 2, respectively. It’s performance is highly competitive and lies within a narrow margin of the best-performing classical controllers by this metric. This marginal difference in energy consumption must be acknowledge in the context of overall performance.
While the Fertik and Sharpe method shows a slightly lower control effort, this minor advantage comes at a significant cost to desired speed tracking precision, as interpreted by its higher error indices reported in Tables 7 and 8. Whereas, the proposed TD3RL method presents a superior balance, delivering near-optimal control efforts while simultaneously achieving the lowest tracking errors indices. It avoids the undesirable extreme control effort of the McMillan method and also provides a more stable solution than the Edgar et al. technique. Therefore, the TD3 RL controller emerges as the most balanced performing method after considering trade-off between energy consumption and desired speed tracking accuracy.
Impact of setpoint filter on response of TD3RL agent
The setpoint filter plays critical role in determining how aggressively the desired speed trajectory is followed by the TD3 agent. Three test cases have been performed to evaluate the impact of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha _f$$\end{document} in which it’s value is taken as 0.5, 1.5, and 4.5, respectively. The response from all these test cases are plotted in Figs. 8 and 9 for constant and variable desired speed, respectively.Fig. 8. Response due to various setpoint filter on DC motor system for constant desired speed.Fig. 9. Response due to various setpoint filter on DC motor system for variable desired speed.
The \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha _f$$\end{document} of 1.5 is identified as the optimal value. It provides a fast response with minimal overshoot and, most critically. Comparatively, it eliminates the undesired oscillations observed as with the 0.5 value of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha _f$$\end{document} , while avoiding the excessive overshoot of the 4.5 value.
Conclusion and future work
This study proposed the design and implementation of TD3 RL based adaptive speed controller for the DC motor model while considering various types of uncertainties in order to test it’s performance in dynamic operating scenario. Both qualitative and quantitative results indicate that TD3 RL based controller consistently outperformed rest of benchmark controller techniques across all error indices for both constant and variable desired speed test cases, indicating its higher adaptability, efficacy, and robustness. Proposed adaptive controller exhibits least values for both IAE and ISE, for both test cases, which indicates exceptional tracking precision and error minimization qualities. Further, the lowest values for higher-order error indices (IT \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^2$$\end{document} AE and IT \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^2$$\end{document} SE) suggest that the proposed TD3 RL based adaptive speed controller maintains performance despite uncertainties in dynamic environment for both test cases. Thus, this study highlighted the potential of RL in the DC motor speed regulation in efficient manner. The presented research work is achieved via software-based validation in MATLAB Simulink environment. Future work will focus on bridging this gap through hardware-in-loop (HIL) validation on a real DC motor setup under parameter variation and a comparative analysis with other RL algorithms, such as SAC, PPO, and A3C, to detailed comparative analysis. Multiple safety protocol should be followed for transition from simulation to practice such as defining ramp rate for desired speed variation and setting up HIL setup for continuous feedback on TD3 agent’s training. Additionally, a supervisory safety layer should be deployed for monitoring for fault conditions such as over-current, excessive speed deviation, or abnormal winding temperatures. Emergency-stop mechanisms would be critical in order to override the RL agent’s commands for prevention of damage to the DC motor system in case of any contingency occurs.
The performance of TD3RL algorithm is sensitivity to hyperparameter selection, particularly to replay buffer, discount factor and total number of episodes for training period. Hence, the values of hyper-parameters used in this research work may not be generalize to other industrial systems without extensive retraining of agent. Additionally, the memory requirements grow substantially with replay buffer size and complexity of the operating system. The implementation of TD3RL algorithm framework to complex industrial objectives is future scope of this work. The TD3RL can be utilized in autonomous drone navigation as it has robust capability for handling of aerodynamic disturbances and providing smooth operating control. Similarly, in load frequency control for smart grids, the trained TD3 RL can enables continuous adjustments to maintain grid frequency while taking fluctuating power demands and non-linear grid dynamics into the account. Additionally, TD3 can be utilized for the speed control of electric vehicles by efficiently handling external driving uncertainties like road gradient, temperature, speed limit, rate of traffic flow, etc.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Thakur, P. & Talwandi, N. S. Deep reinforcement learning in healthcare and bio-medical applications. In 2024 IEEE International Conference on Computing, Power and Communication Technologies (IC 2PCT), vol. 5, 742–747. 10.1109/IC 2PCT 60090.2024.10486549 (2024).
- 2Hu, Y.-J. & Lin, S.-J. Deep reinforcement learning for optimizing finance portfolio management. In 2019 Amity International Conference on Artificial Intelligence (AICAI), 14–20. 10.1109/AICAI.2019.8701368 (2019).
- 3Barbalho, P. et al. Reinforcement learning solutions for microgrid control and management: A survey. IEEE Access (2025).
- 4Silva, C., Andrade, P., Ribeiro, B.,& Santos, F. B Adaptive reinforcement learning for task scheduling in aircraft maintenance. Sci. Rep.13, 16605 (2023).10.1038/s 41598-023-41169-3PMC 1054782937789033 · doi ↗ · pubmed ↗
- 5Boubin, J. et al. Marble: Multi-agent reinforcement learning at the edge for digital agriculture. In 2022 IEEE/ACM 7th Symposium on Edge Computing (SEC), 68–81 (IEEE, 2022).
- 6Han, B., Ren, Z., Wu, Z., Zhou, Y. & Peng, J. Off-policy reinforcement learning with delayed rewards. In International conference on machine learning, 8280–8303 (PMLR, 2022).
- 7Mnih, V. Playing atari with deep reinforcement learning. Preprint at ar Xiv:1312.5602 (2013).
- 8Silver, D. et al. Deterministic policy gradient algorithms. In International conference on machine learning, 387–395 (PMLR, 2014).
