Broad Critic Deep Actor Reinforcement Learning for Continuous Control
Shiron Thalagala, Pak Kin Wong, Xiaozheng Wang, and Tianang Sun

TL;DR
This paper introduces a hybrid actor-critic reinforcement learning framework combining broad learning systems with deep neural networks to improve training efficiency and accuracy in continuous control tasks.
Contribution
It presents a novel hybrid architecture integrating BLS with DNNs, enhancing existing actor-critic algorithms for better efficiency and adaptability.
Findings
BLS-augmented algorithms outperform original versions in training speed.
The hybrid framework improves accuracy in continuous control tasks.
Enhanced algorithms are suitable for real-time control scenarios.
Abstract
In the domain of continuous control, deep reinforcement learning (DRL) demonstrates promising results. However, the dependence of DRL on deep neural networks (DNNs) results in the demand for extensive data and increased computational cost. To address this issue, a novel hybrid actor-critic reinforcement learning (RL) framework is introduced. The proposed framework integrates the broad learning system (BLS) with DNN, aiming to merge the strengths of both distinct architectural paradigms. Specifically, the critic network employs BLS for rapid value estimation via ridge regression, while the actor network retains the DNN structure to optimize policy gradients. This hybrid design is generalizable and can enhance existing actor-critic algorithms. To demonstrate its versatility, the proposed framework is integrated into three widely used actor-critic algorithms -- deep deterministic policy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Neural Networks and Reservoir Computing
MethodsDense Connections · Experience Replay · *Communicated@Fast*How Do I Communicate to Expedia? · Convolution · Batch Normalization · Adam · Weight Decay · Deep Deterministic Policy Gradient
