Enhancing Hardware Fault Tolerance in Machines with Reinforcement Learning Policy Gradient Algorithms
Sheila Schoepp, Mehran Taghian, Shotaro Miwa, Yoshihiro Mitsuka,, Shadan Golestan, Osmar Za\"iane

TL;DR
This paper explores how reinforcement learning algorithms PPO and SAC can improve hardware fault tolerance in machines, demonstrating rapid adaptation in simulated environments with various faults, and analyzing knowledge transfer methods.
Contribution
It introduces the application of PPO and SAC algorithms for hardware fault tolerance, including an ablation study on knowledge transfer in continual learning settings.
Findings
PPO adapts fastest with retained knowledge.
SAC performs best when knowledge is discarded.
Reinforcement learning enables rapid fault adaptation within minutes.
Abstract
Industry is rapidly moving towards fully autonomous and interconnected systems that can detect and adapt to changing conditions, including machine hardware faults. Traditional methods for adding hardware fault tolerance to machines involve duplicating components and algorithmically reconfiguring a machine's processes when a fault occurs. However, the growing interest in reinforcement learning-based robotic control offers a new perspective on achieving hardware fault tolerance. However, limited research has explored the potential of these approaches for hardware fault tolerance in machines. This paper investigates the potential of two state-of-the-art reinforcement learning algorithms, Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC), to enhance hardware fault tolerance into machines. We assess the performance of these algorithms in two OpenAI Gym simulated environments,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advancements in Semiconductor Devices and Circuit Design · Blockchain Technology Applications and Security
Methods1x1 Convolution · Global Average Pooling · Average Pooling · Entropy Regularization · Convolution · Proximal Policy Optimization · Dilated Convolution · Switchable Atrous Convolution
