Enhancing Hardware Fault Tolerance in Machines with Reinforcement   Learning Policy Gradient Algorithms

Sheila Schoepp; Mehran Taghian; Shotaro Miwa; Yoshihiro Mitsuka,; Shadan Golestan; Osmar Za\"iane

arXiv:2407.15283·cs.LG·July 23, 2024·2 cites

Enhancing Hardware Fault Tolerance in Machines with Reinforcement Learning Policy Gradient Algorithms

Sheila Schoepp, Mehran Taghian, Shotaro Miwa, Yoshihiro Mitsuka,, Shadan Golestan, Osmar Za\"iane

PDF

Open Access

TL;DR

This paper explores how reinforcement learning algorithms PPO and SAC can improve hardware fault tolerance in machines, demonstrating rapid adaptation in simulated environments with various faults, and analyzing knowledge transfer methods.

Contribution

It introduces the application of PPO and SAC algorithms for hardware fault tolerance, including an ablation study on knowledge transfer in continual learning settings.

Findings

01

PPO adapts fastest with retained knowledge.

02

SAC performs best when knowledge is discarded.

03

Reinforcement learning enables rapid fault adaptation within minutes.

Abstract

Industry is rapidly moving towards fully autonomous and interconnected systems that can detect and adapt to changing conditions, including machine hardware faults. Traditional methods for adding hardware fault tolerance to machines involve duplicating components and algorithmically reconfiguring a machine's processes when a fault occurs. However, the growing interest in reinforcement learning-based robotic control offers a new perspective on achieving hardware fault tolerance. However, limited research has explored the potential of these approaches for hardware fault tolerance in machines. This paper investigates the potential of two state-of-the-art reinforcement learning algorithms, Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC), to enhance hardware fault tolerance into machines. We assess the performance of these algorithms in two OpenAI Gym simulated environments,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advancements in Semiconductor Devices and Circuit Design · Blockchain Technology Applications and Security

Methods1x1 Convolution · Global Average Pooling · Average Pooling · Entropy Regularization · Convolution · Proximal Policy Optimization · Dilated Convolution · Switchable Atrous Convolution