Fault Tolerant Reconfigurable ML Multiprocessor
Tangrui Li, Justin Y. Shi, Matteo Spatola, Hongzheng Wang

TL;DR
This paper explores a reconfigurable, fault-tolerant multiprocessor architecture inspired by von Neumann design, demonstrating its adaptability for neural network training and discussing integration with MLIR compilers for diverse hardware support.
Contribution
It introduces a novel reconfigurable multiprocessor architecture tailored for neural network workflows, emphasizing fault tolerance and adaptability.
Findings
Feasibility of the architecture for NN training workflows
Enhanced robustness through reconfiguration
Potential integration with MLIR for hardware diversity
Abstract
This paper reports three computational experiments for a von Neumann inspired reconfigurable fault tolerant multiprocessor for neural network (NN) training workflows. The experiments are intended to prove the feasibility of the proposed reconfigurable multiprocessor architecture for non-regular workflows on robustness of adaptability. A potential integration with MLIR compilers is also discussed for integrating diverse accelerator hardware for existing practical applications.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadiation Effects in Electronics · Embedded Systems Design Techniques · Parallel Computing and Optimization Techniques
