PyGFI: Analyzing and Enhancing Robustness of Graph Neural Networks Against Hardware Errors
Ruixuan Wang, Fred Lin, Daniel Moore, Sriram Sankar, Xun Jiao

TL;DR
This paper conducts a large-scale empirical study on the fault tolerance of graph neural networks (GNNs) in hardware, revealing significant variability in resilience and proposing mitigation strategies to improve robustness.
Contribution
It is the first comprehensive empirical analysis of GNN resilience to hardware faults, developing a fault injection tool and exploring error mitigation techniques.
Findings
GNN error resilience varies greatly across models and datasets
Fault injection reveals significant impact of hardware errors on GNN accuracy
Proposed low-cost mitigation improves GNN robustness
Abstract
Graph neural networks (GNNs) have recently emerged as a promising learning paradigm in learning graph-structured data and have demonstrated wide success across various domains such as recommendation systems, social networks, and electronic design automation (EDA). Like other deep learning (DL) methods, GNNs are being deployed in sophisticated modern hardware systems, as well as dedicated accelerators. However, despite the popularity of GNNs and the recent efforts of bringing GNNs to hardware, the fault tolerance and resilience of GNNs have generally been overlooked. Inspired by the inherent algorithmic resilience of DL methods, this paper conducts, for the first time, a large-scale and empirical study of GNN resilience, aiming to understand the relationship between hardware faults and GNN accuracy. By developing a customized fault injection tool on top of PyTorch, we perform extensive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadiation Effects in Electronics · Machine Learning in Materials Science · Advanced Memory and Neural Computing
