On-Chip Sensors Data Collection and Analysis for SoC Health Management
Konstantin Shibin, Maksim Jenihhin, Artur Jutman, Sergei Devadze,, Anton Tsertov

TL;DR
This paper introduces a Health Map structure and algorithms for analyzing on-chip sensor data, enabling effective hardware health management and fault mitigation in modern SoCs.
Contribution
It proposes a novel Health Map framework and data aggregation algorithms to improve on-chip sensor data analysis for hardware health management.
Findings
Health Map effectively captures fault and resource information.
Algorithms enable accurate data aggregation and classification.
Supports hierarchical health management for increased system reliability.
Abstract
Data produced by on-chip sensors in modern SoCs contains a large amount of information such as occurring faults, aging status, accumulated radiation dose, performance characteristics, environmental and other operational parameters. Such information provides insight into the overall health of a system's hardware as well as the operability of individual modules. This gives a chance to mitigate faults and avoid using faulty units, thus enabling hardware health management. Raw data from embedded sensors cannot be immediately used to perform health management tasks. In most cases, the information about occurred faults needs to be analyzed taking into account the history of the previously reported fault events and other collected statistics. For this purpose, we propose a special structure called Health Map (HM) that holds the information about functional resources, occurring faults and maps…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadiation Effects in Electronics · VLSI and Analog Circuit Testing · Fault Detection and Control Systems
