A Survey on Deep Learning Hardware Accelerators for Heterogeneous HPC Platforms
Cristina Silvano, Daniele Ielmini, Fabrizio Ferrandi, Leandro Fiorin,, Serena Curzel, Luca Benini, Francesco Conti, Angelo Garofalo, Cristian, Zambelli, Enrico Calore, Sebastiano Fabio Schifano, Maurizio Palesi, Giuseppe, Ascia, Davide Patti, Nicola Petra, Davide De Caro

TL;DR
This survey comprehensively reviews recent hardware accelerators for deep learning in high-performance computing, covering GPU, FPGA, ASIC, neuromorphic, quantum, and emerging memory technologies.
Contribution
It provides a detailed classification and analysis of the latest DL accelerators across diverse hardware platforms and emerging paradigms, highlighting recent advancements.
Findings
Diverse hardware accelerators have been developed for DL in HPC.
Emerging technologies like quantum and photonics are gaining interest.
Specialized accelerators outperform general-purpose hardware in specific tasks.
Abstract
Recent trends in deep learning (DL) have made hardware accelerators essential for various high-performance computing (HPC) applications, including image classification, computer vision, and speech recognition. This survey summarizes and classifies the most recent developments in DL accelerators, focusing on their role in meeting the performance demands of HPC applications. We explore cutting-edge approaches to DL acceleration, covering not only GPU- and TPU-based platforms but also specialized hardware such as FPGA- and ASIC-based accelerators, Neural Processing Units, open hardware RISC-V-based accelerators, and co-processors. This survey also describes accelerators leveraging emerging memory technologies and computing paradigms, including 3D-stacked Processor-In-Memory, non-volatile memories like Resistive RAM and Phase Change Memories used for in-memory computing, as well as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Memory and Neural Computing · Parallel Computing and Optimization Techniques · CCD and CMOS Imaging Sensors
