Benchmarking Edge AI Platforms for High-Performance ML Inference

Rakshith Jayanth; Neelesh Gupta; Viktor Prasanna

arXiv:2409.14803·cs.AI·September 24, 2024

Benchmarking Edge AI Platforms for High-Performance ML Inference

Rakshith Jayanth, Neelesh Gupta, Viktor Prasanna

PDF

Open Access

TL;DR

This paper benchmarks various edge AI hardware platforms, revealing the strengths and trade-offs of CPU, GPU, and NPU solutions for neural network inference in terms of latency, throughput, and power efficiency.

Contribution

It provides a comprehensive comparison of edge AI hardware, highlighting the performance advantages of NPUs and GPUs for specific neural network tasks, guiding deployment choices.

Findings

01

NPU outperforms in matrix-vector multiplication and neural network tasks

02

GPU excels in matrix multiplication and LSTM networks

03

NPU offers a good balance of latency, throughput, and power consumption

Abstract

Edge computing's growing prominence, due to its ability to reduce communication latency and enable real-time processing, is promoting the rise of high-performance, heterogeneous System-on-Chip solutions. While current approaches often involve scaling down modern hardware, the performance characteristics of neural network workloads on these platforms can vary significantly, especially when it comes to parallel processing, which is a critical consideration for edge deployments. To address this, we conduct a comprehensive study comparing the latency and throughput of various linear algebra and neural network inference tasks across CPU-only, CPU/GPU, and CPU/NPU integrated solutions. {We find that the Neural Processing Unit (NPU) excels in matrix-vector multiplication (58.6% faster) and some neural network tasks (3.2 $\times$ faster for video classification and large language models). GPU…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Explainable Artificial Intelligence (XAI)

MethodsTanh Activation · Sigmoid Activation · Long Short-Term Memory