Benchmarking Deep Learning Models on NVIDIA Jetson Nano for Real-Time   Systems: An Empirical Investigation

Tushar Prasanna Swaminathan; Christopher Silver; Thangarajah Akilan

arXiv:2406.17749·cs.AR·June 26, 2024·2 cites

Benchmarking Deep Learning Models on NVIDIA Jetson Nano for Real-Time Systems: An Empirical Investigation

Tushar Prasanna Swaminathan, Christopher Silver, Thangarajah Akilan

PDF

Open Access 1 Repo

TL;DR

This paper empirically evaluates the performance of optimized deep learning models on NVIDIA Jetson Nano, demonstrating significant speed improvements and highlighting the importance of hardware-aware optimization for resource-constrained AI deployment.

Contribution

It provides a comprehensive analysis of model optimization effects on embedded devices, emphasizing hardware-specific tuning for improved inference speed and energy efficiency.

Findings

01

Optimized models are 16.11% faster on average.

02

Hardware-aware optimization enhances deployment efficiency.

03

Model optimization reduces energy consumption and carbon footprint.

Abstract

The proliferation of complex deep learning (DL) models has revolutionized various applications, including computer vision-based solutions, prompting their integration into real-time systems. However, the resource-intensive nature of these models poses challenges for deployment on low-computational power and low-memory devices, like embedded and edge devices. This work empirically investigates the optimization of such complex DL models to analyze their functionality on an embedded device, particularly on the NVIDIA Jetson Nano. It evaluates the effectiveness of the optimized models in terms of their inference speed for image classification and video action detection. The experimental results reveal that, on average, optimized models exhibit a 16.11% speed improvement over their non-optimized counterparts. This not only emphasizes the critical need to consider hardware constraints and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xtotodilex/deep-learning-model-optimization
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsQuantum-Dot Cellular Automata · Molecular Communication and Nanonetworks · Advanced Data and IoT Technologies

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings