Scheduling Techniques of AI Models on Modern Heterogeneous Edge GPU -- A Critical Review
Ashiyana Abdul Majeed, Mahmoud Meribout

TL;DR
This paper critically reviews scheduling techniques for AI models on modern heterogeneous edge GPUs, focusing on their methodologies, performance, and potential for improving resource utilization in resource-constrained edge devices.
Contribution
It provides a comprehensive analysis of existing DNN schedulers on NVIDIA Jetson devices, highlighting current research and future development directions.
Findings
Evaluates various scheduler methodologies and their effectiveness.
Identifies gaps in current scheduling frameworks for edge AI devices.
Suggests future research areas for optimizing AI model execution on heterogeneous hardware.
Abstract
In recent years, the development of specialized edge computing devices has significantly increased, driven by the growing demand for AI models. These devices, such as the NVIDIA Jetson series, must efficiently handle increased data processing and storage requirements. However, despite these advancements, there remains a lack of frameworks that automate the optimal execution of optimal execution of deep neural network (DNN). Therefore, efforts have been made to create schedulers that can manage complex data processing needs while ensuring the efficient utilization of all available accelerators within these devices, including the CPU, GPU, deep learning accelerator (DLA), programmable vision accelerator (PVA), and video image compositor (VIC). Such schedulers would maximize the performance of edge computing systems, crucial in resource-constrained environments. This paper aims to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Stochastic Gradient Optimization Techniques · Cloud Computing and Resource Management
