SparOA: Sparse and Operator-aware Hybrid Scheduling for Edge DNN Inference

Ziyang Zhang; Jie Liu; Luca Mottola

arXiv:2511.19457·cs.DC·November 26, 2025

SparOA: Sparse and Operator-aware Hybrid Scheduling for Edge DNN Inference

Ziyang Zhang, Jie Liu, Luca Mottola

PDF

Open Access

TL;DR

SparOA is a hybrid CPU-GPU inference framework that optimizes DNN performance on edge devices by leveraging sparsity and operator characteristics, using reinforcement learning for dynamic scheduling.

Contribution

The paper introduces SparOA, a novel hybrid inference framework that combines sparsity, operator-awareness, and reinforcement learning for optimized edge DNN inference.

Findings

01

Achieves 1.22-1.31x speedup over baselines.

02

Outperforms CPU-only by up to 50.7x.

03

Reduces energy consumption by 7-16%.

Abstract

The resource demands of deep neural network (DNN) models introduce significant performance challenges, especially when deployed on resource-constrained edge devices. Existing solutions like model compression often sacrifice accuracy, while specialized hardware remains costly and inflexible. Hybrid inference methods, however, typically overlook how operator characteristics impact performance. In this work, we present SparOA, a CPU-GPU hybrid inference framework, which leverages both sparsity and computational intensity to optimize operator scheduling. SparOA embraces aforementioned challenges through three key components: (1) a threshold predictor that accurately determines optimal sparsity and computational intensity thresholds; (2) a reinforcement learning-based scheduler that dynamically optimizes resource allocation based on real-time hardware states; and (3) a hybrid inference…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Big Data and Digital Economy · IoT and Edge/Fog Computing