# A Hierarchical Dispatcher for Scheduling Multiple Deep Neural Networks (DNNs) on Edge Devices

**Authors:** Hyung Kook Jun, Taeho Kim, Sang Cheol Kim, Young Ik Eom

PMC · DOI: 10.3390/s25072243 · 2025-04-02

## TL;DR

This paper introduces a hierarchical system for efficiently scheduling deep neural networks on edge devices with different types of processors.

## Contribution

A novel hierarchical dispatcher architecture that separates scheduling policy from execution for scalable DNN scheduling on edge devices.

## Key findings

- The hierarchical dispatcher improves performance by 51.6% on average through optimized scheduling.
- The architecture supports both homogeneous and heterogeneous processing unit environments.
- Case studies demonstrate the practicality of the approach on real edge devices.

## Abstract

This paper presents a hierarchical dispatcher architecture designed to efficiently schedule the execution of multiple deep neural networks (DNNs) on edge devices with heterogeneous processing units (PUs). The proposed architecture is applicable to systems where PUs are either integrated on a single edge device or distributed across multiple devices. We separate the dispatcher and scheduling policy. The dispatcher in our framework acts as a mechanism for allocating, executing, and managing subgraphs of DNNs across various PUs, and the scheduling policy generates optimized scheduling sequences. We formalize a hierarchical structure consisting of high-level and low-level dispatchers, which together provide scalable and flexible scheduling support for diverse DNN workloads. The high-level dispatcher oversees the partitioning and distribution of subgraphs, while the low-level dispatcher handles the execution and coordination of subgraphs on allocated PUs. This separation of responsibilities allows the architecture to efficiently manage workloads in both homogeneous and heterogeneous environments. Through case studies on edge devices, we demonstrate the practicality of the proposed architecture. By integrating appropriate scheduling policies, our approach achieves an average performance improvement of 51.6%, providing a scalable and adaptable solution for deploying deep learning models on heterogeneous edge systems.

## Full-text entities

- **Genes:** TOR1B (torsin family 1 member B) [NCBI Gene 27348] {aka DQ1}
- **Diseases:** injury to (MESH:D014947)
- **Chemicals:** DNN (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

12 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11991227/full.md

---
Source: https://tomesphere.com/paper/PMC11991227