DiviML: A Module-based Heuristic for Mapping Neural Networks onto Heterogeneous Platforms
Yassine Ghannane, Mohamed S. Abdelfattah

TL;DR
DiviML introduces a modular heuristic framework for efficiently mapping deep neural networks onto heterogeneous hardware, significantly reducing latency and increasing throughput compared to naive deployment, with scalable solutions and theoretical bounds.
Contribution
We develop a general, scalable framework for heterogeneous DNN compilation that combines exact and heuristic methods, including a novel lower bound for solution quality assessment.
Findings
Achieves over 3x lower latency and 2.9x higher throughput compared to naive GPU deployment.
Heuristic improves runtime by up to 395x with minimal quality loss.
Extensible to large language models across multiple heterogeneous servers.
Abstract
Datacenters are increasingly becoming heterogeneous, and are starting to include specialized hardware for networking, video processing, and especially deep learning. To leverage the heterogeneous compute capability of modern datacenters, we develop an approach for compiler-level partitioning of deep neural networks (DNNs) onto multiple interconnected hardware devices. We present a general framework for heterogeneous DNN compilation, offering automatic partitioning and device mapping. Our scheduler integrates both an exact solver, through a mixed integer linear programming (MILP) formulation, and a modularity-based heuristic for scalability. Furthermore, we propose a theoretical lower bound formula for the optimal solution, which enables the assessment of the heuristic solutions' quality. We evaluate our scheduler in optimizing both conventional DNNs and randomly-wired neural networks,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Ferroelectric and Negative Capacitance Devices · Brain Tumor Detection and Classification
