Energy-Efficient Model Compression and Splitting for Collaborative Inference Over Time-Varying Channels
Mounssif Krouka, Anis Elgabli, Chaouki Ben Issaid, Mehdi Bennis

TL;DR
This paper introduces an energy-efficient model compression and dynamic splitting technique for collaborative DNN inference that adapts to changing channel conditions, reducing energy use and emissions while maintaining accuracy.
Contribution
It proposes a novel time-varying model split approach that optimizes energy efficiency in edge-cloud inference under varying channel conditions.
Findings
Significant reduction in energy consumption and CO2 emissions.
Robust inference performance across different channel conditions.
Effective adaptation to time-varying channels with minimal accuracy loss.
Abstract
Today's intelligent applications can achieve high performance accuracy using machine learning (ML) techniques, such as deep neural networks (DNNs). Traditionally, in a remote DNN inference problem, an edge device transmits raw data to a remote node that performs the inference task. However, this may incur high transmission energy costs and puts data privacy at risk. In this paper, we propose a technique to reduce the total energy bill at the edge device by utilizing model compression and time-varying model split between the edge and remote nodes. The time-varying representation accounts for time-varying channels and can significantly reduce the total energy at the edge device while maintaining high accuracy (low loss). We implement our approach in an image classification task using the MNIST dataset, and the system environment is simulated as a trajectory navigation scenario to emulate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
