Edge-Cloud Cooperation for DNN Inference via Reinforcement Learning and   Supervised Learning

Tinghao Zhang; Zhijun Li; Yongrui Chen; Kwok-Yan Lam; Jun Zhao

arXiv:2210.05182·cs.LG·October 12, 2022

Edge-Cloud Cooperation for DNN Inference via Reinforcement Learning and Supervised Learning

Tinghao Zhang, Zhijun Li, Yongrui Chen, Kwok-Yan Lam, Jun Zhao

PDF

TL;DR

This paper presents an edge-cloud cooperation framework for DNN inference that combines reinforcement learning for model compression and supervised learning for offloading decisions, significantly reducing latency and improving accuracy in IoT systems.

Contribution

It introduces a novel combination of RL-based DNN compression and SL-based offloading for efficient edge-cloud inference, with real hardware validation.

Findings

01

Lightweight models are up to 87.6% smaller than baseline models.

02

Offloading decisions are correct in most cases using the proposed strategy.

03

Inference latency is reduced by up to 78.8% with higher accuracy.

Abstract

Deep Neural Networks (DNNs) have been widely applied in Internet of Things (IoT) systems for various tasks such as image classification and object detection. However, heavyweight DNN models can hardly be deployed on edge devices due to limited computational resources. In this paper, an edge-cloud cooperation framework is proposed to improve inference accuracy while maintaining low inference latency. To this end, we deploy a lightweight model on the edge and a heavyweight model on the cloud. A reinforcement learning (RL)-based DNN compression approach is used to generate the lightweight model suitable for the edge from the heavyweight model. Moreover, a supervised learning (SL)-based offloading strategy is applied to determine whether the sample should be processed on the edge or on the cloud. Our method is implemented on real hardware and tested on multiple datasets. The experimental…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.