Mobile-Cloud Inference for Collaborative Intelligence

Mateen Ulhaq

arXiv:2306.13982·cs.LG·June 27, 2023

Mobile-Cloud Inference for Collaborative Intelligence

Mateen Ulhaq

PDF

Open Access

TL;DR

This paper explores a shared mobile-cloud inference approach that performs partial inference on mobile devices to reduce latency, energy use, bandwidth, and enhance privacy, with additional gains from feature tensor compression.

Contribution

It introduces a collaborative inference framework that balances local and cloud processing, improving efficiency and privacy over traditional cloud-only methods.

Findings

01

Reduces inference latency and energy consumption

02

Decreases network bandwidth usage

03

Enhances privacy by keeping raw data on device

Abstract

As AI applications for mobile devices become more prevalent, there is an increasing need for faster execution and lower energy consumption for deep learning model inference. Historically, the models run on mobile devices have been smaller and simpler in comparison to large state-of-the-art research models, which can only run on the cloud. However, cloud-only inference has drawbacks such as increased network bandwidth consumption and higher latency. In addition, cloud-only inference requires the input data (images, audio) to be fully transferred to the cloud, creating concerns about potential privacy breaches. There is an alternative approach: shared mobile-cloud inference. Partial inference is performed on the mobile in order to reduce the dimensionality of the input data and arrive at a compact feature tensor, which is a latent space representation of the input signal. The feature…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIoT and Edge/Fog Computing · Age of Information Optimization · Context-Aware Activity Recognition Systems