SLIDE: Simultaneous Model Downloading and Inference at the Wireless Network Edge

Guanqiao Qu; Tao Li; Qian Chen; Xianhao Chen; Sheng Zhou

arXiv:2512.20946·cs.NI·April 21, 2026

SLIDE: Simultaneous Model Downloading and Inference at the Wireless Network Edge

Guanqiao Qu, Tao Li, Qian Chen, Xianhao Chen, Sheng Zhou

PDF

TL;DR

SLIDE enables real-time inference at the wireless network edge by allowing simultaneous model downloading and inference, optimizing resource allocation for improved throughput.

Contribution

The paper introduces SLIDE, a novel framework that enables concurrent model downloading and inference, with an efficient algorithm for resource optimization in multi-user wireless systems.

Findings

01

SLIDE significantly improves task throughput compared to traditional schemes.

02

The recursive dependency model captures the latency impact of layer-wise downloading and inference.

03

Proposed algorithm finds optimal resource allocation with polynomial complexity.

Abstract

To support on-device inference, the next-generation mobile networks are expected to support real-time model downloading services to mobile users. However, powerful AI models typically have large model sizes, resulting in excessive end-to-end (E2E) downloading-and-inference (DAI) latency. To address this issue, we propose a simultaneous model downloading and inference (SLIDE) framework, which allows users to perform inference with downloaded layers while simultaneously receiving the remaining layers of the model. To this end, we formulate a task throughput maximization problem by jointly optimizing model provisioning, spectrum bandwidth allocation, and computing resource allocation for multi-user downlink systems. Unlike traditional DAI frameworks, SLIDE introduces recursive dependencies across layers, where inference latency depends recursively on the downloading bandwidth and computing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.