TL;DR
This paper introduces a progressive transmission framework for deep learning models that enables approximate inference during model delivery, improving user experience over slow networks without increasing transmission time or model size.
Contribution
It presents a novel method for dividing and transmitting deep models progressively, allowing for early approximate inference on user devices.
Findings
Efficiently transmits models without increasing total transmission time.
Provides acceptable intermediate outputs during transmission.
Enhances user experience in slow network conditions.
Abstract
Modern image files are usually progressively transmitted and provide a preview before downloading the entire image for improved user experience to cope with a slow network connection. In this paper, with a similar goal, we propose a progressive transmission framework for deep learning models, especially to deal with the scenario where pre-trained deep learning models are transmitted from servers and executed at user devices (e.g., web browser or mobile). Our progressive transmission allows inferring approximate models in the middle of file delivery, and quickly provide an acceptable intermediate outputs. On the server-side, a deep learning model is divided and progressively transmitted to the user devices. Then, the divided pieces are progressively concatenated to construct approximate models on user devices. Experiments show that our method is computationally efficient without…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
