Progressive Semantic Communication for Efficient Edge-Cloud Vision-Language Models

Cyril Shih-Huan Hsu; Wig Yuan-Cheng Cheng; Chrysa Papagianni

arXiv:2604.26508·cs.LG·April 30, 2026

Progressive Semantic Communication for Efficient Edge-Cloud Vision-Language Models

Cyril Shih-Huan Hsu, Wig Yuan-Cheng Cheng, Chrysa Papagianni

PDF

1 Repo

TL;DR

This paper introduces a progressive semantic communication framework for edge-cloud vision-language models, enabling adaptive, bandwidth-efficient inference with maintained semantic fidelity on resource-constrained devices.

Contribution

It proposes a Meta AutoEncoder-based adaptive compression scheme that allows flexible, plug-and-play deployment of VLMs without fine-tuning, optimizing communication under bandwidth constraints.

Findings

01

Significant latency reduction at 1 Mbps uplink compared to baseline methods.

02

High semantic consistency maintained under high compression levels.

03

Effective deployment on embedded platforms with end-to-end system demonstrated.

Abstract

Deploying Vision-Language Models (VLMs) on edge devices remains challenging due to their substantial computational and memory demands, which exceed the capabilities of resource-constrained embedded platforms. Conversely, fully offloading inference to the cloud is often impractical in bandwidth-limited environments, where transmitting raw visual data introduces substantial latency overhead. While recent edge-cloud collaborative architectures attempt to partition VLM workloads across devices, they typically rely on transmitting fixed-size representations, lacking adaptability to dynamic network conditions and failing to fully exploit semantic redundancy. In this paper, we propose a progressive semantic communication framework for edge-cloud VLM inference, using a Meta AutoEncoder that compresses visual tokens into adaptive, progressively refinable representations, enabling plug-and-play…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

open-ep/ProSemComVLM
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.