Semantic Communications with World Models
Peiwen Jiang, Jiajia Guo, Chao-Kai Wen, Shi Jin, Jun Zhang

TL;DR
This paper introduces a WFM-aided semantic video transmission framework that reduces bandwidth usage by predicting future frames and selectively transmitting data, effectively handling low bandwidth and variable channel conditions.
Contribution
It proposes a novel framework combining world foundation models with feedback and segmentation techniques for efficient semantic video transmission under challenging conditions.
Findings
Significantly reduces transmission overhead.
Maintains task performance across diverse scenarios.
Effective proactive scheduling in mobile environments.
Abstract
Semantic communication is a promising technique for emerging wireless applications, which reduces transmission overhead by transmitting only task-relevant features instead of raw data. However, existing methods struggle under extremely low bandwidth and varying channel conditions, where corrupted or missing semantics lead to severe reconstruction errors. To resolve this difficulty, we propose a world foundation model (WFM)-aided semantic video transmission framework that leverages the predictive capability of WFMs to generate future frames based on the current frame and textual guidance. This design allows transmissions to be omitted when predictions remain reliable, thereby saving bandwidth. Through WFM's prediction, the key semantics are preserved, yet minor prediction errors tend to amplify over time. To mitigate issue, a lightweight depth-based feedback module is introduced to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
