Semantic Communications with World Models

Peiwen Jiang; Jiajia Guo; Chao-Kai Wen; Shi Jin; Jun Zhang

arXiv:2510.24785·eess.IV·October 30, 2025

Semantic Communications with World Models

Peiwen Jiang, Jiajia Guo, Chao-Kai Wen, Shi Jin, Jun Zhang

PDF

TL;DR

This paper introduces a WFM-aided semantic video transmission framework that reduces bandwidth usage by predicting future frames and selectively transmitting data, effectively handling low bandwidth and variable channel conditions.

Contribution

It proposes a novel framework combining world foundation models with feedback and segmentation techniques for efficient semantic video transmission under challenging conditions.

Findings

01

Significantly reduces transmission overhead.

02

Maintains task performance across diverse scenarios.

03

Effective proactive scheduling in mobile environments.

Abstract

Semantic communication is a promising technique for emerging wireless applications, which reduces transmission overhead by transmitting only task-relevant features instead of raw data. However, existing methods struggle under extremely low bandwidth and varying channel conditions, where corrupted or missing semantics lead to severe reconstruction errors. To resolve this difficulty, we propose a world foundation model (WFM)-aided semantic video transmission framework that leverages the predictive capability of WFMs to generate future frames based on the current frame and textual guidance. This design allows transmissions to be omitted when predictions remain reliable, thereby saving bandwidth. Through WFM's prediction, the key semantics are preserved, yet minor prediction errors tend to amplify over time. To mitigate issue, a lightweight depth-based feedback module is introduced to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.