Video TokenCom: Textual Intent-Guided Multi-Rate Video Token Communications with UEP-Based Adaptive Source-Channel Coding
Jingxuan Men, Mahdi Boloursaz Mashhadi, Ning Wang, Yi Ma, Mike Nilsson, Rahim Tafazolli

TL;DR
This paper introduces Video TokenCom, a semantic-aware multi-rate video communication framework guided by textual intent, utilizing UEP-based adaptive coding to improve semantic fidelity under bandwidth constraints.
Contribution
It proposes a novel framework integrating textual intent with video tokenization and UEP-based adaptive coding, enhancing semantic fidelity in wireless video transmission.
Findings
Outperforms conventional and semantic baselines in perceptual quality.
Effective rate savings through differential encoding of tokens.
Robust performance across various SNR conditions.
Abstract
Token Communication (TokenCom) is a new paradigm, motivated by the recent success of Large AI Models (LAMs) and Multimodal Large Language Models (MLLMs), where tokens serve as unified units of communication and computation, enabling efficient semantic- and goal-oriented information exchange in future wireless networks. In this paper, we propose a novel Video TokenCom framework for textual intent-guided multi-rate video communication with Unequal Error Protection (UEP)-based source-channel coding adaptation. The proposed framework integrates user-intended textual descriptions with discrete video tokenization and unequal error protection to enhance semantic fidelity under restrictive bandwidth constraints. First, discrete video tokens are extracted through a pretrained video tokenizer, while text-conditioned vision-language modeling and optical-flow propagation are jointly used to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWireless Signal Modulation Classification · Wireless Communication Security Techniques · Video Coding and Compression Technologies
