MOC-RVQ: Multilevel Codebook-Assisted Digital Generative Semantic   Communication

Yingbin Zhou; Yaping Sun; Guanying Chen; Xiaodong Xu; Hao Chen,; Binhong Huang; Shuguang Cui; Ping Zhang

arXiv:2401.01272·cs.CV·October 1, 2024·1 cites

MOC-RVQ: Multilevel Codebook-Assisted Digital Generative Semantic Communication

Yingbin Zhou, Yaping Sun, Guanying Chen, Xiaodong Xu, Hao Chen,, Binhong Huang, Shuguang Cui, Ping Zhang

PDF

Open Access 1 Repo

TL;DR

This paper introduces MOC-RVQ, a multilevel semantic communication system that combines a multi-head octonary codebook and residual vector quantization with a Swin Transformer-based noise reduction, achieving high efficiency and quality in image transmission.

Contribution

The paper proposes a novel multilevel generative semantic communication framework with a two-stage training process, integrating a multi-head octonary codebook, RVQ, and a Swin Transformer-based noise reduction for improved image transmission.

Findings

01

Outperforms conventional methods like BPG and JPEG in image quality.

02

Achieves comparable performance to analog JSCC with significantly lower bandwidth.

03

Requires only one-sixth of the channel bandwidth ratio for high-quality transmission.

Abstract

Vector quantization-based image semantic communication systems have successfully boosted transmission efficiency, but face challenges with conflicting requirements between codebook design and digital constellation modulation. Traditional codebooks need wide index ranges, while modulation favors few discrete states. To address this, we propose a multilevel generative semantic communication system with a two-stage training framework. In the first stage, we train a high-quality codebook, using a multi-head octonary codebook (MOC) to compress the index range. In addition, a residual vector quantization (RVQ) mechanism is also integrated for effective multilevel communication. In the second stage, a noise reduction block (NRB) based on Swin Transformer is introduced, coupled with the multilevel codebook from the first stage, serving as a high-quality semantic knowledge base (SKB) for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

albert2x/moc_rvq
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Image Processing Techniques and Applications · Fractal and DNA sequence analysis

MethodsMulti-Head Attention · Attention Is All You Need · Byte Pair Encoding · Label Smoothing · Adam · Dropout · Linear Layer · Stochastic Depth · Absolute Position Encodings · Layer Normalization