VideoQA-SC: Adaptive Semantic Communication for Video Question Answering
Jiangyuan Guo, Wei Chen, Yuxuan Sun, Jialong Xu, Bo Ai

TL;DR
This paper introduces VideoQA-SC, an innovative semantic communication system for video question answering that transmits task-relevant video semantics directly over noisy channels, bypassing video reconstruction to improve efficiency and robustness.
Contribution
It proposes an end-to-end adaptive semantic communication framework with a spatiotemporal encoder and a deep joint source-channel coding scheme tailored for VideoQA tasks, enhancing performance over noisy channels.
Findings
Outperforms traditional SC systems in accuracy and bandwidth efficiency.
Achieves 5.17% higher answer accuracy at low SNR.
Saves 99.5% bandwidth compared to existing systems.
Abstract
Although semantic communication (SC) has shown its potential in efficiently transmitting multimodal data such as texts, speeches and images, SC for videos has focused primarily on pixel-level reconstruction. However, these SC systems may be suboptimal for downstream intelligent tasks. Moreover, SC systems without pixel-level video reconstruction present advantages by achieving higher bandwidth efficiency and real-time performance of various intelligent tasks. The difficulty in such system design lies in the extraction of task-related compact semantic representations and their accurate delivery over noisy channels. In this paper, we propose an end-to-end SC system, named VideoQA-SC for video question answering (VideoQA) tasks. Our goal is to accomplish VideoQA tasks directly based on video semantics over noisy or fading wireless channels, bypassing the need for video reconstruction at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques
