E2E-WAVE: End-to-End Learned Waveform Generation for Underwater Video Multicasting
Khizar Anjum, Tingcong Jiang, Dario Pompili

TL;DR
E2E-WAVE is an end-to-end learned waveform system for underwater video multicasting that embeds semantic similarity into physical layer waveforms, enabling high-quality real-time video transmission over challenging channels.
Contribution
It introduces a novel end-to-end system combining semantic-aware waveform generation with differentiable OFDM for underwater video multicast, outperforming traditional methods.
Findings
Achieves +5 dB PSNR and +0.10 SSIM over baseline in underwater channels.
Delivers real-time 16 FPS video at 128x128 resolution over 2.3 kbps channels.
Generalizes to unseen underwater environments without retraining.
Abstract
We present E2E-WAVE, the first end-to-end learned waveform generation system for underwater video multicasting. Acoustic channels exhibit 20--46% bit error rates where forward error correction becomes counterproductive -- LDPC increases rather than decreases errors beyond its decoding threshold. E2E-WAVE addresses this by embedding semantic similarity directly into physical layer waveforms: when decoding errors are unavoidable, the system preferentially selects semantically similar tokens rather than arbitrary corruption. Combining VideoGPT tokenization (1024x compression) with a trainable waveform bank and fully differentiable OFDM transmission, E2E-WAVE achieves +5 dB (19.26%) PSNR and +0.10 (14.28%) SSIM over the strongest FEC-protected baseline in less challenging underwater channel (NOF1) while delivering real-time 16 FPS video at 128x128 resolution over 2.3 kbps channels --…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
