Real-Time and Scalable Zak-OTFS Receiver Processing on GPUs
Junyao Zheng, Chung-Hsuan Tung, Yuncheng Yao, Nishant Mehrotra, Sandesh Mattu, Zhenzhou Qi, Danyang Zhuo, Robert Calderbank, Tingjun Chen

TL;DR
This paper introduces a GPU-based Zak-OTFS receiver architecture that leverages channel sparsity and optimized matrix operations for real-time, high-throughput processing in high-mobility communication scenarios.
Contribution
It presents a scalable, low-latency Zak-OTFS receiver design on GPUs that exploits DD-domain sparsity and structured matrices for efficient processing.
Findings
Achieves up to 906.52 Mbps throughput with large DD grid sizes.
Meets 99.9-th percentile real-time processing deadlines.
Demonstrates robustness and scalability across multiple GPU platforms.
Abstract
Orthogonal time frequency space (OTFS) modulation offers superior robustness to high-mobility channels compared to conventional orthogonal frequency-division multiplexing (OFDM) waveforms. However, its explicit delay-Doppler (DD) domain representation incurs substantial signal processing complexity, especially with increased DD domain grid sizes. To address this challenge, we present a scalable, real-time Zak-OTFS receiver architecture on GPUs through hardware--algorithm co-design that exploits DD-domain channel sparsity. Our design leverages compact matrix operations for key processing stages, a branchless iterative equalizer, and a structured sparse channel matrix of the DD domain channel matrix to significantly reduce computational and memory overhead. These optimizations enable low-latency processing that consistently meets the 99.9-th percentile real-time processing deadline. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
