The Radio-Frequency Transformer for Signal Separation

Egor Lifar; Semyon Savkin; Rachana Madhukara; Tejas Jayashankar; Yury Polyanskiy; Gregory W. Wornell

arXiv:2603.09201·cs.LG·March 11, 2026

The Radio-Frequency Transformer for Signal Separation

Egor Lifar, Semyon Savkin, Rachana Madhukara, Tejas Jayashankar, Yury Polyanskiy, Gregory W. Wornell

PDF

Open Access 3 Reviews

TL;DR

This paper introduces a data-driven transformer-based method for RF signal separation that learns a discrete tokenizer and achieves significant improvements in bit-error rate, with potential applications beyond radio-frequency data.

Contribution

It presents a novel transformer-based signal separator with a learned tokenizer that outperforms traditional methods and generalizes to unseen interference types.

Findings

01

122x reduction in bit-error rate over prior methods

02

Zero-shot generalization to unseen mixtures

03

Effective on real and synthetic RF data

Abstract

We study a problem of signal separation: estimating a signal of interest (SOI) contaminated by an unknown non-Gaussian background/interference. Given the training data consisting of examples of SOI and interference, we show how to build a fully data-driven signal separator. To that end we learn a good discrete tokenizer for SOI and then train an end-to-end transformer on a cross-entropy loss. Training with a cross-entropy shows substantial improvements over the conventional mean-squared error (MSE). Our tokenizer is a modification of Google's SoundStream, which incorporates additional transformer layers and switches from VQVAE to finite-scalar quantization (FSQ). Across real and synthetic mixtures from the MIT RF Challenge dataset, our method achieves competitive performance, including a 122x reduction in bit-error rate (BER) over prior state-of-the-art techniques for separating a QPSK…

Peer Reviews

Decision·Submitted to ICLR 2026

Reviewer 01Rating 2Confidence 4

Strengths

The paper tackles a practically relevant and challenging problem -- signal separation under RF interference. The use of cross-entropy loss on quantized token sequences, rather than waveform-level MSE, is technically appropriate for a discrete latent representation and improves compatibility with communication metrics such as BER. The writing and experimental presentation are clear and organized, helping reproducibility.

Weaknesses

The model is a relatively straightforward adaptation of existing architectures (SoundStream tokenizer + transformer) to an RF dataset. Architectural, theoretical, or algorithmic innovations are limited. Evaluation is restricted to the MIT RF Challenge dataset. The paper provides no quantitative evidence on why each design choice (tokenization depth, transformer depth, etc.) matters, making the results difficult to interpret. Assertions of robustness to unseen interference are based on synthet

Reviewer 02Rating 4Confidence 4

Strengths

1. The paper steps outside the well-trodden domain of audio source separation and applies modern sequence-to-sequence modeling to the more constrained problem of RF signal separation. The domain is very intriguing and the success is measured by the unforgiving metric of Bit Error Rate (BER), not perceptual audio quality. By successfully adapting these advanced architectures to this field, the authors bridge a critical gap between mainstream deep learning and a specialized engineering domain with

Weaknesses

The paper has some weaknesses and I will try to write them down with decreasing order of significance. 1. The authors should consider placing their contributions with proper citations of the previous methods in the field. The first known published method to perform separation of any signal in some latent continuous domain for neural networks was proposed in [D] and more similar to the contribution of this paper, the first method to formulate the separation problem to a classification-like probl

Reviewer 03Rating 6Confidence 3

Strengths

The method showcases very good empirical results on common benchmarks in the wireless communications domain, surpassing current state-of-the-art models of the latest ICASSP challenge on wireless signal source separation. I also appreciate the effort put into making the model real-time, since this is the typical real-world use-case. Finally, the experiments on additive Gaussian intereference and the ablations are well received, showcasing improvement over classic baselines such as matched filteri

Weaknesses

My thought reading this paper is that it would rank low on novelty since variants of the proposed method (conditional generation via autoregressive transformers for signal processing) is becoming more ubiquitous, for example in audio source separation [1] and accompaniment generation [2]. Also [3] should be mentioned as being the first method to perform source separation in a quantised autoencoder domain via autoregressive transformers. Nevertheless in the application domain of wireless comunica

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPulsars and Gravitational Waves Research · Sparse and Compressive Sensing Techniques · Speech and Audio Processing