SONIC: Spectral Oriented Neural Invariant Convolutions
Gijs Joppe Moens, Regina Beets-Tan, Eduardo H. P. Pooch

TL;DR
SONIC introduces a spectral convolutional approach that captures global context and orientation selectivity, improving robustness and efficiency over traditional CNNs and ViTs across various vision tasks.
Contribution
It proposes a novel spectral parameterisation for convolutions using shared, orientation-selective components, enabling global receptive fields with fewer parameters.
Findings
Enhanced robustness to geometric transformations and noise
Matches or exceeds performance of existing architectures
Uses significantly fewer parameters
Abstract
Convolutional Neural Networks (CNNs) rely on fixed-size kernels scanning local patches, which limits their ability to capture global context or long-range dependencies without very deep architectures. Vision Transformers (ViTs), in turn, provide global connectivity but lack spatial inductive bias, depend on explicit positional encodings, and remain tied to the initial patch size. Bridging these limitations requires a representation that is both structured and global. We introduce SONIC (Spectral Oriented Neural Invariant Convolutions), a continuous spectral parameterisation that models convolutional operators using a small set of shared, orientation-selective components. These components define smooth responses across the full frequency domain, yielding global receptive fields and filters that adapt naturally across resolutions. Across synthetic benchmarks, large-scale image…
Peer Reviews
Decision·ICLR 2026 Poster
- The paper overall is easy to follow.
- My main concern is the unclear comparisons with the existing solutions. In Section 2 (line 149), the authors mentioned some limitations of previous solutions like GFNet and FNO. However, there is no further study to show how the proposed method overcomes these limitations and why these limitations are important. Besides, there is also no direct comparison with these methods in the experiments. For example, there are two limitations of GFNet mentioned: "the FFT grid is tied to the input resolut
The paper is well written, very well motivated and easy to follow. The presented idea is novel and presents an interesting concept, bringing core elements of state-space system into the the Frequency domain. The fact that the learnable filters are formulated in a continuous parameterization is particularly noteworthy. This actually opens the door to solve many sampling related problems in current network designs (aliasing causing low robustness, limitation to fixed input sizes, low robustness
There are several aspects in which the paper could be improved: 1) the paper mentions several previous approaches of Fourier-domain feature extraction (page 3 bottom), but does not compere to these methods in the experiments or in terms of computational complexity 2) the authors missed to discuss and to compare to [1] - another Fourier-domain approach of efficient large kernel implementations. 3) the experiments comparing to CNNs do not show the used kernel size (also not in the appendix). H
1. The paper presents a spectral framework for multidimensional signals that offers global receptive fields, complete convolutional capability, and built-in resolution invariance. It provides a lightweight and flexible foundation for building scalable, adaptable vision models. 2. The paper presents comprehensive empirical validation across both synthetic and real-world settings.
1. The innovations mentioned in the abstract and contributions are mainly about unifying existing convolution kernels, spectral filtering, and state-space kernels under one spectral framework. However, the idea of parameterizing operators in the frequency or linear domain already exists in models such as S4ND, GFNet, FNO, and Mamba. The so-called directional modes only add a few interpretable parameters (e.g., direction, scale, damping) to the frequency response function, but in essence, it is s
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace Recognition and Perception · Advanced Neural Network Applications · Adversarial Robustness in Machine Learning
