How Big Should a Wireless Foundation Model Be?
Wei-Lun Cheng, Wanjiun Liao

TL;DR
This paper establishes that the optimal size of wireless foundation models is limited by physical channel constraints, with diminishing returns beyond a certain scale, and highlights the effectiveness of test-time training for adaptation.
Contribution
It introduces a physics-grounded scaling framework for wireless AI models based on channel intrinsic dimensionality, challenging the notion that larger models always yield better performance.
Findings
Channel intrinsic dimensionality spans 5-35, much lower than language models.
Scaling gains diminish rapidly beyond approximately 30 million parameters.
Test-time training significantly improves performance of smaller models.
Abstract
Wireless foundation models are rapidly emerging as a key enabler of AI-native communication systems, yet a fundamental question remains unanswered: how large should these models be? We present a principled, physics-grounded answer, showing that the intrinsic dimensionality (dNL, the nonlinear manifold dimension of the channel) acts as the fundamental bottleneck, defining the scaling ceiling once a data-sufficient regime is reached. This dimensionality is not a design choice but a physical constraint: Maxwell's equations, finite scatterers, and antenna aperture inherently constrain wireless propagation environments to a limited number of degrees of freedom -- spanning 5-35 across both real-world OTA measurements and 3GPP-standardized channel models we evaluate -- orders of magnitude below the ~1,000-dimensional semantic space of language. As a consequence, we propose a scaling framework…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
