WirelessMathBench: A Mathematical Modeling Benchmark for LLMs in Wireless Communications
Xin Li, Mengbing Liu, Li Wei, Jiancheng An, M\'erouane Debbah, Chau Yuen

TL;DR
WirelessMathBench is a new benchmark designed to evaluate LLMs on complex mathematical modeling tasks specific to wireless communications, revealing current models' limitations in domain-specific reasoning.
Contribution
We introduce WirelessMathBench, a comprehensive benchmark with 587 questions from research papers to assess LLMs' capabilities in wireless communication mathematics.
Findings
LLMs perform poorly on complex equation reconstruction tasks.
DeepSeek-R1 achieves only 38.05% average accuracy on the benchmark.
Current LLMs struggle with domain-specific mathematical reasoning.
Abstract
Large Language Models (LLMs) have achieved impressive results across a broad array of tasks, yet their capacity for complex, domain-specific mathematical reasoning-particularly in wireless communications-remains underexplored. In this work, we introduce WirelessMathBench, a novel benchmark specifically designed to evaluate LLMs on mathematical modeling challenges to wireless communications engineering. Our benchmark consists of 587 meticulously curated questions sourced from 40 state-of-the-art research papers, encompassing a diverse spectrum of tasks ranging from basic multiple-choice questions to complex equation completion tasks, including both partial and full completions, all of which rigorously adhere to physical and dimensional constraints. Through extensive experimentation with leading LLMs, we observe that while many models excel in basic recall tasks, their performance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsCooperative Communication and Network Coding · IPv6, Mobility, Handover, Networks, Security · Wireless Networks and Protocols
MethodsFast Attention Via Positive Orthogonal Random Features · Performer
