MERLIN: Building Low-SNR Robust Multimodal LLMs for Electromagnetic Signals
Junyu Shen, Zhendong She, Chenghanyu Zhang, Yuchuang Sun, Luqing Luo, Dingwei Tan, Zonghao Guo, Bo Guo, Zehua Han, Wupeng Xie, Yaxin Mu, Peng Zhang, Peipei Li, Fengxiang Wang, Yangang Sun, Maosong Sun

TL;DR
This paper introduces MERLIN, a novel framework for multimodal large language models in the electromagnetic domain, addressing data scarcity, benchmarking, and low-SNR robustness, with a large dataset and comprehensive evaluation.
Contribution
The paper presents EM-100k dataset, EM-Bench benchmark, and MERLIN training framework, advancing EM signal-to-text modeling with robustness in low-SNR conditions.
Findings
MERLIN achieves state-of-the-art performance on EM-Bench.
MERLIN demonstrates remarkable robustness in low-SNR environments.
The EM-100k dataset enables large-scale EM signal-text pretraining.
Abstract
The paradigm of Multimodal Large Language Models (MLLMs) offers a promising blueprint for advancing the electromagnetic (EM) domain. However, prevailing approaches often deviate from the native MLLM paradigm, instead using task-specific or pipelined architectures that lead to fundamental limitations in model performance and generalization. Fully realizing the MLLM potential in EM domain requires overcoming three main challenges: (1) Data. The scarcity of high-quality datasets with paired EM signals and descriptive text annotations used for MLLMs pre-training; (2) Benchmark. The absence of comprehensive benchmarks to systematically evaluate and compare the performance of models on EM signal-to-text tasks; (3) Model. A critical fragility in low Signal-to-Noise Ratio (SNR) environments, where critical signal features can be obscured, leading to significant performance degradation. To…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Topic Modeling · Multimodal Machine Learning Applications
