NeuroLex: A Lightweight Domain Language Model for EEG Report Understanding and Generation

Kang Yin; Hye-Bin Shin

arXiv:2511.12851·cs.CL·November 18, 2025

NeuroLex: A Lightweight Domain Language Model for EEG Report Understanding and Generation

Kang Yin, Hye-Bin Shin

PDF

Open Access

TL;DR

NeuroLex is a specialized lightweight language model trained exclusively on EEG reports, improving EEG report understanding and generation by capturing domain-specific language and reasoning patterns.

Contribution

It introduces NeuroLex, a domain-adaptive language model tailored for EEG reports, enhancing biomedical text modeling and neural decoding applications.

Findings

01

Lower perplexity than general models

02

Higher accuracy in report extraction and summarization

03

Improved robustness to negation and hallucination

Abstract

Clinical electroencephalogram (EEG) reports encode domain-specific linguistic conventions that general-purpose language models (LMs) fail to capture. We introduce NeuroLex, a lightweight domain-adaptive language model trained purely on EEG report text from the Harvard Electroencephalography Database. Unlike existing biomedical LMs, NeuroLex is tailored to the linguistic and diagnostic characteristics of EEG reporting, enabling it to serve as both an independent textual model and a decoder backbone for multimodal EEG-language systems. Using span-corruption pretraining and instruction-style fine-tuning on report polishing, paragraph summarization, and terminology question answering, NeuroLex learns the syntax and reasoning patterns characteristic of EEG interpretation. Comprehensive evaluations show that it achieves lower perplexity, higher extraction and summarization accuracy, better…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEEG and Brain-Computer Interfaces · Topic Modeling · Machine Learning in Healthcare