CLEAN-MI: A Scalable and Efficient Pipeline for Constructing High-Quality Neurodata in Motor Imagery Paradigm
Dingkun Liu, Zhu Chen, Dongrui Wu

TL;DR
CLEAN-MI is a comprehensive pipeline that enhances the quality and standardization of large-scale EEG datasets for motor imagery BCIs, improving model training and classification accuracy.
Contribution
The paper introduces CLEAN-MI, a novel scalable pipeline combining filtering, selection, and alignment techniques to improve neurodata quality in MI-based BCIs.
Findings
Improved classification performance on multiple datasets.
Enhanced data quality and consistency across sources.
Systematic filtering reduces noise and heterogeneity.
Abstract
The construction of large-scale, high-quality datasets is a fundamental prerequisite for developing robust and generalizable foundation models in motor imagery (MI)-based brain-computer interfaces (BCIs). However, EEG signals collected from different subjects and devices are often plagued by low signal-to-noise ratio, heterogeneity in electrode configurations, and substantial inter-subject variability, posing significant challenges for effective model training. In this paper, we propose CLEAN-MI, a scalable and systematic data construction pipeline for constructing large-scale, efficient, and accurate neurodata in the MI paradigm. CLEAN-MI integrates frequency band filtering, channel template selection, subject screening, and marginal distribution alignment to systematically filter out irrelevant or low-quality data and standardize multi-source EEG datasets. We demonstrate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMedical Imaging and Analysis · Neural Networks and Applications · Hand Gesture Recognition Systems
