Fusing Pretrained ViTs with TCNet for Enhanced EEG Regression
Eric Modesitt, Haicheng Yin, Williams Huang Wang, Brian Lu

TL;DR
This paper introduces a novel method combining pretrained Vision Transformers with Temporal Convolutional Networks to significantly improve EEG regression accuracy and speed, setting new benchmarks in Brain-Computer Interface analysis.
Contribution
It presents a new integrated model that leverages ViTs and TCNet for enhanced EEG analysis, optimizing patch construction for better speed-accuracy tradeoffs.
Findings
RMSE reduced from 55.4 to 51.8 on EEGEyeNet dataset
Model achieves up to 4.32x faster processing speed
Sets new benchmark in EEG regression accuracy
Abstract
The task of Electroencephalogram (EEG) analysis is paramount to the development of Brain-Computer Interfaces (BCIs). However, to reach the goal of developing robust, useful BCIs depends heavily on the speed and the accuracy at which BCIs can understand neural dynamics. In response to that goal, this paper details the integration of pre-trained Vision Transformers (ViTs) with Temporal Convolutional Networks (TCNet) to enhance the precision of EEG regression. The core of this approach lies in harnessing the sequential data processing strengths of ViTs along with the superior feature extraction capabilities of TCNet, to significantly improve EEG analysis accuracy. In addition, we analyze the importance of how to construct optimal patches for the attention mechanism to analyze, balancing both speed and accuracy tradeoffs. Our results showcase a substantial improvement in regression…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEEG and Brain-Computer Interfaces · Neural Networks and Applications
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
