Attention Isn't All You Need for Emotion Recognition:Domain Features Outperform Transformers on the EAV Dataset
Anmol Guragain

TL;DR
This study shows that in small-scale multimodal emotion recognition, simple domain-specific features and modifications outperform complex attention-based models like transformers, which tend to overfit and underperform.
Contribution
The paper demonstrates that domain knowledge and simple feature engineering outperform sophisticated attention mechanisms on small emotion recognition datasets.
Findings
Attention mechanisms underperform on small datasets due to overfitting.
Domain-specific features improve accuracy significantly.
Pretraining and domain adaptation enhance model performance.
Abstract
We present a systematic study of multimodal emotion recognition using the EAV dataset, investigating whether complex attention mechanisms improve performance on small datasets. We implement three model categories: baseline transformers (M1), novel factorized attention mechanisms (M2), and improved CNN baselines (M3). Our experiments show that sophisticated attention mechanisms consistently underperform on small datasets. M2 models achieved 5 to 13 percentage points below baselines due to overfitting and destruction of pretrained features. In contrast, simple domain-appropriate modifications proved effective: adding delta MFCCs to the audio CNN improved accuracy from 61.9% to 65.56% (+3.66pp), while frequency-domain features for EEG achieved 67.62% (+7.62pp over the paper baseline). Our vision transformer baseline (M1) reached 75.30%, exceeding the paper's ViViT result (74.5%) through…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmotion and Mood Recognition · EEG and Brain-Computer Interfaces · Sentiment Analysis and Opinion Mining
