Cross-Temporal Attention Fusion (CTAF) for Multimodal Physiological Signals in Self-Supervised Learning
Arian Khorasani, Th\'eophile Demazure

TL;DR
This paper introduces Cross-Temporal Attention Fusion (CTAF), a novel self-supervised method for aligning and fusing asynchronous multimodal physiological signals like EEG and peripheral data, improving affect modeling accuracy with minimal labels.
Contribution
It presents a time-aware fusion mechanism with an alignment-driven self-supervised objective specifically designed for asynchronous EEG and physiological signals, along with an evaluation protocol for alignment quality.
Findings
Higher cosine margins for matched pairs
Better cross-modal token retrieval within one second
Competitive accuracy and macro-F1 with few labels
Abstract
We study multimodal affect modeling when EEG and peripheral physiology are asynchronous, which most fusion methods ignore or handle with costly warping. We propose Cross-Temporal Attention Fusion (CTAF), a self-supervised module that learns soft bidirectional alignments between modalities and builds a robust clip embedding using time-aware cross attention, a lightweight fusion gate, and alignment-regularized contrastive objectives with optional weak supervision. On the K-EmoCon dataset, under leave-one-out cross-validation evaluation, CTAF yields higher cosine margins for matched pairs and better cross-modal token retrieval within one second, and it is competitive with the baseline on three-bin accuracy and macro-F1 while using few labels. Our contributions are a time-aware fusion mechanism that directly models correspondence, an alignment-driven self-supervised objective tailored to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEEG and Brain-Computer Interfaces · Emotion and Mood Recognition · Time Series Analysis and Forecasting
