# ICTD: Combination of Improved CNN–Transformer and Enhanced Deep Canonical Correlation Analysis for Eye-Movement Emotion Classification

**Authors:** Cong Zhang, Xisheng Li, Jiannan Chi, Ming Cao, Qingfeng Gu, Jiahui Liu

PMC · DOI: 10.3390/brainsci16030330 · 2026-03-19

## TL;DR

This paper introduces ICTD, a new method combining CNNs and transformers with enhanced deep canonical correlation analysis to improve emotion classification using eye-movement data.

## Contribution

The paper introduces an improved CNN-transformer model with an enhanced deep canonical correlation analysis method and an incremental feature feedforward network for emotion classification.

## Key findings

- ICTD achieves 81.8% and 85.2% accuracy for three-category arousal and valence classification on the eSEE-d dataset.
- The method reaches 91.2% accuracy for four-category emotion classification on the SEED-IV dataset.
- ICTD obtains 85.1% accuracy for five-category emotion classification on the SEED-V dataset.

## Abstract

What are the main findings?
This paper proposes a deep canonical correlation analysis method based on cosine similarity, non-linearly transforming feature vectors of different modalities into feature vectors with stronger correlation to improve the accuracy of emotion classification.This paper proposes an incremental feature feedforward network (IFFN) to perform feature transformations in enhancement and simplification, replacing the FFN in the original transformer module.

This paper proposes a deep canonical correlation analysis method based on cosine similarity, non-linearly transforming feature vectors of different modalities into feature vectors with stronger correlation to improve the accuracy of emotion classification.

This paper proposes an incremental feature feedforward network (IFFN) to perform feature transformations in enhancement and simplification, replacing the FFN in the original transformer module.

What are the implications of the main findings?
Cosine similarity pays more attention to the direction of vectors than it does to their magnitude, is affected less by outliers, and does not require the data to satisfy specific distribution assumptions. The characteristics that are more suitable for eye-movement input data have provided a suitable processing method for eye-movement-based emotion classification.Existing studies mostly rely on the statistical characteristics of the original data for emotion analysis. Moreover, the importance of each emotional feature in the calculation process is not primary or secondary, and the role of key features cannot be highlighted. By assigning higher weights to key features, the influence of indiscriminately input features is reduced, thereby enhancing the importance of key features. This design effectively addresses the lack of prioritization in feature importance and significantly improves the ability of eye-movement features to characterize emotional states.

Cosine similarity pays more attention to the direction of vectors than it does to their magnitude, is affected less by outliers, and does not require the data to satisfy specific distribution assumptions. The characteristics that are more suitable for eye-movement input data have provided a suitable processing method for eye-movement-based emotion classification.

Existing studies mostly rely on the statistical characteristics of the original data for emotion analysis. Moreover, the importance of each emotional feature in the calculation process is not primary or secondary, and the role of key features cannot be highlighted. By assigning higher weights to key features, the influence of indiscriminately input features is reduced, thereby enhancing the importance of key features. This design effectively addresses the lack of prioritization in feature importance and significantly improves the ability of eye-movement features to characterize emotional states.

Background/Objectives: Emotion classification based on eye-movement features has become a widely adopted approach due to the simplicity of data acquisition and the strong association between ocular responses and emotional states. However, several challenges remain with regard to existing emotion recognition methods, including the relatively weak correlation between eye-movement features and emotional labels and the fact that the key features are not prominently presented. Methods: To address abovelimitations, this study proposes an improved CNN-transformer combined with enhanced deep canonical correlation analysis network (ICTD). The proposed method first performs preprocessing and reconstruction of raw eye-movement signals to extract informative features. Subsequently, convolutional neural networks (CNNs) and transformer architectures are employed to capture local and global feature, respectively. In addition, an incremental feature feedforward network is incorporated to enhance the transformer, enabling the model to assign higher importance to salient feature information. Finally, the extracted representations are processed through deep canonical correlation analysis based on cosine similarity in order to generate classification outcomes. Results: Experiments conducted on the SEED-IV, SEED-V, and eSEE-d datasets demonstrate that the proposed ICTD framework consistently outperforms baseline approaches and attains optimal classification results. (1) On the eSEE-d dataset, the results of three-category arousal and valence classification reach 81.8% and 85.2%, respectively; (2) on the SEED-IV dataset, the emotion four-category classification result reaches 91.2%; (3) finally, on the SEED-V dataset, the emotion five-category classification result reaches 85.1%. Conclusions: The proposed ICTD framework effectively improves feature representation and classification performance, showing strong potential for practical emotion recognition and physiological signal analysis.

## Full-text entities

- **Diseases:** tenderness (MESH:D063806), injury to (MESH:D014947)
- **Chemicals:** HA (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13025195/full.md

---
Source: https://tomesphere.com/paper/PMC13025195