Batch Transformer Architecture: Case of Synthetic Image Generation for Emotion Expression Facial Recognition

Stanislav Selitskiy

arXiv:2511.11754·cs.CV·November 18, 2025

Batch Transformer Architecture: Case of Synthetic Image Generation for Emotion Expression Facial Recognition

Stanislav Selitskiy

PDF

Open Access

TL;DR

This paper introduces a novel Batch Transformer architecture that emphasizes important feature dimensions to improve synthetic image generation for facial recognition, especially under makeup and occlusion conditions.

Contribution

It proposes a new Transformer variation that selectively attends to key features, reducing bottleneck size and enhancing synthetic face image generation for emotion expression recognition.

Findings

01

Improved synthetic image quality for facial recognition tasks.

02

Enhanced variability in limited datasets with makeup and occlusion.

03

Reduced computational bottleneck in Transformer architectures.

Abstract

A novel Transformer variation architecture is proposed in the implicit sparse style. Unlike "traditional" Transformers, instead of attention to sequential or batch entities in their entirety of whole dimensionality, in the proposed Batch Transformers, attention to the "important" dimensions (primary components) is implemented. In such a way, the "important" dimensions or feature selection allows for a significant reduction of the bottleneck size in the encoder-decoder ANN architectures. The proposed architecture is tested on the synthetic image generation for the face recognition task in the case of the makeup and occlusion data set, allowing for increased variability of the limited original data set.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEmotion and Mood Recognition · Face and Expression Recognition · Face recognition and analysis