Affective Processes: stochastic modelling of temporal context for emotion and facial expression recognition
Enrique Sanchez, Mani Kumar Tellamekala, Michel Valstar and, Georgios Tzimiropoulos

TL;DR
This paper introduces a probabilistic, neural process-based approach for emotion and facial expression recognition that models temporal context uncertainty and improves accuracy across multiple datasets.
Contribution
It proposes a novel method combining neural processes with task-specific temporal context modeling and selection, addressing limitations of existing recurrent and self-attention models.
Findings
Consistent improvement over strong baselines
Outperforms state-of-the-art methods on multiple datasets
Effective modeling of temporal context uncertainty
Abstract
Temporal context is key to the recognition of expressions of emotion. Existing methods, that rely on recurrent or self-attention models to enforce temporal consistency, work on the feature level, ignoring the task-specific temporal dependencies, and fail to model context uncertainty. To alleviate these issues, we build upon the framework of Neural Processes to propose a method for apparent emotion recognition with three key novel components: (a) probabilistic contextual representation with a global latent variable model; (b) temporal context modelling using task-specific predictions in addition to features; and (c) smart temporal context selection. We validate our approach on four databases, two for Valence and Arousal estimation (SEWA and AffWild2), and two for Action Unit intensity estimation (DISFA and BP4D). Results show a consistent improvement over a series of strong baselines as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
