Audio-Visual Compound Expression Recognition Method based on Late   Modality Fusion and Rule-based Decision

Elena Ryumina; Maxim Markitantov; Dmitry Ryumin; Heysem Kaya; Alexey; Karpov

arXiv:2403.12687·cs.CV·April 1, 2024·1 cites

Audio-Visual Compound Expression Recognition Method based on Late Modality Fusion and Rule-based Decision

Elena Ryumina, Maxim Markitantov, Dmitry Ryumin, Heysem Kaya, Alexey, Karpov

PDF

Open Access

TL;DR

This paper introduces a zero-shot audio-visual method for compound expression recognition that fuses modalities at the emotion probability level and uses rule-based decisions, achieving a 22.01% F1-score.

Contribution

The paper presents a novel zero-shot approach combining modality fusion and rule-based decision-making for compound expression recognition without task-specific training.

Findings

01

Achieved 22.01% F1-score on C-EXPR-DB test set.

02

Demonstrated potential for annotating audio-visual emotional data.

03

Validated effectiveness in multi-corpus and cross-corpus setups.

Abstract

This paper presents the results of the SUN team for the Compound Expressions Recognition Challenge of the 6th ABAW Competition. We propose a novel audio-visual method for compound expression recognition. Our method relies on emotion recognition models that fuse modalities at the emotion probability level, while decisions regarding the prediction of compound expressions are based on predefined rules. Notably, our method does not use any training data specific to the target task. Thus, the problem is a zero-shot classification task. The method is evaluated in multi-corpus training and cross-corpus validation setups. Using our proposed method is achieved an F1-score value equals to 22.01% on the C-EXPR-DB test subset. Our findings from the challenge demonstrate that the proposed method can potentially form a basis for developing intelligent tools for annotating audio-visual data in the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSimulation and Modeling Applications