FAF: A novel multimodal emotion recognition approach integrating face, body and text
Zhongyu Fang, Aoyun He, Qihui Yu, Baopeng Gao, Weiping Ding, Tong, Zhang, Lei Ma

TL;DR
This paper introduces a new multimodal emotion recognition approach that combines face, body, and text data, utilizing a large dataset and a novel framework to improve accuracy and provide an online prediction platform.
Contribution
The paper presents a new large multimodal emotion dataset and a 'Feature After Feature' framework for improved emotion recognition accuracy.
Findings
Achieved 83.75% classification accuracy with the proposed method.
Improved recognition performance by 1.83%, 9.38%, and 21.62% over individual modalities.
Effectively utilized complementarity between modalities to enhance emotion recognition.
Abstract
Multimodal emotion analysis performed better in emotion recognition depending on more comprehensive emotional clues and multimodal emotion dataset. In this paper, we developed a large multimodal emotion dataset, named "HED" dataset, to facilitate the emotion recognition task, and accordingly propose a multimodal emotion recognition method. To promote recognition accuracy, "Feature After Feature" framework was used to explore crucial emotional information from the aligned face, body and text samples. We employ various benchmarks to evaluate the "HED" dataset and compare the performance with our method. The results show that the five classification accuracy of the proposed multimodal fusion method is about 83.75%, and the performance is improved by 1.83%, 9.38%, and 21.62% respectively compared with that of individual modalities. The complementarity between each channel is effectively…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmotion and Mood Recognition
