SZTU-CMU at MER2024: Improving Emotion-LLaMA with Conv-Attention for   Multimodal Emotion Recognition

Zebang Cheng; Shuyuan Tu; Dawei Huang; Minghan Li; Xiaojiang Peng,; Zhi-Qi Cheng; Alexander G. Hauptmann

arXiv:2408.10500·cs.MM·August 23, 2024

SZTU-CMU at MER2024: Improving Emotion-LLaMA with Conv-Attention for Multimodal Emotion Recognition

Zebang Cheng, Shuyuan Tu, Dawei Huang, Minghan Li, Xiaojiang Peng,, Zhi-Qi Cheng, Alexander G. Hauptmann

PDF

1 Repo

TL;DR

This paper introduces a multimodal emotion recognition system that combines Emotion-LLaMA with Conv-Attention, achieving state-of-the-art results in the MER2024 Challenge by improving annotation quality and multimodal fusion.

Contribution

The paper presents Conv-Attention, a novel hybrid framework for multimodal fusion, and leverages Emotion-LLaMA for high-quality annotation, advancing emotion recognition performance.

Findings

01

Achieved 85.30% weighted F-score in MER-NOISE, surpassing previous methods.

02

Improved average accuracy and recall by 8.52% over GPT-4V in MER-OV.

03

Secured the top score among large multimodal models in MER-OV.

Abstract

This paper presents our winning approach for the MER-NOISE and MER-OV tracks of the MER2024 Challenge on multimodal emotion recognition. Our system leverages the advanced emotional understanding capabilities of Emotion-LLaMA to generate high-quality annotations for unlabeled samples, addressing the challenge of limited labeled data. To enhance multimodal fusion while mitigating modality-specific noise, we introduce Conv-Attention, a lightweight and efficient hybrid framework. Extensive experimentation vali-dates the effectiveness of our approach. In the MER-NOISE track, our system achieves a state-of-the-art weighted average F-score of 85.30%, surpassing the second and third-place teams by 1.47% and 1.65%, respectively. For the MER-OV track, our utilization of Emotion-LLaMA for open-vocabulary annotation yields an 8.52% improvement in average accuracy and recall compared to GPT-4V,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zebangcheng/emotion-llama
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.