A Unified Framework for Emotion Recognition and Sentiment Analysis via Expert-Guided Multimodal Fusion with Large Language Models

Jiaqi Qiao; Xiujuan Xu; Xinran Li; Yu Liu

arXiv:2601.07565·cs.CL·January 13, 2026

A Unified Framework for Emotion Recognition and Sentiment Analysis via Expert-Guided Multimodal Fusion with Large Language Models

Jiaqi Qiao, Xiujuan Xu, Xinran Li, Yu Liu

PDF

Open Access

TL;DR

This paper introduces EGMF, a unified multimodal framework leveraging expert-guided fusion and large language models to improve emotion recognition and sentiment analysis across multiple languages.

Contribution

The paper proposes a novel expert-guided multimodal fusion approach integrated with LLMs, enabling unified classification and regression for emotion and sentiment tasks.

Findings

01

Outperforms state-of-the-art on multiple bilingual benchmarks

02

Demonstrates strong cross-lingual robustness

03

Efficient fine-tuning with LoRA

Abstract

Multimodal emotion understanding requires effective integration of text, audio, and visual modalities for both discrete emotion recognition and continuous sentiment analysis. We present EGMF, a unified framework combining expert-guided multimodal fusion with large language models. Our approach features three specialized expert networks--a fine-grained local expert for subtle emotional nuances, a semantic correlation expert for cross-modal relationships, and a global context expert for long-range dependencies--adaptively integrated through hierarchical dynamic gating for context-aware feature selection. Enhanced multimodal representations are integrated with LLMs via pseudo token injection and prompt-based conditioning, enabling a single generative framework to handle both classification and regression through natural language generation. We employ LoRA fine-tuning for computational…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEmotion and Mood Recognition · Sentiment Analysis and Opinion Mining · Multimodal Machine Learning Applications