TL;DR
AffectAgent is a multi-agent framework that improves multimodal emotion recognition by collaborative reasoning, dynamic modality balancing, and retrieval-augmented fusion, outperforming existing methods.
Contribution
The paper introduces AffectAgent, a novel multi-agent retrieval-augmented approach with specialized agents and adaptive fusion techniques for enhanced emotion understanding.
Findings
Achieves superior performance on MER-UniBench.
Effectively balances modalities with MB-MoE.
Improves semantic completion with RAAF.
Abstract
LLM-based multimodal emotion recognition relies on static parametric memory and often hallucinates when interpreting nuanced affective states. In this paper, given that single-round retrieval-augmented generation is highly susceptible to modal ambiguity and therefore struggles to capture complex affective dependencies across modalities, we introduce AffectAgent, an affect-oriented multi-agent retrieval-augmented generation framework that leverages collaborative decision-making among agents for fine-grained affective understanding. Specifically, AffectAgent comprises three jointly optimized specialized agents, namely a query planner, an evidence filter, and an emotion generator, which collaboratively perform analytical reasoning to retrieve cross-modal samples, assess evidence, and generate predictions. These agents are optimized end-to-end using Multi-Agent Proximal Policy Optimization…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
