MSE-Adapter: A Lightweight Plugin Endowing LLMs with the Capability to Perform Multimodal Sentiment Analysis and Emotion Recognition
Yang Yang, Xunde Dong, Yupeng Qiang

TL;DR
The paper introduces MSE-Adapter, a lightweight plugin that enables large language models to perform multimodal sentiment analysis and emotion recognition efficiently, without sacrificing their original capabilities.
Contribution
It proposes a novel lightweight plugin with a Text-Guide-Mixer module that aligns non-textual and textual modalities, enhancing multimodal analysis with minimal additional parameters.
Findings
Effective on multiple datasets in English and Chinese
Uses only about 2.6-2.8 million trainable parameters
Compatible with open-source LLMs like LLaMA2 and ChatGLM3
Abstract
Current Multimodal Sentiment Analysis (MSA) and Emotion Recognition in Conversations (ERC) methods based on pre-trained language models exhibit two primary limitations: 1) Once trained for MSA and ERC tasks, these pre-trained language models lose their original generalized capabilities. 2) They demand considerable computational resources. As the size of pre-trained language models continues to grow, training larger multimodal sentiment analysis models using previous approaches could result in unnecessary computational cost. In response to this challenge, we propose \textbf{M}ultimodal \textbf{S}entiment Analysis and \textbf{E}motion Recognition \textbf{Adapter} (MSE-Adapter), a lightweight and adaptable plugin. This plugin enables a large language model (LLM) to carry out MSA or ERC tasks with minimal computational overhead (only introduces approximately 2.6M to 2.8M trainable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsEmotion and Mood Recognition · Sentiment Analysis and Opinion Mining
MethodsALIGN
