Dynamic Parameter Memory: Temporary LoRA-Enhanced LLM for Long-Sequence Emotion Recognition in Conversation

Jialong Mai; Xiaofen Xing; Yawei Li; Weidong Chen; Zhipeng Li; Jingyuan Xing; Xiangmin Xu

arXiv:2507.09076·cs.CL·September 25, 2025

Dynamic Parameter Memory: Temporary LoRA-Enhanced LLM for Long-Sequence Emotion Recognition in Conversation

Jialong Mai, Xiaofen Xing, Yawei Li, Weidong Chen, Zhipeng Li, Jingyuan Xing, Xiangmin Xu

PDF

Open Access 1 Repo 3 Reviews

TL;DR

This paper introduces a Dynamic Parameter Memory mechanism that enhances speech large language models with temporary LoRA modules, enabling effective processing of long audio sequences for emotion recognition in conversation, surpassing existing methods.

Contribution

The paper proposes a novel DPM mechanism that encodes sentence-level emotions into temporary LoRA modules, allowing SLLMs to handle unlimited-length audio sequences for ERC.

Findings

01

DPM significantly improves emotion recognition accuracy on IEMOCAP.

02

The method achieves state-of-the-art performance in long-sequence emotion recognition.

03

DPM effectively memorizes contextual information across conversation turns.

Abstract

Recent research has focused on applying speech large language model (SLLM) to improve speech emotion recognition (SER). However, the inherently high frame rate in speech modality severely limits the signal processing and understanding capabilities of SLLM. For example, a SLLM with a 4K context window can only process 80 seconds of audio at 50Hz feature sampling rate before reaching its capacity limit. Input token compression methods used in SLLM overlook the continuity and inertia of emotions across multiple conversation turns. This paper proposes a Dynamic Parameter Memory (DPM) mechanism with contextual semantics and sentence-level emotion encoding, enabling processing of unlimited-length audio with limited context windows in SLLM. Specifically, DPM progressively encodes sentence-level information and emotions into a temporary LoRA module during inference to effectively "memorize" the…

Peer Reviews

Decision·ICLR 2026 Conference Withdrawn Submission

Reviewer 01Rating 6Confidence 3

Strengths

1. The manuscript presents a novel inference method (DPM) addressing long-sequence limits in LLMs. This is usually a very complex method that can be very helpful in long sequential emotional recognition conversations. 2. The manuscript also maintains contextual emotion continuity across dialogue turns. This is also very in depth and contextual 3. Demonstrated SOTA performance (e.g., 79.34% WF1). The SOTA performance is a good parameter to consider overall 4. Elegant use of temporary LoRA for

Weaknesses

1. Although the evaluation looks pretty comprehensive but limited to two datasets; lacks real-world or multilingual validation. 2. The metrics are good but there is no explicit latency or computational cost benchmarks. 3. Overall there is a high dependency on sentence segmentation quality. 4. Limited analysis on failure or misclassification cases were also seen overall.

Reviewer 02Rating 2Confidence 4

Strengths

1. The primary novelty of this work lies in its innovative application of LoRA not as a static fine-tuning method, but as a dynamic, temporal memory for extending the effective context of an SLLM. Instead of conventional approaches like input compression or sliding windows, which risk losing historical information, the authors propose to progressively encode the evolving conversational context directly into the LoRA parameters during inference. This reconceptualization of LoRA is really interest

Weaknesses

The paper has some weaknesses and I will try to write them down in a somewhat decreasing order of significance that would hopefully help the authors to fix these issues and improve the quality of their paper. 1. A primary concern is the marginal performance improvement when contextualized against the immense increase in model complexity. The reported 10-15% gain in weighted and unweighted accuracy over a four-emotion task on IEMOCAP is unimpressive when compared to results from over seven years

Reviewer 03Rating 4Confidence 5

Strengths

The problem statement and motivation behind the work are well introduced. The overall presentation of the paper and the visuals are clear.

Weaknesses

For a more comprehensive evaluation, I would suggest more evaluations on more conversational style datasets in addition to IEMOCAP and MELD. This could be something to consider for other language speakers and more varied conversational styles. In addition, a synthetic dataset could also be used to show generalizability. I would also like to know the model's performance on different discrete emotions, instead of total accuracy or macro F1. I know the main scope of the work is for ERC, but for S

Code & Models

Repositories

yongaifadian1/Dynamic-Parameter-Memory_DPM
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEmotion and Mood Recognition