From LLM to Conversational Agent: A Memory Enhanced Architecture with   Fine-Tuning of Large Language Models

Na Liu; Liangyu Chen; Xiaoyu Tian; Wei Zou; Kaijiang Chen; Ming Cui

arXiv:2401.02777·cs.CL·January 31, 2024·6 cites

From LLM to Conversational Agent: A Memory Enhanced Architecture with Fine-Tuning of Large Language Models

Na Liu, Liangyu Chen, Xiaoyu Tian, Wei Zou, Kaijiang Chen, Ming Cui

PDF

Open Access

TL;DR

This paper presents RAISE, a memory-augmented architecture that improves the integration of large language models into conversational agents, enhancing context management and adaptability in multi-turn dialogues.

Contribution

It introduces RAISE, a novel framework with a dual-memory system that extends ReAct, enabling more controllable and context-aware conversational agents.

Findings

01

RAISE improves agent controllability in complex dialogues.

02

Preliminary results show advantages over traditional agents.

03

Potential for broader applications in conversational AI.

Abstract

This paper introduces RAISE (Reasoning and Acting through Scratchpad and Examples), an advanced architecture enhancing the integration of Large Language Models (LLMs) like GPT-4 into conversational agents. RAISE, an enhancement of the ReAct framework, incorporates a dual-component memory system, mirroring human short-term and long-term memory, to maintain context and continuity in conversations. It entails a comprehensive agent construction scenario, including phases like Conversation Selection, Scene Extraction, CoT Completion, and Scene Augmentation, leading to the LLMs Training phase. This approach appears to enhance agent controllability and adaptability in complex, multi-turn dialogues. Our preliminary evaluations in a real estate sales context suggest that RAISE has some advantages over traditional agents, indicating its potential for broader applications. This work contributes to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Speech and dialogue systems · Natural Language Processing Techniques

MethodsAttention Is All You Need · Linear Layer · Dropout · Adam · Layer Normalization · Residual Connection · Absolute Position Encodings · Dense Connections · Position-Wise Feed-Forward Layer · Byte Pair Encoding