ChatSOP: An SOP-Guided MCTS Planning Framework for Controllable LLM   Dialogue Agents

Zhigen Li; Jianxiang Peng; Yanmeng Wang; Yong Cao; Tianhao Shen,; Minghui Zhang; Linxi Su; Shang Wu; Yihang Wu; Yuqian Wang; Ye Wang; Wei Hu,; Jianfeng Li; Shaojun Wang; Jing Xiao; Deyi Xiong

arXiv:2407.03884·cs.CL·February 25, 2025·1 cites

ChatSOP: An SOP-Guided MCTS Planning Framework for Controllable LLM Dialogue Agents

Zhigen Li, Jianxiang Peng, Yanmeng Wang, Yong Cao, Tianhao Shen,, Minghui Zhang, Linxi Su, Shang Wu, Yihang Wu, Yuqian Wang, Ye Wang, Wei Hu,, Jianfeng Li, Shaojun Wang, Jing Xiao, Deyi Xiong

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces ChatSOP, a framework that uses SOP-guided MCTS planning to improve the controllability of LLM-based dialogue agents, leading to more focused and effective conversations.

Contribution

It presents a novel SOP-guided MCTS framework, a curated SOP-annotated dialogue dataset, and a new method combining Chain of Thought reasoning with supervised fine-tuning for SOP prediction.

Findings

01

Achieved 27.95% improvement in action accuracy over GPT-3.5 baseline.

02

Demonstrated effectiveness on multiple models, including open-source variants.

03

Validated the approach with publicly available dataset and code.

Abstract

Dialogue agents powered by Large Language Models (LLMs) show superior performance in various tasks. Despite the better user understanding and human-like responses, their lack of controllability remains a key challenge, often leading to unfocused conversations or task failure. To address this, we introduce Standard Operating Procedure (SOP) to regulate dialogue flow. Specifically, we propose ChatSOP, a novel SOP-guided Monte Carlo Tree Search (MCTS) planning framework designed to enhance the controllability of LLM-driven dialogue agents. To enable this, we curate a dataset comprising SOP-annotated multi-scenario dialogues, generated using a semi-automated role-playing system with GPT-4o and validated through strict manual quality control. Additionally, we propose a novel method that integrates Chain of Thought reasoning with supervised fine-tuning for SOP prediction and utilizes…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

pca-anonymous/pca
noneOfficial

Videos

ChatSOP: An SOP-Guided MCTS Planning Framework for Controllable LLM Dialogue Agents· underline

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · {Dispute@FaQ-s}How to file a dispute with Expedia? · Attention Is All You Need · Adam · Layer Normalization · GPT-3 · Cosine Annealing · Weight Decay · Linear Warmup With Cosine Annealing · Linear Layer