ChatShopBuddy: Towards Reliable Conversational Shopping Agents via Reinforcement Learning

Yiruo Cheng; Kelong Mao; Tianhao Li; Jiejun Tan; Ji-Rong Wen; Zhicheng Dou

arXiv:2603.06065·cs.IR·March 9, 2026

ChatShopBuddy: Towards Reliable Conversational Shopping Agents via Reinforcement Learning

Yiruo Cheng, Kelong Mao, Tianhao Li, Jiejun Tan, Ji-Rong Wen, Zhicheng Dou

PDF

Open Access

TL;DR

This paper introduces ChatShopBuddy, a reinforcement learning framework for conversational shopping agents that optimizes multiple objectives, using a new benchmark, hierarchical reward modeling, and dynamic policy optimization to improve response quality and efficiency.

Contribution

It presents a comprehensive methodology including a new benchmark, hierarchical reward modeling, and dynamic policy optimization for training reliable conversational shopping agents.

Findings

01

RL-trained ChatShopBuddy outperforms larger models in stability.

02

The hierarchical reward model effectively captures complex quality requirements.

03

Dynamic contrastive policy optimization balances response quality and operational efficiency.

Abstract

Conversational shopping agents represent a critical consumer-facing application of Large Language Model (LLM)-powered agents, yet how to effectively apply post-training Reinforcement Learning (RL) to optimize such agents remains underexplored. This work investigates RL-based optimization for shopping agents in real-world scenarios, where agents must simultaneously satisfy multiple interdependent objectives spanning objective metrics (product correctness), subjective qualities (persuasiveness), outcome rewards (final response quality), and process rewards (tool efficiency). We present a complete methodology to address this challenge. Specifically, we first construct SmartShopBench, a benchmark that captures diverse shopping intents with a hierarchical evaluation that decomposes complex quality requirements into measurable levels. Building on this evaluation framework, we design…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Multimodal Machine Learning Applications · AI in Service Interactions