Shopping Companion: A Memory-Augmented LLM Agent for Real-World E-Commerce Tasks
Zijian Yu, Kejun Xiao, Huaipeng Zhao, Tao Luo, Xiaoyi Zeng

TL;DR
This paper introduces a new benchmark and a unified memory-augmented LLM framework for e-commerce shopping tasks, improving long-term preference understanding and task performance in real-world scenarios.
Contribution
It presents a novel benchmark for long-term preference-aware shopping tasks and a unified framework that jointly handles memory retrieval and shopping assistance.
Findings
Lightweight LLM trained with our method outperforms baselines.
State-of-the-art models achieve under 70% success on the benchmark.
Our approach effectively captures user preferences over long-term interactions.
Abstract
In e-commerce, LLM agents show promise for shopping tasks such as recommendations, budgeting, and bundle deals, where accurately capturing user preferences from long-term conversations is critical. However, two challenges hinder realizing this potential: (1) the absence of benchmarks for evaluating long-term preference-aware shopping tasks, and (2) the lack of end-to-end optimization due to existing designs that treat preference identification and shopping assistance as separate components. In this paper, we introduce a novel benchmark with a long-term memory setup, spanning two shopping tasks over 1.2 million real-world products, and propose Shopping Companion, a unified framework that jointly tackles memory retrieval and shopping assistance while supporting user intervention. To train such capabilities, we develop a dual-reward reinforcement learning strategy with tool-wise rewards to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Multimodal Machine Learning Applications · Topic Modeling
