Less is More: Benchmarking LLM Based Recommendation Agents
Kargi Chauhan, Mahalakshmi Venkateswarlu

TL;DR
This paper systematically benchmarks LLM-based recommendation agents, revealing that longer user histories do not improve recommendation quality and that shorter contexts can significantly reduce inference costs.
Contribution
It challenges the assumption that longer user histories improve recommendations, providing empirical evidence that shorter contexts are equally effective and more cost-efficient.
Findings
No significant quality improvement with increased context length.
Using shorter contexts reduces inference costs by approximately 88%.
Model-specific latency behaviors inform deployment strategies.
Abstract
Large Language Models (LLMs) are increasingly deployed for personalized product recommendations, with practitioners commonly assuming that longer user purchase histories lead to better predictions. We challenge this assumption through a systematic benchmark of four state of the art LLMs GPT-4o-mini, DeepSeek-V3, Qwen2.5-72B, and Gemini 2.5 Flash across context lengths ranging from 5 to 50 items using the REGEN dataset. Surprisingly, our experiments with 50 users in a within subject design reveal no significant quality improvement with increased context length. Quality scores remain flat across all conditions (0.17--0.23). Our findings have significant practical implications: practitioners can reduce inference costs by approximately 88\% by using context (5--10 items) instead of longer histories (50 items), without sacrificing recommendation quality. We also analyze latency patterns…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Artificial Intelligence in Healthcare and Education · Explainable Artificial Intelligence (XAI)
