IntentionQA: A Benchmark for Evaluating Purchase Intention Comprehension Abilities of Language Models in E-commerce
Wenxuan Ding, Weiqi Wang, Sze Heng Douglas Kwok, Minghao Liu, Tianqing, Fang, Jiaxin Bai, Xin Liu, Changlong Yu, Zheng Li, Chen Luo, Qingyu Yin, Bing, Yin, Junxian He, Yangqiu Song

TL;DR
IntentionQA is a comprehensive benchmark designed to evaluate how well language models understand and reason about purchase intentions in e-commerce, highlighting current models' limitations in real-world scenarios.
Contribution
This paper introduces IntentionQA, a novel large-scale benchmark with a double-task multiple-choice format to assess purchase intention comprehension in language models.
Findings
Models struggle with understanding product-intention relationships.
Current models perform far below human levels in complex reasoning tasks.
IntentionQA demonstrates the need for improved models in e-commerce understanding.
Abstract
Enhancing Language Models' (LMs) ability to understand purchase intentions in E-commerce scenarios is crucial for their effective assistance in various downstream tasks. However, previous approaches that distill intentions from LMs often fail to generate meaningful and human-centric intentions applicable in real-world E-commerce contexts. This raises concerns about the true comprehension and utilization of purchase intentions by LMs. In this paper, we present IntentionQA, a double-task multiple-choice question answering benchmark to evaluate LMs' comprehension of purchase intentions in E-commerce. Specifically, LMs are tasked to infer intentions based on purchased products and utilize them to predict additional purchases. IntentionQA consists of 4,360 carefully curated problems across three difficulty levels, constructed using an automated pipeline to ensure scalability on large…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsPersonal Information Management and User Behavior · Recommender Systems and Techniques · Technology Adoption and User Behaviour
