Unlocking Markets: A Multilingual Benchmark to Cross-Market Question   Answering

Yifei Yuan; Yang Deng; Anders S{\o}gaard; Mohammad Aliannejadi

arXiv:2409.16025·cs.CL·September 25, 2024

Unlocking Markets: A Multilingual Benchmark to Cross-Market Question Answering

Yifei Yuan, Yang Deng, Anders S{\o}gaard, Mohammad Aliannejadi

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces MCPQA, a new multilingual, cross-market question answering benchmark with over 7 million questions, demonstrating that leveraging cross-market data improves answer quality and question ranking.

Contribution

The paper presents a large-scale multilingual dataset for cross-market product question answering and benchmarks various models, highlighting the benefits of cross-market information integration.

Findings

01

Cross-market data improves answer accuracy.

02

LLMs outperform traditional models in this task.

03

Multilingual dataset enables broader applicability.

Abstract

Users post numerous product-related questions on e-commerce platforms, affecting their purchase decisions. Product-related question answering (PQA) entails utilizing product-related resources to provide precise responses to users. We propose a novel task of Multilingual Cross-market Product-based Question Answering (MCPQA) and define the task as providing answers to product-related questions in a main marketplace by utilizing information from another resource-rich auxiliary marketplace in a multilingual context. We introduce a large-scale dataset comprising over 7 million questions from 17 marketplaces across 11 languages. We then perform automatic translation on the Electronics category of our dataset, naming it as McMarket. We focus on two subtasks: review-based answer generation and product-related question ranking. For each subtask, we label a subset of McMarket using an LLM and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yfyuan01/mcpqa
pytorchOfficial

Videos

Unlocking Markets: A Multilingual Benchmark to Cross-Market Question Answering· underline

Taxonomy

TopicsTopic Modeling · Speech and dialogue systems · Natural Language Processing Techniques

MethodsFocus