xPQA: Cross-Lingual Product Question Answering across 12 Languages
Xiaoyu Shen, Akari Asai, Bill Byrne, Adri\`a de Gispert

TL;DR
This paper introduces xPQA, a large-scale cross-lingual product question answering dataset across 12 languages, evaluating methods like translation and multilingual models for ranking and answer generation, highlighting the importance of in-domain data and the challenges in cross-lingual performance.
Contribution
The paper presents xPQA, a new large-scale dataset for cross-lingual PQA, and provides comprehensive evaluation of different approaches, revealing key insights into their effectiveness and limitations.
Findings
In-domain data is crucial for effective cross-lingual ranking.
Runtime translation favors candidate ranking, while multilingual models excel in answer generation.
Offline translation improves performance mainly for non-Latin script languages.
Abstract
Product Question Answering (PQA) systems are key in e-commerce applications to provide responses to customers' questions as they shop for products. While existing work on PQA focuses mainly on English, in practice there is need to support multiple customer languages while leveraging product information available in English. To study this practical industrial task, we present xPQA, a large-scale annotated cross-lingual PQA dataset in 12 languages across 9 branches, and report results in (1) candidate ranking, to select the best English candidate containing the information to answer a non-English question; and (2) answer generation, to generate a natural-sounding non-English answer based on the selected English candidate. We evaluate various approaches involving machine translation at runtime or offline, leveraging multilingual pre-trained LMs, and including or excluding xPQA training…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Expert finding and Q&A systems · Natural Language Processing Techniques
MethodsTest
