WixQA: A Multi-Dataset Benchmark for Enterprise Retrieval-Augmented Generation
Dvir Cohen, Lin Burg, Sviatoslav Pykhnivskyi, Hagit Gur, Stanislav Kovynov, Olga Atzmon, Gilad Barkan

TL;DR
WixQA introduces a comprehensive benchmark suite with three datasets grounded in real enterprise knowledge bases, enabling evaluation of retrieval and generation in enterprise question answering systems.
Contribution
This paper presents WixQA, a novel multi-dataset benchmark for enterprise RAG systems, including real, simulated, and synthetic QA datasets grounded in a knowledge base, along with baseline results.
Findings
WixQA provides diverse datasets for enterprise QA evaluation.
Baseline RAG system performances are established on WixQA.
The benchmark facilitates holistic assessment of retrieval and generation components.
Abstract
Retrieval-Augmented Generation (RAG) is a cornerstone of modern question answering (QA) systems, enabling grounded answers based on external knowledge. Although recent progress has been driven by open-domain datasets, enterprise QA systems need datasets that mirror the concrete, domain-specific issues users raise in day-to-day support scenarios. Critically, evaluating end-to-end RAG systems requires benchmarks comprising not only question--answer pairs but also the specific knowledge base (KB) snapshot from which answers were derived. To address this need, we introduce WixQA, a benchmark suite featuring QA datasets precisely grounded in the released KB corpus, enabling holistic evaluation of retrieval and generation components. WixQA includes three distinct QA datasets derived from Wix.com customer support interactions and grounded in a snapshot of the public Wix Help Center KB: (i)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Is Venmo Customer Support Available 24/7? How to Reach a Real Person · Attention Is All You Need · Linear Warmup With Linear Decay · Dropout · Layer Normalization · Byte Pair Encoding · Attention Dropout · Softmax · Residual Connection
