Orca: A Few-shot Benchmark for Chinese Conversational Machine Reading   Comprehension

Nuo Chen; Hongguang Li; Junqing He; Yinan Bao; Xinshi Lin; Qi Yang,; Jianfeng Liu; Ruyi Gan; Jiaxing Zhang; Baoyuan Wang; Jia Li

arXiv:2302.13619·cs.CL·October 16, 2023·1 cites

Orca: A Few-shot Benchmark for Chinese Conversational Machine Reading Comprehension

Nuo Chen, Hongguang Li, Junqing He, Yinan Bao, Xinshi Lin, Qi Yang,, Jianfeng Liu, Ruyi Gan, Jiaxing Zhang, Baoyuan Wang, Jia Li

PDF

Open Access 1 Repo

TL;DR

Orca is a new Chinese conversational machine reading comprehension benchmark with diverse, real-world topics, designed to evaluate models' understanding in realistic scenarios through natural responses and zero-shot/few-shot settings.

Contribution

This paper introduces the first Chinese CMRC benchmark with diverse domains, natural answer annotations, and zero-shot/few-shot evaluation settings, addressing limitations of previous static-passage datasets.

Findings

01

Existing models struggle with the new benchmark.

02

The dataset covers 33 real-world domains.

03

Baseline models show significant room for improvement.

Abstract

The conversational machine reading comprehension (CMRC) task aims to answer questions in conversations, which has been a hot research topic in recent years because of its wide applications. However, existing CMRC benchmarks in which each conversation is assigned a static passage are inconsistent with real scenarios. Thus, model's comprehension ability towards real scenarios are hard to evaluate reasonably. To this end, we propose the first Chinese CMRC benchmark Orca and further provide zero-shot/few-shot settings to evaluate model's generalization ability towards diverse domains. We collect 831 hot-topic driven conversations with 4,742 turns in total. Each turn of a conversation is assigned with a response-related passage, aiming to evaluate model's comprehension ability more reasonably. The topics of conversations are collected from social media platform and cover 33 domains, trying…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nuochenpku/orca
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification