CJRC: A Reliable Human-Annotated Benchmark DataSet for Chinese Judicial Reading Comprehension
Xingyi Duan, Baoxin Wang, Ziyue Wang, Wentao Ma, Yiming Cui, Dayong, Wu, Shijin Wang, Ting Liu, Tianxiang Huo, Zhen Hu, Heng Wang, Zhiyuan Liu

TL;DR
This paper introduces CJRC, a large, human-annotated Chinese judicial reading comprehension dataset designed to facilitate legal element extraction through machine reading comprehension models.
Contribution
It provides a new, reliable benchmark dataset for Chinese legal reading comprehension, annotated by experts, enabling research in legal element extraction.
Findings
Baseline models achieve lower accuracy than human annotators.
The dataset reveals significant room for improvement in machine comprehension.
CJRC supports the development of legal reading comprehension technologies.
Abstract
We present a Chinese judicial reading comprehension (CJRC) dataset which contains approximately 10K documents and almost 50K questions with answers. The documents come from judgment documents and the questions are annotated by law experts. The CJRC dataset can help researchers extract elements by reading comprehension technology. Element extraction is an important task in the legal field. However, it is difficult to predefine the element types completely due to the diversity of document types and causes of action. By contrast, machine reading comprehension technology can quickly extract elements by answering various questions from the long document. We build two strong baseline models based on BERT and BiDAF. The experimental results show that there is enough space for improvement compared to human annotators.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Layer · Residual Connection · Attention Dropout · Linear Warmup With Linear Decay · Weight Decay · Refunds@Expedia|||How do I get a full refund from Expedia? · Dense Connections · Adam · WordPiece · Softmax
