STARD: A Chinese Statute Retrieval Dataset with Real Queries Issued by Non-professionals
Weihang Su, Yiran Hu, Anzhe Xie, Qingyao Ai, Zibing Que, Ning Zheng,, Yun Liu, Weixing Shen, Yiqun Liu

TL;DR
The paper introduces STARD, a Chinese statute retrieval dataset based on real non-professional queries, highlighting the challenges and gaps in current retrieval methods for practical legal applications.
Contribution
It provides a new dataset capturing real-world non-professional legal queries and evaluates existing retrieval methods, revealing their limitations on such complex queries.
Findings
Existing retrieval methods perform poorly on real non-professional queries.
The best retrieval approach achieves only 90.7% Recall@100.
The dataset highlights the need for improved retrieval techniques for practical legal use.
Abstract
Statute retrieval aims to find relevant statutory articles for specific queries. This process is the basis of a wide range of legal applications such as legal advice, automated judicial decisions, legal document drafting, etc. Existing statute retrieval benchmarks focus on formal and professional queries from sources like bar exams and legal case documents, thereby neglecting non-professional queries from the general public, which often lack precise legal terminology and references. To address this gap, we introduce the STAtute Retrieval Dataset (STARD), a Chinese dataset comprising 1,543 query cases collected from real-world legal consultations and 55,348 candidate statutory articles. Unlike existing statute retrieval datasets, which primarily focus on professional legal queries, STARD captures the complexity and diversity of real queries from the general public. Through a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Law · Legal Education and Practice Innovations · Artificial Intelligence Applications
MethodsFocus
