What Are the Facts? Automated Extraction of Court-Established Facts from Criminal-Court Opinions
Kl\'ara Bendov\'a, Tom\'a\v{s} Knap, Jan \v{C}ern\'y, Vojt\v{e}ch Pour, Jaromir Savelka, Ivana Kvapil\'ikov\'a, Jakub Dr\'apal

TL;DR
This study explores automated methods to extract descriptions of criminal behaviors from Slovak court verdicts, significantly improving accuracy over simple regex approaches by employing advanced regex and large language models, with near-human performance.
Contribution
It demonstrates the feasibility of extracting detailed criminal behavior descriptions from court decisions using advanced regex and LLMs, outperforming baseline methods and approaching human annotation accuracy.
Findings
Advanced regex achieved 97% accuracy in extraction.
LLMs achieved 98.75% accuracy, matching human annotations in 91.75% of cases.
Combining regex and LLMs reached 92% accuracy, greatly surpassing baseline performance.
Abstract
Criminal justice administrative data contain only a limited amount of information about the committed offense. However, there is an unused source of extensive information in continental European courts' decisions: descriptions of criminal behaviors in verdicts by which offenders are found guilty. In this paper, we study the feasibility of extracting these descriptions from publicly available court decisions from Slovakia. We use two different approaches for retrieval: regular expressions and large language models (LLMs). Our baseline was a simple method employing regular expressions to identify typical words occurring before and after the description. The advanced regular expression approach further focused on "sparing" and its normalization (insertion of spaces between individual letters), typical for delineating the description. The LLM approach involved prompting the Gemini Flash 2.0…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Law · Jury Decision Making Processes · Topic Modeling
