Ready Jurist One: Benchmarking Language Agents for Legal Intelligence in Dynamic Environments
Zheng Jia, Shengbin Yue, Wei Chen, Siyuan Wang, Yidong Liu, Zejun Li, Yun Song, Zhongyu Wei

TL;DR
This paper introduces a new interactive legal environment and evaluation framework to assess language models' legal reasoning and procedural skills in dynamic, real-world scenarios, revealing current models' limitations.
Contribution
It presents J1-ENVS and J1-EVAL, pioneering tools for benchmarking legal AI in dynamic environments, highlighting gaps in models' procedural capabilities.
Findings
Models perform well in legal knowledge but poorly in procedural tasks.
GPT-4o achieves less than 60% overall performance.
Dynamic legal intelligence remains a significant challenge.
Abstract
The gap between static benchmarks and the dynamic nature of real-world legal practice poses a key barrier to advancing legal intelligence. To this end, we introduce J1-ENVS, the first interactive and dynamic legal environment tailored for LLM-based agents. Guided by legal experts, it comprises six representative scenarios from Chinese legal practices across three levels of environmental complexity. We further introduce J1-EVAL, a fine-grained evaluation framework, designed to assess both task performance and procedural compliance across varying levels of legal proficiency. Extensive experiments on 17 LLM agents reveal that, while many models demonstrate solid legal knowledge, they struggle with procedural execution in dynamic settings. Even the SOTA model, GPT-4o, falls short of 60% overall performance. These findings highlight persistent challenges in achieving dynamic legal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Law · Multi-Agent Systems and Negotiation
