LexGenius: An Expert-Level Benchmark for Large Language Models in Legal General Intelligence

Wenjin Liu; Haoran Luo; Xin Feng; Xiang Ji; Lijuan Zhou; Rui Mao; Jiapu Wang; Shirui Pan; and Erik Cambria

arXiv:2512.04578·cs.CL·April 17, 2026

LexGenius: An Expert-Level Benchmark for Large Language Models in Legal General Intelligence

Wenjin Liu, Haoran Luo, Xin Feng, Xiang Ji, Lijuan Zhou, Rui Mao, Jiapu Wang, Shirui Pan, and Erik Cambria

PDF

1 Repo 1 Models 1 Datasets

TL;DR

LexGenius is a comprehensive Chinese legal benchmark designed to evaluate large language models' legal understanding, reasoning, and decision-making abilities, revealing significant gaps compared to human experts.

Contribution

It introduces a new expert-level legal benchmark with a structured framework, combining manual and LLM reviews to reliably assess legal GI in LLMs.

Findings

01

Significant disparities in legal abilities among LLMs.

02

Even top LLMs lag behind human legal professionals.

03

LexGenius effectively evaluates legal intelligence in LLMs.

Abstract

Legal general intelligence (GI) refers to artificial intelligence (AI) that encompasses legal understanding, reasoning, and decision-making, simulating the expertise of legal experts across domains. However, existing benchmarks are result-oriented and fail to systematically evaluate the legal intelligence of large language models (LLMs), hindering the development of legal GI. To address this, we propose LexGenius, an expert-level Chinese legal benchmark for evaluating legal GI in LLMs. It follows a Dimension-Task-Ability framework, covering seven dimensions, eleven tasks, and twenty abilities. We use the recent legal cases and exam questions to create multiple-choice questions with a combination of manual and LLM reviews to reduce data leakage risks, ensuring accuracy and reliability through multiple rounds of checks. We evaluate 12 state-of-the-art LLMs using LexGenius and conduct an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

QwenQKing/LexGenius
github

Models

🤗
QwenQKing/LexGenius
model· ♡ 2
♡ 2

Datasets

QwenQKing/LexGenius
dataset· 95 dl
95 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.