ReLeVAnT: Relevance Lexical Vectors for Accurate Legal Text Classification

Ishaan Gakhar; Harsh Nandwani

arXiv:2604.22292·cs.CL·April 27, 2026

ReLeVAnT: Relevance Lexical Vectors for Accurate Legal Text Classification

Ishaan Gakhar, Harsh Nandwani

PDF

TL;DR

ReLeVAnT is a novel framework that uses lexical features, contrastive scoring, and shallow neural networks to classify legal documents with high accuracy, reducing reliance on metadata and extensive computation.

Contribution

It introduces a discriminative lexical feature-based approach for legal text classification that achieves near state-of-the-art performance with minimal computational resources.

Findings

01

Achieved 99.3% accuracy on LexGLUE dataset.

02

Attained 98.7% F1 score in legal document classification.

03

Utilized one-time keyword extraction for efficient classification.

Abstract

The classification of legal documents from an unstructured data corpus has several crucial applications in downstream tasks. Documents relevant to court filings are key in use cases such as drafting motions, memos, and outlines, as well as in tasks like docket summarisation, retrieval systems, and training data curation. Current methods classify based on provided metadata, LLM-extracted metadata, or multimodal methods. These methods depend on structured data, metadata, and extensive computational power. This task is approached from a perspective of leveraging discriminative features in the documents between classes. The authors propose ReLeVAnT, a framework for legal document binary classification. ReLeVAnT utilises n-gram processing, contrastive score matching, and a shallow neural network as the primary drivers for discriminative classification. It leverages one-time keyword…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.