Seeing Justice Clearly: Handwritten Legal Document Translation with OCR and Vision-Language Models

Shubham Kumar Nigam; Parjanya Aditya Shukla; Noel Shallum; Arnab Bhattacharya

arXiv:2512.18004·cs.CV·December 23, 2025

Seeing Justice Clearly: Handwritten Legal Document Translation with OCR and Vision-Language Models

Shubham Kumar Nigam, Parjanya Aditya Shukla, Noel Shallum, Arnab Bhattacharya

PDF

Open Access 1 Video

TL;DR

This paper compares traditional OCR plus machine translation pipelines with vision-language models for translating handwritten Marathi legal documents, aiming to improve digitization and accessibility of legal records in low-resource settings.

Contribution

It introduces and evaluates a unified vision-language model approach for direct translation of handwritten legal documents, demonstrating potential advantages over traditional pipelines.

Findings

01

Vision-language models can directly translate handwritten images effectively.

02

Traditional OCR-MT pipelines are less efficient for low-resource languages.

03

End-to-end models show promise for legal document digitization in India.

Abstract

Handwritten text recognition (HTR) and machine translation continue to pose significant challenges, particularly for low-resource languages like Marathi, which lack large digitized corpora and exhibit high variability in handwriting styles. The conventional approach to address this involves a two-stage pipeline: an OCR system extracts text from handwritten images, which is then translated into the target language using a machine translation model. In this work, we explore and compare the performance of traditional OCR-MT pipelines with Vision Large Language Models that aim to unify these stages and directly translate handwritten text images in a single, end-to-end step. Our motivation is grounded in the urgent need for scalable, accurate translation systems to digitize legal records such as FIRs, charge sheets, and witness statements in India's district and high courts. We evaluate both…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Seeing Justice Clearly: Handwritten Legal Document Translation with OCR and Vision-Language Models· underline

Taxonomy

TopicsHandwritten Text Recognition Techniques · Topic Modeling · Multimodal Machine Learning Applications