Learning from Litigation: Graphs and LLMs for Retrieval and Reasoning in eDiscovery
Sounak Lahiri, Sumit Pai, Tim Weninger, Sanmitra Bhattacharya

TL;DR
This paper introduces DISCOG, a system combining knowledge graphs and large language models to improve document retrieval and reasoning in eDiscovery, significantly reducing review costs and enhancing accuracy.
Contribution
The paper presents a novel integration of knowledge graphs with LLMs for legal document review, addressing challenges with legal entities and citations.
Findings
DISCOG outperforms baseline models in F1-score, precision, and recall.
It reduces litigation review costs by approximately 98%.
Demonstrates effectiveness on both balanced and imbalanced datasets.
Abstract
Electronic Discovery (eDiscovery) requires identifying relevant documents from vast collections for legal production requests. While artificial intelligence (AI) and natural language processing (NLP) have improved document review efficiency, current methods still struggle with legal entities, citations, and complex legal artifacts. To address these challenges, we introduce DISCOvery Graph (DISCOG), an emerging system that integrates knowledge graphs for enhanced document ranking and classification, augmented by LLM-driven reasoning. DISCOG outperforms strong baselines in F1-score, precision, and recall across both balanced and imbalanced datasets. In real-world deployments, it has reduced litigation-related document review costs by approximately 98\%, demonstrating significant business impact.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Law
