A Comparative Study of Retrieval Methods in Azure AI Search
Qiang Mao, Han Qin, Robert Neary, Charles Wang, Fusheng Wei, Jianping Zhang, Nathaniel Huber-Fliflet

TL;DR
This paper compares various retrieval methods within Azure AI Search's RAG framework to determine their effectiveness for legal document review, focusing on accuracy, relevance, and consistency of AI responses.
Contribution
It provides a comprehensive evaluation of keyword, semantic, vector, hybrid, and hybrid-semantic retrieval strategies for legal eDiscovery tasks.
Findings
Semantic and hybrid methods outperform keyword retrieval in accuracy.
Vector and hybrid-semantic methods provide more relevant responses.
Hybrid approaches offer a balance of accuracy and consistency.
Abstract
Increasingly, attorneys are interested in moving beyond keyword and semantic search to improve the efficiency of how they find key information during a document review task. Large language models (LLMs) are now seen as tools that attorneys can use to ask natural language questions of their data during document review to receive accurate and concise answers. This study evaluates retrieval strategies within Microsoft Azure's Retrieval-Augmented Generation (RAG) framework to identify effective approaches for Early Case Assessment (ECA) in eDiscovery. During ECA, legal teams analyze data at the outset of a matter to gain a general understanding of the data and attempt to determine key facts and risks before beginning full-scale review. In this paper, we compare the performance of Azure AI Search's keyword, semantic, vector, hybrid, and hybrid-semantic retrieval methods. We then present the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Law · Computational and Text Analysis Methods · Topic Modeling
