1-800-SHARED-TASKS at RegNLP: Lexical Reranking of Semantic Retrieval (LeSeR) for Regulatory Question Answering
Jebish Purbey, Drishti Sharma, Siddhant Gupta, Khawaja Murad,, Siddartha Pullakhandam, Ram Mohan Rao Kadiyala

TL;DR
This paper describes a system for regulatory question answering that uses advanced embedding models and a novel lexical reranking method, LeSeR, to improve document retrieval accuracy in regulatory NLP tasks.
Contribution
Introduction of LeSeR, a lexical reranking approach combined with multiple embedding models, enhancing retrieval performance in regulatory question answering.
Findings
Achieved recall@10 of 0.8201 and map@10 of 0.6655 in retrieval tasks.
Demonstrated the effectiveness of combining embedding models with reranking.
Provided insights into NLP applications for regulatory domains.
Abstract
This paper presents the system description of our entry for the COLING 2025 RegNLP RIRAG (Regulatory Information Retrieval and Answer Generation) challenge, focusing on leveraging advanced information retrieval and answer generation techniques in regulatory domains. We experimented with a combination of embedding models, including Stella, BGE, CDE, and Mpnet, and leveraged fine-tuning and reranking for retrieving relevant documents in top ranks. We utilized a novel approach, LeSeR, which achieved competitive results with a recall@10 of 0.8201 and map@10 of 0.6655 for retrievals. This work highlights the transformative potential of natural language processing techniques in regulatory applications, offering insights into their capabilities for implementing a retrieval augmented generation system while identifying areas for future improvement in robustness and domain adaptation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Semantic Web and Ontologies
