Ancient Wisdom, Modern Tools: Exploring Retrieval-Augmented LLMs for   Ancient Indian Philosophy

Priyanka Mandikal

arXiv:2408.11903·cs.CL·August 26, 2024

Ancient Wisdom, Modern Tools: Exploring Retrieval-Augmented LLMs for Ancient Indian Philosophy

Priyanka Mandikal

PDF

Open Access 1 Repo

TL;DR

This paper investigates the use of retrieval-augmented generation models to improve factual accuracy and comprehensiveness in answering questions about Advaita Vedanta, demonstrating significant improvements over standard LLMs through a new dataset and hybrid retrieval methods.

Contribution

It introduces VedantaNY-10M, a new dataset on Indian philosophy, and benchmarks RAG models with hybrid retrieval, showing enhanced performance in specialized knowledge domains.

Findings

01

RAG models outperform standard LLMs in factual accuracy

02

Hybrid retrieval improves response quality further

03

Human evaluations favor RAG-generated answers

Abstract

LLMs have revolutionized the landscape of information retrieval and knowledge dissemination. However, their application in specialized areas is often hindered by factual inaccuracies and hallucinations, especially in long-tail knowledge distributions. We explore the potential of retrieval-augmented generation (RAG) models for long-form question answering (LFQA) in a specialized knowledge domain. We present VedantaNY-10M, a dataset curated from extensive public discourses on the ancient Indian philosophy of Advaita Vedanta. We develop and benchmark a RAG model against a standard, non-RAG LLM, focusing on transcription, retrieval, and generation performance. Human evaluations by computational linguists and domain experts show that the RAG model significantly outperforms the standard model in producing factual and comprehensive responses having fewer hallucinations. In addition, a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

priyankamandikal/vedantany-10m
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · WordPiece · Residual Connection · Multi-Head Attention · Linear Warmup With Linear Decay · Attention Dropout · Adam · Layer Normalization