RAGTurk: Best Practices for Retrieval Augmented Generation in Turkish
S\"uha Ka\u{g}an K\"ose, Mehmet Can Baytekin, Burak Akta\c{s}, Bilge Kaan G\"or\"ur, Evren Ayberk Munis, Deniz Y{\i}lmaz, Muhammed Yusuf Kartal, \c{C}a\u{g}r{\i} Toraman

TL;DR
This paper develops and benchmarks best practices for Retrieval-Augmented Generation in Turkish, a morphologically rich language, demonstrating effective configurations and highlighting the impact of module stacking on performance.
Contribution
It introduces a comprehensive Turkish RAG dataset and benchmarks various pipeline configurations, providing new insights for RAG in morphologically complex languages.
Findings
HyDE achieves 85% accuracy, outperforming baseline.
Pareto-optimal reranking and context augmentation reach 84.6%.
Over-stacking modules can harm performance.
Abstract
Retrieval-Augmented Generation (RAG) enhances LLM factuality, yet design guidance remains English-centric, limiting insights for morphologically rich languages like Turkish. We address this by constructing a comprehensive Turkish RAG dataset derived from Turkish Wikipedia and CulturaX, comprising question-answer pairs and relevant passage chunks. We benchmark seven stages of the RAG pipeline, from query transformation and reranking to answer refinement, without task-specific fine-tuning. Our results show that complex methods like HyDE maximize accuracy (85%) that is considerably higher than the baseline (78.70%). Also a Pareto-optimal configuration using Cross-encoder Reranking and Context Augmentation achieves comparable performance (84.60%) with much lower cost. We further demonstrate that over-stacking generative modules can degrade performance by distorting morphological cues,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Information Retrieval and Search Behavior
