IterKey: Iterative Keyword Generation with LLMs for Enhanced Retrieval Augmented Generation
Kazuki Hayashi, Hidetaka Kamigaito, Shinya Kouda, Taro Watanabe

TL;DR
IterKey is an innovative LLM-driven iterative keyword generation framework that improves retrieval-augmented generation by balancing accuracy and interpretability across multiple question-answering tasks.
Contribution
It introduces a novel BM25-based iterative keyword refinement method leveraging LLMs, enhancing RAG performance while maintaining interpretability.
Findings
Achieves 5% to 20% accuracy improvements over BM25-based RAG.
Performs comparably to dense retrieval methods and prior iterative refinement approaches.
Balances accuracy and interpretability effectively across four QA tasks.
Abstract
Retrieval-Augmented Generation (RAG) has emerged as a way to complement the in-context knowledge of Large Language Models (LLMs) by integrating external documents. However, real-world applications demand not only accuracy but also interpretability. While dense retrieval methods provide high accuracy, they lack interpretability; conversely, sparse retrieval methods offer transparency but often fail to capture the full intent of queries due to their reliance on keyword matching. To address these issues, we introduce IterKey, an LLM-driven iterative keyword generation framework that enhances RAG via sparse retrieval. IterKey consists of three LLM-driven stages: generating keywords for retrieval, generating answers based on retrieved documents, and validating the answers. If validation fails, the process iteratively repeats with refined keywords. Across four QA tasks, experimental results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Warmup With Linear Decay · Dropout · Layer Normalization · Byte Pair Encoding · Attention Dropout · Softmax · Residual Connection · WordPiece
