Enhancing Q&A with Domain-Specific Fine-Tuning and Iterative Reasoning:   A Comparative Study

Zooey Nguyen; Anthony Annunziata; Vinh Luong; Sang Dinh; Quynh Le; Anh; Hai Ha; Chanh Le; Hong An Phan; Shruti Raghavan; Christopher Nguyen

arXiv:2404.11792·cs.AI·April 23, 2024·6 cites

Enhancing Q&A with Domain-Specific Fine-Tuning and Iterative Reasoning: A Comparative Study

Zooey Nguyen, Anthony Annunziata, Vinh Luong, Sang Dinh, Quynh Le, Anh, Hai Ha, Chanh Le, Hong An Phan, Shruti Raghavan, Christopher Nguyen

PDF

Open Access

TL;DR

This study demonstrates that domain-specific fine-tuning and iterative reasoning significantly improve the accuracy of large language model-based Q&A systems, especially in financial domains, approaching human expert quality.

Contribution

It provides a comparative analysis of fine-tuning and reasoning techniques in RAG-based Q&A systems, highlighting their combined impact on performance and offering a structured design space for future development.

Findings

01

Fine-tuned embedding models enhance RAG accuracy more than generic models.

02

Reasoning iterations substantially improve Q&A performance.

03

Domain-specific augmentation is crucial for high-quality financial Q&A.

Abstract

This paper investigates the impact of domain-specific model fine-tuning and of reasoning mechanisms on the performance of question-answering (Q&A) systems powered by large language models (LLMs) and Retrieval-Augmented Generation (RAG). Using the FinanceBench SEC financial filings dataset, we observe that, for RAG, combining a fine-tuned embedding model with a fine-tuned LLM achieves better accuracy than generic models, with relatively greater gains attributable to fine-tuned embedding models. Additionally, employing reasoning iterations on top of RAG delivers an even bigger jump in performance, enabling the Q&A systems to get closer to human-expert quality. We discuss the implications of such findings, propose a structured technical design space capturing major technical components of Q&A AI, and provide recommendations for making high-impact technical choices for such components. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIntelligent Tutoring Systems and Adaptive Learning

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Warmup With Linear Decay · Attention Dropout · Linear Layer · Multi-Head Attention · WordPiece · Weight Decay · Byte Pair Encoding · Dense Connections