Retrieval-Augmented Chain-of-Thought in Semi-structured Domains
Vaibhav Mavi, Abulhair Saparov, Chen Zhao

TL;DR
This paper introduces a retrieval-augmented chain-of-thought approach that leverages semi-structured data to improve domain-specific question answering in law and finance, overcoming LLM context length limitations.
Contribution
It presents a novel retrieval-augmented method that efficiently incorporates semi-structured domain data into LLMs for enhanced QA performance and interpretability.
Findings
Outperforms existing models in legal and financial QA tasks.
Effectively handles long contexts by retrieving relevant semi-structured data.
Provides explanations for answers, aiding interpretability.
Abstract
Applying existing question answering (QA) systems to specialized domains like law and finance presents challenges that necessitate domain expertise. Although large language models (LLMs) have shown impressive language comprehension and in-context learning capabilities, their inability to handle very long inputs/contexts is well known. Tasks specific to these domains need significant background knowledge, leading to contexts that can often exceed the maximum length that existing LLMs can process. This study explores leveraging the semi-structured nature of legal and financial data to efficiently retrieve relevant context, enabling the use of LLMs for domain-specialized QA. The resulting system outperforms contemporary models and also provides useful explanations for the answers, encouraging the integration of LLMs into legal and financial NLP systems for future research.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
