iTRI-QA: a Toolset for Customized Question-Answer Dataset Generation Using Language Models for Enhanced Scientific Research
Qiming Liu, Zhongzheng Niu, Siting Liu, Mao Tian

TL;DR
This paper introduces iTRI-QA, a toolset that creates customized scientific question-answer datasets by fine-tuning language models with curated research data, improving knowledge retrieval for scientific research.
Contribution
The paper presents a novel framework for generating domain-specific QA datasets using fine-tuned language models and curated research data, enhancing scientific knowledge retrieval.
Findings
Demonstrates the feasibility of the toolset for scientific knowledge retrieval.
Shows improved contextual relevance and accuracy in responses.
Provides a scalable pipeline for future research applications.
Abstract
The exponential growth of AI in science necessitates efficient and scalable solutions for retrieving and preserving research information. Here, we present a tool for the development of a customized question-answer (QA) dataset, called Interactive Trained Research Innovator (iTRI) - QA, tailored for the needs of researchers leveraging language models (LMs) to retrieve scientific knowledge in a QA format. Our approach integrates curated QA datasets with a specialized research paper dataset to enhance responses' contextual relevance and accuracy using fine-tuned LM. The framework comprises four key steps: (1) the generation of high-quality and human-generated QA examples, (2) the creation of a structured research paper database, (3) the fine-tuning of LMs using domain-specific QA examples, and (4) the generation of QA dataset that align with user queries and the curated database. This…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling
MethodsALIGN
