Verif.ai: Towards an Open-Source Scientific Generative Question-Answering System with Referenced and Verifiable Answers
Milo\v{s} Ko\v{s}prdi\'c, Adela Ljaji\'c, Bojana Ba\v{s}aragin, Darija, Medvecki, Nikola Milo\v{s}evi\'c

TL;DR
Verif.ai is an open-source scientific question-answering system that combines retrieval, generative modeling, and verification to produce referenced, verifiable answers and reduce hallucinations in scientific contexts.
Contribution
The paper introduces a novel integrated system that combines retrieval, generative, and verification components for scientific QA with references and hallucination checks.
Findings
System effectively retrieves relevant scientific papers.
Generative model produces answers with references.
Verification engine reduces hallucinations.
Abstract
In this paper, we present the current progress of the project Verif.ai, an open-source scientific generative question-answering system with referenced and verified answers. The components of the system are (1) an information retrieval system combining semantic and lexical search techniques over scientific papers (PubMed), (2) a fine-tuned generative model (Mistral 7B) taking top answers and generating answers with references to the papers from which the claim was derived, and (3) a verification engine that cross-checks the generated claim and the abstract or paper from which the claim was derived, verifying whether there may have been any hallucinations in generating the claim. We are reinforcing the generative model by providing the abstract in context, but in addition, an independent set of methods and models are verifying the answer and checking for hallucinations. Therefore, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Semantic Web and Ontologies
MethodsSparse Evolutionary Training · Time-homogenuous Top-K Ranking
