FlashCheck: Exploration of Efficient Evidence Retrieval for Fast   Fact-Checking

Kevin Nanekhan; Venktesh V; Erik Martin; Henrik Vatndal; Vinay Setty,; and Avishek Anand

arXiv:2502.05803·cs.IR·February 18, 2025

FlashCheck: Exploration of Efficient Evidence Retrieval for Fast Fact-Checking

Kevin Nanekhan, Venktesh V, Erik Martin, Henrik Vatndal, Vinay Setty,, and Avishek Anand

PDF

Open Access 1 Repo

TL;DR

This paper explores efficient evidence retrieval methods for automated fact-checking, focusing on indexing and vector quantization to improve speed and scalability in large knowledge sources like Wikipedia.

Contribution

It introduces novel indexing and compression techniques for dense retrieval, significantly enhancing the efficiency of fact-checking pipelines on large datasets.

Findings

01

Achieves up to 10x speedup on CPUs

02

Over 20x acceleration on GPUs

03

Effective retrieval on real-world fact-checking datasets

Abstract

The advances in digital tools have led to the rampant spread of misinformation. While fact-checking aims to combat this, manual fact-checking is cumbersome and not scalable. It is essential for automated fact-checking to be efficient for aiding in combating misinformation in real-time and at the source. Fact-checking pipelines primarily comprise a knowledge retrieval component which extracts relevant knowledge to fact-check a claim from large knowledge sources like Wikipedia and a verification component. The existing works primarily focus on the fact-verification part rather than evidence retrieval from large data collections, which often face scalability issues for practical applications such as live fact-checking. In this study, we address this gap by exploring various methods for indexing a succinct set of factual statements from large collections like Wikipedia to enhance the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kevin-rn/Efficient-Fact-checking
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsScientific Computing and Data Management · Topic Modeling

MethodsFocus · Sparse Evolutionary Training