Web Retrieval Agents for Evidence-Based Misinformation Detection

Jacob-Junqi Tian; Hao Yu; Yury Orlovskiy; Tyler Vergho; Mauricio; Rivera; Mayank Goel; Zachary Yang; Jean-Francois Godbout; Reihaneh Rabbany,; Kellin Pelrine

arXiv:2409.00009·cs.IR·October 11, 2024·2 cites

Web Retrieval Agents for Evidence-Based Misinformation Detection

Jacob-Junqi Tian, Hao Yu, Yury Orlovskiy, Tyler Vergho, Mauricio, Rivera, Mayank Goel, Zachary Yang, Jean-Francois Godbout, Reihaneh Rabbany,, Kellin Pelrine

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper presents an agent-based system combining large language models and web search to improve misinformation detection, demonstrating significant performance gains and thorough analysis of system components and biases.

Contribution

It introduces a novel hybrid agent approach that enhances fact-checking accuracy by integrating LLMs with web search, outperforming standalone models.

Findings

01

Increases macro F1 score by up to 20% over LLMs without search

02

System is robust across multiple models and configurations

03

Provides detailed analysis of sources, biases, and system design choices

Abstract

This paper develops an agent-based automated fact-checking approach for detecting misinformation. We demonstrate that combining a powerful LLM agent, which does not have access to the internet for searches, with an online web search agent yields better results than when each tool is used independently. Our approach is robust across multiple models, outperforming alternatives and increasing the macro F1 of misinformation detection by as much as 20 percent compared to LLMs without search. We also conduct extensive analyses on the sources our system leverages and their biases, decisions in the construction of the system like the search tool and the knowledge base, the type of evidence needed and its impact on the results, and other parts of the overall process. By combining strong performance with in-depth understanding, we hope to provide building blocks for future search-enabled…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

complexdata-mila/webretrieval
noneOfficial

Videos

Web Retrieval Agents for Evidence-Based Misinformation Detection· underline

Taxonomy

TopicsMisinformation and Its Impacts · Spam and Phishing Detection