Search Engines in an AI Era: The False Promise of Factual and Verifiable   Source-Cited Responses

Pranav Narayanan Venkit; Philippe Laban; Yilun Zhou; Yixin Mao,; Chien-Sheng Wu

arXiv:2410.22349·cs.IR·October 31, 2024

Search Engines in an AI Era: The False Promise of Factual and Verifiable Source-Cited Responses

Pranav Narayanan Venkit, Philippe Laban, Yilun Zhou, Yixin Mao,, Chien-Sheng Wu

PDF

Open Access 1 Repo

TL;DR

This paper critically examines the limitations of LLM-based answer engines, revealing common issues like hallucinations and citation inaccuracies, and introduces an evaluation benchmark to improve their transparency and reliability.

Contribution

It provides a comprehensive user study, design recommendations, and an automated evaluation benchmark for assessing LLM-based answer engines.

Findings

01

Frequent hallucinations in answer generation

02

Inaccurate source citations by answer engines

03

Variation in answer confidence levels across systems

Abstract

Large Language Model (LLM)-based applications are graduating from research prototypes to products serving millions of users, influencing how people write and consume information. A prominent example is the appearance of Answer Engines: LLM-based generative search engines supplanting traditional search engines. Answer engines not only retrieve relevant sources to a user query but synthesize answer summaries that cite the sources. To understand these systems' limitations, we first conducted a study with 21 participants, evaluating interactions with answer vs. traditional search engines and identifying 16 answer engine limitations. From these insights, we propose 16 answer engine design recommendations, linked to 8 metrics. An automated evaluation implementing our metrics on three popular engines (You.com, Perplexity.ai, BingChat) quantifies common limitations (e.g., frequent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

salesforceairesearch/answer-engine-eval
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMisinformation and Its Impacts · Ethics and Social Impacts of AI