Source Coverage and Citation Bias in LLM-based vs. Traditional Search Engines
Peixian Zhang, Qiming Ye, Zifan Peng, Kiran Garimella, Gareth Tyson

TL;DR
This study compares large language model-based search engines with traditional search engines, revealing differences in source diversity, credibility, and transparency, and providing insights into their source selection mechanisms.
Contribution
It offers the first large-scale empirical analysis of LLM-SEs versus TSEs, highlighting source diversity, credibility issues, and key factors influencing source selection.
Findings
LLM-SEs cite more diverse sources than TSEs.
37% of cited domains are unique to LLM-SEs.
LLM-SEs do not outperform TSEs in credibility, neutrality, safety.
Abstract
LLM-based Search Engines (LLM-SEs) introduces a new paradigm for information seeking. Unlike Traditional Search Engines (TSEs) (e.g., Google), these systems summarize results, often providing limited citation transparency. The implications of this shift remain largely unexplored, yet raises key questions regarding trust and transparency. In this paper, we present a large-scale empirical study of LLM-SEs, analyzing 55,936 queries and the corresponding search results across six LLM-SEs and two TSEs. We confirm that LLM-SEs cites domain resources with greater diversity than TSEs. Indeed, 37% of domains are unique to LLM-SEs. However, certain risks still persist: LLM-SEs do not outperform TSEs in credibility, political neutrality and safety metrics. Finally, to understand the selection criteria of LLM-SEs, we perform a feature-based analysis to identify key factors influencing source…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInformation Retrieval and Search Behavior · Web visibility and informetrics · Expert finding and Q&A systems
