Evaluating Search Engines and Large Language Models for Answering Health Questions
Marcos Fern\'andez-Pichel, Juan C. Pichel, David E. Losada

TL;DR
This study compares search engines, large language models, and retrieval-augmented methods in answering health questions, revealing LLMs outperform search engines with higher accuracy, especially when combined with retrieval techniques.
Contribution
It provides a comprehensive comparison of SEs, LLMs, and RAG methods for health question answering, highlighting the effectiveness of retrieval-augmented LLMs.
Findings
SEs answer 50-70% of questions correctly
LLMs answer about 80% correctly
RAG improves small LLMs' accuracy by up to 30%
Abstract
Search engines (SEs) have traditionally been primary tools for information seeking, but the new Large Language Models (LLMs) are emerging as powerful alternatives, particularly for question-answering tasks. This study compares the performance of four popular SEs, seven LLMs, and retrieval-augmented (RAG) variants in answering 150 health-related questions from the TREC Health Misinformation (HM) Track. Results reveal SEs correctly answer between 50 and 70% of questions, often hindered by many retrieval results not responding to the health question. LLMs deliver higher accuracy, correctly answering about 80% of questions, though their performance is sensitive to input prompts. RAG methods significantly enhance smaller LLMs' effectiveness, improving accuracy by up to 30% by integrating retrieval evidence.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Expert finding and Q&A systems · Topic Modeling
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Attention Dropout · Linear Warmup With Linear Decay · Residual Connection · Adam · Dropout · Byte Pair Encoding · Layer Normalization · Linear Layer
