Rethinking Search: Making Domain Experts out of Dilettantes
Donald Metzler, Yi Tay, Dara Bahri, Marc Najork

TL;DR
This paper explores combining classical information retrieval with pre-trained language models to create systems that can deliver expert-level, justified, and reliable responses to user queries, overcoming current limitations of AI hallucinations.
Contribution
It proposes a novel approach that synthesizes classical IR and language models to develop systems capable of providing trustworthy, expert-like answers with supporting references.
Findings
Identifies limitations of current language models in expert domains.
Suggests a hybrid system combining IR and language models for better accuracy.
Lays groundwork for future systems that deliver justified, expert-level responses.
Abstract
When experiencing an information need, users want to engage with a domain expert, but often turn to an information retrieval system, such as a search engine, instead. Classical information retrieval systems do not answer information needs directly, but instead provide references to (hopefully authoritative) answers. Successful question answering systems offer a limited corpus created on-demand by human experts, which is neither timely nor scalable. Pre-trained language models, by contrast, are capable of directly generating prose that may be responsive to an information need, but at present they are dilettantes rather than domain experts -- they do not have a true understanding of the world, they are prone to hallucinating, and crucially they are incapable of justifying their utterances by referring to supporting documents in the corpus they were trained over. This paper examines how…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
