From Relevance to Authority: Authority-aware Generative Retrieval in Web Search Engines
Sunkyung Lee, Jihye Back, Donghyeon Jeon, Soonhwan Kwon, Moonkwon Kim, Inho Kang, Jongwuk Lee

TL;DR
This paper introduces AuthGR, a framework that integrates authority into generative retrieval for web search, improving trustworthiness and user engagement in high-stakes domains.
Contribution
AuthGR is the first framework to incorporate authority into generative retrieval, combining multimodal authority scoring, a three-stage training pipeline, and a hybrid ensemble for deployment.
Findings
AuthGR enhances authority and accuracy in retrieval tasks.
A 3B model matches a 14B baseline in performance.
Online tests show improved user engagement and reliability.
Abstract
Generative information retrieval (GenIR) formulates the retrieval process as a text-to-text generation task, leveraging the vast knowledge of large language models. However, existing works primarily optimize for relevance while often overlooking document trustworthiness. This is critical in high-stakes domains like healthcare and finance, where relying solely on semantic relevance risks retrieving unreliable information. To address this, we propose an Authority-aware Generative Retriever (AuthGR), the first framework that incorporates authority into GenIR. AuthGR consists of three key components: (i) Multimodal Authority Scoring, which employs a vision-language model to quantify authority from textual and visual cues; (ii) a Three-stage Training Pipeline to progressively instill authority awareness into the retriever; and (iii) a Hybrid Ensemble Pipeline for robust deployment. Offline…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
