Aligning Large Language Models with Searcher Preferences
Wei Wu, Peilun Zhou, Liyi Chen, Qimeng Wang, Chengqiang Lu, Yan Gao, Yi Wu, Yao Hu, and Hui Xiong

TL;DR
This paper introduces SearchLLM, a large language model designed for open-ended generative search, utilizing a hierarchical reward system and novel training strategies to improve answer quality, safety, and user alignment.
Contribution
The work presents the first LLM tailored for open-ended generative search, with a multi-dimensional reward system and a new training method called Gated Aggregation Strategy.
Findings
Improved generation quality and user engagement in online tests.
Increased Valid Consumption Rate by 1.03%.
Reduced Re-search Rate by 2.81%.
Abstract
The paradigm shift from item-centric ranking to answer-centric synthesis is redefining the role of search engines. While recent industrial progress has applied generative techniques to closed-set item ranking in e-commerce, research and deployment of open-ended generative search on large content platforms remain limited. This setting introduces challenges, including robustness to noisy retrieval, non-negotiable safety guarantees, and alignment with diverse user needs. In this work, we introduce SearchLLM, the first large language model (LLM) for open-ended generative search. We design a hierarchical, multi-dimensional reward system that separates bottom-line constraints, including factual grounding, basic answer quality and format compliance, from behavior optimization objectives that promote robustness to noisy retrieval and alignment with user needs. Concretely, our reward model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInformation Retrieval and Search Behavior · Expert finding and Q&A systems · Topic Modeling
