Controlling Output Rankings in Generative Engines for LLM-based Search
Haibo Jin, Ruoxi Chen, Peiyan Zhang, Yifeng Luo, Huimin Zeng, Man Luo, Haohan Wang

TL;DR
This paper introduces CORE, a method to control output rankings in LLM-based search engines by appending optimized content, significantly improving the visibility of targeted products across multiple LLMs and categories.
Contribution
CORE is a novel optimization approach that influences LLM output rankings through content augmentation, addressing the black-box nature of search engines and enhancing fairness for small businesses.
Findings
CORE achieves over 91% promotion success at Top-5.
The method outperforms existing ranking manipulation techniques.
Optimized content maintains fluency while improving rankings.
Abstract
The way customers search for and choose products is changing with the rise of large language models (LLMs). LLM-based search, or generative engines, provides direct product recommendations to users, rather than traditional online search results that require users to explore options themselves. However, these recommendations are strongly influenced by the initial retrieval order of LLMs, which disadvantages small businesses and independent creators by limiting their visibility. In this work, we propose CORE, an optimization method that \textbf{C}ontrols \textbf{O}utput \textbf{R}ankings in g\textbf{E}nerative Engines for LLM-based search. Since the LLM's interactions with the search engine are black-box, CORE targets the content returned by search engines as the primary means of influencing output rankings. Specifically, CORE optimizes retrieved content by appending strategically…
Peer Reviews
Decision·Submitted to ICLR 2026
* **Original Problem Formulation:** CORE addresses output ranking manipulation at the LLM *synthesis stage*, distinct from traditional SEO or GEO which focus on retrieval, overcoming the limitation of fixed search engine choice. * **Realistic Threat Model:** The methodology operates successfully under the demanding *black-box assumption*, without requiring access to model internals or gradients. * **High Effectiveness:** CORE achieved high promotion success rates, including an average of *
The primary weakness of the paper lies in the **practical constraints** and **fragility** of the proposed optimization strategies, particularly in a truly *black-box* environment, coupled with a lack of discussion regarding **mitigation and defense**. * **Reliance on Alignment:** Optimal performance (PSR) in the *query-based black-box solution* requires the Generator and Optimizer to be the *same model* as the target synthesizing LLM, suggesting limited robustness if the target LLM is truly
- S1: The introduction of AmazonCOREBench provides a reusable benchmark that enhances reproducibility and future comparative studies. - S2: The reasoning-based and review-based optimization strategies are thoughtfully designed and show realistic manipulation behavior. - S3: The experimental setup is comprehensive, covering multiple model families and 15 product categories, which strengthens the empirical evidence. - S4: The sensitivity-to-insertion-order experiment (Section 4.5) is particular
- W1: The experimental setting seems to be unrealistic. It assumes a single target product is pre-specified and optimized to improve its rank, while all other products remain static. In real generative search, users do not know which item they want to promote. - W2: All reported improvements are relative to retrieval order, not to any true relevance judgment. The experiments never verify whether the promoted item is actually better or more relevant. - W3: The optimized outputs have worse fluen
1. This paper tries to address the problem of generative engine optimization (GEO), which is relevant and practical at the intersection of LLMs and information retrieval. Since generative engines become more and more popular, understanding and resolving GEO is crucial, just as the status of SEO in conventional search engine. 2. Experimental results are impressive. It shows that the final ranking list produced by LLMs is heavily influenced by the initial order of retrieved results and CORE achi
1. The novelty is limited. The shadow-model optimization method seems a direct application of black-box adversarial attack techniques, and the query-based optimization method seems an iterative prompt engineering framework. While effectively applied, the underlying mechanism or learning paradigm is not novel. 2. It lacks theoretical insights. Theoretical analysis or explanations are needed for understanding the phenomenon that the final ranking list produced by LLMs is heavily influenced by the
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInformation Retrieval and Search Behavior · Natural Language Processing Techniques · Topic Modeling
