LERA: LLM-Enhanced RAG for Ad Auction in Generative Chatbots
Haoran Sun, Xinrui Song, Xinyu Zhang, Zhaohua Chen, Xu Chu, Zhilin Zhang, Chuan Yu, Jian Xu, Bo Zheng, Xiaotie Deng

TL;DR
LERA introduces a two-stage auction framework for LLM chatbots that improves ad relevance and diversity by combining embedding filtering with LLM-based scoring, ensuring truthful bidding.
Contribution
The paper presents LERA, a novel retrieve-then-generate auction method that enhances ad selection accuracy and diversity in LLM chatbots through a two-stage scoring and payment mechanism.
Findings
LERA significantly improves ad relevance and diversity.
The framework maintains truthfulness for advertisers.
Experiments show controllable latency overhead.
Abstract
The integration of advertising auction mechanisms into large language model (LLM)-based chatbots presents a significant opportunity for commercialization, yet poses unique challenges in balancing relevance, efficiency, and user experience. Recently, Feizi et al.~\citep{feizi2023online} and Hajiaghayi et al.~\citep{hajiaghayi2024ad} outlined a retrieve-then-generate paradigm that decouples retrieval and generation, offering lightweight ad insertion and payment determination. However, current retrieval relies solely on text embedding similarity, which may lead to commercial misinterpretation and issues such as repetitive insertions. In this paper, we propose LERA, a two-stage retrieve-then-generate auction framework tailored for LLM chatbots. In the first stage, embedding-based coarse filtering pre-selects a small set of candidate advertisers. In the second stage, the LLM itself is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
