LERA: LLM-Enhanced RAG for Ad Auction in Generative Chatbots

Haoran Sun; Xinrui Song; Xinyu Zhang; Zhaohua Chen; Xu Chu; Zhilin Zhang; Chuan Yu; Jian Xu; Bo Zheng; Xiaotie Deng

arXiv:2605.16474·cs.IR·May 19, 2026

LERA: LLM-Enhanced RAG for Ad Auction in Generative Chatbots

Haoran Sun, Xinrui Song, Xinyu Zhang, Zhaohua Chen, Xu Chu, Zhilin Zhang, Chuan Yu, Jian Xu, Bo Zheng, Xiaotie Deng

PDF

TL;DR

LERA introduces a two-stage auction framework for LLM chatbots that improves ad relevance and diversity by combining embedding filtering with LLM-based scoring, ensuring truthful bidding.

Contribution

The paper presents LERA, a novel retrieve-then-generate auction method that enhances ad selection accuracy and diversity in LLM chatbots through a two-stage scoring and payment mechanism.

Findings

01

LERA significantly improves ad relevance and diversity.

02

The framework maintains truthfulness for advertisers.

03

Experiments show controllable latency overhead.

Abstract

The integration of advertising auction mechanisms into large language model (LLM)-based chatbots presents a significant opportunity for commercialization, yet poses unique challenges in balancing relevance, efficiency, and user experience. Recently, Feizi et al.~\citep{feizi2023online} and Hajiaghayi et al.~\citep{hajiaghayi2024ad} outlined a retrieve-then-generate paradigm that decouples retrieval and generation, offering lightweight ad insertion and payment determination. However, current retrieval relies solely on text embedding similarity, which may lead to commercial misinterpretation and issues such as repetitive insertions. In this paper, we propose LERA, a two-stage retrieve-then-generate auction framework tailored for LLM chatbots. In the first stage, embedding-based coarse filtering pre-selects a small set of candidate advertisers. In the second stage, the LLM itself is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.