NaiAD: Initiate Data-Driven Research for LLM Advertising
Yihang Zhang, Zimeng Huang, Ren Zhai, Yipeng Kang, Tonghan Wang

TL;DR
NaiAD introduces a comprehensive dataset and evaluation framework for LLM-native advertising, enabling improved balancing of user experience and revenue through diverse response generation and calibrated scoring.
Contribution
This work presents the first dataset for LLM advertising, a decoupled generation pipeline, and a calibration framework to enhance ad response quality and utility.
Findings
Models trained on NaiAD improve user and commercial utility.
Decoupled generation produces diverse, structurally different responses.
Calibration aligns automated scores with human judgments.
Abstract
Reconciling platform revenue with user experience in LLM advertising motivates a data-centric foundation. We introduce NaiAD, the first comprehensive dataset for LLM-native advertising comprising 58,999 carefully constructed ad-embedded responses paired with user queries. NaiAD is organized around theoretically grounded evaluation metrics that separately and comprehensively capture user and commercial utility. To mitigate the dimensional collinearity of aligned LLMs, we propose a decoupled generation pipeline that produces structurally diverse samples, ranging from responses that explicitly disentangle stakeholder utilities to responses that are uniformly strong or weak across dimensions. We further provide score labels calibrated by a Variance-Calibrated Prediction-Powered Inference (VC-PPI) framework, aligning automated scoring with human annotations. Mechanistic analyses reveal that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
