Optimal Return-to-Go Guided Decision Transformer for Auto-Bidding in Advertisement

Hao Jiang; Yongxiang Tang; Yanxiang Zeng; Pengjia Yuan; Yanhua Cheng; Teng Sha; Xialong Liu; Peng Jiang

arXiv:2506.21956·cs.LG·June 30, 2025

Optimal Return-to-Go Guided Decision Transformer for Auto-Bidding in Advertisement

Hao Jiang, Yongxiang Tang, Yanxiang Zeng, Pengjia Yuan, Yanhua Cheng, Teng Sha, Xialong Liu, Peng Jiang

PDF

TL;DR

This paper introduces the R* Decision Transformer, a novel approach for auto-bidding in online advertising that improves long-term decision-making by forecasting optimal return-to-go values and enhancing training data quality.

Contribution

The paper proposes the R* Decision Transformer, which forecasts maximum return-to-go values and uses data augmentation to improve auto-bidding policies in advertising.

Findings

01

R* DT outperforms traditional DT in auto-bidding tasks.

02

Forecasting RTG enhances long-term decision quality.

03

Data augmentation leads to more robust bidding policies.

Abstract

In the realm of online advertising, advertisers partake in ad auctions to obtain advertising slots, frequently taking advantage of auto-bidding tools provided by demand-side platforms. To improve the automation of these bidding systems, we adopt generative models, namely the Decision Transformer (DT), to tackle the difficulties inherent in automated bidding. Applying the Decision Transformer to the auto-bidding task enables a unified approach to sequential modeling, which efficiently overcomes short-sightedness by capturing long-term dependencies between past bidding actions and user behavior. Nevertheless, conventional DT has certain drawbacks: (1) DT necessitates a preset return-to-go (RTG) value before generating actions, which is not inherently produced; (2) The policy learned by DT is restricted by its training data, which is consists of mixed-quality trajectories. To address these…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.