EBaReT: Expert-guided Bag Reward Transformer for Auto Bidding
Kaiyuan Li, Pengyu Wang, Yunshan Peng, Pengjia Yuan, Yanxiang Zeng, Rui Xiang, Yanhua Cheng, Xialong Liu, Peng Jiang

TL;DR
This paper introduces EBaReT, a novel transformer-based model for automated bidding that leverages expert data, PU learning, and a bag reward strategy to improve decision quality and reward stability in reinforcement learning settings.
Contribution
The paper proposes EBaReT, a new expert-guided transformer model that addresses data quality and reward uncertainty in automated bidding through expert trajectories and a bag reward approach.
Findings
Outperforms state-of-the-art bidding methods in experiments.
Effectively mitigates data quality issues with expert trajectories.
Achieves smoother reward acquisition and better decision quality.
Abstract
Reinforcement learning has been widely applied in automated bidding. Traditional approaches model bidding as a Markov Decision Process (MDP). Recently, some studies have explored using generative reinforcement learning methods to address long-term dependency issues in bidding environments. Although effective, these methods typically rely on supervised learning approaches, which are vulnerable to low data quality due to the amount of sub-optimal bids and low probability rewards resulting from the low click and conversion rates. Unfortunately, few studies have addressed these challenges. In this paper, we formalize the automated bidding as a sequence decision-making problem and propose a novel Expert-guided Bag Reward Transformer (EBaReT) to address concerns related to data quality and uncertainty rewards. Specifically, to tackle data quality issues, we generate a set of expert…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsErgonomics and Musculoskeletal Disorders · Color perception and design · Emotion and Mood Recognition
