MoTiAC: Multi-Objective Actor-Critics for Real-Time Bidding
Haolin Zhou, Chaoqi Yang, Xiaofeng Gao, Qiong Chen, Gongshen Liu and, Guihai Chen

TL;DR
MoTiAC is a reinforcement learning algorithm that optimizes real-time bidding strategies by balancing multiple objectives, demonstrating convergence to Pareto optimality and outperforming recent methods on real-world data.
Contribution
Introduces MoTiAC, a multi-objective actor-critic RL framework for RTB, capable of handling multiple goals simultaneously with proven convergence to Pareto optimality.
Findings
MoTiAC outperforms recent approaches on Tencent dataset
The model converges to Pareto optimal solutions
Effective in complex multi-objective bidding environments
Abstract
Online Real-Time Bidding (RTB) is a complex auction game among which advertisers struggle to bid for ad impressions when a user request occurs. Considering display cost, Return on Investment (ROI), and other influential Key Performance Indicators (KPIs), large ad platforms try to balance the trade-off among various goals in dynamics. To address the challenge, we propose a Multi-ObjecTive Actor-Critics algorithm based on reinforcement learning (RL), named MoTiAC, for the problem of bidding optimization with various goals. In MoTiAC, objective-specific agents update the global network asynchronously with different goals and perspectives, leading to a robust bidding policy. Unlike previous RL models, the proposed MoTiAC can simultaneously fulfill multi-objective tasks in complicated bidding environments. In addition, we mathematically prove that our model will converge to Pareto…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAuction Theory and Applications · Reinforcement Learning in Robotics · Advanced Bandit Algorithms Research
