Learning to Bid Long-Term: Multi-Agent Reinforcement Learning with Long-Term and Sparse Reward in Repeated Auction Games
Jing Tan, Ramin Khalili, Holger Karl

TL;DR
This paper introduces a multi-agent reinforcement learning algorithm for repeated auction games that balances short-term gains with long-term rewards, demonstrating improved performance and social welfare in simulations.
Contribution
The paper presents a novel distributed RL algorithm that incorporates long-term rewards and partial information, outperforming benchmarks in auction game scenarios.
Findings
Outperforms benchmark algorithms in auction competitions
Long-term reward signals can guide aggressive strategies to benefit social welfare
Algorithm balances individual payoff with overall social welfare
Abstract
We propose a multi-agent distributed reinforcement learning algorithm that balances between potentially conflicting short-term reward and sparse, delayed long-term reward, and learns with partial information in a dynamic environment. We compare different long-term rewards to incentivize the algorithm to maximize individual payoff and overall social welfare. We test the algorithm in two simulated auction games, and demonstrate that 1) our algorithm outperforms two benchmark algorithms in a direct competition, with cost to social welfare, and 2) our algorithm's aggressive competitive behavior can be guided with the long-term reward signal to maximize both individual payoff and overall social welfare.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAuction Theory and Applications · Experimental Behavioral Economics Studies
