Graph-Attentive MAPPO for Dynamic Retail Pricing
Krishna Kumar Neelakanta Pillai Santha Kumari Amma

TL;DR
This paper systematically evaluates multi-agent reinforcement learning for retail pricing, demonstrating that graph-attention mechanisms improve performance and stability in multi-product dynamic pricing scenarios.
Contribution
It introduces a graph-attention-augmented MAPPO variant that leverages product interactions, enhancing multi-product pricing strategies over standard MAPPO.
Findings
MAPPO is a robust foundation for retail price control.
MAPPO+GAT improves performance by sharing information across products.
Graph-integrated MARL offers scalable, stable solutions for dynamic pricing.
Abstract
Dynamic pricing in retail requires policies that adapt to shifting demand while coordinating decisions across related products. We present a systematic empirical study of multi-agent reinforcement learning for retail price optimization, comparing a strong MAPPO baseline with a graph-attention-augmented variant (MAPPO+GAT) that leverages learned interactions among products. Using a simulated pricing environment derived from real transaction data, we evaluate profit, stability across random seeds, fairness across products, and training efficiency under a standardized evaluation protocol. The results indicate that MAPPO provides a robust and reproducible foundation for portfolio-level price control, and that MAPPO+GAT further enhances performance by sharing information over the product graph without inducing excessive price volatility. These results indicate that graph-integrated MARL…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAuction Theory and Applications · Consumer Market Behavior and Pricing · Advanced Bandit Algorithms Research
