Entropy based Independent Learning in Anonymous Multi-Agent Settings
Tanvi Verma, Pradeep Varakantham, Hoong Chuin Lau

TL;DR
This paper introduces an entropy-based independent learning framework for anonymous multi-agent systems, improving revenue and fairness in online service matching scenarios with limited local information.
Contribution
It proposes a maximum entropy principle-based approach for independent learning in anonymous multi-agent settings, with theoretical justification and empirical validation.
Findings
Significant revenue improvements over existing methods.
Reduced variance in individual agent revenues.
Effective in both simulated and real-world taxi matching problems.
Abstract
Efficient sequential matching of supply and demand is a problem of interest in many online to offline services. For instance, Uber, Lyft, Grab for matching taxis to customers; Ubereats, Deliveroo, FoodPanda etc for matching restaurants to customers. In these online to offline service problems, individuals who are responsible for supply (e.g., taxi drivers, delivery bikes or delivery van drivers) earn more by being at the "right" place at the "right" time. We are interested in developing approaches that learn to guide individuals to be in the "right" place at the "right" time (to maximize revenue) in the presence of other similar "learning" individuals and only local aggregated observation of other agents states (e.g., only number of other taxis in same zone as current agent). A key characteristic of the domains of interest is that the interactions between individuals are anonymous,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Smart Grid Energy Management · Reinforcement Learning in Robotics
