Self-Configurable Mesh-Networks for Scalable Distributed Submodular Bandit Optimization
Zirui Xu, Vasileios Tzoumas

TL;DR
This paper introduces a scalable, communication-efficient distributed bandit algorithm for multi-agent submodular optimization that adapts over time to optimize coordination in resource-limited, partially observable environments.
Contribution
It proposes a novel self-configurable mesh network approach that limits communication to one-hop and optimizes neighborhoods online, achieving near-optimal coordination under bandwidth constraints.
Findings
Faster convergence in simulations compared to benchmarks.
Outperforms benchmarks with prior environmental knowledge.
Maintains positive suboptimality bounds across network topologies.
Abstract
We study how to scale distributed bandit submodular coordination under realistic communication constraints in bandwidth, data rate, and connectivity. We are motivated by multi-agent tasks of active situational awareness in unknown, partially-observable, and resource-limited environments, where the agents must coordinate through agent-to-agent communication. Our approach enables scalability by (i) limiting information relays to only one-hop communication and (ii) keeping inter-agent messages small, having each agent transmit only its own action information. Despite these information-access restrictions, our approach enables near-optimal action coordination by optimizing the agents' communication neighborhoods over time, through distributed online bandit optimization, subject to the agents' bandwidth constraints. Particularly, our approach enjoys an anytime suboptimality bound that is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Age of Information Optimization
