Online Learning of Strategic Defense against Ecological Adversaries under Partial Observability with Semi-Bandit Feedback
Anjali Purathekandy, Deepak N. Subramani

TL;DR
This paper presents HERDS, an online learning algorithm for adaptive resource allocation against strategic ecological adversaries with unknown behavior, addressing partial observability and non-stationary payoffs in security games.
Contribution
HERDS extends Follow-the-Perturbed-Leader with innovations for exploration, adaptive payoff estimation, and model-agnostic learning, enabling regret guarantees without behavioral assumptions.
Findings
Achieves 15-45% regret reduction compared to baseline methods.
Reduces crop damage by 40-50% against adaptive adversaries.
Converges in 40-50 rounds, faster than traditional algorithms.
Abstract
We introduce an online learning algorithm for computing adaptive resource allocation policies against strategic ecological adversaries with unknown behavioral models and partial observability. Our setting addresses a fundamental limitation of security games: when adversary behavior cannot be modeled a priori, classical equilibrium-based approaches fail. We formulate the problem as regret minimization in a combinatorial action space with semi-bandit feedback, where payoffs are non-stationary and interdependent across targets. Our algorithm, named HERDS (Human-Elephant conflict mitigation through Resource Deployment for Strategic guarding), extends Follow-the-Perturbed-Leader (FPL) with three innovations: (1) simultaneous exploration-exploitation through dynamic budget partitioning driven by observed losses, (2) adaptive payoff estimation under confounded observations where attack entry…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Reinforcement Learning in Robotics · Guidance and Control Systems
