Robust Pareto Set Identification with Contaminated Bandit Feedback
\.Ilter Onat Korkmaz, Efe Eren Ceyani, Kerem Bozgan, Cem Tekin

TL;DR
This paper introduces a robust algorithm for Pareto set identification in multi-objective bandits that remains effective under adversarial reward contamination, with proven sample complexity bounds.
Contribution
It proposes a median-based adaptive elimination algorithm for contaminated MO-MAB, providing theoretical guarantees and empirical validation under adversarial corruptions.
Findings
The median-based method outperforms mean-based approaches under contamination.
Sample complexity increases with contamination probability but recovers standard bounds when contamination is low.
Numerical experiments confirm the robustness and effectiveness of the proposed algorithm.
Abstract
We consider the Pareto set identification (PSI) problem in multi-objective multi-armed bandits (MO-MAB) with contaminated reward observations. At each arm pull, with some fixed probability, the true reward samples are replaced with the samples from an arbitrary contamination distribution chosen by an adversary. We consider ({\alpha}, {\delta})-PAC PSI and propose a sample median-based multi-objective adaptive elimination algorithm that returns an ({\alpha}, {\delta})- PAC Pareto set upon termination with a sample complexity bound that depends on the contamination probability. As the contamination probability decreases, we recover the wellknown sample complexity results in MO-MAB. We compare the proposed algorithm with a mean-based method from MO-MAB literature, as well as an extended version that uses median estimators, on several PSI problems under adversarial corruptions, including…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Advanced Control Systems Optimization · Control Systems and Identification
