Robust Contextual Bandit via the Capped-$\ell_{2}$ norm
Feiyun Zhu, Xinliang Zhu, Sheng Wang, Jiawen Yao, Junzhou Huang

TL;DR
This paper introduces a robust actor-critic contextual bandit method using the capped-$\\ell_{2}$ norm to effectively handle outliers in mHealth interventions, improving decision-making robustness in noisy environments.
Contribution
The paper proposes a novel robust actor-critic algorithm with capped-$\ell_{2}$ norm for outlier resistance, including a method to set its key parameter based on statistical outlier definitions.
Findings
Achieves similar performance to state-of-the-art methods on clean data.
Significantly outperforms existing methods on outlier-contaminated data.
Enhances robustness of mHealth decision-making processes.
Abstract
This paper considers the actor-critic contextual bandit for the mobile health (mHealth) intervention. The state-of-the-art decision-making methods in mHealth generally assume that the noise in the dynamic system follows the Gaussian distribution. Those methods use the least-square-based algorithm to estimate the expected reward, which is prone to the existence of outliers. To deal with the issue of outliers, we propose a novel robust actor-critic contextual bandit method for the mHealth intervention. In the critic updating, the capped- norm is used to measure the approximation error, which prevents outliers from dominating our objective. A set of weights could be achieved from the critic updating. Considering them gives a weighted objective for the actor updating. It provides the badly noised sample in the critic updating with zero weights for the actor updating. As a result,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Cognitive Radio Networks and Spectrum Sensing · Sparse and Compressive Sensing Techniques
