Regret Minimization with Performative Feedback
Meena Jagadeesan, Tijana Zrnic, Celestine Mendler-D\"unner

TL;DR
This paper introduces an algorithm for performative prediction that minimizes regret by leveraging performative feedback, which involves distribution shifts caused by model deployment, without assuming convexity.
Contribution
It presents a novel algorithm that achieves regret bounds based on distribution shift complexity, using smoothness assumptions and careful exploration, with guarantees on near-optimality.
Findings
The algorithm's regret depends only on distribution shift complexity.
It does not require convexity assumptions.
The final model is guaranteed to be near-optimal.
Abstract
In performative prediction, the deployment of a predictive model triggers a shift in the data distribution. As these shifts are typically unknown ahead of time, the learner needs to deploy a model to get feedback about the distribution it induces. We study the problem of finding near-optimal models under performativity while maintaining low regret. On the surface, this problem might seem equivalent to a bandit problem. However, it exhibits a fundamentally richer feedback structure that we refer to as performative feedback: after every deployment, the learner receives samples from the shifted distribution rather than only bandit feedback about the reward. Our main contribution is an algorithm that achieves regret bounds scaling only with the complexity of the distribution shifts and not that of the reward function. The algorithm only relies on smoothness of the shifts and does not assume…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Data Stream Mining Techniques
