Harm Mitigation in Recommender Systems under User Preference Dynamics
Jerry Chee, Shankar Kalyanaraman, Sindhu Kiranmai Ernala, Udi, Weinsberg, Sarah Dean, Stratis Ioannidis

TL;DR
This paper develops a framework for recommender systems that balances maximizing user engagement with minimizing exposure to harmful content by modeling user dynamics and proposing optimal recommendation policies.
Contribution
It introduces a model capturing the interaction between recommendations, user interest evolution, and harm, along with algorithms for optimal policy determination under these dynamics.
Findings
Proposed policies outperform baselines in balancing CTR and harm mitigation.
Established conditions for stationary points in user profile dynamics.
Validated approach on semi-synthetic movie recommendation data.
Abstract
We consider a recommender system that takes into account the interplay between recommendations, the evolution of user interests, and harmful content. We model the impact of recommendations on user behavior, particularly the tendency to consume harmful content. We seek recommendation policies that establish a tradeoff between maximizing click-through rate (CTR) and mitigating harm. We establish conditions under which the user profile dynamics have a stationary point, and propose algorithms for finding an optimal recommendation policy at stationarity. We experiment on a semi-synthetic movie recommendation setting initialized with real data and observe that our policies outperform baselines at simultaneously maximizing CTR and mitigating harm.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research
