Shapley Value-driven Data Pruning for Recommender Systems

Yansen Zhang; Xiaokun Zhang; Ziqiang Cui; and Chen Ma

arXiv:2505.22057·cs.IR·May 29, 2025

Shapley Value-driven Data Pruning for Recommender Systems

Yansen Zhang, Xiaokun Zhang, Ziqiang Cui, and Chen Ma

PDF

1 Repo

TL;DR

This paper introduces SVV, a Shapley value-based framework for data pruning in recommender systems that evaluates interactions by their actual contribution to training, improving robustness and accuracy.

Contribution

The paper presents a novel, model-driven approach to data denoising using real-time Shapley value estimation, moving beyond heuristic filtering methods.

Findings

01

SVV outperforms existing denoising methods in accuracy.

02

SVV enhances robustness against noisy interactions.

03

The method preserves training-critical interactions.

Abstract

Recommender systems often suffer from noisy interactions like accidental clicks or popularity bias. Existing denoising methods typically identify users' intent in their interactions, and filter out noisy interactions that deviate from the assumed intent. However, they ignore that interactions deemed noisy could still aid model training, while some ``clean'' interactions offer little learning value. To bridge this gap, we propose Shapley Value-driven Valuation (SVV), a framework that evaluates interactions based on their objective impact on model training rather than subjective intent assumptions. In SVV, a real-time Shapley value estimation method is devised to quantify each interaction's value based on its contribution to reducing training loss. Afterward, SVV highlights the interactions with high values while downplaying low ones to achieve effective data pruning for recommender…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Forrest-Stone/SVV
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsPruning