Understanding Data Importance in Machine Learning Attacks: Does Valuable   Data Pose Greater Harm?

Rui Wen; Michael Backes; Yang Zhang

arXiv:2409.03741·cs.CR·October 1, 2024

Understanding Data Importance in Machine Learning Attacks: Does Valuable Data Pose Greater Harm?

Rui Wen, Michael Backes, Yang Zhang

PDF

Open Access 1 Repo

TL;DR

This paper investigates how the importance of data samples in machine learning affects their vulnerability to various attacks, revealing that valuable data can be more susceptible and suggesting improved defense strategies.

Contribution

It introduces an analysis of the link between data importance and attack vulnerability, proposing sample-specific metrics to enhance membership inference attacks.

Findings

01

High importance data samples are more vulnerable to certain attacks

02

Sample-specific criteria can improve membership inference performance

03

Highlights the need for defenses balancing data utility and security

Abstract

Machine learning has revolutionized numerous domains, playing a crucial role in driving advancements and enabling data-centric processes. The significance of data in training models and shaping their performance cannot be overstated. Recent research has highlighted the heterogeneous impact of individual data samples, particularly the presence of valuable data that significantly contributes to the utility and effectiveness of machine learning models. However, a critical question remains unanswered: are these valuable data samples more vulnerable to machine learning attacks? In this work, we investigate the relationship between data importance and machine learning attacks by analyzing five distinct attack types. Our findings reveal notable insights. For example, we observe that high importance data samples exhibit increased vulnerability in certain attacks, such as membership inference…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

TrustAIRLab/importance-in-mlattacks
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning