LIA: Privacy-Preserving Data Quality Evaluation in Federated Learning   Using a Lazy Influence Approximation

Ljubomir Rokvic; Panayiotis Danassis; Sai Praneeth Karimireddy; Boi; Faltings

arXiv:2205.11518·cs.CR·November 27, 2024

LIA: Privacy-Preserving Data Quality Evaluation in Federated Learning Using a Lazy Influence Approximation

Ljubomir Rokvic, Panayiotis Danassis, Sai Praneeth Karimireddy, Boi, Faltings

PDF

Open Access

TL;DR

This paper introduces 'lazy influence,' a privacy-preserving influence approximation method for data quality evaluation in federated learning, effectively filtering corrupted data while maintaining differential privacy guarantees.

Contribution

It proposes a novel influence approximation technique that enables data valuation in federated learning without compromising privacy, outperforming existing methods.

Findings

01

Achieves over 90% recall in filtering biased data

02

Maintains strong differential privacy with ε ≤ 1

03

Effective in both simulated and real-world settings

Abstract

In Federated Learning, it is crucial to handle low-quality, corrupted, or malicious data. However, traditional data valuation methods are not suitable due to privacy concerns. To address this, we propose a simple yet effective approach that utilizes a new influence approximation called "lazy influence" to filter and score data while preserving privacy. To do this, each participant uses their own data to estimate the influence of another participant's batch and sends a differentially private obfuscated score to the central coordinator. Our method has been shown to successfully filter out biased and corrupted data in various simulated and real-world settings, achieving a recall rate of over $> 90%$ (sometimes up to $100%$ ) while maintaining strong differential privacy guarantees with $ε \leq 1$ .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Distributed Sensor Networks and Detection Algorithms · Traffic Prediction and Management Techniques