BadGD: A unified data-centric framework to identify gradient descent vulnerabilities
Chi-Hua Wang, Guang Cheng

TL;DR
BadGD introduces a comprehensive theoretical framework that reveals how strategic backdoor triggers can exploit gradient descent vulnerabilities, significantly impacting model training and integrity.
Contribution
The paper presents three novel constructs for analyzing gradient descent vulnerabilities and bridges empirical findings with theoretical insights into backdoor attack mechanisms.
Findings
Backdoor triggers can distort empirical risk and gradients.
Gradient descent hyperparameters can be exploited for attacks.
Model integrity can be compromised through these vulnerabilities.
Abstract
We present BadGD, a unified theoretical framework that exposes the vulnerabilities of gradient descent algorithms through strategic backdoor attacks. Backdoor attacks involve embedding malicious triggers into a training dataset to disrupt the model's learning process. Our framework introduces three novel constructs: Max RiskWarp Trigger, Max GradWarp Trigger, and Max GradDistWarp Trigger, each designed to exploit specific aspects of gradient descent by distorting empirical risk, deterministic gradients, and stochastic gradients respectively. We rigorously define clean and backdoored datasets and provide mathematical formulations for assessing the distortions caused by these malicious backdoor triggers. By measuring the impact of these triggers on the model training procedure, our framework bridges existing empirical findings with theoretical insights, demonstrating how a malicious party…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMedical Imaging and Analysis
