Investigating the Adversarial Robustness of Density Estimation Using the Probability Flow ODE
Marius Arvinte, Cory Cornelius, Jason Martin, Nageen Himayat

TL;DR
This paper examines the robustness of density estimation via the probability flow ODE in score-based diffusion models against gradient-based attacks, revealing that it remains resilient to high-complexity adversarial manipulations and can produce meaningful adversarial samples.
Contribution
It introduces and evaluates six gradient-based likelihood attacks, including a novel reverse integration attack, and analyzes the robustness of PF ODE density estimation.
Findings
PF ODE density estimation is robust against high-complexity attacks
Adversarial samples can be semantically meaningful
The study links robustness to sample complexity measures
Abstract
Beyond their impressive sampling capabilities, score-based diffusion models offer a powerful analysis tool in the form of unbiased density estimation of a query sample under the training data distribution. In this work, we investigate the robustness of density estimation using the probability flow (PF) neural ordinary differential equation (ODE) model against gradient-based likelihood maximization attacks and the relation to sample complexity, where the compressed size of a sample is used as a measure of its complexity. We introduce and evaluate six gradient-based log-likelihood maximization attacks, including a novel reverse integration attack. Our experimental evaluations on CIFAR-10 show that density estimation using the PF ODE is robust against high-complexity, high-likelihood attacks, and that in some cases adversarial samples are semantically meaningful, as expected from a robust…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Model Reduction and Neural Networks
MethodsDiffusion
