Systematic Testing of the Data-Poisoning Robustness of KNN
Yannan Li, Jingbo Wang, and Chao Wang

TL;DR
This paper introduces a systematic testing approach for assessing the data-poisoning robustness of KNN, capable of both certifying robustness and falsifying non-robust cases more efficiently than existing methods.
Contribution
It presents a novel over-approximate analysis combined with systematic testing to improve accuracy and speed in verifying KNN's data-poisoning robustness.
Findings
Outperforms baseline enumeration in speed and accuracy
Can certify robustness for most test inputs
Effectively falsifies non-robust cases
Abstract
Data poisoning aims to compromise a machine learning based software component by contaminating its training set to change its prediction results for test inputs. Existing methods for deciding data-poisoning robustness have either poor accuracy or long running time and, more importantly, they can only certify some of the truly-robust cases, but remain inconclusive when certification fails. In other words, they cannot falsify the truly-non-robust cases. To overcome this limitation, we propose a systematic testing based method, which can falsify as well as certify data-poisoning robustness for a widely used supervised-learning technique named k-nearest neighbors (KNN). Our method is faster and more accurate than the baseline enumeration method, due to a novel over-approximate analysis in the abstract domain, to quickly narrow down the search space, and systematic testing in the concrete…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Machine Learning and Data Classification · Imbalanced Data Classification Techniques
