TL;DR
This paper introduces novel algorithms for embedding and extracting Boolean-form knowledge in tree ensemble classifiers, highlighting a complexity gap between backdoor attacks and defenses, validated across multiple datasets.
Contribution
It presents the first effective, verifiable, and stealthy embedding algorithms for tree ensembles and an SMT-based extraction method, revealing a P vs. NP complexity gap.
Findings
Embedding algorithms operate in polynomial time.
Extraction reduces to an NP-hard SMT problem.
A significant complexity gap exists between attack and defense.
Abstract
The embedding and extraction of useful knowledge is a recent trend in machine learning applications, e.g., to supplement existing datasets that are small. Whilst, as the increasing use of machine learning models in security-critical applications, the embedding and extraction of malicious knowledge are equivalent to the notorious backdoor attack and its defence, respectively. This paper studies the embedding and extraction of knowledge in tree ensemble classifiers, and focuses on knowledge expressible with a generic form of Boolean formulas, e.g., robustness properties and backdoor attacks. For the embedding, it is required to be preservative(the original performance of the classifier is preserved), verifiable(the knowledge can be attested), and stealthy(the embedding cannot be easily detected). To facilitate this, we propose two novel, and effective, embedding algorithms, one of which…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
