Stealing Black-Box Functionality Using The Deep Neural Tree Architecture
Daniel Teitelman, Itay Naeh, Shie Mannor

TL;DR
This paper introduces Deep Neural Trees (DNTs), a novel ML architecture capable of cloning black-box models' functionality through input-output interactions, with improved efficiency and explainability.
Contribution
The paper presents DNTs and an active learning training method to effectively clone complex black-box models with different internal architectures.
Findings
DNTs successfully clone black-box model behavior.
Active learning improves training efficiency.
DNTs offer some explainability due to their tree structure.
Abstract
This paper makes a substantial step towards cloning the functionality of black-box models by introducing a Machine learning (ML) architecture named Deep Neural Trees (DNTs). This new architecture can learn to separate different tasks of the black-box model, and clone its task-specific behavior. We propose to train the DNT using an active learning algorithm to obtain faster and more sample-efficient training. In contrast to prior work, we study a complex "victim" black-box model based solely on input-output interactions, while at the same time the attacker and the victim model may have completely different internal architectures. The attacker is a ML based algorithm whereas the victim is a generally unknown module, such as a multi-purpose digital chip, complex analog circuit, mechanical system, software logic or a hybrid of these. The trained DNT module not only can function as the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Machine Learning and Algorithms · Advanced Malware Detection Techniques
