HoneypotNet: Backdoor Attacks Against Model Extraction

Yixu Wang; Tianle Gu; Yan Teng; Yingchun Wang; Xingjun Ma

arXiv:2501.01090·cs.CR·January 3, 2025

HoneypotNet: Backdoor Attacks Against Model Extraction

Yixu Wang, Tianle Gu, Yan Teng, Yingchun Wang, Xingjun Ma

PDF

Open Access 1 Video

TL;DR

This paper introduces HoneypotNet, a novel lightweight backdoor attack that poisons model outputs to defend against model extraction attacks, effectively deterring malicious users while maintaining model performance.

Contribution

HoneypotNet is a new backdoor method that modifies model outputs to poison substitute models, providing an attack-as-defense paradigm against model extraction.

Findings

01

High success rate in injecting backdoors into substitute models

02

Disrupts functionality of extracted models effectively

03

Works across four benchmark datasets

Abstract

Model extraction attacks are one type of inference-time attacks that approximate the functionality and performance of a black-box victim model by launching a certain number of queries to the model and then leveraging the model's predictions to train a substitute model. These attacks pose severe security threats to production models and MLaaS platforms and could cause significant monetary losses to the model owners. A body of work has proposed to defend machine learning models against model extraction attacks, including both active defense methods that modify the model's outputs or increase the query overhead to avoid extraction and passive defense methods that detect malicious queries or leverage watermarks to perform post-verification. In this work, we introduce a new defense paradigm called attack as defense which modifies the model's output to be poisonous such that any malicious…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

HoneypotNet: Backdoor Attacks Against Model Extraction· underline

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Machine Learning in Healthcare