HoneyImage: Verifiable, Harmless, and Stealthy Dataset Ownership Verification for Image Models

Zhihao Zhu; Jiale Han; Yi Yang

arXiv:2508.00892·cs.CV·August 5, 2025

HoneyImage: Verifiable, Harmless, and Stealthy Dataset Ownership Verification for Image Models

Zhihao Zhu, Jiale Han, Yi Yang

PDF

Open Access

TL;DR

HoneyImage is a novel dataset ownership verification method that subtly embeds verifiable traces into hard samples of image datasets, ensuring reliable ownership proof without compromising data integrity or model performance.

Contribution

It introduces a new technique for dataset ownership verification that balances effectiveness, imperceptibility, and dataset integrity, addressing limitations of existing methods.

Findings

01

Achieves high verification accuracy across multiple datasets and models.

02

Maintains dataset integrity and model performance with minimal modifications.

03

Proves practical for protecting proprietary image datasets in AI applications.

Abstract

Image-based AI models are increasingly deployed across a wide range of domains, including healthcare, security, and consumer applications. However, many image datasets carry sensitive or proprietary content, raising critical concerns about unauthorized data usage. Data owners therefore need reliable mechanisms to verify whether their proprietary data has been misused to train third-party models. Existing solutions, such as backdoor watermarking and membership inference, face inherent trade-offs between verification effectiveness and preservation of data integrity. In this work, we propose HoneyImage, a novel method for dataset ownership verification in image recognition models. HoneyImage selectively modifies a small number of hard samples to embed imperceptible yet verifiable traces, enabling reliable ownership verification while maintaining dataset integrity. Extensive experiments…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Generative Adversarial Networks and Image Synthesis · Advanced Neural Network Applications