Exploiting and Defending Against the Approximate Linearity of Apple's NeuralHash
Jagdeep Singh Bhatia, Kevin Meng

TL;DR
This paper reveals that Apple's NeuralHash has an approximate linearity property enabling black-box attacks that can evade detection, cause collisions, and leak information, threatening its security and privacy goals.
Contribution
The paper uncovers the approximate linearity of NeuralHash and develops black-box attacks exploiting this property, proposing a cryptographic fix to enhance security.
Findings
NeuralHash exhibits approximate linearity.
Black-box attacks can evade detection and cause collisions.
A cryptographic fix improves NeuralHash security.
Abstract
Perceptual hashes map images with identical semantic content to the same -bit hash value, while mapping semantically-different images to different hashes. These algorithms carry important applications in cybersecurity such as copyright infringement detection, content fingerprinting, and surveillance. Apple's NeuralHash is one such system that aims to detect the presence of illegal content on users' devices without compromising consumer privacy. We make the surprising discovery that NeuralHash is approximately linear, which inspires the development of novel black-box attacks that can (i) evade detection of "illegal" images, (ii) generate near-collisions, and (iii) leak information about hashed images, all without access to model parameters. These vulnerabilities pose serious threats to NeuralHash's security goals; to address them, we propose a simple fix using classical cryptographic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Media Forensic Detection · Advanced Steganography and Watermarking Techniques · Chaos-based Image/Signal Encryption
