Humans can decipher adversarial images

Zhenglong Zhou; Chaz Firestone

arXiv:1809.04120·cs.CV·August 27, 2019

Humans can decipher adversarial images

Zhenglong Zhou, Chaz Firestone

PDF

TL;DR

Humans can reliably identify the labels assigned by machine-learning models to adversarial images, indicating a closer relationship between human and machine classification than previously thought.

Contribution

This study empirically demonstrates that humans can decipher and predict machine classifications of adversarial images across multiple datasets, challenging assumptions about divergence between human and machine vision.

Findings

01

Humans reliably identified machine labels on adversarial images.

02

Human classification patterns closely match machine predictions.

03

Results held across diverse image sets and recognition challenges.

Abstract

How similar is the human mind to the sophisticated machine-learning systems that mirror its performance? Models of object categorization based on convolutional neural networks (CNNs) have achieved human-level benchmarks in assigning known labels to novel images. These advances promise to support transformative technologies such as autonomous vehicles and machine diagnosis; beyond this, they also serve as candidate models for the visual system itself -- not only in their output but perhaps even in their underlying mechanisms and principles. However, unlike human vision, CNNs can be "fooled" by adversarial examples -- carefully crafted images that appear as nonsense patterns to humans but are recognized as familiar objects by machines, or that appear as one object to humans and a different object to machines. This seemingly extreme divergence between human and machine classification…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.