# Unmasking Clever Hans Predictors and Assessing What Machines Really   Learn

**Authors:** Sebastian Lapuschkin, Stephan W\"aldchen, Alexander Binder, Gr\'egoire, Montavon, Wojciech Samek, Klaus-Robert M\"uller

arXiv: 1902.10178 · 2019-02-28

## TL;DR

This paper investigates the decision-making behaviors of state-of-the-art learning machines across vision and gaming tasks, revealing that high accuracy does not always imply strategic understanding, and introduces a spectral relevance analysis method for better model validation.

## Contribution

It introduces a semi-automated Spectral Relevance Analysis technique to characterize and validate nonlinear machine behaviors, highlighting the importance of nuanced evaluation beyond standard metrics.

## Key findings

- Diverse problem-solving behaviors identified in machine models
- Standard metrics may overlook naive or short-sighted strategies
- Spectral Relevance Analysis effectively characterizes model behavior

## Abstract

Current learning machines have successfully solved hard application problems, reaching high accuracy and displaying seemingly "intelligent" behavior. Here we apply recent techniques for explaining decisions of state-of-the-art learning machines and analyze various tasks from computer vision and arcade games. This showcases a spectrum of problem-solving behaviors ranging from naive and short-sighted, to well-informed and strategic. We observe that standard performance evaluation metrics can be oblivious to distinguishing these diverse problem solving behaviors. Furthermore, we propose our semi-automated Spectral Relevance Analysis that provides a practically effective way of characterizing and validating the behavior of nonlinear learning machines. This helps to assess whether a learned model indeed delivers reliably for the problem that it was conceived for. Furthermore, our work intends to add a voice of caution to the ongoing excitement about machine intelligence and pledges to evaluate and judge some of these recent successes in a more nuanced manner.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1902.10178/full.md

## Figures

39 figures with captions in the complete paper: https://tomesphere.com/paper/1902.10178/full.md

## References

154 references — full list in the complete paper: https://tomesphere.com/paper/1902.10178/full.md

---
Source: https://tomesphere.com/paper/1902.10178