Shortcut Detection with Variational Autoencoders

Nicolas M. M\"uller; Simon Roschmann; Shahbaz Khan; Philip Sperl,; Konstantin B\"ottinger

arXiv:2302.04246·cs.LG·July 24, 2023

Shortcut Detection with Variational Autoencoders

Nicolas M. M\"uller, Simon Roschmann, Shahbaz Khan, Philip Sperl,, Konstantin B\"ottinger

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel method using variational autoencoders to detect spurious correlations, or shortcuts, in image and audio datasets, aiding in the development of more robust machine learning models.

Contribution

The work presents a new approach leveraging VAE feature disentanglement to semi-automatically identify shortcuts in datasets, addressing a scarcely explored problem.

Findings

01

Successfully identified previously unknown shortcuts in datasets

02

Demonstrated applicability on multiple real-world datasets

03

Showed effectiveness in revealing spurious correlations

Abstract

For real-world applications of machine learning (ML), it is essential that models make predictions based on well-generalizing features rather than spurious correlations in the data. The identification of such spurious correlations, also known as shortcuts, is a challenging problem and has so far been scarcely addressed. In this work, we present a novel approach to detect shortcuts in image and audio datasets by leveraging variational autoencoders (VAEs). The disentanglement of features in the latent space of VAEs allows us to discover feature-target correlations in datasets and semi-automatically evaluate them for ML shortcuts. We demonstrate the applicability of our method on several real-world datasets and identify shortcuts that have not been discovered before.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

fraunhofer-aisec/shortcut-detection-vae
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Generative Adversarial Networks and Image Synthesis · Machine Learning and Data Classification