# Three-Dimensional Reconstruction Pre-Training as a Prior to Improve Robustness to Adversarial Attacks and Spurious Correlation

**Authors:** Yutaro Yamada, Fred Weiying Zhang, Yuval Kluger, Ilker Yildirim

PMC · DOI: 10.3390/e26030258 · Entropy · 2024-03-14

## TL;DR

This paper explores using 3D reconstruction pre-training to improve image classifiers' robustness against adversarial attacks and spurious correlations.

## Contribution

The novel contribution is combining 3D reconstruction pre-training with adversarial training to enhance robustness in realistic conditions.

## Key findings

- 3D pre-training improves adversarial robustness in realistic settings with textured backgrounds.
- 3D pre-training outperforms 2D pre-training on the ShapeNet dataset.
- 3D pre-training reduces spurious correlations between shape and background textures.

## Abstract

Ensuring robustness of image classifiers against adversarial attacks and spurious correlation has been challenging. One of the most effective methods for adversarial robustness is a type of data augmentation that uses adversarial examples during training. Here, inspired by computational models of human vision, we explore a synthesis of this approach by leveraging a structured prior over image formation: the 3D geometry of objects and how it projects to images. We combine adversarial training with a weight initialization that implicitly encodes such a prior about 3D objects via 3D reconstruction pre-training. We evaluate our approach using two different datasets and compare it to alternative pre-training protocols that do not encode a prior about 3D shape. To systematically explore the effect of 3D pre-training, we introduce a novel dataset called Geon3D, which consists of simple shapes that nevertheless capture variation in multiple distinct dimensions of geometry. We find that while 3D reconstruction pre-training does not improve robustness for the simplest dataset setting, we consider (Geon3D on a clean background) that it improves upon adversarial training in more realistic (Geon3D with textured background and ShapeNet) conditions. We also find that 3D pre-training coupled with adversarial training improves the robustness to spurious correlations between shape and background textures. Furthermore, we show that the benefit of using 3D-based pre-training outperforms 2D-based pre-training on ShapeNet. We hope that these results encourage further investigation of the benefits of structured, 3D-based models of vision for adversarial robustness.

## Full-text entities

- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC10968904/full.md

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/PMC10968904/full.md

## References

47 references — full list in the complete paper: https://tomesphere.com/paper/PMC10968904/full.md

---
Source: https://tomesphere.com/paper/PMC10968904