Curved Representation Space of Vision Transformers

Juyeop Kim; Junha Park; Songkuk Kim; Jong-Seok Lee

arXiv:2210.05742·cs.CV·December 15, 2023

Curved Representation Space of Vision Transformers

Juyeop Kim, Junha Park, Songkuk Kim, Jong-Seok Lee

PDF

Open Access 1 Video

TL;DR

This paper investigates the curved representation space of Vision Transformers, revealing how their nonlinear trajectories in feature space contribute to robustness and underconfidence, contrasting with CNNs.

Contribution

It provides empirical analysis of the nonlinear, curved trajectories in Transformer representations and links these to robustness and prediction confidence.

Findings

01

Transformers exhibit nonlinear, curved trajectories in representation space.

02

Curved regions in the space hinder movement out of decision regions, enhancing robustness.

03

Linear movements near decision boundaries lead to confident, direct predictions.

Abstract

Neural networks with self-attention (a.k.a. Transformers) like ViT and Swin have emerged as a better alternative to traditional convolutional neural networks (CNNs). However, our understanding of how the new architecture works is still limited. In this paper, we focus on the phenomenon that Transformers show higher robustness against corruptions than CNNs, while not being overconfident. This is contrary to the intuition that robustness increases with confidence. We resolve this contradiction by empirically investigating how the output of the penultimate layer moves in the representation space as the input data moves linearly within a small area. In particular, we show the following. (1) While CNNs exhibit fairly linear relationship between the input and output movements, Transformers show nonlinear relationship for some data. For those data, the output of Transformers moves in a curved…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Curved Representation Space of Vision Transformers· underline

Taxonomy

TopicsNeural Networks and Applications · Advanced Neural Network Applications · Adversarial Robustness in Machine Learning