Are Transformers More Robust? Towards Exact Robustness Verification for   Transformers

Brian Hsuan-Cheng Liao; Chih-Hong Cheng; Hasan Esen; Alois Knoll

arXiv:2202.03932·cs.LG·December 3, 2024

Are Transformers More Robust? Towards Exact Robustness Verification for Transformers

Brian Hsuan-Cheng Liao, Chih-Hong Cheng, Hasan Esen, Alois Knoll

PDF

Open Access

TL;DR

This paper investigates the robustness of Sparsemax-based Transformers, framing the verification as an MIQCP problem, and compares their robustness to MLPs in safety-critical applications, revealing that Transformers are not inherently more robust.

Contribution

It introduces a novel MIQCP-based method for exact robustness verification of Transformers and proposes heuristics to improve computational efficiency.

Findings

01

Transformers are not necessarily more robust than MLPs.

02

The MIQCP approach enables exact robustness verification.

03

Heuristics significantly speed up the verification process.

Abstract

As an emerging type of Neural Networks (NNs), Transformers are used in many domains ranging from Natural Language Processing to Autonomous Driving. In this paper, we study the robustness problem of Transformers, a key characteristic as low robustness may cause safety concerns. Specifically, we focus on Sparsemax-based Transformers and reduce the finding of their maximum robustness to a Mixed Integer Quadratically Constrained Programming (MIQCP) problem. We also design two pre-processing heuristics that can be embedded in the MIQCP encoding and substantially accelerate its solving. We then conduct experiments using the application of Land Departure Warning to compare the robustness of Sparsemax-based Transformers against that of the more conventional Multi-Layer-Perceptron (MLP) NNs. To our surprise, Transformers are not necessarily more robust, leading to profound considerations in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Advanced Neural Network Applications

MethodsSoftmax · Layer Normalization · Sparsemax