Global Average Feature Augmentation for Robust Semantic Segmentation with Transformers
Alberto Gonzalo Rodriguez Salgado, Maying Shen, Philipp Harzig, Peter, Mayer, Jose M. Alvarez

TL;DR
This paper introduces Channel Wise Feature Augmentation (CWFA), a simple technique that enhances the robustness of Vision Transformers for semantic segmentation against visual corruptions, with minimal training overhead.
Contribution
The paper proposes CWFA, a novel feature augmentation method that improves Vision Transformer robustness for semantic segmentation without sacrificing performance on clean data.
Findings
CWFA significantly improves robustness to noise and corruptions.
CWFA achieves state-of-the-art robustness metrics on Cityscapes.
CWFA enhances multiple Transformer architectures with minimal computational cost.
Abstract
Robustness to out-of-distribution data is crucial for deploying modern neural networks. Recently, Vision Transformers, such as SegFormer for semantic segmentation, have shown impressive robustness to visual corruptions like blur or noise affecting the acquisition device. In this paper, we propose Channel Wise Feature Augmentation (CWFA), a simple yet efficient feature augmentation technique to improve the robustness of Vision Transformers for semantic segmentation. CWFA applies a globally estimated perturbation per encoder with minimal compute overhead during training. Extensive evaluations on Cityscapes and ADE20K, with three state-of-the-art Vision Transformer architectures : SegFormer, Swin Transformer, and Twins demonstrate that CWFA-enhanced models significantly improve robustness without affecting clean data performance. For instance, on Cityscapes, a CWFA-augmented SegFormer-B1…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Machine Learning and Data Classification · Domain Adaptation and Few-Shot Learning
MethodsAttention Is All You Need · Convolution · Refunds@Expedia|||How do I get a full refund from Expedia? · Mix-FFN · Absolute Position Encodings · Adam · Softmax · Stochastic Depth · Label Smoothing · Dropout
