Effects of Architectures on Continual Semantic Segmentation
Tobias Kalb, Niket Ahuja, Jingxing Zhou, J\"urgen Beyerer

TL;DR
This paper investigates how different neural network architectures, including CNNs, Transformers, and hybrids, influence catastrophic forgetting in continual semantic segmentation, highlighting architecture choice as a key factor.
Contribution
It provides a comprehensive comparison of architectures and normalization layers, revealing their impact on stability and plasticity in continual learning for semantic segmentation.
Findings
Transformers are more stable than CNNs in continual learning.
Hybrid architectures balance plasticity and stability effectively.
Continual Normalization improves adaptability and stability.
Abstract
Research in the field of Continual Semantic Segmentation is mainly investigating novel learning algorithms to overcome catastrophic forgetting of neural networks. Most recent publications have focused on improving learning algorithms without distinguishing effects caused by the choice of neural architecture.Therefore, we study how the choice of neural network architecture affects catastrophic forgetting in class- and domain-incremental semantic segmentation. Specifically, we compare the well-researched CNNs to recently proposed Transformers and Hybrid architectures, as well as the impact of the choice of novel normalization layers and different decoder heads. We find that traditional CNNs like ResNet have high plasticity but low stability, while transformer architectures are much more stable. When the inductive biases of CNN architectures are combined with transformers in hybrid…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI · Multimodal Machine Learning Applications
MethodsResidual Connection · Batch Normalization · 1x1 Convolution · *Communicated@Fast*How Do I Communicate to Expedia? · Convolution · Kaiming Initialization · Max Pooling · Average Pooling · Residual Block · Bottleneck Residual Block
