Deep Nets with Subsampling Layers Unwittingly Discard Useful Activations at Test-Time
Chiao-An Yang, Ziwei Liu, Raymond A. Yeh

TL;DR
This paper challenges the common practice of discarding activations in subsampling layers of deep nets, proposing a method to utilize these discarded features at test time to enhance prediction accuracy across various tasks.
Contribution
The authors introduce a search and aggregate method to leverage discarded activations in subsampling layers, improving test-time performance in image classification and segmentation.
Findings
Consistent performance improvements across nine architectures.
Enhancement of existing test-time augmentation techniques.
Applicable to multiple datasets and tasks.
Abstract
Subsampling layers play a crucial role in deep nets by discarding a portion of an activation map to reduce its spatial dimensions. This encourages the deep net to learn higher-level representations. Contrary to this motivation, we hypothesize that the discarded activations are useful and can be incorporated on the fly to improve models' prediction. To validate our hypothesis, we propose a search and aggregate method to find useful activation maps to be used at test time. We applied our approach to the task of image classification and semantic segmentation. Extensive experiments over nine different architectures on multiple datasets show that our method consistently improves model test-time performance, complementing existing test-time augmentation techniques. Our code is available at https://github.com/ca-joe-yang/discard-in-subsampling.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and ELM · Neural Networks and Applications · Data Stream Mining Techniques
