TL;DR
This paper introduces a novel training scheme for CNN classifiers that incorporates visual explanation techniques to enhance robustness by encouraging the model to consider multiple image regions, leading to improved performance on various datasets.
Contribution
The work presents a new robust training approach that uses visual explanation techniques to distract CNNs during learning, promoting attention to diverse image regions for better classification.
Findings
Achieved state-of-the-art results on EgoFoodPlaces dataset.
Improved model robustness and generalization across datasets.
Reduced model complexity while maintaining high accuracy.
Abstract
The field of deep learning is evolving in different directions, with still the need for more efficient training strategies. In this work, we present a novel and robust training scheme that integrates visual explanation techniques in the learning process. Unlike the attention mechanisms that focus on the relevant parts of images, we aim to improve the robustness of the model by making it pay attention to other regions as well. Broadly speaking, the idea is to distract the classifier in the learning process to force it to focus not only on relevant regions but also on those that, a priori, are not so informative for the discrimination of the class. We tested the proposed approach by embedding it into the learning process of a convolutional neural network for the analysis and classification of two well-known datasets, namely Stanford cars and FGVC-Aircraft. Furthermore, we evaluated our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
