Approximating CNNs with Bag-of-local-Features models works surprisingly well on ImageNet
Wieland Brendel, Matthias Bethge

TL;DR
This paper introduces BagNets, a simple yet high-performing CNN variant based on local features without spatial order, achieving near state-of-the-art accuracy on ImageNet and offering better interpretability.
Contribution
The paper demonstrates that a bag-of-local-features model can perform competitively on ImageNet and provides insights into the decision strategies of deep neural networks.
Findings
BagNets reach 87.6% top-5 accuracy on ImageNet.
BagNets' decision process is more interpretable due to local feature analysis.
Similar behavior to state-of-the-art DNNs in feature sensitivity and error distribution.
Abstract
Deep Neural Networks (DNNs) excel on many complex perceptual tasks but it has proven notoriously difficult to understand how they reach their decisions. We here introduce a high-performance DNN architecture on ImageNet whose decisions are considerably easier to explain. Our model, a simple variant of the ResNet-50 architecture called BagNet, classifies an image based on the occurrences of small local image features without taking into account their spatial ordering. This strategy is closely related to the bag-of-feature (BoF) models popular before the onset of deep learning and reaches a surprisingly high accuracy on ImageNet (87.6% top-5 for 33 x 33 px features and Alexnet performance for 17 x 17 px features). The constraint on local features makes it straight-forward to analyse how exactly each part of the image influences the classification. Furthermore, the BagNets behave similar to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Industrial Vision Systems and Defect Detection · Domain Adaptation and Few-Shot Learning
