Progress and limitations of deep networks to recognize objects in unusual poses
Amro Abbas, St\'ephane Deny

TL;DR
This study evaluates the robustness of deep networks in recognizing objects in unusual poses, revealing significant accuracy drops and limited improvements from various design choices, highlighting challenges for real-world applications.
Contribution
We introduce a synthetic dataset for unusual object poses and systematically assess 38 deep networks, demonstrating persistent brittleness and the impact of training data size on robustness.
Findings
All networks show an average 29.5% accuracy drop on unusual poses.
Large datasets improve robustness, with the best network showing only 14.5% accuracy drop.
Transformations like 3D rotations further decrease recognition performance.
Abstract
Deep networks should be robust to rare events if they are to be successfully deployed in high-stakes real-world applications (e.g., self-driving cars). Here we study the capability of deep networks to recognize objects in unusual poses. We create a synthetic dataset of images of objects in unusual orientations, and evaluate the robustness of a collection of 38 recent and competitive deep networks for image classification. We show that classifying these images is still a challenge for all networks tested, with an average accuracy drop of 29.5% compared to when the objects are presented upright. This brittleness is largely unaffected by various network design choices, such as training losses (e.g., supervised vs. self-supervised), architectures (e.g., convolutional networks vs. transformers), dataset modalities (e.g., images vs. image-text pairs), and data-augmentation schemes. However,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques
MethodsDropout · Stochastic Depth · RandAugment · Noisy Student
