Inverting and Understanding Object Detectors
Ang Cao, Justin Johnson

TL;DR
This paper introduces an inversion-based method to visualize and interpret modern object detectors, revealing their reliance on different features, learned object co-occurrences, and size-dependent cues, thereby enhancing understanding of their decision processes.
Contribution
It develops a novel layout inversion technique to generate synthetic images recognized by detectors, providing new insights into detector behavior and feature reliance.
Findings
Detectors use different features for classification and regression.
Detectors learn canonical co-occurrence motifs.
Size variation affects visual cues used by detectors.
Abstract
As a core problem in computer vision, the performance of object detection has improved drastically in the past few years. Despite their impressive performance, object detectors suffer from a lack of interpretability. Visualization techniques have been developed and widely applied to introspect the decisions made by other kinds of deep learning models; however, visualizing object detectors has been underexplored. In this paper, we propose using inversion as a primary tool to understand modern object detectors and develop an optimization-based approach to layout inversion, allowing us to generate synthetic images recognized by trained detectors as containing a desired configuration of objects. We reveal intriguing properties of detectors by applying our layout inversion technique to a variety of modern object detectors, and further investigate them via validation experiments: they rely on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Visual Attention and Saliency Detection
