Improving Object Detection and Attribute Recognition by Feature Entanglement Reduction
Zhaoheng Zheng, Arka Sadhu, Ram Nevatia

TL;DR
This paper proposes a two-stream model to disentangle category and attribute features in object detection, leading to improved accuracy in detecting objects and their attributes like color and material.
Contribution
The paper introduces a novel two-stream architecture that reduces feature entanglement between object categories and attributes, enhancing detection performance.
Findings
Significant improvements on VG-20 dataset.
Effective disentanglement of category and attribute features.
Enhanced attribute transfer capabilities.
Abstract
We explore object detection with two attributes: color and material. The task aims to simultaneously detect objects and infer their color and material. A straight-forward approach is to add attribute heads at the very end of a usual object detection pipeline. However, we observe that the two goals are in conflict: Object detection should be attribute-independent and attributes be largely object-independent. Features computed by a standard detection network entangle the category and attribute features; we disentangle them by the use of a two-stream model where the category and attribute features are computed independently but the classification heads share Regions of Interest (RoIs). Compared with a traditional single-stream model, our model shows significant improvements over VG-20, a subset of Visual Genome, on both supervised and attribute transfer tasks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Visual Attention and Saliency Detection · Advanced Neural Network Applications
