EHSOD: CAM-Guided End-to-end Hybrid-Supervised Object Detection with Cascade Refinement
Linpu Fang, Hang Xu, Zhili Liu, Sarah Parisot, Zhenguo Li

TL;DR
EHSOD introduces an end-to-end hybrid-supervised object detection system that effectively combines fully-annotated and weakly-annotated data, achieving high performance with less manual labeling effort.
Contribution
The paper proposes a novel end-to-end hybrid-supervised detection framework with CAM-guided proposals and cascade refinement, eliminating iterative pseudo-labeling.
Findings
Achieves 37.5% mAP on COCO with only 30% fully-annotated data.
Outperforms traditional iterative methods in efficiency and accuracy.
Demonstrates effectiveness across multiple detection benchmarks.
Abstract
Object detectors trained on fully-annotated data currently yield state of the art performance but require expensive manual annotations. On the other hand, weakly-supervised detectors have much lower performance and cannot be used reliably in a realistic setting. In this paper, we study the hybrid-supervised object detection problem, aiming to train a high quality detector with only a limited amount of fullyannotated data and fully exploiting cheap data with imagelevel labels. State of the art methods typically propose an iterative approach, alternating between generating pseudo-labels and updating a detector. This paradigm requires careful manual hyper-parameter tuning for mining good pseudo labels at each round and is quite time-consuming. To address these issues, we present EHSOD, an end-to-end hybrid-supervised object detection system which can be trained in one shot on both fully…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning
