MUM : Mix Image Tiles and UnMix Feature Tiles for Semi-Supervised Object Detection
JongMok Kim, Jooyoung Jang, Seunghyeon Seo, Jisoo Jeong, Jongkeun Na,, Nojun Kwak

TL;DR
This paper introduces MUM, a novel data augmentation technique for semi-supervised object detection that unmixes feature tiles to create effective weak-strong input pairs, improving detection performance.
Contribution
MUM is a simple, effective augmentation method that reconstructs mixed image tiles in feature space, compatible with various SSOD methods and enhancing detection accuracy.
Findings
Consistently improves mAP across benchmarks
Effective in creating meaningful weak-strong pairs
Enhances SSOD performance on MS-COCO and PASCAL VOC
Abstract
Many recent semi-supervised learning (SSL) studies build teacher-student architecture and train the student network by the generated supervisory signal from the teacher. Data augmentation strategy plays a significant role in the SSL framework since it is hard to create a weak-strong augmented input pair without losing label information. Especially when extending SSL to semi-supervised object detection (SSOD), many strong augmentation methodologies related to image geometry and interpolation-regularization are hard to utilize since they possibly hurt the location information of the bounding box in the object detection task. To address this, we introduce a simple yet effective data augmentation method, Mix/UnMix (MUM), which unmixes feature tiles for the mixed image tiles for the SSOD framework. Our proposed method makes mixed input image tiles and reconstructs them in the feature space.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Brain Tumor Detection and Classification · Domain Adaptation and Few-Shot Learning
