Automatic dataset generation for specific object detection
Xiaotian Lin, Leiyang Xu, Qiang Wang

TL;DR
This paper introduces a novel method for automatically generating object detection datasets by synthesizing images with objects seamlessly integrated into new scenes, reducing manual labeling effort and enhancing data quality.
Contribution
It presents a new approach to synthesize object-in-scene images that preserve detailed features and improve training data for detection models, addressing scalability and label limitations.
Findings
Synthesized images blend object boundaries effectively with backgrounds.
State-of-the-art segmentation models perform well on the generated data.
The method reduces the need for manual dataset creation.
Abstract
In the past decade, object detection tasks are defined mostly by large public datasets. However, building object detection datasets is not scalable due to inefficient image collecting and labeling. Furthermore, most labels are still in the form of bounding boxes, which provide much less information than the real human visual system. In this paper, we present a method to synthesize object-in-scene images, which can preserve the objects' detailed features without bringing irrelevant information. In brief, given a set of images containing a target object, our algorithm first trains a model to find an approximate center of the object as an anchor, then makes an outline regression to estimate its boundary, and finally blends the object into a new scene. Our result shows that in the synthesized image, the boundaries of objects blend very well with the background. Experiments also show that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Visual Attention and Saliency Detection
