Paint Outside the Box: Synthesizing and Selecting Training Data for Visual Grounding
Zilin Du, Haoxin Li, Jianfei Yu, Boyang Li

TL;DR
This paper introduces POBF, a novel framework that synthesizes and filters training data to improve visual grounding performance in data-scarce scenarios, demonstrating significant accuracy gains and robustness across multiple benchmarks.
Contribution
The paper proposes POBF, a new method combining image synthesis and data filtering to enhance visual grounding under limited data conditions, addressing label misalignment and data selection challenges.
Findings
POBF achieves an average of 5.83% performance improvement over real-data-only methods.
POBF outperforms leading baselines by 2.29%-3.85% in accuracy.
The framework is robust across different generative models, data sizes, and architectures.
Abstract
Visual grounding aims to localize the image regions based on a textual query. Given the difficulty of large-scale data curation, we investigate how to effectively learn visual grounding under data-scarce settings in this paper. To address the data scarcity, we propose a novel framework, POBF (Paint Outside the Box and Filter). POBF synthesizes images by inpainting outside the box, tackling a label misalignment issue encountered in previous works. Furthermore, POBF leverages an innovative filtering scheme to select the most effective training data. This scheme combines a hardness score and an overfitting score, balanced by a penalty term. Extensive experiments across four benchmark datasets demonstrate that POBF consistently improves performance, achieving an average gain of 5.83\% over the real-data-only method and outperforming leading baselines by 2.29\%-3.85\% in accuracy.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Visualization and Analytics
MethodsInpainting
