SemAug: Semantically Meaningful Image Augmentations for Object Detection Through Language Grounding

Morgan Heisler; Amin Banitalebi-Dehkordi; Yong Zhang

arXiv:2208.07407·cs.CV·May 21, 2025

SemAug: Semantically Meaningful Image Augmentations for Object Detection Through Language Grounding

Morgan Heisler, Amin Banitalebi-Dehkordi, Yong Zhang

PDF

Open Access

TL;DR

SemAug introduces a novel image augmentation technique that injects semantically meaningful objects into scenes using language grounding, enhancing object detection performance without significant overhead.

Contribution

It proposes a new augmentation method that adds contextually relevant objects into images, improving generalization for object detection without extra training overhead.

Findings

01

Achieved 2-4% mAP improvement on Pascal VOC.

02

Achieved 1-2% mAP improvement on COCO.

03

Effective across various model architectures.

Abstract

Data augmentation is an essential technique in improving the generalization of deep neural networks. The majority of existing image-domain augmentations either rely on geometric and structural transformations, or apply different kinds of photometric distortions. In this paper, we propose an effective technique for image augmentation by injecting contextually meaningful knowledge into the scenes. Our method of semantically meaningful image augmentation for object detection via language grounding, SemAug, starts by calculating semantically appropriate new objects that can be placed into relevant locations in the image (the what and where problems). Then it embeds these objects into their relevant target locations, thereby promoting diversity of object instance distribution. Our method allows for introducing new object instances and categories that may not even exist in the training set.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Neural Network Applications · Natural Language Processing Techniques