Evaluating the Performance of Open-Vocabulary Object Detection in Low-quality Image

Po-Chih Wu

arXiv:2512.22801·cs.CV·January 5, 2026

Evaluating the Performance of Open-Vocabulary Object Detection in Low-quality Image

Po-Chih Wu

PDF

Open Access

TL;DR

This paper evaluates open-vocabulary object detection models under low-quality image conditions, revealing their robustness varies with degradation level and providing a new dataset for future research.

Contribution

Introduces a new dataset simulating low-quality images and benchmarks existing models, highlighting their performance differences under various degradation levels.

Findings

01

Models maintain performance under low-level degradation

02

High-level degradation causes sharp performance drops

03

OWLv2 outperforms other models across degradations

Abstract

Open-vocabulary object detection enables models to localize and recognize objects beyond a predefined set of categories and is expected to achieve recognition capabilities comparable to human performance. In this study, we aim to evaluate the performance of existing models on open-vocabulary object detection tasks under low-quality image conditions. For this purpose, we introduce a new dataset that simulates low-quality images in the real world. In our evaluation experiment, we find that although open-vocabulary object detection models exhibited no significant decrease in mAP scores under low-level image degradation, the performance of all models dropped sharply under high-level image degradation. OWLv2 models consistently performed better across different types of degradation, while OWL-ViT, GroundingDINO, and Detic showed significant performance declines. We will release our dataset…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning