DarkQA: Benchmarking Vision-Language Models on Visual-Primitive Question Answering in Low-Light Indoor Scenes

Yohan Park; Hyunwoo Ha; Wonjun Jo; Tae-Hyun Oh

arXiv:2512.24985·cs.CV·May 13, 2026

DarkQA: Benchmarking Vision-Language Models on Visual-Primitive Question Answering in Low-Light Indoor Scenes

Yohan Park, Hyunwoo Ha, Wonjun Jo, Tae-Hyun Oh

PDF

1 Repo

TL;DR

DarkQA is a benchmark designed to evaluate vision-language models' robustness in low-light indoor scenes, highlighting their limitations under challenging visual conditions.

Contribution

It introduces a physically realistic, open-source benchmark with 9.4K question-image pairs for assessing perceptual primitives in low-light environments, validated against real camera data.

Findings

01

VLMs degrade under low-light and sensor noise conditions.

02

LLIE methods provide inconsistent recovery across severity levels.

03

DarkQA reveals systematic limitations of current VLMs in low-light scenarios.

Abstract

Vision Language Models (VLMs) are increasingly adopted as central reasoning modules for embodied agents. Existing benchmarks evaluate their capabilities under ideal, well-lit conditions, yet robust 24/7 operation demands performance under a wide range of visual degradations, including low-light conditions at night or in dark environments, a core necessity that has been largely overlooked. To address this underexplored challenge, we present DarkQA, an open-source benchmark for evaluating perceptual primitives under multi-level low-light conditions in embodied scenarios. DarkQA evaluates single-view egocentric observations across controlled degradation levels, isolating low-light perceptual failures before they are entangled with complex embodied tasks. The benchmark contains 9.4K deterministically generated and verifiable question-image pairs spanning five visual-primitive families. A…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://darkqa-benchmark.github.io
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.