Scalable Evaluation of the Realism of Synthetic Environmental Augmentations in Images
Damian J. Ruck, Paul Vautravers, Oliver Chalkley, Jake Thomas

TL;DR
This paper introduces a scalable framework to evaluate the realism of synthetic environmental augmentations in images, demonstrating that generative AI can produce highly realistic adverse-condition images suitable for AI system evaluation.
Contribution
The paper develops a novel, scalable evaluation framework combining automated metrics to assess the realism of generative AI image edits for environmental conditions.
Findings
Generative AI methods outperform rule-based approaches in realism.
Best generative methods achieve 3.6 times higher acceptance than rule-based methods.
Generative models match or exceed real-image realism for most conditions.
Abstract
Evaluation of AI systems often requires synthetic test cases, particularly for rare or safety-critical conditions that are difficult to observe in operational data. Generative AI offers a promising approach for producing such data through controllable image editing, but its usefulness depends on whether the resulting images are sufficiently realistic to support meaningful evaluation. We present a scalable framework for assessing the realism of synthetic image-editing methods and apply it to the task of adding environmental conditions-fog, rain, snow, and nighttime-to car-mounted camera images. Using 40 clear-day images, we compare rule-based augmentation libraries with generative AI image-editing models. Realism is evaluated using two complementary automated metrics: a vision-language model (VLM) jury for perceptual realism assessment, and embedding-based distributional analysis to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Image Enhancement Techniques · Multimodal Machine Learning Applications
