Synthetic Designed Experiments for Diagnosing Vision Model Failure

Krisanu Sarkar

arXiv:2605.00832·cs.CV·May 5, 2026

Synthetic Designed Experiments for Diagnosing Vision Model Failure

Krisanu Sarkar

PDF

TL;DR

This paper introduces SDRS, a method using Design of Experiments principles to diagnose and address specific failure modes in vision models through targeted synthetic data generation.

Contribution

It proposes a novel framework that treats synthetic data generation as an experimental process, enabling precise identification and correction of model failure modes.

Findings

01

SDRS accurately identifies failure types in controlled and real-world scenarios.

02

Targeted synthetic data improves model accuracy and segmentation performance.

03

ANOVA-based audit detects cross-factor contamination in generators.

Abstract

Current synthetic data pipelines for computer vision generate images without diagnosing what the downstream model actually needs. This open-loop paradigm treats synthetic data as cheap real data, randomly sampling the generator's output space and hoping to cover the model's failure modes. We argue this fundamentally misuses synthetic data's unique property: the controllable, independent variation of scene factors.Drawing on the statistical theory of Design of Experiments (DoE), we propose Synthetic Designed Experiments for Representational Sufficiency (SDRS). SDRS treats the downstream model as a black-box system and the synthetic generator as an experimental apparatus. Using fractional factorial designs, SDRS efficiently audits a model's factor-sensitivity profile via ANOVA decomposition. It classifies failures into two actionable types: Type I gaps (coverage failures on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.