DAViD: Data-efficient and Accurate Vision Models from Synthetic Data

Fatemeh Saleh; Sadegh Aliakbarian; Charlie Hewitt; Lohit Petikam; Xiao-Xian; Antonio Criminisi; Thomas J. Cashman; Tadas Baltru\v{s}aitis

arXiv:2507.15365·cs.CV·July 22, 2025

DAViD: Data-efficient and Accurate Vision Models from Synthetic Data

Fatemeh Saleh, Sadegh Aliakbarian, Charlie Hewitt, Lohit Petikam, Xiao-Xian, Antonio Criminisi, Thomas J. Cashman, Tadas Baltru\v{s}aitis

PDF

1 Datasets 1 Video

TL;DR

This paper shows that high-accuracy human-centric vision models can be trained efficiently on synthetic data, reducing costs and addressing fairness, while maintaining performance across multiple dense prediction tasks.

Contribution

It introduces a method to train accurate vision models using smaller, high-fidelity synthetic datasets, improving efficiency and fairness compared to traditional large-scale approaches.

Findings

01

Models trained on synthetic data match real data accuracy

02

Significant reduction in training and inference costs

03

Enhanced control over data diversity and fairness

Abstract

The state of the art in human-centric computer vision achieves high accuracy and robustness across a diverse range of tasks. The most effective models in this domain have billions of parameters, thus requiring extremely large datasets, expensive training regimes, and compute-intensive inference. In this paper, we demonstrate that it is possible to train models on much smaller but high-fidelity synthetic datasets, with no loss in accuracy and higher efficiency. Using synthetic training data provides us with excellent levels of detail and perfect labels, while providing strong guarantees for data provenance, usage rights, and user consent. Procedural data synthesis also provides us with explicit control on data diversity, that we can use to address unfairness in the models we train. Extensive quantitative assessment on real input images demonstrates accuracy of our models on three dense…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

Voxel51/SynthHuman
dataset· 2.5k dl
2.5k dl

Videos

DAViD: Data-efficient and Accurate Vision Models from Synthetic Data· youtube