AutoSimulate: (Quickly) Learning Synthetic Data Generation

Harkirat Singh Behl; At{\i}l{\i}m G\"une\c{s} Baydin; Ran Gal; Philip; H.S. Torr; Vibhav Vineet

arXiv:2008.08424·cs.CV·August 20, 2020

AutoSimulate: (Quickly) Learning Synthetic Data Generation

Harkirat Singh Behl, At{\i}l{\i}m G\"une\c{s} Baydin, Ran Gal, Philip, H.S. Torr, Vibhav Vineet

PDF

TL;DR

AutoSimulate introduces a differentiable approximation method for efficient synthetic data generation, significantly reducing computational costs and improving accuracy in machine learning tasks involving simulation-based data.

Contribution

The paper presents a novel differentiable approach for optimizing simulator parameters, enabling faster and more accurate synthetic data generation compared to existing black-box methods.

Findings

01

Finds optimal data distribution up to 50 times faster

02

Reduces data generation needs up to 30 times

03

Achieves 8.7% better accuracy on real datasets

Abstract

Simulation is increasingly being used for generating large labelled datasets in many machine learning problems. Recent methods have focused on adjusting simulator parameters with the goal of maximising accuracy on a validation task, usually relying on REINFORCE-like gradient estimators. However these approaches are very expensive as they treat the entire data generation, model training, and validation pipeline as a black-box and require multiple costly objective evaluations at each iteration. We propose an efficient alternative for optimal synthetic data generation, based on a novel differentiable approximation of the objective. This allows us to optimize the simulator, which may be non-differentiable, requiring only one objective evaluation at each iteration with a little overhead. We demonstrate on a state-of-the-art photorealistic renderer that the proposed method finds the optimal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.