# Meta-Sim: Learning to Generate Synthetic Datasets

**Authors:** Amlan Kar, Aayush Prakash, Ming-Yu Liu, Eric Cameracci, Justin Yuan,, Matt Rusiniak, David Acuna, Antonio Torralba, Sanja Fidler

arXiv: 1904.11621 · 2019-04-29

## TL;DR

Meta-Sim introduces a neural network-based approach to automatically generate synthetic datasets tailored for specific tasks, reducing reliance on expensive real data and improving downstream performance.

## Contribution

It presents a novel method that learns to modify scene attributes to generate high-quality synthetic datasets optimized for target tasks.

## Key findings

- Significantly improves content quality over traditional probabilistic scene grammars.
- Enhances downstream task performance using synthetic data.
- Demonstrates effectiveness through qualitative and quantitative evaluations.

## Abstract

Training models to high-end performance requires availability of large labeled datasets, which are expensive to get. The goal of our work is to automatically synthesize labeled datasets that are relevant for a downstream task. We propose Meta-Sim, which learns a generative model of synthetic scenes, and obtain images as well as its corresponding ground-truth via a graphics engine. We parametrize our dataset generator with a neural network, which learns to modify attributes of scene graphs obtained from probabilistic scene grammars, so as to minimize the distribution gap between its rendered outputs and target data. If the real dataset comes with a small labeled validation set, we additionally aim to optimize a meta-objective, i.e. downstream task performance. Experiments show that the proposed method can greatly improve content generation quality over a human-engineered probabilistic scene grammar, both qualitatively and quantitatively as measured by performance on a downstream task.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1904.11621/full.md

## Figures

20 figures with captions in the complete paper: https://tomesphere.com/paper/1904.11621/full.md

## References

51 references — full list in the complete paper: https://tomesphere.com/paper/1904.11621/full.md

---
Source: https://tomesphere.com/paper/1904.11621