Compositional Scene Representation Learning via Reconstruction: A Survey

Jinyang Yuan; Tonglin Chen; Bin Li; Xiangyang Xue

arXiv:2202.07135·cs.LG·June 16, 2023

Compositional Scene Representation Learning via Reconstruction: A Survey

Jinyang Yuan, Tonglin Chen, Bin Li, Xiangyang Xue

PDF

Open Access 1 Datasets

TL;DR

This survey reviews recent advances in learning compositional scene representations through reconstruction using deep neural networks, highlighting progress, benchmarks, limitations, and future directions in the field.

Contribution

It provides a comprehensive overview of reconstruction-based compositional scene representation learning methods, including development history, categorizations, benchmarks, and open source tools.

Findings

01

Progress in deep learning methods for scene representation

02

Benchmark datasets and open source toolbox provided

03

Discussion on limitations and future research directions

Abstract

Visual scenes are composed of visual concepts and have the property of combinatorial explosion. An important reason for humans to efficiently learn from diverse visual scenes is the ability of compositional perception, and it is desirable for artificial intelligence to have similar abilities. Compositional scene representation learning is a task that enables such abilities. In recent years, various methods have been proposed to apply deep neural networks, which have been proven to be advantageous in representation learning, to learn compositional scene representations via reconstruction, advancing this research direction into the deep learning era. Learning via reconstruction is advantageous because it may utilize massive unlabeled data and avoid costly and laborious data annotation. In this survey, we first outline the current progress on reconstruction-based compositional scene…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

Yinxuan/OCTScenes
dataset· 348 dl
348 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Surveying and Cultural Heritage · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning