Disentangling Visual Priors: Unsupervised Learning of Scene   Interpretations with Compositional Autoencoder

Krzysztof Krawiec; Antoni Nowinowski

arXiv:2409.09716·cs.CV·September 17, 2024

Disentangling Visual Priors: Unsupervised Learning of Scene Interpretations with Compositional Autoencoder

Krzysztof Krawiec, Antoni Nowinowski

PDF

TL;DR

This paper introduces a neurosymbolic autoencoder that learns to interpret scenes by disentangling visual concepts like objects and transforms using a domain-specific language, enabling better generalization and noise robustness.

Contribution

It presents a novel neurosymbolic architecture combining symbolic priors with neural features to interpret images, which is a new approach in scene understanding.

Findings

01

Successfully disentangles visual scene aspects

02

Learns effectively from small datasets

03

Generalizes well to out-of-sample data

Abstract

Contemporary deep learning architectures lack principled means for capturing and handling fundamental visual concepts, like objects, shapes, geometric transforms, and other higher-level structures. We propose a neurosymbolic architecture that uses a domain-specific language to capture selected priors of image formation, including object shape, appearance, categorization, and geometric transforms. We express template programs in that language and learn their parameterization with features extracted from the scene by a convolutional neural network. When executed, the parameterized program produces geometric primitives which are rendered and assessed for correspondence with the scene content and trained via auto-association with gradient. We confront our approach with a baseline method on a synthetic benchmark and demonstrate its capacity to disentangle selected aspects of the image…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.