Compositional Law Parsing with Latent Random Functions

Fan Shi; Bin Li; Xiangyang Xue

arXiv:2209.09115·cs.CV·February 28, 2023

Compositional Law Parsing with Latent Random Functions

Fan Shi, Bin Li, Xiangyang Xue

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces CLAP, a deep latent variable model that learns and manipulates the underlying laws of concepts in visual scenes, demonstrating human-like compositional understanding and interpretability.

Contribution

It presents a novel encoding-decoding architecture with concept-specific latent random functions using Neural Processes for compositional law parsing.

Findings

01

Outperforms baseline methods in physics and reasoning tasks

02

Enables law manipulation and composition for interpretability

03

Learns laws of position and appearance from visual scenes

Abstract

Human cognition has compositionality. We understand a scene by decomposing the scene into different concepts (e.g., shape and position of an object) and learning the respective laws of these concepts, which may be either natural (e.g., laws of motion) or man-made (e.g., laws of a game). The automatic parsing of these laws indicates the model's ability to understand the scene, which makes law parsing play a central role in many visual tasks. This paper proposes a deep latent variable model for Compositional LAw Parsing (CLAP), which achieves the human-like compositionality ability through an encoding-decoding architecture to represent concepts in the scene as latent variables. CLAP employs concept-specific latent random functions instantiated with Neural Processes to capture the law of concepts. Our experimental results demonstrate that CLAP outperforms the baseline methods in multiple…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

fudanvi/generative-abstract-reasoning
pytorchOfficial

Videos

Compositional Law Parsing with Latent Random Functions· slideslive

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques · Multimodal Machine Learning Applications