Disentangling Patterns and Transformations from One Sequence of Images   with Shape-invariant Lie Group Transformer

T. Takada; W. Shimaya; Y. Ohmura; Y. Kuniyoshi

arXiv:2203.11210·cs.CV·March 23, 2022

Disentangling Patterns and Transformations from One Sequence of Images with Shape-invariant Lie Group Transformer

T. Takada, W. Shimaya, Y. Ohmura, Y. Kuniyoshi

PDF

Open Access

TL;DR

This paper introduces a novel algebraic approach using shape-invariant Lie group transformers to disentangle objects and transformations from a single image sequence, enhancing scene understanding without extensive data.

Contribution

It proposes a new model that leverages Lie group transformers to separate patterns and transformations, requiring only one sequence of images for scene decomposition.

Findings

01

Successfully discovers distinct objects and transformations from one image sequence.

02

Effectively disentangles patterns and transformations invariant to shape.

03

Demonstrates potential for improved scene understanding with minimal data.

Abstract

An effective way to model the complex real world is to view the world as a composition of basic components of objects and transformations. Although humans through development understand the compositionality of the real world, it is extremely difficult to equip robots with such a learning mechanism. In recent years, there has been significant research on autonomously learning representations of the world using the deep learning; however, most studies have taken a statistical approach, which requires a large number of training data. Contrary to such existing methods, we take a novel algebraic approach for representation learning based on a simpler and more intuitive formulation that the observed world is the combination of multiple independent patterns and transformations that are invariant to the shape of patterns. Since the shape of patterns can be viewed as the invariant features…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Processing and 3D Reconstruction · Image Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques