Learning Invariant World State Representations with Predictive Coding

Avi Ziskind; Sujeong Kim; and Giedrius T. Burachas

arXiv:2207.02972·cs.LG·July 8, 2022

Learning Invariant World State Representations with Predictive Coding

Avi Ziskind, Sujeong Kim, and Giedrius T. Burachas

PDF

Open Access

TL;DR

This paper introduces PreludeNet, a predictive coding-based architecture that learns depth from video in a self-supervised manner, achieving robustness to lighting variations and enabling evaluation of visual representations for invariance.

Contribution

It presents a novel hybrid learning architecture, PreludeNet, combining self-supervised and supervised training for depth inference and invariance to illumination changes.

Findings

01

PreludeNet achieves competitive depth inference accuracy.

02

The model demonstrates robustness to lighting variations.

03

The framework allows evaluation of visual invariance in representations.

Abstract

Self-supervised learning methods overcome the key bottleneck for building more capable AI: limited availability of labeled data. However, one of the drawbacks of self-supervised architectures is that the representations that they learn are implicit and it is hard to extract meaningful information about the encoded world states, such as 3D structure of the visual scene encoded in a depth map. Moreover, in the visual domain such representations only rarely undergo evaluations that may be critical for downstream tasks, such as vision for autonomous cars. Herein, we propose a framework for evaluating visual representations for illumination invariance in the context of depth perception. We develop a new predictive coding-based architecture and a hybrid fully-supervised/self-supervised learning method. We propose a novel architecture that extends the predictive coding approach: PRedictive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Image Processing Techniques and Applications · Image Enhancement Techniques