Integrating Motion into Vision Models for Better Visual Prediction

Michael Hazoglou; Todd Hylton

arXiv:1912.01661·cs.CV·December 5, 2019·1 cites

Integrating Motion into Vision Models for Better Visual Prediction

Michael Hazoglou, Todd Hylton

PDF

Open Access

TL;DR

This paper presents an enhanced vision system that integrates camera motion into a self-supervised predictive learning framework, leading to improved visual prediction and saccadic control.

Contribution

It introduces a method to incorporate camera motion into a hierarchical predictive vision model, improving prediction accuracy and control behavior.

Findings

01

Enhanced visual prediction accuracy

02

Improved saccadic behavior

03

Successful integration of camera motion into the model

Abstract

We demonstrate an improved vision system that learns a model of its environment using a self-supervised, predictive learning method. The system includes a pan-tilt camera, a foveated visual input, a saccading reflex to servo the foveated region to areas high prediction error, input frame transformation synced to the camera motion, and a recursive, hierachical machine learning technique based on the Predictive Vision Model. In earlier work, which did not integrate camera motion into the vision model, prediction was impaired and camera movement suffered from undesired feedback effects. Here we detail the integration of camera motion into the predictive learning system and show improved visual prediction and saccadic behavior. From these experiences, we speculate on the integration of additional sensory and motor systems into self-supervised, predictive learning models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Advanced Image and Video Retrieval Techniques · Generative Adversarial Networks and Image Synthesis