Sequential Coordination of Deep Models for Learning Visual Arithmetic

Eric Crawford; Guillaume Rabusseau; Joelle Pineau

arXiv:1809.04988·cs.LG·September 14, 2018

Sequential Coordination of Deep Models for Learning Visual Arithmetic

Eric Crawford, Guillaume Rabusseau, Joelle Pineau

PDF

Open Access

TL;DR

This paper introduces a two-tiered neural architecture combining perception modules and a reinforcement learning controller to perform visual arithmetic tasks, effectively integrating perception and reasoning.

Contribution

It presents a novel hierarchical model that coordinates deep perception modules with symbolic reasoning, advancing the integration of perception and reasoning in neural networks.

Findings

01

Successfully solves visual arithmetic tasks

02

Improves sample efficiency over standard networks

03

Demonstrates effective coordination of modules via reinforcement learning

Abstract

Achieving machine intelligence requires a smooth integration of perception and reasoning, yet models developed to date tend to specialize in one or the other; sophisticated manipulation of symbols acquired from rich perceptual spaces has so far proved elusive. Consider a visual arithmetic task, where the goal is to carry out simple arithmetical algorithms on digits presented under natural conditions (e.g. hand-written, placed randomly). We propose a two-tiered architecture for tackling this problem. The lower tier consists of a heterogeneous collection of information processing modules, which can include pre-trained deep neural networks for locating and extracting characters from the image, as well as modules performing symbolic transformations on the representations extracted by perception. The higher tier consists of a controller, trained using reinforcement learning, which…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Vision and Imaging · Advanced Neural Network Applications