Neural Programmer: Inducing Latent Programs with Gradient Descent

Arvind Neelakantan; Quoc V. Le; Ilya Sutskever

arXiv:1511.04834·cs.LG·August 5, 2016·ICLR·73 cites

Neural Programmer: Inducing Latent Programs with Gradient Descent

Arvind Neelakantan, Quoc V. Le, Ilya Sutskever

PDF

Open Access

TL;DR

Neural Programmer introduces a differentiable neural network that learns to induce complex programs with basic arithmetic and logic operations, enabling better reasoning in tasks like table comprehension.

Contribution

The paper presents Neural Programmer, a novel end-to-end differentiable model that learns to induce compositional programs using weak supervision and gradient descent.

Findings

01

Neural Programmer achieves near-perfect accuracy on complex table comprehension tasks.

02

Adding gradient noise significantly improves training stability.

03

Traditional models perform poorly compared to Neural Programmer on the dataset.

Abstract

Deep neural networks have achieved impressive supervised classification performance in many tasks including image recognition, speech recognition, and sequence to sequence learning. However, this success has not been translated to applications like question answering that may involve complex arithmetic and logic reasoning. A major limitation of these models is in their inability to learn even simple arithmetic and logic operations. For example, it has been shown that neural networks fail to learn to add two binary numbers reliably. In this work, we propose Neural Programmer, an end-to-end differentiable neural network augmented with a small set of basic arithmetic and logic operations. Neural Programmer can call these augmented operations over several steps, thereby inducing compositional programs that are more complex than the built-in operations. The model learns from a weak…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Multimodal Machine Learning Applications · Machine Learning and Algorithms