Neural Programmer: Inducing Latent Programs with Gradient Descent
Arvind Neelakantan, Quoc V. Le, Ilya Sutskever

TL;DR
Neural Programmer introduces a differentiable neural network that learns to induce complex programs with basic arithmetic and logic operations, enabling better reasoning in tasks like table comprehension.
Contribution
The paper presents Neural Programmer, a novel end-to-end differentiable model that learns to induce compositional programs using weak supervision and gradient descent.
Findings
Neural Programmer achieves near-perfect accuracy on complex table comprehension tasks.
Adding gradient noise significantly improves training stability.
Traditional models perform poorly compared to Neural Programmer on the dataset.
Abstract
Deep neural networks have achieved impressive supervised classification performance in many tasks including image recognition, speech recognition, and sequence to sequence learning. However, this success has not been translated to applications like question answering that may involve complex arithmetic and logic reasoning. A major limitation of these models is in their inability to learn even simple arithmetic and logic operations. For example, it has been shown that neural networks fail to learn to add two binary numbers reliably. In this work, we propose Neural Programmer, an end-to-end differentiable neural network augmented with a small set of basic arithmetic and logic operations. Neural Programmer can call these augmented operations over several steps, thereby inducing compositional programs that are more complex than the built-in operations. The model learns from a weak…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Multimodal Machine Learning Applications · Machine Learning and Algorithms
