Programming with a Differentiable Forth Interpreter
Matko Bo\v{s}njak, Tim Rockt\"aschel, Jason Naradowsky, Sebastian, Riedel

TL;DR
This paper introduces a differentiable Forth interpreter that allows neural networks to incorporate prior procedural knowledge, enabling learning of complex behaviors and improving reasoning tasks involving natural language understanding.
Contribution
It presents a novel end-to-end differentiable Forth interpreter that integrates program sketches with trainable behavior, facilitating gradient-based learning of procedural knowledge.
Findings
Effectively leverages prior program structure for learning complex behaviors
Achieves state-of-the-art accuracy in reasoning about natural language stories
Enables integration of program execution into neural computation graphs
Abstract
Given that in practice training data is scarce for all but a small set of problems, a core question is how to incorporate prior knowledge into a model. In this paper, we consider the case of prior procedural knowledge for neural networks, such as knowing how a program should traverse a sequence, but not what local actions should be performed at each step. To this end, we present an end-to-end differentiable interpreter for the programming language Forth which enables programmers to write program sketches with slots that can be filled with behaviour trained from program input-output data. We can optimise this behaviour directly through gradient descent techniques on user-specified objectives, and also integrate the program into any larger neural computation graph. We show empirically that our interpreter is able to effectively leverage different levels of prior program structure and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Machine Learning and Algorithms
MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory
