Neural Arithmetic Logic Units
Andrew Trask, Felix Hill, Scott Reed, Jack Rae, Chris Dyer, Phil, Blunsom

TL;DR
The paper introduces Neural Arithmetic Logic Units (NALUs), a neural network module designed to improve systematic numerical extrapolation and manipulation, enabling better generalization in tasks involving numerical data.
Contribution
The paper proposes the NALU architecture, which combines learned gating with primitive arithmetic operations to enhance neural networks' ability to generalize numerically.
Findings
NALUs enable neural networks to perform arithmetic and extrapolate beyond trained ranges.
Networks with NALUs outperform conventional architectures in numerical tasks.
NALUs demonstrate improved generalization in various numerical manipulation tasks.
Abstract
Neural networks can learn to represent and manipulate numerical information, but they seldom generalize well outside of the range of numerical values encountered during training. To encourage more systematic numerical extrapolation, we propose an architecture that represents numerical quantities as linear activations which are manipulated using primitive arithmetic operators, controlled by learned gates. We call this module a neural arithmetic logic unit (NALU), by analogy to the arithmetic logic unit in traditional processors. Experiments show that NALU-enhanced neural networks can learn to track time, perform arithmetic over images of numbers, translate numerical language into real-valued scalars, execute computer code, and count objects in images. In contrast to conventional architectures, we obtain substantially better generalization both inside and outside of the range of numerical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Model Reduction and Neural Networks · Numerical Methods and Algorithms
