Improving the Robustness of Neural Multiplication Units with Reversible   Stochasticity

Bhumika Mistry; Katayoun Farrahi; Jonathon Hare

arXiv:2211.05624·cs.LG·November 11, 2022

Improving the Robustness of Neural Multiplication Units with Reversible Stochasticity

Bhumika Mistry, Katayoun Farrahi, Jonathon Hare

PDF

Open Access

TL;DR

This paper introduces stochastic Neural Multiplication Units (sNMUs) that enhance robustness and learning reliability in neural arithmetic tasks by mitigating biases and avoiding undesirable solutions.

Contribution

The paper proposes reversible stochasticity in NMUs to improve their robustness and ability to learn simple arithmetic tasks across varying training ranges.

Findings

01

sNMUs outperform standard NMUs in learning multiplication across different ranges

02

Stochasticity improves robustness and convergence to true solutions

03

Enhanced representations benefit downstream numerical and image tasks

Abstract

Multilayer Perceptrons struggle to learn certain simple arithmetic tasks. Specialist neural modules for arithmetic can outperform classical architectures with gains in extrapolation, interpretability and convergence speeds, but are highly sensitive to the training range. In this paper, we show that Neural Multiplication Units (NMUs) are unable to reliably learn tasks as simple as multiplying two inputs when given different training ranges. Causes of failure are linked to inductive and input biases which encourage convergence to solutions in undesirable optima. A solution, the stochastic NMU (sNMU), is proposed to apply reversible stochasticity, encouraging avoidance of such optima whilst converging to the true solution. Empirically, we show that stochasticity provides improved robustness with the potential to improve learned representations of upstream networks for numerical and image…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Stochastic Gradient Optimization Techniques · Model Reduction and Neural Networks