Surprisal-Triggered Conditional Computation with Neural Networks

Loren Lugosch; Derek Nowrouzezahrai; Brett H. Meyer

arXiv:2006.01659·cs.LG·June 3, 2020·5 cites

Surprisal-Triggered Conditional Computation with Neural Networks

Loren Lugosch, Derek Nowrouzezahrai, Brett H. Meyer

PDF

Open Access 1 Repo

TL;DR

This paper introduces a neural network approach that dynamically allocates computational resources based on input difficulty, using surprisal to decide between small and large networks, improving efficiency in speech recognition tasks.

Contribution

It proposes a novel method that uses autoregressive model surprisal to trigger conditional computation, reducing FLOPs while maintaining performance.

Findings

01

Achieves 15% reduction in FLOPs compared to always using the large network.

02

Matches baseline performance with less computational cost.

03

Demonstrates effectiveness on speech recognition tasks.

Abstract

Autoregressive neural network models have been used successfully for sequence generation, feature extraction, and hypothesis scoring. This paper presents yet another use for these models: allocating more computation to more difficult inputs. In our model, an autoregressive model is used both to extract features and to predict observations in a stream of input observations. The surprisal of the input, measured as the negative log-likelihood of the current observation according to the autoregressive model, is used as a measure of input difficulty. This in turn determines whether a small, fast network, or a big, slow network, is used. Experiments on two speech recognition tasks show that our model can match the performance of a baseline in which the big network is always used with 15% fewer FLOPs.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lorenlugosch/conditional-computation-using-surprisal
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Topic Modeling · Music and Audio Processing