A2P-MANN: Adaptive Attention Inference Hops Pruned Memory-Augmented   Neural Networks

Mohsen Ahmadzadeh; Mehdi Kamal; Ali Afzali-Kusha; Massoud Pedram

arXiv:2101.09693·cs.CL·February 24, 2022·1 cites

A2P-MANN: Adaptive Attention Inference Hops Pruned Memory-Augmented Neural Networks

Mohsen Ahmadzadeh, Mehdi Kamal, Ali Afzali-Kusha, Massoud Pedram

PDF

Open Access

TL;DR

This paper introduces A2P-MANN, an adaptive, pruned memory-augmented neural network that reduces computational costs by dynamically determining attention hops and pruning weights, achieving significant efficiency gains in question-answering tasks.

Contribution

The paper presents a novel adaptive approach for attention inference and weight pruning in MANNs, significantly reducing computations with minimal accuracy loss.

Findings

01

Over 42% fewer computations on average compared to baseline MANN.

02

Up to 68% reduction in computation when combined with zero-skipping.

03

Up to 43% runtime reduction on CPU and GPU platforms.

Abstract

In this work, to limit the number of required attention inference hops in memory-augmented neural networks, we propose an online adaptive approach called A2P-MANN. By exploiting a small neural network classifier, an adequate number of attention inference hops for the input query is determined. The technique results in elimination of a large number of unnecessary computations in extracting the correct answer. In addition, to further lower computations in A2P-MANN, we suggest pruning weights of the final FC (fully-connected) layers. To this end, two pruning approaches, one with negligible accuracy loss and the other with controllable loss on the final accuracy, are developed. The efficacy of the technique is assessed by using the twenty question-answering (QA) tasks of bAbI dataset. The analytical assessment reveals, on average, more than 42% fewer computations compared to the baseline…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Topic Modeling

MethodsPruning