Decomposing a Recurrent Neural Network into Modules for Enabling   Reusability and Replacement

Sayem Mohammad Imtiaz; Fraol Batole; Astha Singh; Rangeet Pan; Breno; Dantas Cruz; Hridesh Rajan

arXiv:2212.05970·cs.SE·February 10, 2023·1 cites

Decomposing a Recurrent Neural Network into Modules for Enabling Reusability and Replacement

Sayem Mohammad Imtiaz, Fraol Batole, Astha Singh, Rangeet Pan, Breno, Dantas Cruz, Hridesh Rajan

PDF

Open Access

TL;DR

This paper introduces a novel method to decompose RNNs into modules, enabling reuse and replacement without retraining, thus improving flexibility in language translation and understanding tasks.

Contribution

It is the first work to decompose RNNs into modules, applicable to various RNN types, facilitating reuse and replacement in natural language processing.

Findings

01

Decomposing RNNs incurs minimal accuracy loss (-0.6%).

02

Reused and replaced modules maintain performance without retraining.

03

Approach is validated on 5 datasets with multiple model variants.

Abstract

Can we take a recurrent neural network (RNN) trained to translate between languages and augment it to support a new natural language without retraining the model from scratch? Can we fix the faulty behavior of the RNN by replacing portions associated with the faulty behavior? Recent works on decomposing a fully connected neural network (FCNN) and convolutional neural network (CNN) into modules have shown the value of engineering deep models in this manner, which is standard in traditional SE but foreign for deep learning models. However, prior works focus on the image-based multiclass classification problems and cannot be applied to RNN due to (a) different layer structures, (b) loop structures, (c) different types of input-output architectures, and (d) usage of both nonlinear and logistic activation functions. In this work, we propose the first approach to decompose an RNN into…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Software Engineering Research · Machine Learning and Data Classification

MethodsTanh Activation · Sigmoid Activation · Gated Recurrent Unit · Long Short-Term Memory