Tensor Decomposition for Compressing Recurrent Neural Network

Andros Tjandra; Sakriani Sakti; Satoshi Nakamura

arXiv:1802.10410·cs.LG·May 9, 2018·6 cites

Tensor Decomposition for Compressing Recurrent Neural Network

Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

PDF

Open Access 1 Repo

TL;DR

This paper explores tensor decomposition techniques like CP, Tucker, and Tensor Train to compress Gated Recurrent Units in RNNs, aiming to reduce parameters while maintaining performance, with Tensor Train showing the best results.

Contribution

It introduces tensor decomposition methods for RNN compression, demonstrating their effectiveness and comparing different approaches on sequence modeling tasks.

Findings

01

Tensor Train-GRU outperforms other tensor decomposition methods.

02

Tensor decompositions significantly reduce RNN parameters.

03

Performance is maintained across various parameter sizes.

Abstract

In the machine learning fields, Recurrent Neural Network (RNN) has become a popular architecture for sequential data modeling. However, behind the impressive performance, RNNs require a large number of parameters for both training and inference. In this paper, we are trying to reduce the number of parameters and maintain the expressive power from RNN simultaneously. We utilize several tensor decompositions method including CANDECOMP/PARAFAC (CP), Tucker decomposition and Tensor Train (TT) to re-parameterize the Gated Recurrent Unit (GRU) RNN. We evaluate all tensor-based RNNs performance on sequence modeling tasks with a various number of parameters. Based on our experiment results, TT-GRU achieved the best results in a various number of parameters compared to other decomposition methods.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

androstj/tensor_rnn
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTensor decomposition and applications · Parallel Computing and Optimization Techniques