A Unifying Framework of Bilinear LSTMs

Mohit Rajpal; Bryan Kian Hsiang Low

arXiv:1910.10294·cs.LG·September 12, 2023

A Unifying Framework of Bilinear LSTMs

Mohit Rajpal, Bryan Kian Hsiang Low

PDF

Open Access

TL;DR

This paper introduces a unifying bilinear LSTM framework that captures nonlinear feature interactions in sequence data, improving performance without increasing model parameters.

Contribution

It proposes a flexible framework balancing expressivity and parameter efficiency, unifying linear and bilinear LSTMs for sequence learning.

Findings

01

Outperforms linear LSTMs in language tasks

02

Maintains parameter count while increasing expressivity

03

Demonstrates broad applicability across sequence datasets

Abstract

This paper presents a novel unifying framework of bilinear LSTMs that can represent and utilize the nonlinear interaction of the input features present in sequence datasets for achieving superior performance over a linear LSTM and yet not incur more parameters to be learned. To realize this, our unifying framework allows the expressivity of the linear vs. bilinear terms to be balanced by correspondingly trading off between the hidden state vector size vs. approximation quality of the weight matrix in the bilinear term so as to optimize the performance of our bilinear LSTM, while not incurring more parameters to be learned. We empirically evaluate the performance of our bilinear LSTM in several language-based sequence learning tasks to demonstrate its general applicability.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Speech Recognition and Synthesis · Music and Audio Processing

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory