Assessing the Unitary RNN as an End-to-End Compositional Model of Syntax

Jean-Philippe Bernardy (University of Gothenburg); Shalom Lappin; (University of Gothenburg; Queen Mary University of London; and King's; College London)

arXiv:2208.05719·cs.CL·August 12, 2022·E2ECOMPVEC@ESSLLI

Assessing the Unitary RNN as an End-to-End Compositional Model of Syntax

Jean-Philippe Bernardy (University of Gothenburg), Shalom Lappin, (University of Gothenburg, Queen Mary University of London, and King's, College London)

PDF

TL;DR

This paper demonstrates that unitary RNNs and LSTMs can effectively model complex syntactic patterns, with URNs offering advantages in interpretability and information retention over long sequences.

Contribution

It introduces the use of unitary RNNs for modeling syntax, highlighting their strict compositionality and potential for explainability in NLP.

Findings

01

URNs achieve high accuracy on long-distance syntactic dependencies

02

URNs retain information over arbitrary sequence lengths

03

URNs satisfy strict compositionality, unlike traditional RNNs

Abstract

We show that both an LSTM and a unitary-evolution recurrent neural network (URN) can achieve encouraging accuracy on two types of syntactic patterns: context-free long distance agreement, and mildly context-sensitive cross serial dependencies. This work extends recent experiments on deeply nested context-free long distance dependencies, with similar results. URNs differ from LSTMs in that they avoid non-linear activation functions, and they apply matrix multiplication to word embeddings encoded as unitary matrices. This permits them to retain all information in the processing of an input string over arbitrary distances. It also causes them to satisfy strict compositionality. URNs constitute a significant advance in the search for explainable models in deep learning applied to NLP.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory