Quaternion Recurrent Neural Networks
Titouan Parcollet, Mirco Ravanelli, Mohamed Morchid, Georges, Linar\`es, Chiheb Trabelsi, Renato De Mori, Yoshua Bengio

TL;DR
This paper introduces quaternion-based recurrent neural networks (QRNN and QLSTM) that better model multi-dimensional sequential data, achieving improved performance and parameter efficiency in speech recognition tasks.
Contribution
The paper presents novel quaternion RNN and LSTM architectures that incorporate internal structural dependencies using quaternion algebra, enhancing modeling capabilities and efficiency.
Findings
QRNN and QLSTM outperform traditional RNN and LSTM in speech recognition.
They reduce the number of parameters by up to 3.3 times.
Achieve better accuracy with more compact models.
Abstract
Recurrent neural networks (RNNs) are powerful architectures to model sequential data, due to their capability to learn short and long-term dependencies between the basic elements of a sequence. Nonetheless, popular tasks such as speech or images recognition, involve multi-dimensional input features that are characterized by strong internal dependencies between the dimensions of the input vector. We propose a novel quaternion recurrent neural network (QRNN), alongside with a quaternion long-short term memory neural network (QLSTM), that take into account both the external relations and these internal structural dependencies with the quaternion algebra. Similarly to capsules, quaternions allow the QRNN to code internal dependencies by composing and processing multidimensional features as single entities, while the recurrent operation reveals correlations between the elements composing the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Model Reduction and Neural Networks · Digital Filter Design and Implementation
MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory
