Yet Unnoticed in LSTM: Binary Tree Based Input Reordering, Weight Regularization, and Gate Nonlinearization
Mojtaba Moattari

TL;DR
This paper introduces novel input reordering, weight normalization, and gate nonlinearization techniques for LSTMs, demonstrating improved accuracy in text classification tasks by better focusing on long-term information.
Contribution
It proposes new methods for input reordering, weight normalization, and gate nonlinearization in LSTMs, which have not been previously explored together, to enhance long-term information modeling.
Findings
Improved text classification accuracy with proposed methods.
Optimal norm selection for weight normalization enhances model performance.
Nonlinearized gates better capture nonlinear input relationships.
Abstract
LSTM models used in current Machine Learning literature and applications, has a promising solution for permitting long term information using gating mechanisms that forget and reduce effect of current input information. However, even with this pipeline, they do not optimally focus on specific old index or long-term information. This paper elaborates upon input reordering approaches to prioritize certain input indices. Moreover, no LSTM based approach is found in the literature that examines weight normalization while choosing the right weight and exponent of Lp norms through main supervised loss function. In this paper, we find out which norm best finds relationship between weights to either smooth or sparsify them. Lastly, gates, as weighted representations of inputs and states, which control reduction-extent of current input versus previous inputs (~ state), are not nonlinearized…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies · Neural Networks and Applications · Topic Modeling
