Recurrent Neural Networks with Mixed Hierarchical Structures and EM Algorithm for Natural Language Processing
Zhaoxin Luo, Michael Zhu

TL;DR
This paper introduces EM-HRNN, a novel RNN architecture with a latent indicator layer and EM algorithm to capture hierarchical language structures, improving document classification without pre-training.
Contribution
The paper proposes a new EM-based RNN model with a latent indicator layer for implicit hierarchical structure learning in language processing.
Findings
EM-HRNN outperforms other RNN models in document classification
Model achieves comparable performance to BERT without pre-training
Bootstrap strategies improve training efficiency on long texts
Abstract
How to obtain hierarchical representations with an increasing level of abstraction becomes one of the key issues of learning with deep neural networks. A variety of RNN models have recently been proposed to incorporate both explicit and implicit hierarchical information in modeling languages in the literature. In this paper, we propose a novel approach called the latent indicator layer to identify and learn implicit hierarchical information (e.g., phrases), and further develop an EM algorithm to handle the latent indicator layer in training. The latent indicator layer further simplifies a text's hierarchical structure, which allows us to seamlessly integrate different levels of attention mechanisms into the structure. We called the resulting architecture as the EM-HRNN model. Furthermore, we develop two bootstrap strategies to effectively and efficiently train the EM-HRNN model on long…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Machine Learning and Data Classification · Text and Document Classification Technologies
