Recurrent Neural Networks with Mixed Hierarchical Structures and EM   Algorithm for Natural Language Processing

Zhaoxin Luo; Michael Zhu

arXiv:2201.08919·cs.CL·January 25, 2022·1 cites

Recurrent Neural Networks with Mixed Hierarchical Structures and EM Algorithm for Natural Language Processing

Zhaoxin Luo, Michael Zhu

PDF

Open Access

TL;DR

This paper introduces EM-HRNN, a novel RNN architecture with a latent indicator layer and EM algorithm to capture hierarchical language structures, improving document classification without pre-training.

Contribution

The paper proposes a new EM-based RNN model with a latent indicator layer for implicit hierarchical structure learning in language processing.

Findings

01

EM-HRNN outperforms other RNN models in document classification

02

Model achieves comparable performance to BERT without pre-training

03

Bootstrap strategies improve training efficiency on long texts

Abstract

How to obtain hierarchical representations with an increasing level of abstraction becomes one of the key issues of learning with deep neural networks. A variety of RNN models have recently been proposed to incorporate both explicit and implicit hierarchical information in modeling languages in the literature. In this paper, we propose a novel approach called the latent indicator layer to identify and learn implicit hierarchical information (e.g., phrases), and further develop an EM algorithm to handle the latent indicator layer in training. The latent indicator layer further simplifies a text's hierarchical structure, which allows us to seamlessly integrate different levels of attention mechanisms into the structure. We called the resulting architecture as the EM-HRNN model. Furthermore, we develop two bootstrap strategies to effectively and efficiently train the EM-HRNN model on long…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Machine Learning and Data Classification · Text and Document Classification Technologies