The Emergence of Chunking Structures with Hierarchical RNN

Zijun Wu; Anup Anand Deshmukh; Yongkang Wu; Jimmy Lin; Lili Mou

arXiv:2309.04919·cs.CL·December 19, 2025·2 cites

The Emergence of Chunking Structures with Hierarchical RNN

Zijun Wu, Anup Anand Deshmukh, Yongkang Wu, Jimmy Lin, Lili Mou

PDF

Open Access 1 Repo

TL;DR

This paper presents an unsupervised hierarchical RNN model that learns to identify chunking structures in language, improving performance and revealing transient emergence of these structures during training.

Contribution

It introduces a novel unsupervised hierarchical RNN approach for chunking, demonstrating its effectiveness and analyzing the emergence of structures during training.

Findings

01

Improved unsupervised chunking performance on multiple datasets

02

Emergence of chunking structures is transient during training

03

Model benefits from a two-stage training process

Abstract

In Natural Language Processing (NLP), predicting linguistic structures, such as parsing and chunking, has mostly relied on manual annotations of syntactic structures. This paper introduces an unsupervised approach to chunking, a syntactic task that involves grouping words in a non-hierarchical manner. We present a Hierarchical Recurrent Neural Network (HRNN) designed to model word-to-chunk and chunk-to-sentence compositions. Our approach involves a two-stage training process: pretraining with an unsupervised parser and finetuning on downstream NLP tasks. Experiments on multiple datasets reveal a notable improvement of unsupervised chunking performance in both pretraining and finetuning stages. Interestingly, we observe that the emergence of the chunking structure is transient during the neural model's downstream-task training. This study contributes to the advancement of unsupervised…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

manga-uofa/uchrnn
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification