Complex Structure Leads to Overfitting: A Structure Regularization   Decoding Method for Natural Language Processing

Xu Sun; Weiwei Sun; Shuming Ma; Xuancheng Ren; Yi Zhang; Wenjie Li,; Houfeng Wang

arXiv:1711.10331·cs.LG·November 29, 2017·6 cites

Complex Structure Leads to Overfitting: A Structure Regularization Decoding Method for Natural Language Processing

Xu Sun, Weiwei Sun, Shuming Ma, Xuancheng Ren, Yi Zhang, Wenjie Li,, Houfeng Wang

PDF

Open Access

TL;DR

This paper introduces a structure regularization decoding method to reduce overfitting in complex structured models for NLP, improving performance on sequence labeling and parsing tasks.

Contribution

It proposes a novel structure regularization decoding approach that leverages simple models to regularize complex models, backed by theoretical analysis and empirical validation.

Findings

01

Significant F1 error reduction (36.4%) on sequence labeling.

02

Maximum UAS improvement of 5.5% on parsing.

03

Method outperforms or matches state-of-the-art results.

Abstract

Recent systems on structured prediction focus on increasing the level of structural dependencies within the model. However, our study suggests that complex structures entail high overfitting risks. To control the structure-based overfitting, we propose to conduct structure regularization decoding (SR decoding). The decoding of the complex structure model is regularized by the additionally trained simple structure model. We theoretically analyze the quantitative relations between the structural complexity and the overfitting risk. The analysis shows that complex structure models are prone to the structure-based overfitting. Empirical evaluations show that the proposed method improves the performance of the complex structure models by reducing the structure-based overfitting. On the sequence labeling tasks, the proposed method substantially improves the performance of the complex neural…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning in Bioinformatics