# Neural Chinese Named Entity Recognition via CNN-LSTM-CRF and Joint   Training with Word Segmentation

**Authors:** Fangzhao Wu, Junxin Liu, Chuhan Wu, Yongfeng Huang, Xing Xie

arXiv: 1905.01964 · 2019-05-07

## TL;DR

This paper presents a neural network model combining CNN, LSTM, and CRF for Chinese NER, jointly trained with word segmentation and augmented by pseudo-labeled data, significantly improving performance especially with limited training data.

## Contribution

It introduces a novel CNN-LSTM-CRF architecture for CNER and a joint training framework with word segmentation, along with an automatic pseudo-labeling method to enhance training data.

## Key findings

- Improved NER performance on benchmark datasets.
- Effective in low-resource training scenarios.
- Joint training with word segmentation enhances boundary detection.

## Abstract

Chinese named entity recognition (CNER) is an important task in Chinese natural language processing field. However, CNER is very challenging since Chinese entity names are highly context-dependent. In addition, Chinese texts lack delimiters to separate words, making it difficult to identify the boundary of entities. Besides, the training data for CNER in many domains is usually insufficient, and annotating enough training data for CNER is very expensive and time-consuming. In this paper, we propose a neural approach for CNER. First, we introduce a CNN-LSTM-CRF neural architecture to capture both local and long-distance contexts for CNER. Second, we propose a unified framework to jointly train CNER and word segmentation models in order to enhance the ability of CNER model in identifying entity boundaries. Third, we introduce an automatic method to generate pseudo labeled samples from existing labeled data which can enrich the training data. Experiments on two benchmark datasets show that our approach can effectively improve the performance of Chinese named entity recognition, especially when training data is insufficient.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.01964/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/1905.01964/full.md

## References

34 references — full list in the complete paper: https://tomesphere.com/paper/1905.01964/full.md

---
Source: https://tomesphere.com/paper/1905.01964