Overview of the NLPCC 2015 Shared Task: Chinese Word Segmentation and   POS Tagging for Micro-blog Texts

Xipeng Qiu; Peng Qian; Liusong Yin; Shiyu Wu; Xuanjing Huang

arXiv:1505.07599·cs.CL·July 1, 2015

Overview of the NLPCC 2015 Shared Task: Chinese Word Segmentation and POS Tagging for Micro-blog Texts

Xipeng Qiu, Peng Qian, Liusong Yin, Shiyu Wu, Xuanjing Huang

PDF

Open Access

TL;DR

This paper provides an overview of the NLPCC 2015 shared task on Chinese word segmentation and POS tagging specifically for informal micro-blog texts, highlighting datasets, approaches, and results.

Contribution

It introduces a new dataset for micro-blog Chinese text and compares various approaches across different resource tracks in a shared task setting.

Findings

01

Different approaches show varying effectiveness on informal texts

02

Resource availability impacts system performance

03

The shared task fosters progress in Chinese micro-blog NLP

Abstract

In this paper, we give an overview for the shared task at the 4th CCF Conference on Natural Language Processing \& Chinese Computing (NLPCC 2015): Chinese word segmentation and part-of-speech (POS) tagging for micro-blog texts. Different with the popular used newswire datasets, the dataset of this shared task consists of the relatively informal micro-texts. The shared task has two sub-tasks: (1) individual Chinese word segmentation and (2) joint Chinese word segmentation and POS Tagging. Each subtask has three tracks to distinguish the systems with different resources. We first introduce the dataset and task, then we characterize the different approaches of the participating systems, report the test results, and provide a overview analysis of these results. An online system is available for open registration and evaluation at http://nlp.fudan.edu.cn/nlpcc2015.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Web Data Mining and Analysis