Research on multi-dimensional end-to-end phrase recognition algorithm based on background knowledge
Zheng Li, Gang Tu, Guang Liu, Zhi-Qiang Zhan, Yi-Jian Liu

TL;DR
This paper introduces a multi-dimensional end-to-end phrase recognition algorithm that incorporates background knowledge and handles nested, multi-granular language features, improving accuracy and winning a competition.
Contribution
It proposes a novel annotation rule and matching algorithm that effectively recognize nested phrases and dependencies, integrating background knowledge into end-to-end recognition.
Findings
Improved accuracy on CPWD dataset by over 1 percentage point
Effective recognition of nested and multi-granular phrases
Won first place in Chinese humor type recognition at CCL 2018
Abstract
At present, the deep end-to-end method based on supervised learning is used in entity recognition and dependency analysis. There are two problems in this method: firstly, background knowledge cannot be introduced; secondly, multi granularity and nested features of natural language cannot be recognized. In order to solve these problems, the annotation rules based on phrase window are proposed, and the corresponding multi-dimensional end-to-end phrase recognition algorithm is designed. This annotation rule divides sentences into seven types of nested phrases, and indicates the dependency between phrases. The algorithm can not only introduce background knowledge, recognize all kinds of nested phrases in sentences, but also recognize the dependency between phrases. The experimental results show that the annotation rule is easy to use and has no ambiguity; the matching algorithm is more…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies
