TURNER: The Uncertainty-based Retrieval Framework for Chinese NER
Zhichao Geng, Hang Yan, Zhangyue Yin, Chenxin An, Xipeng Qiu

TL;DR
TURNER introduces an uncertainty-based retrieval framework for Chinese NER that leverages auxiliary knowledge sources like search engines, improving accuracy and surpassing previous lexicon-based methods on benchmark datasets.
Contribution
The paper proposes a novel uncertainty sampling and knowledge fusion approach for Chinese NER, reducing reliance on domain-specific lexicons and enhancing retrieval efficiency.
Findings
TURNER outperforms existing lexicon-based methods.
Achieves new state-of-the-art results on four benchmark datasets.
Effective in handling ambiguous and OOV entities.
Abstract
Chinese NER is a difficult undertaking due to the ambiguity of Chinese characters and the absence of word boundaries. Previous work on Chinese NER focus on lexicon-based methods to introduce boundary information and reduce out-of-vocabulary (OOV) cases during prediction. However, it is expensive to obtain and dynamically maintain high-quality lexicons in specific domains, which motivates us to utilize more general knowledge resources, e.g., search engines. In this paper, we propose TURNER: The Uncertainty-based Retrieval framework for Chinese NER. The idea behind TURNER is to imitate human behavior: we frequently retrieve auxiliary knowledge as assistance when encountering an unknown or uncertain entity. To improve the efficiency and effectiveness of retrieval, we first propose two types of uncertainty sampling methods for selecting the most ambiguous entity-level uncertain components…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Information Retrieval and Search Behavior
