Comparative Analysis of Extrinsic Factors for NER in French
Grace Yang, Zhiyi Li, Yadong Liu, Jungyeul Park

TL;DR
This paper investigates how different extrinsic factors like model structure, annotation schemes, and data augmentation can significantly enhance French NER performance using limited data, achieving a notable F1 score increase.
Contribution
It systematically evaluates the impact of various extrinsic factors and data augmentation techniques on French NER with limited data, demonstrating substantial performance improvements.
Findings
F1 score improved from 62.41 to 79.39 with combined techniques.
Considering multiple extrinsic factors is effective for low-resource NER.
Data augmentation significantly boosts NER accuracy in limited data scenarios.
Abstract
Named entity recognition (NER) is a crucial task that aims to identify structured information, which is often replete with complex, technical terms and a high degree of variability. Accurate and reliable NER can facilitate the extraction and analysis of important information. However, NER for other than English is challenging due to limited data availability, as the high expertise, time, and expenses are required to annotate its data. In this paper, by using the limited data, we explore various factors including model structure, corpus annotation scheme and data augmentation techniques to improve the performance of a NER model for French. Our experiments demonstrate that these approaches can significantly improve the model's F1 score from original CRF score of 62.41 to 79.39. Our findings suggest that considering different extrinsic factors and combining these techniques is a promising…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPhonetics and Phonology Research · Speech Recognition and Synthesis
MethodsConditional Random Field
