May I Check Again? -- A simple but efficient way to generate and use contextual dictionaries for Named Entity Recognition. Application to French Legal Texts
Valentin Barriere, Amaury Fouret

TL;DR
This paper introduces a robust method for Named Entity Recognition in French legal texts that leverages contextual dictionaries and high-level neural network features to effectively handle typos, significantly improving accuracy.
Contribution
It presents a novel approach combining contextual dictionaries and high-level neural features to enhance NER robustness to typos in legal French texts.
Findings
32% reduction in F1-score error
F1-score improved from 94.85% to 96.52%
Effective handling of typos in legal NER
Abstract
In this paper we present a new method to learn a model robust to typos for a Named Entity Recognition task. Our improvement over existing methods helps the model to take into account the context of the sentence inside a court decision in order to recognize an entity with a typo. We used state-of-the-art models and enriched the last layer of the neural network with high-level information linked with the potential of the word to be a certain type of entity. More precisely, we utilized the similarities between the word and the potential entity candidates in the tagged sentence context. The experiments on a dataset of French court decisions show a reduction of the relative F1-score error of 32%, upgrading the score obtained with the most competitive fine-tuned state-of-the-art system from 94.85% to 96.52%.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Artificial Intelligence in Law
