Portuguese Named Entity Recognition using BERT-CRF
F\'abio Souza, Rodrigo Nogueira, Roberto Lotufo

TL;DR
This paper demonstrates that fine-tuning Portuguese BERT models with a CRF layer significantly improves named entity recognition performance, achieving new state-of-the-art results on the HAREM I dataset.
Contribution
It introduces a BERT-CRF architecture for Portuguese NER and compares feature-based and fine-tuning strategies, with fine-tuning achieving superior results.
Findings
Fine-tuning BERT-CRF improves F1-score by 1-4 points.
Achieves state-of-the-art results on HAREM I dataset.
Demonstrates effectiveness of transfer learning for Portuguese NER.
Abstract
Recent advances in language representation using neural networks have made it viable to transfer the learned internal states of a trained model to downstream natural language processing tasks, such as named entity recognition (NER) and question answering. It has been shown that the leverage of pre-trained language models improves the overall performance on many tasks and is highly beneficial when labeled data is scarce. In this work, we train Portuguese BERT models and employ a BERT-CRF architecture to the NER task on the Portuguese language, combining the transfer capabilities of BERT with the structured predictions of CRF. We explore feature-based and fine-tuning training strategies for the BERT model. Our fine-tuning approach obtains new state-of-the-art results on the HAREM I dataset, improving the F1-score by 1 point on the selective scenario (5 NE classes) and by 4 points on the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Data Quality and Management
MethodsLinear Layer · Conditional Random Field · Residual Connection · Attention Dropout · Linear Warmup With Linear Decay · Weight Decay · Refunds@Expedia|||How do I get a full refund from Expedia? · Dense Connections · Adam · WordPiece
