TL;DR
ANER is a transformer-based web tool for recognizing 50 entity types in Arabic and Arabizi, outperforming previous models and providing user-friendly features like Wikipedia linking.
Contribution
It introduces a transformer-based NER model for Arabic and Arabizi, with a web interface and deployment on HuggingFace, achieving higher accuracy than existing tools.
Findings
F1 score of 88.7% on WikiFANE_Gold dataset
F1 score of 77.7% on NewsFANE_Gold dataset
Accessible via web and HuggingFace for developers
Abstract
One of the main tasks of Natural Language Processing (NLP), is Named Entity Recognition (NER). It is used in many applications and also can be used as an intermediate step for other tasks. We present ANER, a web-based named entity recognizer for the Arabic, and Arabizi languages. The model is built upon BERT, which is a transformer-based encoder. It can recognize 50 different entity classes, covering various fields. We trained our model on the WikiFANE\_Gold dataset which consists of Wikipedia articles. We achieved an F1 score of 88.7\%, which beats CAMeL Tools' F1 score of 83\% on the ANERcorp dataset, which has only 4 classes. We also got an F1 score of 77.7\% on the NewsFANE\_Gold dataset which contains out-of-domain data from News articles. The system is deployed on a user-friendly web interface that accepts users' inputs in Arabic, or Arabizi. It allows users to explore the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Weight Decay · Linear Layer · Attention Dropout · WordPiece · Softmax · Dense Connections · Layer Normalization
