Named Entity Recognition System for Sindhi Language
Awais Khan Jumani, Mashooque Ahmed Memon, Fida Hussain Khoso, Anwar, Ali Sanjrani, Safeeullah Soomro

TL;DR
This paper presents a rule-based Named Entity Recognition system for Sindhi language, identifying ten entity types with high accuracy, addressing a gap in NLP tools for Sindhi, which has been less developed compared to other Arabic script languages.
Contribution
It introduces the first rule-based NER system for Sindhi language, capable of recognizing ten entity categories with 98.71% accuracy.
Findings
Achieved 98.71% accuracy in entity recognition
Developed for ten different entity categories
Addresses a gap in Sindhi NLP tools
Abstract
Named Entity Recognition (NER) System aims to extract the existing information into the following categories such as: Persons Name, Organization, Location, Date and Time, Term, Designation and Short forms. Now, it is considered to be important aspect for many natural languages processing (NLP) tasks such as: information retrieval system, machine translation system, information extraction system and question answering. Even at a surface level, the understanding of the named entities involved in a document gives richer analytical framework and cross referencing. It has been used for different Arabic Script-Based languages like, Arabic, Persian and Urdu but, Sindhi could not come into being yet. This paper explains the problem of NER in the framework of Sindhi Language and provides relevant solution. The system is developed to tag ten different Named Entities. We have used Ruled based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
