An Efficient Indexing and Searching Technique for Information Retrieval for Urdu Language
Muhammad Mudassar Qureshi, Muhammad Shoaib, Kalsoom

TL;DR
This paper presents a specialized indexing and searching technique tailored for Urdu language information retrieval, focusing on language-specific processing, stemming, and index optimization to enhance search efficiency.
Contribution
It introduces a novel Urdu-specific indexing method incorporating morphological rules and stemmer implementation, optimizing index creation by excluding stop words and using ordered index files.
Findings
Improved retrieval efficiency for Urdu language data
Effective Urdu stemmer based on morphological rules
Enhanced index creation process without stop words
Abstract
Indexing techniques are used to improve retrieval of data in response to certain search condition. Inverted files are mostly used for creating indexes. This paper proposes indexing technique for Urdu language. Language processing step in Index creation is different for a particular language. We discuss index creation steps specifically for Urdu language. We explore morphological rules for Urdu language and implement these rules to create Urdu stemmer. We implement our proposed technique with different implementations and compare results. We suggest that indexes should be created without stop words and also index file should be an order index file.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Web Data Mining and Analysis · Advanced Database Systems and Queries
