Information retrieval system for silte language using BM25 weighting
Abdulmalik Johar

TL;DR
This paper presents a probabilistic information retrieval system for the Silte language, utilizing BM25 weighting to improve search relevance in growing digital text collections.
Contribution
It introduces a tailored IR system for Silte language incorporating tokenization, stemming, stop word removal, and synonym handling, with BM25 weighting for relevance ranking.
Findings
Effective retrieval of Silte language documents demonstrated
System improves search relevance over baseline methods
Inclusion of language-specific preprocessing enhances performance
Abstract
The main aim of an information retrieval system is to extract appropriate information from an enormous collection of data based on users need. The basic concept of the information retrieval system is that when a user sends out a query, the system would try to generate a list of related documents ranked in order, according to their degree of relevance. Digital unstructured Silte text documents increase from time to time. The growth of digital text information makes the utilization and access of the right information difficult. Thus, developing an information retrieval system for Silte language allows searching and retrieving relevant documents that satisfy information need of users. In this research, we design probabilistic information retrieval system for Silte language. The system has both indexing and searching part was created. In these modules, different text operations such as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInformation Retrieval and Search Behavior · Natural Language Processing Techniques · Semantic Web and Ontologies
