Information retrieval system for silte language using BM25 weighting

Abdulmalik Johar

arXiv:2012.08907·cs.IR·December 17, 2020

Information retrieval system for silte language using BM25 weighting

Abdulmalik Johar

PDF

Open Access

TL;DR

This paper presents a probabilistic information retrieval system for the Silte language, utilizing BM25 weighting to improve search relevance in growing digital text collections.

Contribution

It introduces a tailored IR system for Silte language incorporating tokenization, stemming, stop word removal, and synonym handling, with BM25 weighting for relevance ranking.

Findings

01

Effective retrieval of Silte language documents demonstrated

02

System improves search relevance over baseline methods

03

Inclusion of language-specific preprocessing enhances performance

Abstract

The main aim of an information retrieval system is to extract appropriate information from an enormous collection of data based on users need. The basic concept of the information retrieval system is that when a user sends out a query, the system would try to generate a list of related documents ranked in order, according to their degree of relevance. Digital unstructured Silte text documents increase from time to time. The growth of digital text information makes the utilization and access of the right information difficult. Thus, developing an information retrieval system for Silte language allows searching and retrieving relevant documents that satisfy information need of users. In this research, we design probabilistic information retrieval system for Silte language. The system has both indexing and searching part was created. In these modules, different text operations such as…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsInformation Retrieval and Search Behavior · Natural Language Processing Techniques · Semantic Web and Ontologies