Cluster-Based Information Retrieval by using (K-means)- Hierarchical   Parallel Genetic Algorithms Approach

Sarah Hussein Toman; Mohammed Hamzah Abed; Zinah Hussein Toman

arXiv:2008.00150·cs.AI·August 4, 2020

Cluster-Based Information Retrieval by using (K-means)- Hierarchical Parallel Genetic Algorithms Approach

Sarah Hussein Toman, Mohammed Hamzah Abed, Zinah Hussein Toman

PDF

TL;DR

This paper introduces a novel (K-means)-Hierarchical Parallel Genetic Algorithms approach for cluster-based information retrieval, significantly improving precision and efficiency over traditional IR methods on multiple datasets.

Contribution

It combines K-means clustering with hybrid parallel genetic algorithms to enhance IR quality and processing speed, reducing irrelevant documents in large datasets.

Findings

01

Precision improved by up to 45% over IR-GA

02

Significant accuracy gains over classic IR methods

03

Effective clustering reduces irrelevant document retrieval

Abstract

Cluster-based information retrieval is one of the Information retrieval(IR) tools that organize, extract features and categorize the web documents according to their similarity. Unlike traditional approaches, cluster-based IR is fast in processing large datasets of document. To improve the quality of retrieved documents, increase the efficiency of IR and reduce irrelevant documents from user search. in this paper, we proposed a (K-means) - Hierarchical Parallel Genetic Algorithms Approach (HPGA) that combines the K-means clustering algorithm with hybrid PG of multi-deme and master/slave PG algorithms. K-means uses to cluster the population to k subpopulations then take most clusters relevant to the query to manipulate in a parallel way by the two levels of genetic parallelism, thus, irrelevant documents will not be included in subpopulations, as a way to improve the quality of results.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Methodsk-Means Clustering