Clustering Web Search Results For Effective Arabic Language Browsing

Issam Sahmoudi; Abdelmonaime Lachkar

arXiv:1305.2755·cs.IR·May 14, 2013

Clustering Web Search Results For Effective Arabic Language Browsing

Issam Sahmoudi, Abdelmonaime Lachkar

PDF

TL;DR

This paper explores applying the Suffix Tree Clustering algorithm to Arabic web search results, addressing language-specific challenges to improve clustering quality and user browsing experience.

Contribution

It introduces a novel scheme integrating STC with Arabic language properties, enhancing cluster coherence and label quality for Arabic web search results.

Findings

01

Successfully applied STC to Arabic snippets with improved cluster quality

02

Identified and addressed challenges due to Arabic language morphology

03

Produced coherent clusters with meaningful labels for Arabic search results

Abstract

The process of browsing Search Results is one of the major problems with traditional Web search engines for English, European, and any other languages generally, and for Arabic Language particularly. This process is absolutely time consuming and the browsing style seems to be unattractive. Organizing Web search results into clusters facilitates users quick browsing through search results. Traditional clustering techniques (data-centric clustering algorithms) are inadequate since they don't generate clusters with highly readable names or cluster labels. To solve this problem, Description-centric algorithms such as Suffix Tree Clustering (STC) algorithm have been introduced and used successfully and extensively with different adapted versions for English, European, and Chinese Languages. However, till the day of writing this paper, in our knowledge, STC algorithm has been never applied…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.