C-DLSI: An Extended LSI Tailored for Federated Text Retrieval
Qijun Zhu, Dandan Li, Dik Lun Lee

TL;DR
This paper introduces C-DLSI, a novel federated text retrieval method that extends LSI with clustering to better characterize peers and improve retrieval accuracy in distributed web environments.
Contribution
The paper proposes C-DLSI, an innovative extension of LSI with clustering for federated text retrieval, enhancing peer characterization and retrieval precision.
Findings
C-DLSI outperforms existing federated retrieval methods.
Clustering improves the local LSI space representation.
Enhanced peer characterization leads to better retrieval accuracy.
Abstract
As the web expands in data volume and in geographical distribution, centralized search methods become inefficient, leading to increasing interest in cooperative information retrieval, e.g., federated text retrieval (FTR). Different from existing centralized information retrieval (IR) methods, in which search is done on a logically centralized document collection, FTR is composed of a number of peers, each of which is a complete search engine by itself. To process a query, FTR requires firstly the identification of promising peers that host the relevant documents and secondly the retrieval of the most relevant documents from the selected peers. Most of the existing methods only apply traditional IR techniques that treat each text collection as a single large document and utilize term matching to rank the collections. In this paper, we formalize the problem and identify the properties of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCaching and Content Delivery · Recommender Systems and Techniques · Data Management and Algorithms
