Tail-Tolerant Distributed Search

Naama Kraus; David Carmel; Idit Keidar

arXiv:1707.07426·cs.IR·July 25, 2017

Tail-Tolerant Distributed Search

Naama Kraus, David Carmel, Idit Keidar

PDF

Open Access

TL;DR

This paper introduces novel strategies to mitigate tail latency effects in distributed search engines, improving search quality by optimizing shard selection and replacing replication with repartitioning.

Contribution

It proposes rSmartRed, an optimal shard selection scheme for replicated indexes, and Repartition, a method replacing replication with independent index instances, both enhancing search quality.

Findings

01

rSmartRed improves recall over existing approaches.

02

Repartition outperforms replication in typical scenarios.

03

Empirical validation with real-world datasets confirms effectiveness.

Abstract

Today's search engines process billions of online user queries a day over huge collections of data. In order to scale, they distribute query processing among many nodes, where each node holds and searches over a subset of the index called shard. Responses from some nodes occasionally fail to arrive within a reasonable time-interval due to various reasons, such as high server load and network congestion. Search engines typically need to respond in a timely manner, and therefore skip such tail latency responses, which causes degradation in search quality. In this paper, we tackle response misses due to high tail latencies with the goal of maximizing search quality. Search providers today use redundancy in the form of Replication for mitigating response misses, by constructing multiple copies of each shard and searching all replicas. This approach is not ideal, as it wastes resources on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Caching and Content Delivery · Web Data Mining and Analysis