Caching Historical Embeddings in Conversational Search

Ophir Frieder; Ida Mele; Cristina Ioana Muntean; Franco Maria Nardini,; Raffaele Perego; Nicola Tonellotto

arXiv:2211.14155·cs.IR·November 28, 2022

Caching Historical Embeddings in Conversational Search

Ophir Frieder, Ida Mele, Cristina Ioana Muntean, Franco Maria Nardini,, Raffaele Perego, Nicola Tonellotto

PDF

Open Access

TL;DR

This paper introduces a client-side embedding cache for conversational search that leverages temporal locality in queries to significantly improve response times and reduce backend load, achieving up to 75% cache hit rate.

Contribution

It proposes a novel embedding caching method with an efficient metric index, enhancing conversational search responsiveness without sacrificing answer quality.

Findings

01

Achieved up to 75% cache hit rate in experiments.

02

Significantly improved system responsiveness.

03

Reduced backend query load.

Abstract

Rapid response, namely low latency, is fundamental in search applications; it is particularly so in interactive search sessions, such as those encountered in conversational settings. An observation with a potential to reduce latency asserts that conversational queries exhibit a temporal locality in the lists of documents retrieved. Motivated by this observation, we propose and evaluate a client-side document embedding cache, improving the responsiveness of conversational search systems. By leveraging state-of-the-art dense retrieval models to abstract document and query semantics, we cache the embeddings of documents retrieved for a topic introduced in the conversation, as they are likely relevant to successive queries. Our document embedding cache implements an efficient metric index, answering nearest-neighbor similarity queries by estimating the approximate result sets returned. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCaching and Content Delivery · Advanced Image and Video Retrieval Techniques · Recommender Systems and Techniques