ConvoCache: Smart Re-Use of Chatbot Responses

Conor Atkins; Ian Wood; Mohamed Ali Kaafar; Hassan Asghar; Nardine; Basta; Michal Kepkowski

arXiv:2406.18133·cs.CL·September 26, 2024

ConvoCache: Smart Re-Use of Chatbot Responses

Conor Atkins, Ian Wood, Mohamed Ali Kaafar, Hassan Asghar, Nardine, Basta, Michal Kepkowski

PDF

Open Access 1 Repo

TL;DR

ConvoCache is a caching system that significantly reduces response latency and costs in chatbots by reusing semantically similar past responses, with minimal impact on coherence.

Contribution

This paper introduces ConvoCache, a novel caching approach that reuses past chatbot responses based on semantic similarity to improve efficiency.

Findings

01

Reuses 89% of responses with 214ms latency

02

Achieves 63% cache hit rate with 80% prefetching

03

Reduces chatbot costs by up to 89%

Abstract

We present ConvoCache, a conversational caching system that solves the problem of slow and expensive generative AI models in spoken chatbots. ConvoCache finds a semantically similar prompt in the past and reuses the response. In this paper we evaluate ConvoCache on the DailyDialog dataset. We find that ConvoCache can apply a UniEval coherence threshold of 90% and respond to 89% of prompts using the cache with an average latency of 214ms, replacing LLM and voice synthesis that can take over 1s. To further reduce latency we test prefetching and find limited usefulness. Prefetching with 80% of a request leads to a 63% hit rate, and a drop in overall coherence. ConvoCache can be used with any chatbot to reduce costs by reducing usage of generative AI by up to 89%.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

RoshanStacker/ConvoCache
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAI in Service Interactions · Spam and Phishing Detection · Blood donation and transfusion practices