GutenTag: A Multi-Term Caching Optimized Tag Query Processor for Key-Value Based NoSQL Storage Systems
Christian von der Weth, Anwitaman Datta

TL;DR
GutenTag introduces a multi-term caching and indexing approach for NoSQL systems, significantly reducing bandwidth consumption for complex queries by leveraging popular term combinations and gateway node caching.
Contribution
It proposes a novel architecture with multi-term caching and index optimization based on query popularity, enhancing NoSQL query efficiency for multi-term searches.
Findings
Reduces bandwidth consumption for multi-term queries
Increases gateway node load marginally
Improves query processing performance in distributed systems
Abstract
NoSQL systems are more and more deployed as back-end infrastructure for large-scale distributed online platforms like Google, Amazon or Facebook. Their applicability results from the fact that most services of online platforms access the stored data objects via their primary key. However, NoSQL systems do not efficiently support services referring more than one data object, e.g. the term-based search for data objects. To address this issue we propose our architecture based on an inverted index on top of a NoSQL system. For queries comprising more than one term, distributed indices yield a limited performance in large distributed systems. We propose two extensions to cope with this challenge. Firstly, we store index entries not only for single term but also for a selected set of term combinations depending on their popularity derived from a query history. Secondly, we additionally cache…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCaching and Content Delivery · Cloud Computing and Resource Management · Distributed systems and fault tolerance
