SMURF: Efficient and Scalable Metadata Access for Distributed   Applications

Bing Zhang; Tevfik Kosar

arXiv:2105.14157·cs.DC·June 1, 2021

SMURF: Efficient and Scalable Metadata Access for Distributed Applications

Bing Zhang, Tevfik Kosar

PDF

Open Access

TL;DR

SMURF is a scalable system that enhances distributed metadata access in cloud environments by combining novel pipelining, caching, and prefetching strategies, significantly reducing latency and improving performance based on real-world trace analysis.

Contribution

The paper introduces SMURF, a novel metadata access framework that leverages semantic locality and continuum caching to improve efficiency and scalability in distributed cloud applications.

Findings

01

Achieved 90% accuracy in prefetch prediction.

02

Reduced average fetch latency by 50%.

03

Improved cache hit rate through semantic locality-based prefetching.

Abstract

In parallel with big data processing and analysis dominating the usage of distributed and cloud infrastructures, the demand for distributed metadata access and transfer has increased. In many application domains, the volume of data generated exceeds petabytes, while the corresponding metadata amounts to terabytes or even more. This paper proposes a novel solution for efficient and scalable metadata access for distributed applications across wide-area networks, dubbed SMURF. Our solution combines novel pipelining and concurrent transfer mechanisms with reliability, provides distributed continuum caching and prefetching strategies to sidestep fetching latency, and achieves scalable and high-performance metadata fetch/prefetch services in the cloud. We also study the phenomenon of semantic locality in real trace logs, which is not well utilized in metadata access prediction. We implement a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCaching and Content Delivery · Cloud Computing and Resource Management · IoT and Edge/Fog Computing