SLOFetch: Compressed-Hierarchical Instruction Prefetching for Cloud Microservices
Zerui Bao, Di Zhu, Liu Jiang, Shiqi Sheng, Ziwei Wang, Haoyun Zhang

TL;DR
This paper introduces SLOFetch, a hierarchical instruction prefetching system optimized for cloud microservices, combining compressed entries, hierarchical metadata, and online machine learning to improve efficiency and reduce latency.
Contribution
It presents a novel prefetching design that compresses destination entries, uses hierarchical metadata storage, and incorporates online ML for profitability scoring, tailored for cloud microservice workloads.
Findings
Achieves speedups similar to EIP with less on-chip state
Improves efficiency for networked cloud services
Effectively balances prefetching accuracy and resource usage
Abstract
Large-scale networked services rely on deep soft-ware stacks and microservice orchestration, which increase instruction footprints and create frontend stalls that inflate tail latency and energy. We revisit instruction prefetching for these cloud workloads and present a design that aligns with SLO driven and self optimizing systems. Building on the Entangling Instruction Prefetcher (EIP), we introduce a Compressed Entry that captures up to eight destinations around a base using 36 bits by exploiting spatial clustering, and a Hierarchical Metadata Storage scheme that keeps only L1 resident and frequently queried entries on chip while virtualizing bulk metadata into lower levels. We further add a lightweight Online ML Controller that scores prefetch profitability using context features and a bandit adjusted threshold. On data center applications, our approach preserves EIP like speedups…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Software System Performance and Reliability · Parallel Computing and Optimization Techniques
