NetClone: Fast, Scalable, and Dynamic Request Cloning for Microsecond-Scale RPCs
Gyuyeong Kim

TL;DR
NetClone is a novel in-network request cloning system that dynamically performs request duplication within programmable switches, significantly reducing tail latency for microsecond-scale RPCs in high-load environments.
Contribution
It introduces a fast, scalable, and dynamic request cloning method using programmable switch ASICs, overcoming limitations of previous static or slow approaches.
Findings
Reduces tail latency in microsecond RPCs
Works effectively with real-world workloads
Integrates with in-network schedulers like RackSched
Abstract
Spawning duplicate requests, called cloning, is a powerful technique to reduce tail latency by masking service-time variability. However, traditional client-based cloning is static and harmful to performance under high load, while a recent coordinator-based approach is slow and not scalable. Both approaches are insufficient to serve modern microsecond-scale Remote Procedure Calls (RPCs). To this end, we present NetClone, a request cloning system that performs cloning decisions dynamically within nanoseconds at scale. Rather than the client or the coordinator, NetClone performs request cloning in the network switch by leveraging the capability of programmable switch ASICs. Specifically, NetClone replicates requests based on server states and blocks redundant responses using request fingerprints in the switch data plane. To realize the idea while satisfying the strict hardware…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
