RepFlow: Minimizing Flow Completion Times with Replicated Flows in Data Centers
Hong Xu, Baochun Li

TL;DR
RepFlow is a practical method that replicates short data center flows to significantly reduce flow completion times without requiring hardware or protocol modifications.
Contribution
It introduces a simple replication technique for short flows that improves FCT by leveraging path diversity under ECMP without hardware changes.
Findings
50-70% reduction in flow completion times
Significant improvement in both mean and tail FCTs
Near-optimal FCT performance with DCTCP
Abstract
Short TCP flows that are critical for many interactive applications in data centers are plagued by large flows and head-of-line blocking in switches. Hash-based load balancing schemes such as ECMP aggravate the matter and result in long-tailed flow completion times (FCT). Previous work on reducing FCT usually requires custom switch hardware and/or protocol changes. We propose RepFlow, a simple yet practically effective approach that replicates each short flow to reduce the completion times, without any change to switches or host kernels. With ECMP the original and replicated flows traverse distinct paths with different congestion levels, thereby reducing the probability of having long queueing delay. We develop a simple analytical model to demonstrate the potential improvement of RepFlow. Extensive NS-3 simulations and Mininet implementation show that RepFlow provides 50%--70% speedup…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
