T-RACKs: A Faster Recovery Mechanism for TCP in Data Center Networks
Ahmed M. Abdelmoniem, Brahim Bensaou

TL;DR
T-RACKs is a novel loss recovery mechanism designed to significantly reduce flow completion times for small TCP flows in data centers by mitigating the effects of packet loss and long retransmission timeouts without modifying TCP itself.
Contribution
This paper introduces T-RACKs, a simple, effective loss recovery mechanism that improves TCP performance in data centers without requiring changes to TCP or the application layer.
Findings
T-RACKs reduces flow completion times for small TCP flows.
T-RACKs improves TCP performance under heavy packet loss conditions.
T-RACKs can be implemented as a software shim or in hardware with significant performance gains.
Abstract
Cloud interactive data-driven applications generate swarms of small TCP flows that compete for the small buffer space in data-center switches. Such applications require a short flow completion time (FCT) to perform their jobs effectively. However, TCP is oblivious to the composite nature of application data and artificially inflates the FCT of such flows by several orders of magnitude. This is due to TCP's Internet-centric design that fixes the retransmission timeout (RTO) to be at least hundreds of milliseconds. To better understand this problem, in this paper, we use empirical measurements in a small testbed to study, at a microscopic level, the effects of various types of packet losses on TCP's performance. In particular, we single out packet losses that impact the tail end of small flows, as well as bursty losses, that span a significant fraction of the small congestion window of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
