Credence: Augmenting Datacenter Switch Buffer Sharing with ML Predictions
Vamsi Addanki, Maciej Pacut, Stefan Schmid

TL;DR
Credence is a novel machine learning-augmented buffer sharing algorithm for datacenter switches that significantly improves throughput and flow completion times by predicting future packet arrivals, bridging the gap between drop-tail and push-out algorithms.
Contribution
This paper introduces Credence, the first algorithm to augment drop-tail buffers with ML predictions, enabling push-out performance without hardware support.
Findings
Credence achieves near-optimal performance with perfect predictions.
It improves throughput by 1.5x over traditional methods.
Flow completion times improve by up to 95% with ML predictions.
Abstract
Packet buffers in datacenter switches are shared across all the switch ports in order to improve the overall throughput. The trend of shrinking buffer sizes in datacenter switches makes buffer sharing extremely challenging and a critical performance issue. Literature suggests that push-out buffer sharing algorithms have significantly better performance guarantees compared to drop-tail algorithms. Unfortunately, switches are unable to benefit from these algorithms due to lack of support for push-out operations in hardware. Our key observation is that drop-tail buffers can emulate push-out buffers if the future packet arrivals are known ahead of time. This suggests that augmenting drop-tail algorithms with predictions about the future arrivals has the potential to significantly improve performance. This paper is the first research attempt in this direction. We propose Credence, a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInterconnection Networks and Systems · Advanced Memory and Neural Computing · Cloud Computing and Resource Management
