LB Scalability: Achieving the Right Balance Between Being Stateful and Stateless
Reuven Cohen, Matty Kadosh, Alan Lo, Qasem Sayah

TL;DR
This paper introduces Prism, a scalable Layer-4 load balancer implemented on programmable switch ASICs, capable of handling millions of connections per second with per connection consistency, without maintaining per-connection state in hardware.
Contribution
Prism is the first load balancer to process millions of connections per second with per connection consistency using hardware forwarding without per-connection state.
Findings
Prism can process up to 100 million simultaneous connections.
Prism supports more than one server pool update per second.
Prism achieves high throughput and low latency in hardware.
Abstract
A high performance Layer-4 load balancer (LB) is one of the most important components of a cloud service infrastructure. Such an LB uses network and transport layer information for deciding how to distribute client requests across a group of servers. A crucial requirement for a stateful LB is per connection consistency (PCC); namely, that all the packets of the same connection will be forwarded to the same server, as long as the server is alive, even if the pool of servers or the assignment function changes. The challenge is in designing a high throughput, low latency solution that is also scalable. This paper proposes a highly scalable LB, called Prism, implemented using a programmable switch ASIC. As far as we know, Prism is the first reported LB that can process millions of connections per second and hundreds of millions connections in total, while ensuring PCC. This is due to the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
