Lovelock: Towards Smart NIC-hosted Clusters
Seo Jin Park, Ramesh Govindan, Kai Shen, David Culler, Fatma \"Ozcan,, Geon-Woo Kim, Hank Levy

TL;DR
Lovelock proposes a novel cluster architecture replacing servers with smart NICs to improve performance and cost-efficiency for data-intensive applications like analytics and ML training.
Contribution
The paper introduces Lovelock, a specialized cluster design utilizing smart NICs to enhance data-intensive application performance and reduce costs compared to traditional server-centric clusters.
Findings
Smart NICs can replace servers in clusters for data-intensive tasks.
Lovelock achieves cost and energy savings over traditional clusters.
Performance is maintained or improved with the new design.
Abstract
Traditional cluster designs were originally server-centric, and have evolved recently to support hardware acceleration and storage disaggregation. In applications that leverage acceleration, the server CPU performs the role of orchestrating computation and data movement and data-intensive applications stress the memory bandwidth. Applications that leverage disaggregation can be adversely affected by the increased PCIe and network bandwidth resulting from disaggregation. In this paper, we advocate for a specialized cluster design for important data intensive applications, such as analytics, query processing and ML training. This design, Lovelock, replaces each server in a cluster with one or more headless smart NICs. Because smart NICs are significantly cheaper than servers on bandwidth, the resulting cluster can run these applications without adversely impacting performance, while…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Advanced Data Storage Technologies · Distributed and Parallel Computing Systems
