CARGO : Context Augmented Critical Region Offload for Network-bound datacenter Workloads
Siddharth Rai, Trevor E. Carlson

TL;DR
CARGO is a novel mechanism that executes critical instructions at the network interface card to prefetch data, significantly improving latency, throughput, and power efficiency for network-bound datacenter workloads.
Contribution
It introduces a dynamic, context-augmented critical region execution at the NIC to overlap queuing latency with request processing, enhancing performance and efficiency.
Findings
Latency improved by 2.7X
Throughput improved by 2.7X
Power efficiency improved by 1.5X
Abstract
Network bound applications, like a database server executing OLTP queries or a caching server storing objects for a dynamic web applications, are essential services that consumers and businesses use daily. These services run on a large datacenters and are required to meet predefined Service Level Objectives (SLO), or latency targets, with high probability. Thus, efficient datacenter applications should optimize their execution in terms of power and performance. However, to support large scale data storage, these workloads make heavy use of pointer connected data structures (e.g., hash table, large fan-out tree, trie) and exhibit poor instruction and memory level parallelism. Our experiments show that due to long memory access latency, these workloads occupy processor resources (e.g., ROB entries, RS buffers, LS queue entries etc.) for a prolonged period of time that delay the processing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Advanced Data Storage Technologies · Parallel Computing and Optimization Techniques
