Intra-process Caching and Reuse of Threads
Dave Dice, Alex Kogan

TL;DR
This paper introduces a process-local thread cache to reduce thread creation latency on Linux, significantly improving performance in applications with high thread churn.
Contribution
It proposes a novel intra-process thread caching mechanism that drastically reduces thread creation overhead and enhances scalability.
Findings
Thread creation cost drops by nearly an order of magnitude.
Performance improves significantly in applications with rapid thread creation/destruction.
Caching benefits are demonstrated across multiple benchmarks.
Abstract
Creating and destroying threads on modern Linux systems incurs high latency, absent concurrency, and fails to scale as we increase concurrency. To address this concern we introduce a process-local cache of idle threads. Specifically, instead of destroying a thread when it terminates, we cache and then recycle that thread in the context of subsequent thread creation requests. This approach shows significant promise in various applications and benchmarks that create and destroy threads rapidly and illustrates the need for and potential benefits of improved concurrency infrastructure. With caching, the cost of creating a new thread drops by almost an order of magnitude. As our experiments demonstrate, this results in significant performance improvements for multiple applications that aggressively create and destroy numerous threads.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Distributed and Parallel Computing Systems · Distributed systems and fault tolerance
