OS Scheduling Algorithms for Improving the Performance of Multithreaded Workloads
Suryanarayana Murthy Durbhakula

TL;DR
This paper proposes new OS scheduling algorithms that optimize cache-to-cache transfers and remote DRAM accesses in multicore servers, significantly improving performance for multithreaded workloads by reducing latency.
Contribution
It introduces a unified approach with new algorithms that consider cache affinity and remote memory access patterns to enhance OS scheduling for multicore systems.
Findings
Up to 16.79% reduction in overall latency.
New algorithms outperform existing methods in synthetic workloads.
Effective for varying remote cache and DRAM latencies.
Abstract
Major chip manufacturers have all introduced multicore microprocessors. Multi-socket systems built from these processors are used for running various server applications. However to the best of our knowledge current commercial operating systems are not optimized for multi-threaded workloads running on such servers. Cache-to-cache transfers and remote memory accesses impact the performance of such workloads. This paper presents a unified approach to optimizing OS scheduling algorithms for both cache-to-cache transfers and remote DRAM accesses that also takes cache affinity into account. By observing the patterns of local and remote cache-to-cache transfers as well as local and remote DRAM accesses for every thread in each scheduling quantum and applying different algorithms, we come up with a new schedule of threads for the next quantum taking cache affinity into account. This new…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Cloud Computing and Resource Management · Advanced Data Storage Technologies
