Learning-based Dynamic Pinning of Parallelized Applications in Many-Core Systems
Georgios C. Chasparis, Vladimir Janjic, Michael Rossbory

TL;DR
This paper introduces a learning-based, decentralized scheduling framework for dynamic thread placement in NUMA architectures, enhancing performance and adaptability of parallel applications in many-core systems.
Contribution
It presents a novel decentralized learning scheme for thread placement that easily incorporates multi-objective criteria and adapts to runtime performance changes.
Findings
Performance improvements over Linux scheduler, especially with limited resources.
Significant gains in irregular memory-access patterns.
Analytical guarantees on expected application performance.
Abstract
Motivated by the need for adaptive, secure and responsive scheduling in a great range of computing applications, including human-centered and time-critical applications, this paper proposes a scheduling framework that seamlessly adds resource-awareness to any parallel application. In particular, we introduce a learning-based framework for dynamic placement of parallel threads to Non-Uniform Memory Access (NUMA) architectures. Decisions are taken independently by each thread in a decentralized fashion that significantly reduces computational complexity. The advantage of the proposed learning scheme is the ability to easily incorporate any multi-objective criterion and easily adapt to performance variations during runtime. Under the multi-objective criterion of maximizing total completed instructions per second (i.e., both computational and memory-access instructions), we provide…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
