Optimizing Memory Mapping Using Deep Reinforcement Learning

Pengming Wang; Mikita Sazanovich; Berkin Ilbeyi; Phitchaya Mangpo; Phothilimthana; Manish Purohit; Han Yang Tay; Ng\^an V\~u; Miaosen Wang,; Cosmin Paduraru; Edouard Leurent; Anton Zhernov; Po-Sen Huang; Julian; Schrittwieser; Thomas Hubert; Robert Tung; Paula Kurylowicz; Kieran Milan,; Oriol Vinyals; Daniel J. Mankowitz

arXiv:2305.07440·cs.PF·October 18, 2023·1 cites

Optimizing Memory Mapping Using Deep Reinforcement Learning

Pengming Wang, Mikita Sazanovich, Berkin Ilbeyi, Phitchaya Mangpo, Phothilimthana, Manish Purohit, Han Yang Tay, Ng\^an V\~u, Miaosen Wang,, Cosmin Paduraru, Edouard Leurent, Anton Zhernov, Po-Sen Huang, Julian, Schrittwieser, Thomas Hubert, Robert Tung, Paula Kurylowicz

PDF

Open Access

TL;DR

This paper presents a novel reinforcement learning approach to optimize memory mapping in machine learning program compilation, leading to faster execution times on hardware accelerators.

Contribution

It introduces mallocGame and mallocMuZero, a new RL-based method for memory mapping that outperforms existing solvers in ML workload execution.

Findings

01

mallocMuZero achieves faster execution times than default XLA solver.

02

The approach improves performance on AlphaTensor matrix multiplication models.

03

Reinforcement learning effectively explores high-dimensional combinatorial search spaces.

Abstract

Resource scheduling and allocation is a critical component of many high impact systems ranging from congestion control to cloud computing. Finding more optimal solutions to these problems often has significant impact on resource and time savings, reducing device wear-and-tear, and even potentially improving carbon emissions. In this paper, we focus on a specific instance of a scheduling problem, namely the memory mapping problem that occurs during compilation of machine learning programs: That is, mapping tensors to different memory layers to optimize execution time. We introduce an approach for solving the memory mapping problem using Reinforcement Learning. RL is a solution paradigm well-suited for sequential decision making problems that are amenable to planning, and combinatorial search spaces with high-dimensional data inputs. We formulate the problem as a single-player game,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFerroelectric and Negative Capacitance Devices · Parallel Computing and Optimization Techniques · Stochastic Gradient Optimization Techniques