Machine Learning-Guided Memory Optimization for DLRM Inference on Tiered Memory
Jie Ren, Bin Ma, Shuangyan Yang, Benjamin Francis, Ehsan K. Ardestani, Min Si, Dong Li

TL;DR
RecMG is a machine learning-guided system that optimizes memory management for DLRM inference on tiered memory architectures, significantly reducing fetches and inference time.
Contribution
This paper introduces RecMG, a novel ML-based approach for embedding-vector caching and prefetching tailored for DLRM inference in tiered memory systems.
Findings
RecMG reduces on-demand fetches by up to 2.8x compared to state-of-the-art prefetchers.
RecMG decreases end-to-end DLRM inference time by up to 43%.
The system effectively predicts long-reuse and low-reuse embedding vectors.
Abstract
Deep learning recommendation models (DLRMs) are widely used in industry, and their memory capacity requirements reach the terabyte scale. Tiered memory architectures provide a cost-effective solution but introduce challenges in embedding-vector placement due to complex embedding-access patterns. We propose RecMG, a machine learning (ML)-guided system for vector caching and prefetching on tiered memory. RecMG accurately predicts accesses to embedding vectors with long reuse distances or few reuses. The design of RecMG focuses on making ML feasible in the context of DLRM inference by addressing unique challenges in data labeling and navigating the search space for embedding-vector placement. By employing separate ML models for caching and prefetching, plus a novel differentiable loss function, RecMG narrows the prefetching search space and minimizes on-demand fetches. Compared to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Parallel Computing and Optimization Techniques · Advanced Data Storage Technologies
