BladeDISC++: Memory Optimizations Based On Symbolic Shape
Xiulong Yuan, Xu Yan, Wenting Shen, Xiafei Qiu, Ang Wang, Jie Zhang,, Yong Li, and Wei Lin

TL;DR
BladeDISC++ introduces memory optimization techniques for dynamic shape graphs in deep learning, leveraging symbolic shapes and a combined compile-time and runtime strategy to reduce memory usage without precise shape information.
Contribution
It presents novel op scheduling and rematerialization methods based on symbolic shapes, addressing memory optimization challenges in dynamic shape compilers.
Findings
Reduces memory consumption in dynamic shape graphs
Achieves memory efficiency comparable to precise shape optimizations
Enhances adoption of dynamic shape compilers in large models
Abstract
Recent deep learning workloads exhibit dynamic characteristics, leading to the rising adoption of dynamic shape compilers. These compilers can generate efficient kernels for dynamic shape graphs characterized by a fixed graph topology and uncertain tensor shapes. However, memory optimization, although particularly crucial in this large model era, remains relatively underexplored for dynamic shape graphs. The fundamental challenge lies in the lack of precise tensor shapes which are essential in conventional methods such as operation scheduling(op scheduling) and rematerialization. To address this challenge, we propose op scheduling and rematerialization approaches based on symbolic shapes and developed BladeDISC++. Besides, since rematerialization decisions cannot be made solely at compile time when tensor shapes are unknown, BladeDISC++ employs a compilation-runtime combined strategy to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques
