Understanding Auto-Scheduling Optimizations for Model Deployment via Visualizations
Laixin Xie, Chenyang Zhang, Ruofei Ma, Xing Jiang, Xingxing Xing, Wei, Wan, and Quan Li

TL;DR
This paper explores visualization techniques for auto-scheduling in deep learning model deployment, aiming to improve understanding of optimization processes and facilitate latency reduction on hardware.
Contribution
It proposes an enhanced visualization method to interpret auto-scheduling profiling metrics, aiding manual optimization and latency improvements.
Findings
Visualization clarifies complex scheduling processes
Helps identify optimization opportunities
Supports latency reduction efforts
Abstract
After completing the design and training phases, deploying a deep learning model onto specific hardware is essential before practical implementation. Targeted optimizations are necessary to enhance the model's performance by reducing inference latency. Auto-scheduling, an automated technique offering various optimization options, proves to be a viable solution for large-scale auto-deployment. However, the low-level code generated by auto-scheduling resembles hardware coding, potentially hindering human comprehension and impeding manual optimization efforts. In this ongoing study, we aim to develop an enhanced visualization that effectively addresses the extensive profiling metrics associated with auto-scheduling. This visualization will illuminate the intricate scheduling process, enabling further advancements in latency optimization through insights derived from the schedule.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Embedded Systems Design Techniques · Radiation Effects in Electronics
