DeepProf: Performance Analysis for Deep Learning Applications via Mining GPU Execution Patterns
Jiazhen Gu, Huan Liu, Yangfan Zhou, Xin Wang

TL;DR
DeepProf is a tool that analyzes GPU execution patterns in deep learning applications by mining GPU traces, helping identify performance bottlenecks and providing insights for system optimization.
Contribution
The paper introduces DeepProf, a novel automated tool that extracts GPU execution patterns using suffix trees to analyze deep learning application performance.
Findings
DeepProf effectively identifies performance issues in GPU traces.
Analysis reveals properties of TensorFlow that inform system setup.
Empirical validation confirms DeepProf's usefulness in performance diagnosis.
Abstract
Deep learning applications are computation-intensive and often employ GPU as the underlying computing devices. Deep learning frameworks provide powerful programming interfaces, but the gap between source codes and practical GPU operations make it difficult to analyze the performance of deep learning applications. In this paper, through examing the features of GPU traces and deep learning applications, we use the suffix tree structure to extract the repeated patten in GPU traces. Performance analysis graphs can be generated from the preprocessed GPU traces. We further present \texttt{DeepProf}, a novel tool to automatically process GPU traces and generate performance analysis reports for deep learning applications. Empirical study verifies the effectiveness of \texttt{DeepProf} in performance analysis and diagnosis. We also find out some interesting properties of Tensorflow, which can be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware System Performance and Reliability · Software Testing and Debugging Techniques · Parallel Computing and Optimization Techniques
