Toward Accurate Platform-Aware Performance Modeling for Deep Neural Networks
Chuan-Chi Wang, Ying-Chiao Liao, Ming-Chang Kao, Wen-Yew Liang,, Shih-Hao Hung

TL;DR
This paper introduces PerfNetV2, a machine learning-based performance model that accurately predicts neural network inference and training times across various GPU accelerators, including unseen devices, aiding system optimization.
Contribution
PerfNetV2 improves accuracy over previous models by detailed host-accelerator interaction modeling and architecture enhancements, enabling reliable performance predictions on new hardware.
Findings
Achieves mean absolute percentage error within 13.1% on multiple CNNs.
Outperforms previous models with errors as high as 200%.
Capable of predicting performance on unseen GPU devices.
Abstract
In this paper, we provide a fine-grain machine learning-based method, PerfNetV2, which improves the accuracy of our previous work for modeling the neural network performance on a variety of GPU accelerators. Given an application, the proposed method can be used to predict the inference time and training time of the convolutional neural networks used in the application, which enables the system developer to optimize the performance by choosing the neural networks and/or incorporating the hardware accelerators to deliver satisfactory results in time. Furthermore, the proposed method is capable of predicting the performance of an unseen or non-existing device, e.g. a new GPU which has a higher operating frequency with less processor cores, but more memory capacity. This allows a system developer to quickly search the hardware design space and/or fine-tune the system configuration. Compared…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Ferroelectric and Negative Capacitance Devices · Adversarial Robustness in Machine Learning
