AIPerf: Automated machine learning as an AI-HPC benchmark
Zhixiang Ren, Yongheng Liu, Tianhui Shi, Lei Xie, Yue Zhou, Jidong, Zhai, Youhui Zhang, Yunquan Zhang, Wenguang Chen

TL;DR
AIPerf introduces an automated, scalable AI-HPC benchmark suite using AutoML, capable of evaluating diverse systems with a unified metric, addressing limitations of existing benchmarks.
Contribution
It presents a novel AutoML-based benchmark that is scalable, flexible, and provides a unified performance metric for AI-HPC systems.
Findings
Achieved near-linear weak scalability up to 512 nodes.
Measured 56.1 Tera-OPS on 4 nodes and 194.53 Peta-OPS on 512 nodes.
Demonstrated the benchmark's stability and adaptability across diverse systems.
Abstract
The plethora of complex artificial intelligence (AI) algorithms and available high performance computing (HPC) power stimulates the expeditious development of AI components with heterogeneous designs. Consequently, the need for cross-stack performance benchmarking of AI-HPC systems emerges rapidly. The de facto HPC benchmark LINPACK can not reflect AI computing power and I/O performance without representative workload. The current popular AI benchmarks like MLPerf have fixed problem size therefore limited scalability. To address these issues, we propose an end-to-end benchmark suite utilizing automated machine learning (AutoML), which not only represents real AI scenarios, but also is auto-adaptively scalable to various scales of machines. We implement the algorithms in a highly parallel and flexible way to ensure the efficiency and optimization potential on diverse systems with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Data Storage Technologies · Machine Learning and Data Classification
