MLPerf Inference Benchmark

Vijay Janapa Reddi; Christine Cheng; David Kanter; Peter Mattson,; Guenther Schmuelling; Carole-Jean Wu; Brian Anderson; Maximilien Breughe,; Mark Charlebois; William Chou; Ramesh Chukka; Cody Coleman; Sam Davis; Pan; Deng; Greg Diamos; Jared Duke; Dave Fick; J. Scott Gardner; Itay Hubara,; Sachin Idgunji; Thomas B. Jablin; Jeff Jiao; Tom St. John; Pankaj Kanwar,; David Lee; Jeffery Liao; Anton Lokhmotov; Francisco Massa; Peng Meng; Paulius; Micikevicius; Colin Osborne; Gennady Pekhimenko; Arun Tejusve Raghunath; Rajan; Dilip Sequeira; Ashish Sirasao; Fei Sun; Hanlin Tang; Michael Thomson,; Frank Wei; Ephrem Wu; Lingjie Xu; Koichi Yamada; Bing Yu; George Yuan; Aaron; Zhong; Peizhao Zhang; Yuchen Zhou

arXiv:1911.02549·cs.LG·May 12, 2020

MLPerf Inference Benchmark

Vijay Janapa Reddi, Christine Cheng, David Kanter, Peter Mattson,, Guenther Schmuelling, Carole-Jean Wu, Brian Anderson, Maximilien Breughe,, Mark Charlebois, William Chou, Ramesh Chukka, Cody Coleman, Sam Davis, Pan, Deng, Greg Diamos, Jared Duke, Dave Fick, J. Scott Gardner

PDF

4 Repos

TL;DR

MLPerf Inference provides a standardized benchmarking method for evaluating diverse ML inference systems across hardware and software, enabling fair comparison and industry-wide performance assessment.

Contribution

It introduces a comprehensive, industry-wide benchmark with rules and best practices, facilitating architecture-neutral, reproducible ML inference performance evaluation.

Findings

01

Over 600 measurements from 14 organizations

02

More than 30 systems demonstrated wide performance range

03

Benchmark's flexibility and adaptability confirmed

Abstract

Machine-learning (ML) hardware and software system demand is burgeoning. Driven by ML applications, the number of different ML inference systems has exploded. Over 100 organizations are building ML inference chips, and the systems that incorporate existing models span at least three orders of magnitude in power consumption and five orders of magnitude in performance; they range from embedded devices to data-center solutions. Fueling the hardware are a dozen or more software frameworks and libraries. The myriad combinations of ML hardware and ML software make assessing ML-system performance in an architecture-neutral, representative, and reproducible manner challenging. There is a clear need for industry-wide standard ML benchmarking and evaluation criteria. MLPerf Inference answers that call. In this paper, we present our benchmarking method for evaluating ML inference systems. Driven…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.