Enriching the Machine Learning Workloads in BigBench

Matthias Polag; Todor Ivanov; and Timo Eichhorn

arXiv:2406.10843·cs.LG·June 18, 2024

Enriching the Machine Learning Workloads in BigBench

Matthias Polag, Todor Ivanov, and Timo Eichhorn

PDF

Open Access

TL;DR

This paper enhances the BigBench benchmark by adding new machine learning workloads and comparing multiple implementations across popular libraries to better evaluate AI and ML systems.

Contribution

It introduces three new workloads to BigBench V2 and compares various ML algorithm implementations across multiple libraries, expanding benchmarking capabilities.

Findings

01

Demonstrates the relevance of extended benchmark for AI/ML evaluation

02

Shows differences in implementation performance across libraries

03

Provides a standardized testing framework for new ML workloads

Abstract

In the era of Big Data and the growing support for Machine Learning, Deep Learning and Artificial Intelligence algorithms in the current software systems, there is an urgent need of standardized application benchmarks that stress test and evaluate these new technologies. Relying on the standardized BigBench (TPCx-BB) benchmark, this work enriches the improved BigBench V2 with three new workloads and expands the coverage of machine learning algorithms. Our workloads utilize multiple algorithms and compare different implementations for the same algorithm across several popular libraries like MLlib, SystemML, Scikit-learn and Pandas, demonstrating the relevance and usability of our benchmark extension.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Advanced Data Processing Techniques