Enriching the Machine Learning Workloads in BigBench
Matthias Polag, Todor Ivanov, and Timo Eichhorn

TL;DR
This paper enhances the BigBench benchmark by adding new machine learning workloads and comparing multiple implementations across popular libraries to better evaluate AI and ML systems.
Contribution
It introduces three new workloads to BigBench V2 and compares various ML algorithm implementations across multiple libraries, expanding benchmarking capabilities.
Findings
Demonstrates the relevance of extended benchmark for AI/ML evaluation
Shows differences in implementation performance across libraries
Provides a standardized testing framework for new ML workloads
Abstract
In the era of Big Data and the growing support for Machine Learning, Deep Learning and Artificial Intelligence algorithms in the current software systems, there is an urgent need of standardized application benchmarks that stress test and evaluate these new technologies. Relying on the standardized BigBench (TPCx-BB) benchmark, this work enriches the improved BigBench V2 with three new workloads and expands the coverage of machine learning algorithms. Our workloads utilize multiple algorithms and compare different implementations for the same algorithm across several popular libraries like MLlib, SystemML, Scikit-learn and Pandas, demonstrating the relevance and usability of our benchmark extension.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Advanced Data Processing Techniques
