Performance Characterization and Optimizations of Traditional ML Applications
Harsh Kumar, R. Govindarajan

TL;DR
This paper analyzes the performance of traditional machine learning applications with large datasets, identifies bottlenecks, and demonstrates how cache and memory optimizations can significantly improve their efficiency.
Contribution
It provides a detailed performance characterization of traditional ML methods and introduces practical optimizations in scikit-learn that enhance performance on real systems.
Findings
Performance improvements of 5.2%-27.1% with software prefetching.
Data layout and reordering yield 6.16%-28.0% performance gains.
Insights into bottlenecks in traditional ML applications with large datasets.
Abstract
Even in the era of Deep Learning based methods, traditional machine learning methods with large data sets continue to attract significant attention. However, we find an apparent lack of a detailed performance characterization of these methods in the context of large training datasets. In this work, we study the system's behavior of a number of traditional ML methods as implemented in popular free software libraries/modules to identify critical performance bottlenecks experienced by these applications. The performance characterization study reveals several interesting insights on the performance of these applications. Then we evaluate the performance benefits of applying some well-known optimizations at the levels of caches and the main memory. More specifically, we test the usefulness of optimizations such as (i) software prefetching to improve cache performance and (ii) data layout and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques
