Performance Characterization and Optimizations of Traditional ML   Applications

Harsh Kumar; R. Govindarajan

arXiv:2412.19051·cs.PF·December 30, 2024

Performance Characterization and Optimizations of Traditional ML Applications

Harsh Kumar, R. Govindarajan

PDF

Open Access

TL;DR

This paper analyzes the performance of traditional machine learning applications with large datasets, identifies bottlenecks, and demonstrates how cache and memory optimizations can significantly improve their efficiency.

Contribution

It provides a detailed performance characterization of traditional ML methods and introduces practical optimizations in scikit-learn that enhance performance on real systems.

Findings

01

Performance improvements of 5.2%-27.1% with software prefetching.

02

Data layout and reordering yield 6.16%-28.0% performance gains.

03

Insights into bottlenecks in traditional ML applications with large datasets.

Abstract

Even in the era of Deep Learning based methods, traditional machine learning methods with large data sets continue to attract significant attention. However, we find an apparent lack of a detailed performance characterization of these methods in the context of large training datasets. In this work, we study the system's behavior of a number of traditional ML methods as implemented in popular free software libraries/modules to identify critical performance bottlenecks experienced by these applications. The performance characterization study reveals several interesting insights on the performance of these applications. Then we evaluate the performance benefits of applying some well-known optimizations at the levels of caches and the main memory. More specifically, we test the usefulness of optimizations such as (i) software prefetching to improve cache performance and (ii) data layout and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsParallel Computing and Optimization Techniques