Why every GBDT speed benchmark is wrong
Anna Veronika Dorogush, Vasily Ershov, Dmitriy Kruchinin

TL;DR
This paper critically examines common methods for benchmarking the speed of gradient boosted decision trees, highlighting issues and proposing requirements for fair and effective benchmarking practices.
Contribution
It identifies problems in current benchmarking approaches and offers guidelines to improve the fairness and usefulness of speed evaluations for GBDT algorithms.
Findings
Current benchmarks often give misleading speed comparisons
Many benchmarking methods lack fairness and reproducibility
The paper proposes criteria for more reliable GBDT speed benchmarks
Abstract
This article provides a comprehensive study of different ways to make speed benchmarks of gradient boosted decision trees algorithm. We show main problems of several straight forward ways to make benchmarks, explain, why a speed benchmarking is a challenging task and provide a set of reasonable requirements for a benchmark to be fair and useful.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImbalanced Data Classification Techniques · Machine Learning and Data Classification · Neural Networks and Applications
