Coresets for Decision Trees of Signals
Ibrahim Jubran, Ernesto Evgeniy Sanches Shayda, Ilan Newman, Dan, Feldman

TL;DR
This paper introduces the first algorithm for constructing small, provably accurate coresets for decision trees on 2D signals, significantly speeding up training and tuning without sacrificing accuracy.
Contribution
It presents a novel method linking decision trees to computational geometry, enabling efficient coreset construction for all matrices, improving scalability in machine learning tasks.
Findings
Coresets size polynomial in k, log(N), 1/ε
Construction time is linear in N and k
Experimental results show up to 10x speedup in real-world data
Abstract
A -decision tree (or -tree) is a recursive partition of a matrix (2D-signal) into block matrices (axis-parallel rectangles, leaves) where each rectangle is assigned a real label. Its regression or classification loss to a given matrix of entries (labels) is the sum of squared differences over every label in and its assigned label by . Given an error parameter , a -coreset of is a small summarization that provably approximates this loss to \emph{every} such tree, up to a multiplicative factor of . In particular, the optimal -tree of is a -approximation to the optimal -tree of . We provide the first algorithm that outputs such a -coreset for \emph{every} such matrix . The size of the coreset is polynomial in , and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsGraph Theory and Algorithms · Face and Expression Recognition · Machine Learning and Data Classification
MethodsCoresets
