TL;DR
Proximity Forest is a scalable, accurate, and fast distance-based classifier for large time series datasets, outperforming existing methods in speed while maintaining high accuracy.
Contribution
It introduces Proximity Forest, an ensemble of randomized Proximity Trees that efficiently classifies millions of time series using proximity measures.
Findings
Achieves high accuracy on UCR datasets
Classifies millions of time series in milliseconds
Learns models at least 100,000 times faster than state-of-the-art methods
Abstract
Research into the classification of time series has made enormous progress in the last decade. The UCR time series archive has played a significant role in challenging and guiding the development of new learners for time series classification. The largest dataset in the UCR archive holds 10 thousand time series only; which may explain why the primary research focus has been in creating algorithms that have high accuracy on relatively small datasets. This paper introduces Proximity Forest, an algorithm that learns accurate models from datasets with millions of time series, and classifies a time series in milliseconds. The models are ensembles of highly randomized Proximity Trees. Whereas conventional decision trees branch on attribute values (and usually perform poorly on time series), Proximity Trees branch on the proximity of time series to one exemplar time series or another;…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
