An Innovative Algorithm For Robust, Interactive, Piecewise-Linear Data Exploration
Stephen Wright, Colin Paterson

TL;DR
This paper introduces a generalized, robust piecewise-linear data exploration algorithm that combines mode-based fitting, non-parametric clustering, regularization, and accuracy estimation to handle complex, noisy datasets without predefined models.
Contribution
It extends the Theil-Sen algorithm to include mode-based fitting, cluster analysis, and regularization in a unified, distribution-free framework for robust data exploration.
Findings
Provides a robust piecewise-linear fit for noisy data
Detects regime shifts and data heterogeneity non-parametrically
Integrates regularization and accuracy estimation in a single algorithm
Abstract
Many mathematical modelling tasks (such as in Economics and Finance) are informed by data that is "found" rather than being the result of carefully designed experiments. This often results in data series that are short, noisy, multidimensional and contaminated with outliers, regime shifts, and confounding, uninformative or co-linear variables. We present a generalization of the Theil-Sen algorithm to reflect modes (rather than the median) in the parameter space distribution (of partial fits to the data). This can provide a robust piecewise-linear fit to the data while also allowing for extensions to including elements of cluster analysis, regularization and cross-validation in a unified (distribution free) approach that can:- 1. Exploit piecewise linearity to reduce the need to pre-specify the form of the underlying data generating process. 2. Detect non-homogeneity (e.g. regime…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Analysis with R · Advanced Statistical Methods and Models · Diverse Scientific and Engineering Research
