Symbolic Regression on Sparse and Noisy Data with Gaussian Processes
Junette Hsin, Shubhankar Agarwal, Adam Thorpe, Luis Sentis, David, Fridovich-Keil

TL;DR
This paper introduces GPSINDy, a method combining Gaussian process regression with SINDy to improve symbolic regression accuracy on sparse, noisy data, demonstrated on biological and robotic models.
Contribution
The paper presents GPSINDy, a novel approach that enhances robustness of symbolic regression in noisy, sparse data scenarios by integrating Gaussian processes with SINDy.
Findings
Over 50% improvement in trajectory prediction accuracy
Effective denoising and modeling of nonlinear dynamics
Superior performance on both simulation and hardware data
Abstract
In this paper, we address the challenge of deriving dynamical models from sparse and noisy data. High-quality data is crucial for symbolic regression algorithms; limited and noisy data can present modeling challenges. To overcome this, we combine Gaussian process regression with a sparse identification of nonlinear dynamics (SINDy) method to denoise the data and identify nonlinear dynamical equations. Our approach GPSINDy offers improved robustness with sparse, noisy data compared to SINDy alone. We demonstrate its effectiveness on simulation data from Lotka-Volterra and unicycle models and hardware data from an NVIDIA JetRacer system. We show superior performance over baselines including more than 50% improvement over SINDy and other baselines in predicting future trajectories from noise-corrupted and sparse 5 Hz data.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Time Series Analysis and Forecasting
MethodsGaussian Process
