Data Processing Protocol for Regression of Geothermal Times Series with Uneven Intervals
Palash Panja, Pranay Asai, Raul Velasco, Milind Deo

TL;DR
This paper investigates how selecting different subsets of unevenly spaced geothermal time series data affects regression model accuracy, proposing schemes to optimize data points for better model fitness without losing data features.
Contribution
It introduces six novel data processing schemes for selecting data points in regression analysis of uneven time series, improving model fitness without sacrificing data features.
Findings
Number of data points has negligible effect on model fit depending on the scheme
Proposed schemes are ranked based on R2 and NRMSE metrics
Optimal data selection can enhance model accuracy without increasing data size
Abstract
Regression of data generated in simulations or experiments has important implications in sensitivity studies, uncertainty analysis, and prediction accuracy. Depending on the nature of the physical model, data points may not be evenly distributed. It is not often practical to choose all points for regression of a model because it doesn't always guarantee a better fit. Fitness of the model is highly dependent on the number of data points and the distribution of the data along the curve. In this study, the effect of the number of points selected for regression is investigated and various schemes aimed to process regression data points are explored. Time series data i.e., output varying with time, is our prime interest mainly the temperature profile from enhanced geothermal system. The objective of the research is to find a better scheme for choosing a fraction of data points from the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsForecasting Techniques and Applications · Statistical and numerical algorithms · Advanced Statistical Methods and Models
