Optimizing Financial Data Analysis: A Comparative Study of Preprocessing Techniques for Regression Modeling of Apple Inc.'s Net Income and Stock Prices
Kevin Ungar, Camelia Oprean-Stan

TL;DR
This study compares various data preprocessing techniques for financial datasets of Apple Inc., demonstrating that linear interpolation combined with polynomial regression yields the best predictive performance for net income and stock prices.
Contribution
It introduces a systematic comparison of five preprocessing methods and identifies the most effective approach for regression modeling of financial data.
Findings
Linear interpolation with polynomial regression outperforms other methods.
The best preprocessing technique achieved the lowest validation MSE and MAE.
The study provides a framework for selecting data preprocessing methods in financial analysis.
Abstract
This article presents a comprehensive methodology for processing financial datasets of Apple Inc., encompassing quarterly income and daily stock prices, spanning from March 31, 2009, to December 31, 2023. Leveraging 60 observations for quarterly income and 3774 observations for daily stock prices, sourced from Macrotrends and Yahoo Finance respectively, the study outlines five distinct datasets crafted through varied preprocessing techniques. Through detailed explanations of aggregation, interpolation (linear, polynomial, and cubic spline) and lagged variables methods, the study elucidates the steps taken to transform raw data into analytically rich datasets. Subsequently, the article delves into regression analysis, aiming to decipher which of the five data processing methods best suits capital market analysis, by employing both linear and polynomial regression models on each…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsMasked autoencoder
