A Multi-objective Exploratory Procedure for Regression Model Selection
Ankur Sinha, Pekka Malo, Timo Kuosmanen

TL;DR
This paper introduces a multi-objective genetic algorithm for regression model selection that generates a set of optimal models balancing simplicity and fit, aiding analysts in choosing the best model without prior variable importance knowledge.
Contribution
The paper presents a novel multi-objective genetic algorithm that explores Pareto-optimal regression models considering model complexity and goodness of fit, with a two-step selection process.
Findings
Generated Pareto front of models balancing complexity and fit
Effective in real-world dataset on Communities and Crime
Provides visual and metric-based decision support
Abstract
Variable selection is recognized as one of the most critical steps in statistical modeling. The problems encountered in engineering and social sciences are commonly characterized by over-abundance of explanatory variables, non-linearities and unknown interdependencies between the regressors. An added difficulty is that the analysts may have little or no prior knowledge on the relative importance of the variables. To provide a robust method for model selection, this paper introduces the Multi-objective Genetic Algorithm for Variable Selection (MOGA-VS) that provides the user with an optimal set of regression models for a given data-set. The algorithm considers the regression problem as a two objective task, and explores the Pareto-optimal (best subset) models by preferring those models over the other which have less number of regression coefficients and better goodness of fit. The model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
