TL;DR
DenoGrad is a gradient-based data refinement framework that improves data quality for tabular and time-series learning by iteratively correcting noisy observations using a pretrained neural network.
Contribution
It introduces a novel, model-guided data refinement method applicable to various data types, enhancing predictive performance without relying on strict statistical assumptions.
Findings
Consistent improvements in downstream predictive accuracy across ten real-world datasets.
Preserves statistical structure of data as measured by distributional and correlation metrics.
Enhances generalization even on datasets considered clean, acting as a regularizer.
Abstract
In the Data-Centric Artificial Intelligence (AI) paradigm, improving data quality is essential for robust machine learning. However, many denoising methods rely on rigid statistical assumptions or require clean reference data, which limits their applicability in real-world scenarios. In this work, we propose DenoGrad, a gradient-based framework for data refinement that leverages a pretrained neural network to iteratively correct noisy observations by optimizing the input space while keeping the model fixed. DenoGrad is applicable to both tabular regression and time-series forecasting, and incorporates a consensus-based strategy to ensure temporally coherent updates in sequential settings. Experiments on ten real-world datasets show that the proposed approach yields consistent improvements in downstream predictive performance while preserving the statistical structure of the data, as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
