Extract the information from the big data with randomly distributed noise
Jin Cheng, Jiantang Zhang, Min Zhong

TL;DR
This paper introduces a data-driven statistical regularization technique for extracting information from large datasets with random noise, addressing limitations of traditional methods and improving computational efficiency.
Contribution
A novel regularization method that handles high-variance noise in big data, with proven unique solvability and an effective parameter strategy.
Findings
Method effectively extracts information from noisy big data.
Proven unique solvability and solution characterization.
Numerical examples demonstrate high effectiveness.
Abstract
In this manuscript, a purely data driven statistical regularization method is proposed for extracting the information from big data with randomly distributed noise. Since the variance of the noise maybe large, the method can be regarded as a general data preprocessing method in ill-posed problems, which is able to overcome the difficulty that the traditional regularization method unable to solve, and has superior advantage in computing efficiency. The unique solvability of the method is proved and a number of conditions are given to characterize the solution. The regularization parameter strategy is discussed and the rigorous upper bound estimation of confidence interval of the error in norm is established. Some numerical examples are provided to illustrate the appropriateness and effectiveness of the method.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNumerical methods in inverse problems · Probabilistic and Robust Engineering Design · Sparse and Compressive Sensing Techniques
