Correcting sampling biases via importance reweighting for spatial modeling
Boris Prokhorov, Diana Koldasbayeva, Alexey Zaytsev

TL;DR
This paper presents an importance reweighting method to correct sampling biases in spatial data models, significantly improving error estimation accuracy under distribution shifts.
Contribution
It introduces a novel importance sampling approach combined with kernel density estimation to unbiasedly estimate errors in spatial modeling.
Findings
Error rate reduced from 7% to 2%
Method effectively neutralizes distribution shift
Performance improves with larger sample sizes
Abstract
In machine learning models, the estimation of errors is often complex due to distribution bias, particularly in spatial data such as those found in environmental studies. We introduce an approach based on the ideas of importance sampling to obtain an unbiased estimate of the target error. By taking into account difference between desirable error and available data, our method reweights errors at each sample point and neutralizes the shift. Importance sampling technique and kernel density estimation were used for reweighteing. We validate the effectiveness of our approach using artificial data that resemble real-world spatial datasets. Our findings demonstrate advantages of the proposed approach for the estimation of the target error, offering a solution to a distribution shift problem. Overall error of predictions dropped from 7% to just 2% and it gets smaller for larger samples.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · demographic modeling and climate adaptation · Bayesian Methods and Mixture Models
