GeoThinneR: An R Package for Efficient Spatial Thinning of Species Occurrences and Point Data
J. Mestre-Tom\'as

TL;DR
GeoThinneR is an R package that offers efficient spatial thinning methods for large species occurrence datasets, improving preprocessing in species distribution modeling by reducing bias and computational time.
Contribution
It introduces optimized algorithms based on kd-trees and adaptive neighbor estimation for scalable spatial thinning in R, with additional features for SDM workflows.
Findings
Significant reduction in memory usage and execution time.
Enhanced scalability for large datasets.
Improved preprocessing efficiency in SDM workflows.
Abstract
In this paper we present GeoThinneR, an R package for efficient and flexible spatial thinning of species occurrence data. Spatial thinning is a widely used preprocessing step in species distribution modeling (SDM) that can help reduce sampling bias, but existing R implementations rely on brute-force algorithms that scale poorly with large datasets. GeoThinneR implements multiple thinning approaches, including ensuring a minimum distance between points, subsampling points on a grid, and filtering based on decimal precision. To handle large datasets, it introduces two optimized algorithms based on local kd-trees and adaptive neighbor estimation, which greatly reduce memory usage and execution time. Additional functionalities such as group-wise thinning and point prioritization are included to facilitate its use in SDM workflows. We here provide performance benchmarks using both simulated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
