Spatial Clustering of Citizen Science Data Improves Downstream Species Distribution Models
Nahian Ahmed, Mark Roth, Tyler A. Hallman, W. Douglas Robinson,, Rebecca A. Hutchinson

TL;DR
This study demonstrates that spatial clustering of citizen science data enhances the accuracy of species distribution models by improving site construction for occupancy modeling.
Contribution
It introduces and compares ten spatial clustering methods for site construction, showing their effectiveness in improving occupancy-based species distribution models.
Findings
Spatial clustering improves model performance.
Occupancy models with clustered sites outperform alternatives.
Methodology benefits large-scale citizen science data analysis.
Abstract
Citizen science biodiversity data present great opportunities for ecology and conservation across vast spatial and temporal scales. However, the opportunistic nature of these data lacks the sampling structure required by modeling methodologies that address a pervasive challenge in ecological data collection: imperfect detection, i.e., the likelihood of under-observing species on field surveys. Occupancy modeling is an example of an approach that accounts for imperfect detection by explicitly modeling the observation process separately from the biological process of habitat selection. This produces species distribution models that speak to the pattern of the species on a landscape after accounting for imperfect detection in the data, rather than the pattern of species observations corrupted by errors. To achieve this benefit, occupancy models require multiple surveys of a site across…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsSpecies Distribution and Climate Change · Data-Driven Disease Surveillance
