# Occupancy estimation of wild species in a palm oil plantation using unstructured data

**Authors:** Arco J. van Strien, Erik Meijaard, Syafiie Suief, Syahmi Zaini, Edvard Mizsei, Edvard Mizsei, Edvard Mizsei

PMC · DOI: 10.1371/journal.pone.0328960 · PLOS One · 2026-02-02

## TL;DR

Researchers estimated the presence of wild species in a palm oil plantation using unstructured data collected by workers over five years.

## Contribution

The study demonstrates how unstructured data can be used for occupancy modeling despite methodological challenges.

## Key findings

- Occupancy models were applied to estimate species presence in a palm oil plantation using non-standardized data.
- Data shortcomings like lack of detection records and sampling imbalances were addressed without distorting results.
- Low detection rates led to imprecise occupancy estimates for many species, highlighting the need for improved data collection.

## Abstract

In 2020–2024, plantation workers at a large oil palm plantation in West Kalimantan recorded sightings of wild species of several species groups. They did not use a standardized field method and produced no comprehensive reports but recorded only one or a few species per visit. Such unstructured data are generally viewed as challenging for statistical analysis. Here, we used occupancy models to estimate what proportion of spatial units (both non-natural and forest) a given species occupied annually in the plantation. We tested models with different covariates for species for which most data were available: 13 birds, 3 reptiles and 7 mammals, among which the iconic Orangutan (Pongo pygmaeus). For each species, the model which best fitted the data was selected. Two shortcomings in our data complicated the analyses. First, occupancy models require detection and non-detection records, but because no fixed species lists were used, there is no unique way to generate non-detections. We generated non-detection records in several ways and ran the models again to evaluate the consequences for occupancy estimates. Second, imbalances in sampling may occur because of a lack of sampling design to select study sites. Most concerning are sites surveyed only once in 2020–2024. We ran the models without those sites to examine whether results were different. Although the shortcomings mentioned turned out not to distort our results, the occupancy estimates were imprecise for many study species because of low detection rates, and extra efforts are needed to improve that.

## Linked entities

- **Species:** Pongo pygmaeus (taxon 9600)

## Full-text entities

- **Chemicals:** palm oil (MESH:D000073878)
- **Species:** Pongo pygmaeus (Bornean orangutan, species) [taxon 9600]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12863681/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12863681/full.md

## References

30 references — full list in the complete paper: https://tomesphere.com/paper/PMC12863681/full.md

---
Source: https://tomesphere.com/paper/PMC12863681