Finite-sample equivalence in statistical models for presence-only data
William Fithian, Trevor Hastie

TL;DR
This paper clarifies the relationships among IPP, Maxent, and logistic regression models for presence-only data, showing their equivalences and differences, and introduces a new weighted logistic regression method that aligns with IPP estimates.
Contribution
It demonstrates the equivalence of IPP and Maxent models, highlights the finite-sample differences with logistic regression, and proposes infinitely weighted logistic regression as a practical, exact alternative.
Findings
IPP and Maxent produce identical density estimates.
Logistic regression generally yields different estimates, especially with model misspecification.
Infinitely weighted logistic regression matches IPP estimates exactly in finite samples.
Abstract
Statistical modeling of presence-only data has attracted much recent attention in the ecological literature, leading to a proliferation of methods, including the inhomogeneous Poisson process (IPP) model, maximum entropy (Maxent) modeling of species distributions and logistic regression models. Several recent articles have shown the close relationships between these methods. We explain why the IPP intensity function is a more natural object of inference in presence-only studies than occurrence probability (which is only defined with reference to quadrat size), and why presence-only data only allows estimation of relative, and not absolute intensity of species occurrence. All three of the above techniques amount to parametric density estimation under the same exponential family model (in the case of the IPP, the fitted density is multiplied by the number of presence records to obtain a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
