A Machine Learning Framework for Handling Unreliable Absence Label and Class Imbalance for Marine Stinger Beaching Prediction
Amuche Ibenegbu, Amandine Schaeffer, Pierre Lafaye de Micheaux,, Rohitash Chandra

TL;DR
This paper develops a machine learning framework to predict bluebottle marine stinger presence on beaches, effectively handling class imbalance, unreliable absence data, and class overlap, with Random Forests and synthetic negative data showing the best results.
Contribution
It introduces a novel approach combining data augmentation techniques to address class overlap and unreliable absence labels in marine stinger prediction models.
Findings
SMOTE failed to resolve class overlap
Presence-focused approach effectively handled imbalance
Random Forest with Synthetic Negative Approach was most accurate
Abstract
Bluebottles (\textit{Physalia} spp.) are marine stingers resembling jellyfish, whose presence on Australian beaches poses a significant public risk due to their venomous nature. Understanding the environmental factors driving bluebottles ashore is crucial for mitigating their impact, and machine learning tools are to date relatively unexplored. We use bluebottle marine stinger presence/absence data from beaches in Eastern Sydney, Australia, and compare machine learning models (Multilayer Perceptron, Random Forest, and XGBoost) to identify factors influencing their presence. We address challenges such as class imbalance, class overlap, and unreliable absence data by employing data augmentation techniques, including the Synthetic Minority Oversampling Technique (SMOTE), Random Undersampling, and Synthetic Negative Approach that excludes the negative class. Our results show that SMOTE…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMaritime Navigation and Safety
MethodsSynthetic Minority Over-sampling Technique.
