TL;DR
This paper introduces a novel anomaly detection-based automatic quality control system for oceanographic data, significantly reducing false positives and improving classification accuracy over traditional methods.
Contribution
It presents a new multidimensional anomaly detection approach for automatic QC of oceanographic data, implemented in an open source Python package, CoTeDe.
Findings
Reduced error rate by at least 50% using anomaly detection
Outperformed traditional QC methods in classification performance
Applied successfully to 13 years of hydrographic data
Abstract
Sampling errors are inevitable when measuring the ocean; thus, to achieve a trustable set of observations requires a quality control (QC) procedure capable to detect spurious data. While manual QC by human experts minimizes errors, it is inefficient to handle large datasets and vulnerable to inconsistencies between different experts. Although automatic QC circumvents those issues, the traditional methods results in high rates of false positives. Here, I propose a novel approach to automatically QC oceanographic data based on the anomaly detection technique. Multiple tests are combined into a single, multidimensional criterion that learns the behavior of the good measurements, and identifies bad samples as outliers. When applied to 13 years of hydrographic profiles, the anomaly detection resulted in the best classification performance, reducing the error by at least 50%. An open source…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
