Outlier Prediction and Training Set Modification to Reduce Catastrophic Outlier Redshift Estimates in Large-Scale Surveys
M. Wyatt, J. Singal

TL;DR
This paper introduces a method using individual galaxy redshift probability distributions to identify and mitigate catastrophic outliers in photometric redshift estimates, improving accuracy in large-scale surveys.
Contribution
It develops a novel approach combining outlier identification with training set modification to enhance high-redshift estimation and reduce catastrophic outliers.
Findings
Identifies over 30% of catastrophic outliers with less than 7% false positives.
Redshift distribution modification reduces outliers at z>1.5 by over 10 percentage points.
In some cases, decreases non-outlier misclassification by about 20 percentage points.
Abstract
We present results of using individual galaxies' probability distribution over redshift as a method of identifying potential catastrophic outliers in empirical photometric redshift estimation. In the course of developing this approach we develop a method of modification of the redshift distribution of training sets to improve both the baseline accuracy of high redshift (z>1.5) estimation as well as catastrophic outlier mitigation. We demonstrate these using two real test data sets and one simulated test data set spanning a wide redshift range (0<z<4). Results presented here inform an example `prescription' that can be applied as a realistic photometric redshift estimation scenario for a hypothetical large-scale survey. We find that with appropriate optimization, we can identify a significant percentage (>30%) of catastrophic outlier galaxies while simultaneously incorrectly flagging…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGamma-ray bursts and supernovae · Data-Driven Disease Surveillance · Cryospheric studies and observations
