Comparison of three Statistical Classification Techniques for Maser Identification
Ellen M. Manning, Barbara R. Holland, Simon P. Ellingsen, Shari L., Breen, Xi Chen, Melissa Humphries

TL;DR
This study compares three statistical classification methods—LDA, logistic regression, and random forests—for identifying interstellar masers in astronomical datasets, highlighting their performance and interpretability.
Contribution
It provides a comparative analysis of parametric and non-parametric classification techniques applied to astronomical data for maser identification.
Findings
Parametric methods excel on small datasets.
Random forests perform well on larger datasets.
Data transformation improves LDA accuracy.
Abstract
We applied three statistical classification techniques - linear discriminant analysis (LDA), logistic regression and random forests - to three astronomical datasets associated with searches for interstellar masers. We compared the performance of these methods in identifying whether specific mid-infrared or millimetre continuum sources are likely to have associated interstellar masers. We also discuss the ease, or otherwise, with which the results of each classification technique can be interpreted. Non-parametric methods have the potential to make accurate predictions when there are complex relationships between critical parameters. We found that for the small datasets the parametric methods logistic regression and LDA performed best, for the largest dataset the non-parametric method of random forests performed with comparable accuracy to parametric techniques, rather than any…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
