# A Morphological Classification Model to Identify Unresolved PanSTARRS1   Sources: Application in the ZTF Real-Time Pipeline

**Authors:** Yutaro Tachibana (Tokyo Institute of Technology), A. A. Miller, (Northwestern/CIERA)

arXiv: 1902.01935 · 2019-02-07

## TL;DR

This paper presents a machine learning model using random forests to classify unresolved sources in Pan-STARRS1 data, improving accuracy especially for faint sources, and integrates it into the ZTF pipeline for real-time transient analysis.

## Contribution

The authors develop a novel RF-based morphological classification model trained on HST, SDSS, and Gaia data, optimized for faint sources in large photometric surveys.

## Key findings

- RF model outperforms SDSS and PS1 models for faint sources
- Classified 1.5 billion sources for ZTF pipeline
- Significantly improves unresolved source identification in large surveys

## Abstract

In the era of large photometric surveys, the importance of automated and accurate classification is rapidly increasing. Specifically, the separation of resolved and unresolved sources in astronomical imaging is a critical initial step for a wide array of studies, ranging from Galactic science to large scale structure and cosmology. Here, we present our method to construct a large, deep catalog of point sources utilizing Pan-STARRS1 (PS1) 3$\pi$ survey data, which consists of $\sim$3$\times10^9$ sources with $m\lesssim23.5\,$mag. We develop a supervised machine-learning methodology, using the random forest (RF) algorithm, to construct the PS1 morphology model. We train the model using $\sim$5$\times10^4$ PS1 sources with HST COSMOS morphological classifications and assess its performance using $\sim$4$\times10^6$ sources with Sloan Digital Sky Survey (SDSS) spectra and $\sim$2$\times10^8$ \textit{Gaia} sources. We construct 11 "white flux" features, which combine PS1 flux and shape measurements across 5 filters, to increase the signal-to-noise ratio relative to any individual filter. The RF model is compared to 3 alternative models, including the SDSS and PS1 photometric classification models, and we find that the RF model performs best. By number the PS1 catalog is dominated by faint sources ($m\gtrsim21\,$mag), and in this regime the RF model significantly outperforms the SDSS and PS1 models. For time-domain surveys, identifying unresolved sources is crucial for inferring the Galactic or extragalactic origin of new transients. We have classified $\sim$1.5$\times10^9$ sources using the RF model, and these results are used within the Zwicky Transient Facility real-time pipeline to automatically reject stellar sources from the extragalactic alert stream.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1902.01935/full.md

## Figures

15 figures with captions in the complete paper: https://tomesphere.com/paper/1902.01935/full.md

## References

64 references — full list in the complete paper: https://tomesphere.com/paper/1902.01935/full.md

---
Source: https://tomesphere.com/paper/1902.01935