A search for new symbiotic stars in the Milky Way: Using machine learning techniques applied to photometric databases
V. Contreras Rojas, M. Jaque Arancibia, C.E. Ferreira Lopes, N. Monsalves, R. Angeloni, G. J. M. Luna, V. Marels, D. Concha, N. E. Nunez, C. Saffe, M. Flores

TL;DR
This paper develops a machine learning method using photometric data to identify new symbiotic stars in the Milky Way, significantly expanding the known population.
Contribution
It introduces a Random Forest model trained on confirmed systems and applies it to millions of sources to discover new candidates with high confidence.
Findings
Achieved an 89% F1-score in classifying symbiotic stars.
Identified 990 candidates with >70% probability, refined to 12 high-confidence objects.
Validated the method by recovering 92.3% of recently confirmed systems.
Abstract
Symbiotic stars (SySts) are interacting binaries composed of a red giant transferring material to a hot compact star, typically a white dwarf. Although only about 300 systems are confirmed, the Galactic population is estimated at 1.2 x 10^3 - 1.5 x 10^4, indicating that most remain undiscovered. We identify new SySts using a machine-learning approach that combines Gaia DR3, 2MASS, and WISE photometry, parallaxes, and the pseudo-equivalent width of H alpha. A Random Forest model was trained on 166 confirmed S-type SySts and 1600 non-symbiotic stars, applying SMOTE to mitigate class imbalance. The model achieved an F1-score of 89% for the symbiotic class. Applied to 2.5 x 10^6 color-selected sources, it identified 990 candidates with probabilities more than 70%. We further refined the sample using physically motivated cuts on effective temperature, surface gravity, metallicity, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
