Effect of training characteristics on object classification: an application using Boosted Decision Trees
Ignacio Sevilla-Noarbe, Pen\'elope Etayo-Sotos

TL;DR
This paper applies Boosted Decision Trees to classify stars and galaxies in photometric data, demonstrating improved accuracy and analyzing how training choices affect performance in astronomical object classification.
Contribution
It introduces the use of BDTs in optical astronomy for star-galaxy separation and evaluates the impact of input features and training sets on classification effectiveness.
Findings
BDTs improve galaxy sample purity by 2-4 times at the same efficiency.
Training set composition significantly influences classification performance.
Using BDTs surpasses traditional thresholding methods in accuracy.
Abstract
We present an application of a particular machine-learning method (Boosted Decision Trees, BDTs using AdaBoost) to separate stars and galaxies in photometric images using their catalog characteristics. BDTs are a well established machine learning technique used for classification purposes. They have been widely used specially in the field of particle and astroparticle physics, and we use them here in an optical astronomy application. This algorithm is able to improve from simple thresholding cuts on standard separation variables that may be affected by local effects such as blending, badly calculated background levels or which do not include information in other bands. The improvements are shown using the Sloan Digital Sky Survey Data Release 9, with respect to the type photometric classifier. We obtain an improvement in the impurity of the galaxy sample of a factor 2-4 for this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
