Membership analysis and 3D kinematics of the star-forming complex around Trumpler 37 using Gaia-DR3
Swagat R. Das (1),(2), Saumya Gupta (2), Prem Prakash (3), Manash, Samal (4), and Jessy Jose (2) ((1) Departamento de Astronom{\i}a, Universidad, de Chile, Las Condes, 7591245 Santiago, Chile, (2) Indian Institute of, Science Education, Research (IISER) Tirupati

TL;DR
This study uses Gaia-DR3 data and machine learning to identify and analyze the 3D kinematics of star populations in the Trumpler 37 complex, revealing multiple clusters and insights into their formation and expansion.
Contribution
It introduces a new comprehensive membership catalog for Trumpler 37 using Gaussian mixture and random forest methods, and analyzes the complex's kinematics and substructure.
Findings
Identified 1243 probable members, with 60% being new.
Detected multiple clusters and sub-clusters within the complex.
Found slow expansion of the central cluster, indicating recent star formation activity.
Abstract
Identifying and characterizing young populations of star-forming regions is crucial to unravel their properties. In this regard, Gaia-DR3 data and machine learning tools are very useful for studying large star-forming complexes. In this work, we analyze the area of one of our Galaxy's dominant feedback-driven star-forming complexes, i.e., the region around Trumpler 37. Using the Gaussian mixture and random forest classifier methods, we identify 1243 high-probable members in the complex, of which are new members and are complete down to the mass limit of 0.1 0.2~. The spatial distribution of the stars reveals multiple clusters towards the complex, where the central cluster around the massive star HD 206267 reveals two sub-clusters. Of the 1243 stars, 152 have radial velocity, with a mean value of . We…
| Work | No. of stars | Radius (degree) | RA (J2000) | DEC (J2000) |
|---|---|---|---|---|
| Contreras et al. (2002) | 66 | 0.5 | 21:39:09.89 | +57:30:56.07 |
| Sicilia-Aguilar et al. (2006a) | 172 | 0.6 | 21:37:54.41 | +57:33:15.32 |
| Sicilia-Aguilar et al. (2013) | 67 | 0.25 | 21:37:03.17 | +57:29:05.43 |
| Reach et al. (2004) | 17 | 0.12 | 21:36:33.09 | +57:29:13.83 |
| Sicilia-Aguilar et al. (2006b) | 57 | 0.15 | 21:36:39.73 | +57:29:28.45 |
| Morales-Calderón et al. (2009) | 69 | 0.15 | 21:36:36.32 | +57:29:54.78 |
| Barentsen et al. (2011) | 158 | 1.5 | 21:40:00.43 | +57:26:42.60 |
| Nakano et al. (2012) | 639 | 1.4 | 21:39:48.76 | +57:30:31.56 |
| Getman et al. (2007) | 24 | 0.1 | 21:40:36.73 | +58:15:37.51 |
| Mercer et al. (2009) | 39 | 0.15 | 21:38:54.67 | +57:29:17.61 |
| Getman et al. (2012) | 457 | 0.25 | 21:37:05.85 | +57:32:30.06 |
| Silverberg et al. (2021) | 421 | 0.37 | 21:33:59.30 | +57:29:30.76 |
| Cantat-Gaudin et al. (2018) | 460 | 0.7 | 21:38:58.80 | +57:30:50.40 |
| Star No. | RA (2000) | DEC (2000) | RUWE | Parallax | pmra | pmdec | G | BP | RP | |
|---|---|---|---|---|---|---|---|---|---|---|
| (degree) | (degree) | (mas) | (mas/yr) | (mas/yr) | (mag) | (mag) | (mag) | |||
| 1 | 327.6486 | 57.3557 | 0.922 | 1.2040.178 | -2.5530.232 | -3.1510.194 | 18.80 | 20.60 | 17.56 | 0.716 |
| 2 | 327.6364 | 57.4715 | 1.055 | 1.2990.157 | -2.1620.194 | -3.0750.156 | 18.59 | 20.34 | 17.33 | 0.788 |
| 3 | 327.6372 | 58.4742 | 0.951 | 0.9400.012 | -1.5800.016 | -3.8900.012 | 12.04 | 13.02 | 11.07 | 0.682 |
| 4 | 327.4235 | 57.5640 | 0.999 | 1.0320.022 | -3.7460.027 | -4.3020.024 | 15.18 | 15.95 | 14.30 | 0.682 |
| 5 | 327.3769 | 57.5977 | 1.043 | 1.0090.013 | -4.3260.016 | -4.7770.014 | 11.15 | 11.53 | 10.57 | 0.664 |
| 6 | 327.4919 | 57.6289 | 0.964 | 1.0180.011 | -3.3330.012 | -2.1840.011 | 12.53 | 13.71 | 11.46 | 0.682 |
| 7 | 327.1520 | 57.5298 | 0.962 | 1.0170.014 | -3.2340.017 | -4.5640.015 | 14.04 | 14.82 | 13.16 | 0.780 |
| 8 | 327.2893 | 57.6506 | 0.852 | 1.0940.012 | -2.1170.014 | -4.2600.012 | 11.20 | 11.44 | 10.80 | 0.828 |
| 9 | 327.5741 | 57.8185 | 0.981 | 1.0250.069 | -1.0730.076 | -3.1130.082 | 17.59 | 19.09 | 16.42 | 0.702 |
| 10 | 327.1886 | 57.7081 | 1.014 | 1.0490.030 | -1.0480.035 | -4.3330.033 | 15.98 | 17.29 | 14.84 | 0.766 |
| Parameter | Range | Mean | Median | SD |
|---|---|---|---|---|
| RUWE | 1.12 | 1.02 | 0.59 | |
| Parallax (mas) | 0.8340.162 – 1.5640.184 | 1.0850.003 | 1.078 | 0.109 |
| (mas/yr) | -2.5060.006 – -0.3780.015 | -1.1940.002 | -1.187 | 0.325 |
| (mas/yr) | -6.0110.216 – -0.7640.014 | -4.2150.004 | -4.404 | 0.712 |
| Cluster | Radius | No. | RUWE | Parallax | ||||||||||
| (pc) | of stars | (mas) | (mas/yr) | (mas/yr) | ||||||||||
| Mean | Median | SD | Mean | Median | SD | Mean | Median | SD | Mean | Median | SD | |||
| C-1 | 3.80 | 426 | 1.11 | 1.02 | 0.38 | 1.0980.006 | 1.084 | 0.118 | -1.3290.004 | -1.327 | 0.197 | -4.6710.006 | -4.691 | 0.334 |
| C-1A | 1.72 | 162 | 1.11 | 1.02 | 0.37 | 1.1010.009 | 1.079 | 0.126 | -1.2980.007 | -1.309 | 0.159 | -4.6900.010 | -4.699 | 0.287 |
| C-1B | 1.22 | 80 | 1.12 | 1.02 | 0.53 | 1.108 | 1.103 | 0.116 | -1.4420.010 | -1.449 | 0.222 | -4.6560.016 | -4.656 | 0.366 |
| C-2 | 1.40 | 27 | 1.07 | 1.03 | 0.21 | 1.0760.026 | 1.066 | 0.102 | -1.5120.019 | -1.608 | 0.274 | -4.8250.030 | -4.882 | 0.411 |
| C-3 | 2.17 | 60 | 1.11 | 1.02 | 0.38 | 1.0470.013 | 1.052 | 0.085 | -1.1720.008 | -1.117 | 0.317 | -3.4380.014 | -3.266 | 0.580 |
| C-4 | 1.84 | 23 | 1.04 | 1.02 | 0.09 | 1.0850.021 | 1.080 | 0.097 | -0.8310.014 | -0.772 | 0.296 | -3.3320.028 | -3.417 | 0.225 |
| C-5 | 3.76 | 87 | 1.07 | 1.02 | 0.12 | 1.0830.012 | 1.074 | 0.095 | -0.8540.007 | -0.816 | 0.141 | -3.4010.013 | -3.330 | 0.336 |
| Cluster | log(age) | Mass | ||||
|---|---|---|---|---|---|---|
| (yr) | () | |||||
| Mean | Median | SD | Mean | Median | SD | |
| Full | 6.17 | 6.25 | 0.49 | 0.68 | 0.43 | 1.01 |
| C-1 | 6.05 | 6.14 | 0.48 | 0.62 | 0.4 | 1.07 |
| C-1A | 6.06 | 6.14 | 0.49 | 0.70 | 0.4 | 1.37 |
| C-1B | 5.98 | 6.07 | 0.47 | 0.60 | 0.37 | 0.93 |
| C-2 | 5.99 | 6.20 | 0.40 | 0.66 | 0.35 | 1.29 |
| C-3 | 5.97 | 5.98 | 0.48 | 0.74 | 0.36 | 1.26 |
| C-4 | 6.33 | 6.34 | 0.30 | 0.44 | 0.45 | 0.20 |
| C-5 | 6.37 | 6.42 | 0.37 | 0.70 | 0.50 | 0.70 |
| Star No. | RA (2000) | DEC (2000) | X | Y | Z | U | V | W | u | v | w | ||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| (degree) | (degree) | (pc) | (pc) | (pc) | (km/s) | (km/s) | (km/s) | (km/s) | (km/s) | (km/s) | (km/s) | (km/s) | |||
| 1 | 325.1479 | 57.4754 | -165.86 | 998.44 | 90.92 | 22.66 | -18.99 | -13.75 | 33.76 | -6.75 | -6.50 | -5.02 | -0.87 | -0.16 | 1.19 |
| 2 | 324.8110 | 57.3874 | -150.43 | 925.01 | 87.06 | 17.96 | -6.24 | -12.17 | 29.06 | 6.00 | -4.92 | -7.60 | 3.56 | 1.14 | 0.50 |
| 3 | 324.9923 | 57.4759 | -142.16 | 861.63 | 83.00 | 19.24 | -18.27 | -12.72 | 30.34 | -6.03 | -5.47 | 3.72 | -0.40 | 0.11 | 1.75 |
| 4 | 325.0083 | 57.5653 | -170.96 | 1028.83 | 95.00 | 20.73 | -54.06 | -11.33 | 31.83 | -41.82 | -4.08 | -38.90 | 3.59 | 0.29 | -7.38 |
| 5 | 324.6100 | 57.4779 | -135.03 | 832.29 | 83.10 | 16.58 | -14.38 | -11.84 | 27.68 | -2.14 | -4.59 | -0.66 | -0.90 | 0.10 | 3.83 |
| 6 | 324.5352 | 57.4465 | -142.94 | 885.98 | 86.76 | 13.13 | -10.52 | -8.97 | 24.23 | 1.72 | -1.72 | -5.37 | -3.48 | -0.29 | 6.45 |
| 7 | 324.4657 | 57.4479 | -143.82 | 894.13 | 87.71 | 18.56 | -13.26 | -10.34 | 29.66 | -1.02 | -3.09 | -1.56 | -2.31 | -0.42 | 1.64 |
| 8 | 324.7432 | 57.4737 | -145.45 | 891.47 | 86.29 | 21.12 | -20.71 | -14.08 | 32.22 | -8.47 | -6.83 | 6.49 | 0.79 | 0.15 | 0.34 |
| 9 | 324.7699 | 57.4714 | -137.55 | 842.19 | 82.85 | 21.43 | -30.57 | -13.51 | 32.53 | -18.33 | -6.26 | 16.20 | -0.46 | 0.05 | 1.60 |
| 10 | 324.8082 | 57.4760 | -141.29 | 863.34 | 84.10 | 16.27 | 5.70 | -11.68 | 27.37 | 17.94 | -4.43 | -20.46 | 0.53 | 0.17 | 1.10 |
| Cluster | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| (km/s) | (km/s) | (km/s) | (km/s) | (km/s) | (km/s) | (km/s) | (km/s) | (km/s) | (km/s) | (km/s) | (km/s) | |||
| C-1 | 20.47 | -14.33 | -12.75 | 32.57 | -2.09 | -5.50 | 3.04 | 16.15 | 2.03 | 16.56 | 1.11 | -0.06 | 0.07 | 1.02 |
| Parameter | Releative importance |
|---|---|
| RA | 0.010 |
| DEC | 0.009 |
| Parallax | 0.027 |
| 0.172 | |
| 0.128 | |
| G-mag | 0.102 |
| BP-mag | 0.051 |
| RP-mag | 0.138 |
| BP-RP | 0.140 |
| BP-G | 0.169 |
| G-RP | 0.052 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAstronomy and Astrophysical Research · Scientific Research and Discoveries · Stellar, planetary, and galactic studies
Membership analysis and 3D kinematics of the star-forming complex around Trumpler 37 using Gaia-DR3 [email protected] / [email protected]
Departamento de Astronomı́a, Universidad de Chile, Las Condes,
7591245 Santiago, Chile
Indian Institute of Science Education and Research (IISER) Tirupati,
Rami Reddy Nagar, Karakambadi Road, Mangalam (P.O.),
Tirupati 517507, India
Saumya Gupta
Indian Institute of Science Education and Research (IISER) Tirupati,
Rami Reddy Nagar, Karakambadi Road, Mangalam (P.O.),
Tirupati 517507, India
Prem Prakash
Department of Physics, Indian Institute of Technology (IIT) Hyderabad, India
Manash Samal
Physical Research Laboratory, Ahmedabad, Gujrat, India
Jessy Jose
Indian Institute of Science Education and Research (IISER) Tirupati,
Rami Reddy Nagar, Karakambadi Road, Mangalam (P.O.),
Tirupati 517507, India
Abstract
Identifying and characterizing young populations of star-forming regions is crucial to unravel their properties. In this regard, Gaia-DR3 data and machine learning tools are very useful for studying large star-forming complexes. In this work, we analyze the area of one of our Galaxy’s dominant feedback-driven star-forming complexes, i.e., the region around Trumpler 37. Using the Gaussian mixture and random forest classifier methods, we identify 1243 high-probable members in the complex, of which are new members and are complete down to the mass limit of 0.1 0.2 . The spatial distribution of the stars reveals multiple clusters towards the complex, where the central cluster around the massive star HD 206267 reveals two sub-clusters. Of the 1243 stars, 152 have radial velocity, with a mean value of . We investigate stars’ internal and relative movement within the central cluster. The kinematic analysis shows that the cluster’s expansion is relatively slow compared to the whole complex. This slow expansion is possibly due to newly formed young stars within the cluster. We discuss these results in the context of hierarchical collapse and feedback-induced collapse mode of star formation in the complex.
methods: statistical - stars: pre-main-sequence - open clusters and associations: individual (Trumpler 37)
1 Introduction
Star formation is one of the most complicated yet least understood phenomena in the field of astrophysics. Most of the stars form in clusters (Blaauw, 1964; Elmegreen, 1983; Lada, 1987; Clarke et al., 2000; Megeath et al., 2004; Bonnell et al., 2008; Portegies Zwart et al., 2010; Gieles & Portegies Zwart, 2011; Bastian et al., 2012) by the fragmentation and hierarchical collapsing of molecular clouds (Larson, 1981; Elmegreen & Scalo, 2004; Mac Low & Klessen, 2004; McKee & Ostriker, 2007). Star clusters are unique tracers of galactic properties such as their origin, dynamics, and evolution (Kroupa, 2008; Ferraro et al., 2016). In addition to this, such studies aid in investigating the kinematics, dispersion, and evolution of the star-forming environment (Kuhn et al., 2019; Karnath et al., 2019; Pang et al., 2020). Clusters with massive O and B type stars serve as important laboratories for star-formation since these massive stars ionize their surroundings, create H ii regions and shape the evolution of low mass star population in the vicinity through their feedback effects (Jose et al., 2016; Samal et al., 2014; Das et al., 2017, 2021; Zavagno et al., 2020; Gupta et al., 2021; Pandey et al., 2022). Hence, the identification and characterization of cluster members are essential to investigate various star-formation properties, such as stars form hierarchically by the natural collapse of clumpy molecular clouds or by the collapsing gas formed through sweeping and compression of the cold neutral gas by the H ii regions and bubbles. The distinction between these processes is important in understanding the net outcome of star formation, such as star formation efficiency (SFE) and star formation rate (SFR) due to various modes of star formation processes (Dale et al., 2012, 2013; Walch et al., 2015).
The Global Astrometric Interferometer for Astrophysics (Gaia; Gaia Collaboration et al. 2016) data has revolutionized the identification and investigation of various scientific properties of the Galactic clusters (Koposov et al., 2017; Gaia Collaboration et al., 2018a; Bossini et al., 2019; Kuhn et al., 2019; Damian et al., 2021). The Gaia-DR2 (Gaia Collaboration et al., 2018b) data contains the five parameters (positions, parallax, and proper motions) and astrometric solutions of 1.3 billion of stars up to G-band magnitude of 21 (Gaia Collaboration et al., 2018b). Compared to Gaia-DR2, the Gaia-EDR3 improved the accuracy in proper motion and parallax measurements by factors of 2 and 2.5, respectively (Gaia Collaboration et al., 2020). This accuracy improvement has benefited a better distinction of cluster members, especially for distant clusters. The final data release, Gaia-DR3, has significantly improved the radial velocity measurement of stars. The Gaia-DR3 preserves the astrometry properties of Gaia-EDR3, but it has improved the radial velocity measurement compared to the Gaia-DR2 in terms of accuracy and number of stars. This work aims to identify the new member population associated with the star-forming complex around Trumpler 37 (Tr 37) in IC 1396 using the multi-dimensional Gaia-DR3 data and machine learning techniques.
This work is arranged as follows. We describe the complex IC 1396 in Section 2. In Section 3, we present the analysis and results of this work. This includes the details of Gaia-DR3 data, the membership analysis using the machine learning approach, and the properties of the identified members. In Section 4, we discuss the various physical properties of IC 1396 derived using new members identified in this work along with literature-based members. We discuss the complex’s 3D kinematic property and star-formation history in Section 5. We summarize our work in Section 6.
2 IC 1396
The star-forming complex around Trumpler 37 (Tr 37; Trumpler 1930) in IC 1396, shown in Figure 1, is one of the classic examples of H ii regions with simple circular morphology, which is a part of the Cepheus OB2 complex (de Zeeuw et al., 1999). IC 1396 has relatively low (5 mag) foreground reddening (Sicilia-Aguilar et al., 2005; Getman et al., 2012; Nakano et al., 2012). The star-forming complex is believed to be powered by the massive star (HD 206267) of spectral type O6 V, located near the center (Stickland, 1995). This H ii region is well known for its association with more than 20 bright-rimmed clouds (BRCs; Sugitani et al. 1991), fingertip structures, and elephant trunk structures in and around them, suggesting feedback effect from the massive central star (Schwartz et al., 1991; Froebrich et al., 2005; Saurin et al., 2012). The well-known BRCs at the peripheries of the H ii region (IC 1396A and IC 1396N) have often been referred to as the best examples of feedback-driven star formation (Sicilia-Aguilar et al., 2004, 2006b; Getman et al., 2007; Choudhury et al., 2010; Sicilia-Aguilar et al., 2013; Panwar et al., 2014; Sicilia-Aguilar et al., 2014, 2019), with many previous studies focused around IC 1396A. Using Gaia-DR2 data of the previously identified members, Sicilia-Aguilar et al. (2019) estimate a distance of , which is consistent within errors with the previous estimate of Contreras et al. (2002). Also, Sicilia-Aguilar et al. (2005) obtained a mean age of of the complex based on the spectroscopically identified members. The modest distance and low foreground reddening make IC 1396 an ideal target for understanding the evolution of the H ii region and exploring the low-mass population associated with the complex.
We present the entire field of view of IC 1396 using the WISE image in Figure 1. The region exhibits a prominent mid-infrared cavity of radius , which signifies the role of UV photons from the associated massive stars towards the gas and dust content of the cluster. BRCs, fingertip, and elephant trunk structures are visible towards the periphery of the H ii region displaying the feedback-driven activity in the region. To better understand the evolution of the host H ii region and its possible impact on the next generation stars associated with BRCs/globules and hence the star formation history of the complex, it is important to identify the total member population of the whole complex. There have been many studies in the past in search of the young stellar objects (YSOs) associated with the complex, however these surveys have different area coverage and sensitivity. A brief detail of the membership analysis from previous works towards the complex is given in the next subsection.
Gaia-DR3, due to its improvement in both photometry, astrometry, and radial velocity measurements over Gaia-DR2, is the best data set to obtain the membership population of the complex and, subsequently, its physical properties.
2.1 Member population from previous studies
The identified member population towards this complex in the previous studies can broadly be divided into four categories. Spectroscopically identified members (Contreras et al., 2002; Sicilia-Aguilar et al., 2006a, 2013), Spitzer based NIR excess sources (Reach et al., 2004; Sicilia-Aguilar et al., 2006b; Morales-Calderón et al., 2009), identification based on excess emission (Barentsen et al., 2011; Nakano et al., 2012), and X-ray emission sources (Getman et al., 2007; Mercer et al., 2009; Getman et al., 2012). In addition, a relatively more recent analysis by Silverberg et al. (2021) combines the near-infrared data from UKIRT with X-ray data from XMM-Newton to identify Class III YSO cluster members in a region covering the IC 1396A region. Altogether, there are 1791 candidate members identified in the literature. Apart from this, Cantat-Gaudin et al. (2018) have analyzed a large number (1229) of Milky Way clusters using the Gaia-DR2 catalog. They used an unsupervised machine-learning technique to detect the member stars. They have listed the stars with membership probability greater than as candidate cluster members. For IC 1396, they have identified 460 stars within a radius of centered at and . This region mostly covers the central part of the complex around the massive star HD 206267. Recently, Pelayo-Baldárrago et al. (2022)222This artcile is in press, hence detailed comparison of the sources could not be incorporated., using Gaia-EDR3 and optical spectroscopic analysis of the complex, provides distance, age, and distribution of the the member sources. In Table 1, we summarize details of the area covered and the number of stars retrieved in individual work.
We detect the member stars within the region of radius shown as a white dashed circle in Figure 1 and aim to detect new members of the complex. In Section 3.4, we compare the catalog identified in this work with the literature.
3 Analysis & results
3.1 Data from Gaia-DR3
To obtain the Gaia-based membership of the region, we use the Gaia-DR3 catalog, downloaded from the Gaia archive333https://gea.esac.esa.int/archive/. We retrieve all the sources within the radius centered at and . The search region is shown as the white dashed circle in Figure 1, covering the entire IC 1396 complex. To identify the likely cluster members of this complex, we select sources based on the following criteria. All the selected sources must have positive parallax values (). We consider all the sources with their proper-motion ranging between and . This constraint on the proper motion values removes a large fraction of contaminants (Gao, 2018a, b). All the sources we consider must-have magnitude values in G, BP, and RP bands. We thus obtain 458875 sources within the region which satisfy all the criteria mentioned above.
Following the histogram turnover method (Winston et al., 2007; Jose et al., 2013, 2017; Getman et al., 2017; Damian et al., 2021), we obtain the 90% photometry completeness limits of G, BP, and RP bands to be 20.5, 21.5, and 19.5 mag, respectively. This is in agreement with the survey completeness, which is between and (Gaia Collaboration et al., 2020). The corresponding mass completeness limits are estimated in Section 4.2.
3.2 Membership analysis
Detecting the membership of a star-forming region is the first step towards analyzing its various star-formation properties. If the regions are large (e.g., IC 1396, Lupus) or the regions are not isolated, then the identification of members is not straightforward. Several authors have used different methods to achieve this. Here we briefly summarize the different methods of segregating the member stars from the field population. Pioneering works of Sanders (1971); Vasilevskis et al. (1958) adopt the probability measurements of stars using their proper motions to confirm their membership. In these works, they modeled the distribution of stars in the vector point diagram (VPD) using a bi-variate Gaussian mixture model (GMM). Later, adding the celestial coordinates of stars to their proper motions, Kozhurina-Platais et al. (1995) refined the membership probabilities. Some researchers selected the stars by partitioning data space into bins (Platais, 1991; Lodieu et al., 2012). In another work, Balaguer-Núñez et al. (2007) tried to separate the cluster members from the field stars based on their probability density in their VPD space. The broadband photometry is also considered as a tool to separate the cluster members from field stars with the help of color-magnitude (CMD) and color-color diagram (CCD) (Deacon & Hambly, 2004; Balaguer-Núñez et al., 2007). Krone-Martins & Moitinho (2014) have developed a method of computing membership probabilities in an unsupervised manner from the combination of celestial coordinates and photometric measurements. Their method is unsupervised photometric membership assignment in stellar clusters (UPMASK). The method of Sarro et al. (2014); Olivares et al. (2019) uses astrometric and photometric features of the stars for membership analysis. Then they apply the GMM with different components to model the field population and follow the Bayesian information criteria to choose a model. Then this method modeled the cluster with GMM in the astrometric space and a principal curve in the photometric space. Several recent works have used this methodology for membership analysis (Galli et al., 2020, 2021). The use of unsupervised and supervised computation of membership probabilities has also followed in several works in the recent past. In these works, the unsupervised GMM is used to generate a first catalog for the computation of supervised membership probability. These works used the random forest (RF; Breiman 2001; Pedregosa et al. 2012) classifier of the machine learning algorithm for the supervised computation of membership probability. Recently, Muzic et al. (2022) used various CMDs effectively along with RF classifier to obtain the membership of NGC 2244.
So both astrometry and photometric properties of stars play a crucial role in identifying member populations. As discussed, unsupervised and supervised membership probability estimation works efficiently and effectively. The crucial part of this method is preparing a training set, which comes through the unsupervised estimation, the GMM method. However, GMM suffers difficulty in filtration if the field contamination is relatively high. A safe way to overcome this difficulty is to combine the photometric properties in CMDs to obtain a set of stars, which can be used for the supervised membership probability estimation. Our present analysis uses the various CMDs to refine the member population obtained from the GMM method. We use a few CMDs and theoretical isochrones with prior knowledge about the nature of the star-forming complex from earlier studies. This helps to derive a cleaner member data set, which is used as a training set to derive the supervised membership probability using the RF classifier. We discuss the application of both GMM and RF in the following. More detail about the GMM method is explained in Appendix A.
3.2.1 Applying the Gaussian Mixture Model
We use five parameters (proper motions, parallax, and positions) for our clustering analysis using GMM. We have neither used the errors of the corresponding parameters nor the magnitude and color values as input parameters since they do not follow the Gaussian distributions. The GMM method fails drastically in cluster identification if we apply it to all stars (i.e., 458875 number of sources within the whole area). This is one of the significant limitations of the GMM method, which is also observed in other analyses (Gao, 2018b, a). The possible reasons for this failure are described by Cabrera-Cano & Alfaro (1990). They pointed out that if the ratio between field stars and member stars is very high, it might cause an issue in clustering analysis using GMM. The other possible reason could be that the field stars do not follow a gaussian distribution.
To avoid the above issues related to the GMM method, we try to apply GMM over a small sample with minimum field star contamination. We must remember that obtaining the member population is not straightforward when dealing with a large star-forming complex such as IC 1396, whose radius is . The reason is that the member populations of IC 1396 might not follow a single Gaussian distribution in their proper motion parameters, unlike an isolated cluster. So, we have to choose a small region very carefully, such that the astrometric and photometric properties of the stars in this region should represent the whole complex, and also, at the same time, the field star contamination should be as minimum as possible. In this work, we choose a conservative small central circular region of radius around the coordinate mentioned in Section 3.1. We also use information from previous studies to minimize regional field star contamination. The previous studies suggest the distance of IC 1396 to be (Contreras et al., 2002; Sicilia-Aguilar et al., 2019). So we consider the stars that lie within the distance of 700 pc and 1100 pc to run the GMM algorithm so that we can safely throw the stars that lie outside the distance range. With these conditions, there are 6263 stars within the circular region of radius . We apply GMM on the 6263 stars, and based on the unsupervised membership estimation, we try to retrieve an initial sample of member stars, which will be used for the membership analysis based on supervised probability computation using the RF method.
Since the stars can broadly be separated into two groups as cluster members and field contaminants, we apply the GMM method with two components on these 6263 stars, and we retrieve 3760 stars with , and the remaining 2503 stars are mostly non-members consisting of the field star population. A few possible combinations of CMDs and the VPD of these 6263 stars (gray), along with the extracted 3760 starts (black) from GMM, are shown in Figure 2. As seen from the VPD diagram (Figure 2(a)), the 3760 stars populate as the central black region. This is expected since the member stars of a region usually lie within a narrow circular distribution in the VPD plot. However, the VPD plot and distribution of the 3760 stars on the CMDs show that the member stars are still associated with contamination. There could be a few probable reasons for this. In this analysis, we do not apply any constraint on the magnitude of stars to filter a maximum number of member stars in the fainter end. However, the fainter stars have higher uncertainty and are less reliable. The other possible reason is that in the case of a giant star-forming region such as IC 1396, the member stars might have a little wide distribution in proper motions compared to an isolated stellar cluster. That again increases the chance of contamination in the member star population. So it requires a double check to minimize contamination from the 3760 stars extracted from the GMM method. For this, we use various CMDs, shown in Figure 2. Though the cluster associated with IC 1396 have a mean age of 24 Myr (Sicilia-Aguilar et al., 2005), but there is a spread in age up to 10 Myr for some stars, so here we consider only those sources younger than 10 Myr as members. This further removes a significant fraction of contaminated stars from the member population. There are 577 stars left which are more reliable to be member stars. These 577 stars are shown as blue dots in Figure 2. The selected member stars show a distribution that is largely indistinguishable from the field stars, likely due to the large number of field stars along the line of sight compared to the small number of cluster stars. However, compared to the distribution of filed stars, the distribution of member stars peaks at different locations and shares conservative space in the VPD diagram. For a training sample for the RF method, we keep the 577 stars as member stars and the 2503 stars as non-member stars.
3.2.2 Applying the random forest classifier method
In this section, we apply the supervised machine-learning technique, RF classifier to identify the membership of the entire complex. This technique is an ensemble of machine-learning decision trees for classification and regression tasks. Due to its robustness, the RF technique is widely used in the astrophysical field (Dubath et al., 2011; Brink et al., 2013; Liu et al., 2017; Lin et al., 2018; Plewa, 2018; Gao, 2018b, a; Mahmudunnobe et al., 2021). In this work, we use the python-based RF classifier available in the scikit-learn package444https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.
RandomForestClassifier.html.
Before using RF on the total population to identify member stars, we need to train the machine, as described in Appendix B. After checking RF’s efficiency, we run the RF method to obtain the most probable population of the whole complex. The relative importance of the parameters in separating the member and non-member stars is also listed in Appendix B. After training the machine with the training set retrieved from the GMM method, we ran the RF classifier on a total of 458875 stars located in the direction of the complex IC 1396. Out of these stars, we need to retrieve the most reliable member population of the complex. As described in Appendix B, while training the machine, a few color and magnitude terms also become essential in segregating members from non-member populations. In order to make the detection more robust, we can use the parallax parameter to filter out the non-member stars. Here, we run RF on the stars (), which lie within the parallax range of (0.8 to 1.6 mas). With this, we can use the color and magnitude parameters effectively; otherwise, this could increase more unlikely sources.
RF provides a membership probability to each star based on its training in the previous step. In our analysis, we retrieve 1803 likely possible members with a probability value of . Details of the 1803 likely member stars are listed in Table 2. Of these 1803 stars, 1243 have a high probability value of . Hereafter we use these highly probable candidate members for follow-up analysis. In this work, the massive star HD 206267 has . HD 206267 is an multiple-star system of spectral type O5V O9V and an older member of the cluster (Peter et al., 2012; Maíz Apellániz & Barbá, 2020). RUWE, parallax, , and of the star are 5.07, 1.3600.218 mas, -1.9510.120, and -5.4930.281 mas/yr, respectively. The multiple stellar systems resulted in higher RUWE values and proper motions. The radial velocity of the star is (Brandt, 2021), which is well within the radial velocity distribution of the member stars (Figure 11). This is a direct confirmation of its membership. Also, many earlier studies using multi-wavelength data sets show the connection of the massive star with the star-forming complex (Patel et al., 1995; Getman et al., 2012; Sicilia-Aguilar et al., 2014, 2015, 2019). In Figure 3, we plot the proper motion VPD plot for all the 458875 stars. The member stars with identified by RF are shown as blue dots. This plot shows that the members are concentrated within a narrow range of proper motion values.
The G versus G-RP CM diagram is shown in Figure 9 for the member stars () within the region of radius (shown in Figure 1). All the identified member stars indicate a well-defined pre-main-sequence locus on the CM diagram. In Figure 4, we over-plot the likely members on the WISE image, highlighting their distribution as a function of their value. An over-density of the source distribution is visible in the central part of IC 1396. Within the complex, the stars display a diagonal distribution ranging from the BRC IC 1396A to the IC 1396N. Most of the stars are clustered around the massive star HD 206267, shown as the white ‘’ symbol in the figure. IC 1396N is also associated with a small cluster. A tiny clustering of stars is also visible towards the tip of BRC SFO39. A small fraction of stars is also seen to be randomly distributed all around the complex. A clustering of stars also found towards the northern periphery of the complex. The overall distribution of stars is higher towards the west than the east of the complex.
3.3 Characterstics of the member stars
In Figure 5, we show histogram distributions of RUWE555 (Renormalised unit weight error), parallax, and proper-motions of member stars detected in this work. Table 3 provides the range of these parameters. RUWE parameter provides a measure of astrometric solutions. The RUWE value of around 1.0 is expected for sources where the single-star model provides a good fit for the astrometric observations. Stars with RUWE greater than 1.4 are considered resolved doubles (Gaia Collaboration et al., 2020). In our list of selected members, only 144 and 82 stars have RUWE 1.4, from the list with , and 0.8, respectively. These sources with higher RUWE could be multiple-star systems. The stars detected in this work are of good quality sources. Out of the 1243 stars, stars have relative parallax error less than .
Figure 5 (b) displays the histogram distribution of parallaxes for all these identified member stars. In parallax, the stars detected in this work lie within a spread of 0.8 mas with mean, median, and standard deviation values of , 1.078 mas, and 0.109 mas, respectively. The distance to the cluster is estimated using the parallax values of those sources whose relative parallax error () is better than 20% and . Out of 1243, we find 1107 stars satisfy this condition. From these 1107 stars, we estimate the weighted mean parallax to be , which translates to a distance of . This distance estimate matches well with earlier estimates in literature (Contreras et al., 2002; Sicilia-Aguilar et al., 2019; Pelayo-Baldárrago et al., 2022).
In Figure 5 (c) and (d), we show the histogram distributions of the proper motions (). We derive the mean, median, and standard deviation values for to be , , and , respectively. For , these values are , , and , respectively.
3.4 Comparision with literature
In this section, we compare our detected member stars with the sources detected in the literature. As discussed in Section 2.1, there are 1791 stars detected towards the complex based on various surveys. Also, using Gaia-DR2 data, Cantat-Gaudin et al. (2018), detected 460 stars towards IC 1396. We compare our findings separately with the source lists found in the literature.
To compare with the sources of various surveys, we first find their Gaia-DR3 counterpart information. Out of the 1791 stars, 1002 stars have Gaia counterparts. Then we refine the catalog further based on the astrometry quality. Thus we use the 705 stars, which have a relative parallax error of , for comparison. Of the 705 stars, 360 stars () are retrieved in our work as member stars with . The number is 409 () with . Due to their poor membership probability, the remaining stars are not detected as members.
Then we compare our member list with the 460-star list of Cantat-Gaudin et al. (2018). Within the common area, out of the 460 stars, we retrieved 348 () stars in this work with . The number is 389 () with . In this work, we consider only the stars with a higher probability of . In Cantat-Gaudin et al. (2018), they considered all the stars with membership probability above . So the stars, with higher probability, are retrieved in our work. In our work, we identify more member stars than Cantat-Gaudin et al. (2018) mainly due to the large area we consider.
Then we also compared the source list obtained by the various surveys (section 2.1) with the stars detected by Cantat-Gaudin et al. (2018). Here, also we considered the good quality 705 stars for comparison. In this case, we found 221 () survey-based stars common with the catalog of Cantat-Gaudin et al. (2018). There are 196 stars common to all three catalogs discussed here. We summarize the analysis as a Venn diagram (Figure 6).
4 Properties of the complex
4.1 Sub-clusters within the complex
The spatial distribution of the 1243 stars (Figure 4) displays the association of clustering with IC 1396. In this section, we attempt to identify the clusters quantitatively. To do this, we generate the surface density plot using the 1243 member stars and apply the nearest neighbor (NN) method (Casertano & Hut, 1985; Schmeja, 2011). According to this method, the j-th nearest neighbour density is defined as
[TABLE]
where is the distance to its j-th nearest neighbour and is the surface area with radius . To obtain the distribution of member stars, we use , which is found to be an optimum value for cluster identification (Schmeja et al., 2008; Ramachandran et al., 2017; Damian et al., 2021). With this procedure, we generate the stellar density map with a pixel size of 0.1 pc (). Figure 7 shows the WISE 22 map overlaid with density contours. The lowest contour is at 0.6 stars , within which the maximum number of sources falls. These stellar density contours reveal the cluster of stars towards the star-forming complex.
For identification of the clusters in this region, we use the astrodendro algorithm (Robitaille et al., 2019) in Python. This algorithm works based on constructing tree structures starting from the brightest pixels in the dataset and progressively adding fainter and fainter pixels. It requires the threshold flux value (minimum value), contour separation (min delta), and the minimum number of pixels required for a structure to be considered a cluster. In our analysis, we use the threshold and minimum delta to be 1.0, 0.3 stars , respectively. We use the minimum number of pixels as 150 to detect the potential clusters. These parameters are adopted after multiple trials for optimal detection of clusters. We identify six individual leaf structures with these input parameters, which we call clusters here. Two individual clusterings (C-1A and C-1B) are seen towards the massive star HD 206267, and collectively (C-1) is the central cluster of this complex. Towards the tail of BRC IC 1396A, another grouping (C-2) of stars is also seen. Except for this cluster towards the central part, another three clusters are also seen. They (C-3 and C-4) are linked with the BRC IC 1396N and SFO 39, respectively. We also detect a cluster (C-5) close to the boundary of the star-forming complex. Cluster identification in our work matches well with the clusters identified by Nakano et al. (2012) from the emission line survey. In their work, a cluster is associated with the southern BRC SFO 37. However, in our analysis, we cannot see any such cluster with SFO 37, which could be due to the sensitivity of Gaia. The cluster (C-5), which we detect in this work, was not seen by Nakano et al. (2012), which could be because, their work survey a lesser area than the area covered in this study.
In Table 4, we list the statistics (radius, number of stars, mean, median, and standard deviation) of RUWE, parallax, , and for all the identified clusters. We derive physical radius (; Das et al. 2017) of the clusters using the apertures retrived from astrodendro. Area of each cluster is calculated as , where N is the number of pixels and is the area of each pixel. The distribution of parallax and the proper motions of the cluster stars are also displayed in Figure 8. We see from this plot two groupings. As we see, one is the larger group, mainly from the stars of the C-1 and C-2 clusters, and the second is a smaller group that appears due to stars from the other three (C-3, C-4, and C-5) clusters. This is also evident from the histogram distribution of the (Figure 5(d)). To quantitatively confirm our findings, we carry out a two-component Kolmogorov-Smirnov (KS) test with parallax and proper motions. The score from the test is minimal and close to zero for the proper motions. For parallax, the score is 0.02. This quantitatively confirms that proper motion parameters are the distinctive astrometric features, distinguishing the stars projected in the two sub-groups, which is seen in Figure 8.
4.2 Age and mass range of the candidate cluster members
In this section, we estimate the mean age and mass completeness limit of the member stars identified in this analysis. Studies like Sicilia-Aguilar et al. (2005); Getman et al. (2012) and references therein claim an approximate age of 4 Myr for the primary cluster. To estimate the member population’s age and mass completeness limit, we use the PARSEC isochrones available for the filters of Gaia-DR3 (Chen et al., 2014). We need to correct the isochrones for distance and extinction to fit them. In an earlier study using NIR and optical data, Sicilia-Aguilar et al. (2005) have derived the average visual extinction value towards the entire complex to be . This value also matches the estimations by Contreras et al. (2002) and Nakano et al. (2012). The majority of detected stars in this work are located towards the central part of IC 1396, which is expected to be of less extinction due to the presence of massive star(s) around them compared to the surrounding regions such as BRCs, which are associated with the dense molecular clouds. For further analysis, we use the minimum extinction value of obtained from Nakano et al. (2012).
After correcting for distance (917 pc) and extinction (), we plot the isochrones of various ages on the G versus G-RP CMD in Figure 9. To correct the extinction in individual bands for all the sources, we use the empirical relations of (Gaia Collaboration et al., 2018c; Bossini et al., 2019). In Figure 9, we plot various isochrones of evolutionary ages 0.1, 0.5, 2, and 10 Myr along with the evolutionary tracks corresponding to 0.09, 0.3, 0.5, 1, and 2 . From Figure 9, we derive the age of individual stars by assigning the age of the closest isochrone. Similarly, by assigning the closest mass evolutionary track, we derive the mass of individual stars. However, local variation in extinction and binarity of stars might affect the accurate estimation of these parameters. In Figure 10, we show the histogram distribution of logarithmic values of the age. By fitting a Gaussian curve to the distribution, we obtain the mean logarithmic age of the cluster to be , which corresponds to a mean age of Myr. Using of the upper limit of extinction, i.e., , the mean age obtained to be Myr, which is still in match with the previous studies.
As discussed in Section 3.1, we see that the 90% completeness limits of G, BP, and RP bands are 20.5, 21.5, and 19.5 mag, respectively. We use the G-band to estimate the mass-completeness limit of the cluster. Using an extinction value of , distance of 917 pc and considering pre-main-sequence isochrone of 2 Myr (Chen et al., 2014), the magnitude limit of G-band (20.5 mag) corresponds to a mass of . This analysis shows that the Gaia-DR3 is complete down to the low-mass end. However, compared to the central region, i.e., towards the IC 1396A region, the extinction might be higher due to the presence of BRC and an associated molecular cloud. This local variation in extinction will play a role in the local mass completeness of the member stars towards the outer edge of the complex.
We list the mean, median, and standard deviation values of log(age) and mass for the entire complex and the individual clusters in Table 5. The mean and median values of log(age) and mass are similar, considering the whole complex and the clusters. This suggests that most of the population has evolved within the similar time scale of . However, previous studies have shown that, in the proximity of BRC candidates, multi-episodic star-formation is happening (Sicilia-Aguilar et al., 2014). Similarly, the stellar mass distribution appears uniform for the entire complex, which can be seen from the mean and median values for all the clusters. However, local mass segregation might be happening within the individual clusters.
4.3 Cluster properties
Several clusters have been identified towards IC 1396 based on the spatial distribution of the associated stellar members. Each cluster leaves an imprint of the ongoing star formation in the complex. In this section, we briefly discuss the formation of clusters taking into account their age and spatial distribution.
4.3.1 Inner clusters (C-1 and C-2)
Clusters (C-1 and C-2) are located towards the center of the complex. Also, two sub-clustering (C-1A and C-1B) are observed within cluster C-1. Subcluster C-1A is on the eastern side, and C-1B is on the western side of the massive star. The subcluster C-1B is linked to the head of BRC IC 1396 A, while the C-2 is seen towards its tail. C-1A contains more stars with slightly higher ages than C-1B. So the mean age of C-1A is slightly higher compared to C-1B. Similarly, the mean age of cluster C-2 is similar to C-1B. This indicates a multi-generation star formation triggeded by the feedback effect of the central massive star. Earlier studies (Sicilia-Aguilar et al., 2014, 2019; Pelayo-Baldárrago et al., 2022) have reported such triggered star formation activities towards the head of IC 1396 A. The presence of cluster C-2 is also a signature of ongoing triggered star formation towards the BRC complex. Using Herschel PACS images and analyzing the properties of young members in the head of IC 1396 A, (Sicilia-Aguilar et al., 2014) suggested that this second generation of star-formation is triggered via radiative driven implosion (RDI) induced by the massive star HD 206267. However, more in-depth analysis with multiwavelength data would be helpful to understand the mechanism behind the triggered star formation towards the entire IC 1396A region.
4.3.2 Outer clusters (C-3, C-4, and C-5)
The outer clusters (C-3, C-4, and C-5) differ from the inner clusters based on their astrometry properties (see Figure 8). C-3 is linked with BRC IC 1396 N, C-4 with SFO 37, and C-5 in the northwest boundary of IC 1396. The mean age of C-3 is slightly lower than C-1 (refer Table 5). This indicates that the triggered star formation mechanism also forms the stars associated with IC 1396 N. The mean age of C-4 and C-5 appears slightly higher than all other clusters. In these two clusters, a significant fraction of stars of higher age is present. Earlier studies carried out by Ikeda et al. (2008), and Panwar et al. (2014) have already reported sequential star-formation in the direction of BRCs SFO 37 and SFO 39 (see Figure 1) due to the UV radiation impact of the exciting central star. The cluster C-4 is associated with SFO 39, but we do not detect any significant clustering towards SFO 37, as it is a small globule-like structure consisting of mainly a few embedded pre-main-sequence stars.
4.4 Radial Velocity
We searched for the stars with the radial velocity () information in our member list. We obtained 107 stars with radial velocity information from Gaia-DR3. This is an improvement in measurements in the Gaia-DR3 catalog compared to the DR2 catalog. Out of these 107 stars, 85 stars with good astrometry quality, i.e., , are considered for further analysis. The mean and median of of the 85 stars are -16.301.28 and -16.56 , respectively. To maximize the measurements of the member stars of the complex, we also search for the measurements in the literature. In previous work towards the region, Sicilia-Aguilar et al. (2006a) has carried out high-resolution () spectroscopic observations and obtained the radial velocity information for 136 stars. By cross-matching these stars with our Gaia-detected member lists, we find 78 stars in common, out of which 67 stars are of good astrometry quality, i.e., . The mean and median of of the 67 stars are -16.540.25 and -15.80 , respectively. The has a broad range for 85 stars compared to the list of 67 stars taken from Sicilia-Aguilar et al. (2006a). However, the mean and median values for both lists are similar. In Figure 11, we display smooth histogram distribution for sources from both lists. In the figure, we scaled down the curve for the 67 stars by for a better representation. Smoothed distribution from this figure also suggests similar mean and median values found from the different lists. The spatial distribution of these 152 stars with information is shown in Figure 12. Most of the stars are distributed within the central part of the complex, with few distributed all around the complex. Out of these 152 stars, 68 are members of the central cluster (C-1). We note that the properties of the complex and identification of different sub-groups in the complex by us are in close agreement with the recent work by Pelayo-Baldárrago et al. (2022).
5 Discussion
5.1 Kinematic properties of IC 1396
In Figure 12, we show the spatial distribution of the 1243 stars on the WISE band as red dots, along with their proper-motion values as blue arrows. The magnitudes of gives the length of the arrow and the signs of determine the direction. All the arrows are scaled according to the white reference arrow of length 10 mas/yr. As seen from the plot, most stars are moving towards the south, one of the unique features observed towards the star-forming complex. In this section, we analyze the kinematics of the complex to shed more light on the internal motion of the member stars within the complex.
5.1.1 Determination of 3-dimensional position and velocity
Since the complex IC 1396 is a relatively large star-forming complex, it is essential to inspect its physical structure and spatial distribution in Galactic cartesian coordinates, XYZ. We derive the XYZ coordinates for all the sources associated with IC 1396. The origin of the coordinate system is chosen to be Sun. In this system, the X-axis runs along the Sun-Galactic center with a positive direction toward the Galactic center, and the Y-axis is in the Galactic plane orthogonal to the X-axis with its positive direction along the Galactic rotation, the Z-axis is perpendicular to the Galactic plane, oriented in the direction of Galactic North Pole. Thus it makes a right-handed coordinate system. We used the Gaia-DR3 astrometric information of the detected stars and derived their 3-dimensional positions (X, Y, Z) and the heliocentric velocities (U, V, W). We have also computed the LSR velocities for each star along with the heliocentric velocities. The transformation of heliocentric to LSR velocity transformation made considering the solar motion velocities () from Schönrich et al. (2010).
The majority of stars with information lie towards the complex’s central region. So to obtain the kinematic property, we focus only on the central cluster C-1. Table 6 lists the derived 3D-dimensional positions (X, Y, Z), the heliocentric velocities (U, V, W), and the LSR velocities of the 68 stars of the cluster C-1.
5.1.2 Kinematic properties of the stars
In Figure 13, we show the spatial distribution of the 68 stars of C-1, which have radial velocity information in the XY, YZ, and XZ planes. In the top row, we display the heliocentric and LSR velocities. The heliocentric and the LSR velocities indicate the stars’ bulk motion.
To investigate the stability of the cluster C-1, it is essential to analyze the internal kinematics of the stars. First, we derive the mean value of the velocities of the stars. The values are listed in Table 7. To assess the internal motion of the stars, we calculate the difference in velocities of individual stars with respect to the mean value. In the bottom row of Figure 13, we show the . This displays the random movement of the stars with respect to the central velocity. This shows that of stars are canceling each other, and the mean values of are close to zero, indicating no real expansion. The three dimensional dispersion is derived to be .
Then we conduct a qualitative analysis of the relative motion of the stars within the complex in a similar manner carried out by Rivera et al. (2015) for the Taurus complex. This analysis will provide an implication of the stability of the complex. Each star is located at a certain distance from the complex’s center and moves with a relative velocity. We denote the separation from the complex center with a position vector and the relative velocity vector as . Each position vector is associated with a unit vector, which can be represented as , directing from the center of complex towards the location of each star. So the relative motion of stars with respect to the complex center can be used to analyze the two types of motions expansion or contraction and rotation. The expansion and contraction properties can be gauged by looking at the directions of the position vector and the relative velocity vector. For expansion, will be parallel to and for contraction will be anti-parallel to . Hence for expansion, the dot product () should be a large and positive number, and for contraction, it should be a large and negative number. In a similar analogy, the cross product () will be small for both expansion and contraction. In other way, the cross product () will be higher for large-scale rotation, and the dot product () will be minimal.
In the following, we derive the dot and cross products and list them in Table 6. Since in both the dot and cross product parameters, we use the unit position vector , the values of both the parameters have similar velocities. The mean values of the parameters can be expressed with the equations and .
We derive the expansion velocity, , to be 1.11 km/s. The derived rotation velocities are listed in Table 7. From the CO maps Patel et al. (1995) have obtained an expansion velocity of the whole complex to be 5 km/s. Their analysis suggests that the gas within the complex is pushed away to the outskirts by the central massive star resulting in an expansion of the system. A similar expanision velocity is also observed by Pelayo-Baldárrago et al. (2022). Though cluster C-1 is expanding, but its expansion is slow compared to the whole complex. This could be because young stars dominate the central region, and cluster C-1 is expanding slowly due to higher density.
Nearby Galactic clusters are expanding with similar velocities as of cluster C-1, observed by Kuhn et al. (2019). Their study over a set of 28 Galactic clusters using Gaia-DR2, reported a typical expansion velocity of . Similarly, the study conducted by Pang et al. (2021) of 13 open clusters within a distance of 500 pc using Gaia-EDR3 reported many clusters to be super-virial and expanding in nature.
5.2 Star-formation history in IC 1396
IC 1396 is one of the nearby star-forming complexes dominated by feedback-driven star formation activity (see Section 2). The energetic stellar wind from the central massive star has cleared up most of the gas, resulting in a cavity of radius . The large cavity can be seen at infrared wavelengths with photodissociation regions (PDRs) associated with the boundary of the complex (see Figure 1). This massive feedback effect also forms BRCs and fingertip structures within the complex (Schwartz et al., 1991; Froebrich et al., 2005; Saurin et al., 2012). Here, we discuss the overall star formation history of the complex.
The spatial distribution of the member sources (see Figure 4) and their association with the BRCs all indicate the ongoing feedback-driven star formation activity within the complex. The mean age of the sub-clusters (see Section ) suggests a multi-generation star formation activity within the complex. However, the sub-clusters formation in the complex might have happened through a hierarchical process. To assess this nature, we conduct a KS test on the age of the two major groups of stars (see Section 4.1). One group is from the inner clusters (C-1 and C-2), and the other is from the outer clusters (C-3, C-4, and C-5). The score of the KS test comes out to be 0.00026. This low value of score indicates that a majority number of stars from both groups might have formed over a similar time scale. The hierarchical star formation could be due to the fractal and turbulent nature of the ambient cloud, where star formation can occur simultaneously or near simultaneously at different locations of the clouds (Bonnell et al., 2003; Grudić et al., 2018; Torniamenti et al., 2022). However, one limitation of our analysis is that we have probed stars using optical measurements. Thus many sources embedded in the BRCs might be missing in our analysis; as a result, the estimated ages of the groups associated with BRCs are likely upper limits.
Kinematics and age analysis of the embedded members are needed to understand whether the groups associated with BRCs are formed through entirely hierarchical collapse processes or whether stellar feedback from the central cluster has helped induce star formation in these clouds. In favorable conditions, stellar feedback can enhance or accelerate star formation in pre-existing clouds where star formation is already underway. In this case, one may have both older as well as the young population of sources. Observations show that young clusters tend to show typical velocity dispersion of 2 km/s (Kuhn et al., 2019). Thus, older stars can move pc in 2 Myr of time, so inferences such as age gradient and elongated morphology, which are signatures of induced star formation as we move from ionizing sources to the tip of the BRCs, can be erased, particularly, if we are dealing with smaller groups or number of stars. Thus compressive spectroscopic and kinematic analysis of member stars in both the optical and infrared bands would be highly desirable to shed more light in understanding the formation of different sub-groups in the complex.
6 Summary
We use the high-precision Gaia-DR3 astrometry and photometry data and apply the machine learning algorithms to carry out the membership analysis of the complex. Using the identified members in this work, we study various star-formation properties of this complex. In the following, we report our significant findings from this work.
Using the Gaia-DR3 astrometry and photometry data and applying the supervised RF technique of the machine learning algorithm, we identify this complex’s 1243 high probable member populations. The identified member population is of high quality, with 95% stars having a relative parallax error of less than 20%. More than 99% stars have RUWE less than 1.4 suggesting they are of high astrometry quality. Of the 1243 stars, 731 are entirely new members identified in this work. This has significantly enhanced the reliable member population list for IC 1396. 2. 2.
The mean values of the parameters RUWE, parallax, , and are 1.12, mas, mas/yr, and mas/yr, respectively. The spatial distribution of the parallax, , and suggests that the total population is broadly segregated into two groups. Our KS test shows that proper motion parameters are the most distinctive astrometric features, distinguishing the stars projected in the two sub-groups. 3. 3.
The spatial distribution of the stars reveals the associated clusters. We use the NN method to identify 6 clusters (# C-1A, C-1B, C-2, C-3, C-4, and C-5) towards IC 1396. C-1A and C-1B are the subclusters of the central cluster C-1. We study the statistical properties of stars lying within the subclusters. 4. 4.
Using the G vs. G-RP CMD and parsec isochrones, we estimate the age and mass of individual stars. The mean age derived from all the 1243 stars to Myr, matching with the estimations from previous studies. Using the completeness limit of 19 mag in the G band and distance to be 917 pc, we derive the mass completeness limit for the complex to be . Thus suggesting the complex is associated very low massive population. 5. 5.
Of the 1243 stars, 152 good quality stars () have measurements, out of which 85 stars information from Gaia-DR3 and the remaining 67 stars from a high-resolution spectroscopic study of Sicilia-Aguilar et al. (2006a). The mean and median values of derived from the 152 stars are and 15.80 km/s, respectively. 6. 6.
We carry out a 3D kinematic analysis to understand the internal motion of stars within the central cluster C-1. We use the values and astrometric data of the 68 stars of the cluster. We derive the 3-D cartesian positional and velocities of each star. To study the stability of the cluster, we derive the expansion velocity, which is low compared to the previous value derived based on CO maps. The low value of the expansion velocity of the cluster suggests a slow expansion compared to the whole complex. The slow expansion might be due to the higher density of recently formed young stars. 7. 7.
Considering the spatial distribution, association with BRCs, and age of stars, we study the overall star formation within the complex. The variation in the age of the sub-clusters suggests an ongoing multi-generation star formation process in the complex. However, the sub-clusters of the complex might have formed through a hierarchical process.
We thank the anonymous referee for a constructive review of the manuscript, which helped in improving the quality of the paper. SRD acknowledges support from Fondecyt Postdoctoral fellowship (project code 3220162). This work presents results from the European Space Agency (ESA) space mission Gaia. Gaia data are being processed by the Gaia Data Processing and Analysis Consortium (DPAC). Funding for the DPAC is provided by national institutions, in particular the institutions participating in the Gaia MultiLateral Agreement (MLA). The Gaia mission website is https://www.cosmos.esa.int/gaia. The Gaia archive website is https://archives.esac.esa.int/gaia. This publication makes use of data products from the Wide-field Infrared Survey Explorer, which is a joint project of the University of California, Los Angeles, and the Jet Propulsion Laboratory/California Institute of Technology, funded by the National Aeronautics and Space Administration. This research has made use of the SIMBAD database, operated at CDS, Strasbourg, France. This work made use of various packages of Python programming language.
Appendix A Gaussian Mixture Model
GMM works on the simple principle of identifying the normally distributed sub-populations from the overall population. This model assumes that the data points are generated from a mixture of a finite number of Gaussian distributions with unknown parameters. In the GMM method, each data point will be categorized into cluster members or non-members, depending on its membership score (probability). The mixture models do not require prior knowledge of classifying sub-populations. This allows the model to learn the sub-population in an automated way. Since there is no previous knowledge of the sub-population assignment, this mixture model constitutes unsupervised machine learning. This technique is widely used in various fields, including astrophysics (Lee et al., 2012; Igoshev & Popov, 2013; Zhang et al., 2016; Chattopadhyay & Maitra, 2017; Holoien et al., 2017; Kaplan et al., 2018; Gao, 2018b, a). Below we briefly describe the working principle of the GMM method.
If there are m clusters present in n-dimensional parameter space, then the probability distribution of a data x is estimated as the weighted summation of all the m-Gaussian components.
[TABLE]
where, is the covariance matrix, and is the mixture weight of the k-th Gaussian component, which satisfies the condition . The distribution of individual Gaussian cluster is
[TABLE]
where, and are the mean vector and covariance matrix of the k-th Gaussian component. The is the determinant of .
In GMM, the parameters are determined using the unsupervised machine-learning technique, known as the expectation-maximization (EM) algorithm (Dempster et al., 1977; Press et al., 2007). The maximum likelihood of the data strictly increases with each subsequent iteration, which implies that it is guaranteed to approach a local maximum. This algorithm does not assume any prior knowledge about clustering structures. The EM algorithm starts with an initial guess for N data points and learns the GMM parameters from the data. This process involves a few steps, which is described in detail in Lee et al. (2012). After calculating the distribution parameters, the distribution probability for each data point x can be estimated.
Before carrying out the clustering analysis, it is essential to normalize the data. This data normalization is often required for similarity measures (e.g., Euclidian distance), which are sensitive to the differences in magnitudes or scales (Gao, 2018b, a). In our case, we have done the data normalization following the discussions made in Gao (2018b). If N stars have an n-dimensional parameter space, the normalized parameter in the jth dimension is defined as:
[TABLE]
where, the original parameter, is the median of distribution, and is its standard deviation.
Appendix B Random forest classifier efficiency
As explained in Section 3.2.1, within a small circular area of radius , we find 577 stars as probable members, and 2503 stars to be non-members. Using this result of GMM, we construct a reliable training set. This is quite important since the RF method is highly dependent on the training set. Since RF is handy in handling large dimensions, we use 11 input parameters in this work. The input parameters set include five position parameters: coordinates, proper motions, parallax, and six photometric parameters such as magnitudes in G-band, BP-band, RP-band, BP-RP color, and BP-G and G-RP color. Hence, we construct the RF classifier using the 11-dimensional reliable training set and test its accuracy. For this purpose, we use 60% of the input 3080 stars to train the RF classifier and the remaining 40% data to test the accuracy. So in our case, out of 3080 stars, 1848 stars are used to train the RF method, and the remaining 1232 stars are used to test how well the machine gets trained in recognizing the member stars and the field stars. The machine itself randomly performs the choice of training and test sets. We obtain a high accuracy of 0.99 while running the RF method over the test data set. The confusion matrix shown in Figure 14 presents the RF method’s high accuracy. This confusion matrix shows how the machine identifies the sources based on training. As can be deduced from the confusion matrix, out of 1232 sources used to test the machine’s accuracy, the machine successfully identified 989 non-member or field stars and 233 cluster member stars. The machine is confused with only a few field and cluster member stars during classification. This exercise demonstrates the effectiveness of the RF method.
Table 8 provides the relative importance of 11 input parameters found by RF while providing the membership probability. We see that the proper motion in ra () has maximum relative importance in membership identification compared to other parameters. The proper motion in dec () also has relatively high importance in segregating member and non-member stars. However, in our case, the color terms (BP-G and BP-RP) and the magnitude (RP-mag) get higher importance in correctly identifying members and non-members. The reason is due to the filtering of stars using CMDs during the GMM method (see Section 3.2.1). Usually, proper motions play a dominant role in cluster identification; in our case, we also observe the same. The coordinates of the stars (RA, DEC) have minor importance in membership identification. In the previous analysis, Gao (2018b, a) also obtain a similar result in the regions NGC 6405 and M67. It is worth mentioning here that while running RF, there is no need for data normalization as was done for the GMM method.
Appendix C CMD plots of stars with
Generally, stars with are non-member stars. Here in Figure 15, we show the CMDs of stars retrieved with and G band less than 19 mag. Stars laying within different values are shown here. This is to check their location on the CMD. Out of the total stars with and G mag less than 19, the majority () stars lie within . The stars with are the most likely non-member. However, stars with higher probability spread on the plot. This discussion aims to shed light on the nature of stars with different probabilities. This is to stress the fact that the member and non-member stars should be chosen carefully in this type of membership analysis, where the magnitude and color terms will play a major role in segregating members and non-member stars. An overlap in the magnitude and color terms of both member and non-member stars will lead to the failure of effective training of the machine.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Balaguer-Núñez et al. (2007) Balaguer-Núñez, L., Galadí-Enríquez, D., & Jordi, C. 2007, A&A, 470, 585, doi: 10.1051/0004-6361:20067003 · doi ↗
- 2Barentsen et al. (2011) Barentsen, G., Vink, J. S., Drew, J. E., et al. 2011, MNRAS, 415, 103, doi: 10.1111/j.1365-2966.2011.18674.x · doi ↗
- 3Bastian et al. (2012) Bastian, N., Adamo, A., Gieles, M., et al. 2012, MNRAS, 419, 2606, doi: 10.1111/j.1365-2966.2011.19909.x · doi ↗
- 4Blaauw (1964) Blaauw, A. 1964, ARA&A, 2, 213, doi: 10.1146/annurev.aa.02.090164.001241 · doi ↗
- 5Bonnell et al. (2003) Bonnell, I. A., Bate, M. R., & Vine, S. G. 2003, MNRAS, 343, 413, doi: 10.1046/j.1365-8711.2003.06687.x · doi ↗
- 6Bonnell et al. (2008) Bonnell, I. A., Clark, P., & Bate, M. R. 2008, MNRAS, 389, 1556, doi: 10.1111/j.1365-2966.2008.13679.x · doi ↗
- 7Bossini et al. (2019) Bossini, D., Vallenari, A., Bragaglia, A., et al. 2019, A&A, 623, A 108, doi: 10.1051/0004-6361/201834693 · doi ↗
- 8Brandt (2021) Brandt, T. D. 2021, Ap JS, 254, 42, doi: 10.3847/1538-4365/abf 93c · doi ↗
