Inferring the parallax of Westerlund 1 from Gaia DR2
Mojgan Aghakhanloo, Jeremiah W. Murphy, Nathan Smith, John Parejko,, Mariangelly D\'iaz-Rodr\'iguez, Maria R. Drout, Jose H. Groh, Joseph Guzman,, Keivan G. Stassun

TL;DR
Using Gaia DR2 data and Bayesian inference, the paper accurately determines the distance to Westerlund 1, significantly refining previous estimates and impacting our understanding of its stellar properties and evolution.
Contribution
The paper introduces a Bayesian model incorporating Gaia DR2 parallaxes and zero-point correction to precisely estimate the distance to Westerlund 1, reducing uncertainty and challenging prior distance assumptions.
Findings
Distance to Wd1 is approximately 2.6 kpc with 18% uncertainty.
The cluster's mass is about four times lower than previous estimates.
The initial mass at the main-sequence turn-off is revised from 40 to 22 solar masses.
Abstract
Westerlund 1 (Wd1) is potentially the largest star cluster in the Galaxy. That designation critically depends upon the distance to the cluster, yet the cluster is highly obscured, making luminosity-based distance estimates difficult. Using {\it Gaia} Data Release 2 (DR2) parallaxes and Bayesian inference, we infer a parallax of mas corresponding to a distance of kpc. To leverage the combined statistics of all stars in the direction of Wd1, we derive the Bayesian model for a cluster of stars hidden among Galactic field stars; this model includes the parallax zero-point. Previous estimates for the distance to Wd1 ranged from 1.0 to 5.5 kpc, although values around 5 kpc have usually been adopted. The {\it Gaia} DR2 parallaxes reduce the uncertainty from a factor of 3 to 18\% and rules out the most often quoted value of 5 kpc with 99\% confidence.…
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Inferring the parallax of Westerlund 1 from Gaia DR2
Mojgan Aghakhanloo1, Jeremiah W. Murphy1, Nathan Smith2, John Parejko3,Mariangelly Díaz-Rodríguez1, Maria R. Drout4, Jose H. Groh6, Joseph Guzman 1 and Keivan G. Stassun7,8
1Department of Physics, Florida State University, 77 Chieftan Way, Tallahassee, FL 32306, USA
2Steward Observatory, University of Arizona, 933 N. Cherry Ave., Tucson, AZ 85721, USA
3Department of Astronomy, University of Washington, Box 351580, Seattle, WA 98195, USA
4The Observatories of the Carnegie Institution for Science, 813 Santa Barbara St, Pasadena, CA 91101, USA
5School of Physics, Trinity College Dublin, The University of Dublin, Dublin, Ireland
6Department of Physics & Astronomy, Vanderbilt University, 6301 Stevenson Center Lane, Nashville, TN 37235, USA
7Department of Physics, Fisk University, 1000 17th Avenue N., Nashville, TN 37208, USA E-mail: [email protected] and Carnegie-Dunlap Fellow
Abstract
Westerlund 1 (Wd1) is potentially the largest star cluster in the Galaxy. That designation critically depends upon the distance to the cluster, yet the cluster is highly obscured, making luminosity-based distance estimates difficult. Using Gaia Data Release 2 (DR2) parallaxes and Bayesian inference, we infer a parallax of mas corresponding to a distance of kpc. To leverage the combined statistics of all stars in the direction of Wd1, we derive the Bayesian model for a cluster of stars hidden among Galactic field stars; this model includes the parallax zero-point. Previous estimates for the distance to Wd1 ranged from 1.0 to 5.5 kpc, although values around 5 kpc have usually been adopted. The Gaia DR2 parallaxes reduce the uncertainty from a factor of 3 to 18% and rules out the most often quoted value of 5 kpc with 99% confidence. This new distance allows for more accurate mass and age determinations for the stars in Wd1. For example, the previously inferred initial mass at the main-sequence turn-off was around 40 M*⊙; the new Gaia DR2 distance shifts this down to about 22 M⊙*. This has important implications for our understanding of the late stages of stellar evolution, including the initial mass of the magnetar and the LBV in Wd1. Similarly, the new distance suggests that the total cluster mass is about four times lower than previously calculated.
keywords:
stars:evolution-open clusters and associations: individual: Westerlund 1-methods: Bayesian analysis.
††pagerange: Inferring the parallax of Westerlund 1 from Gaia DR2–References††pubyear: 2019
1 INTRODUCTION
Massive stars are a central focus of ongoing work in stellar evolution theory. Many gaps exist in our understanding of massive stars due to their rarity, short lifetimes, high fraction of interacting binaries, and imprecise Galactic distances. To better understand the evolutionary path of massive stars, it is helpful to explore associated clusters or OB associations. If one can show that the star is a member of a cluster or association, studying the cluster provides a unique insight into intrinsic properties of cluster members. For example, Westerlund 1 has a Luminous Blue Variable (LBV; Clark & Negueruela, 2003), at least 24 Wolf_Rayet stars (WR; Clark et al., 2005; Crowther et al., 2006; Groh et al., 2006; Fenech et al., 2018), 6 yellow hypergiants (YHG; Clark et al., 2005), and a magnetar (Muno et al., 2005); knowing the distance to this one cluster will help to constrain the luminosity, mass, and evolution of all of these late phases of stellar evolution. However, our sightline to the cluster suffers from substantial extinction and reddening, which has made its distance difficult to estimate using luminosity indicators (Piatti et al., 1998; Clark et al., 2005; Damineli et al., 2016). Gaia parallaxes provide an independent distance indicator.
The massive young star cluster, Westerlund 1 (Wd1), was detected by Westerlund (1961) during a survey of the Milky Way. Wd1 is located at , , which corresponds to Galactic coordinates of and .
The first distance estimates mostly relied on reddening-distance relationships and ranged from 1.0 to 5 kpc (Westerlund, 1961, 1968; Piatti et al., 1998; Clark et al., 2005; Crowther et al., 2006). Westerlund (1961) first suggested an extinction of mag, and reported a distance of 1.4 kpc. Later, Westerlund (1968) derived a significantly larger distance of 5 kpc by using VRI photographic photometry with near-infrared photometry of the brightest stars. In contrast, Piatti et al. (1998) presented CCD imaging in the V and I bands, and using isochrone fitting, estimated a distance of kpc.
More recently, Clark et al. (2005) obtained spectra for the brightest members of Wd1 and improved our knowledge of Wd1. With more detailed spectra, many of the brightest members were identified as post-main-sequence stars. Since the isochrone fitting of Piatti et al. (1998) assumed that many of the stars are on the MS, the Piatti et al. (1998) distance estimate was incorrect. Six of the stars in Clark et al. (2005) are YHGs, and the most luminous YHGs are presumed to have relatively standard luminosity of around log (Smith et al., 2004). Assuming that the YHGs in Wd1 were at the observed upper luminosity limit for cool hypergiants, and adopting an extinction of , Clark et al. (2005) inferred a distance of 5.5 kpc. However, they noted that their reddening law is not entirely consistent with Wd1 data. To place a lower limit on the distance, they noted a lack of radio emission from the WR winds; this suggests a minimum distance of 2 kpc. Hence, Clark et al. (2005) reported a distance of kpc. However, these constraints on the distance and reddening depend sensitively on assumptions of wind physics for evolved stars, which is still uncertain (Smith, 2014).
A variety of subsequent investigations produce similar results and accuracy. Crowther et al. (2006) inferred a similar distance using near-IR classification of WN and WC stars. More recently, Kothes & Dougherty (2007) derived a distance of 3.9 0.7 kpc based on the radial velocity of Hi features in the direction of Wd1. Brandner et al. (2007) studied the population of stars below 30 M*⊙* and derived a distance of 3.55 0.17 kpc based on the apparent brightness and width of the CMD using Pre-MS isochrones. Later, Gennaro et al. (2011) performed a similar analysis but also modelled the completeness, field star subtraction, and error propagation, and inferred a distance value of 4.00.2 kpc. Many of the previous techniques require some assumption about the luminosity of spectral classes. Recently, Koumpia & Bonanos (2012) used the dynamics and the geometry of an eclipsing binary (W13) to infer a distance of 3.70.6 kpc. Given all of difficulties explained above in estimating the distance, it is necessary to use Gaia Data Release 2 (DR2) to find an independent distance estimate to Wd1.
These distances have implications for the mass, age, and cluster members. For example, assuming a distance of 5 kpc, Clark & Negueruela (2003) inferred a total cluster mass of M*⊙* and an age of 3.5-5 Myr. With this age, the magnetar’s progenitor would have an initial mass of 40 M*⊙*. The distance is critical, but estimates of the distance suffer from large uncertainties and systematics. It is therefore valuable to infer the distance using a more accurate and geometric technique.
The main goal of this paper is to infer an independent and geometric distance to Wd1 using Gaia DR2. The Gaia DR2 parallax precision for individual stars in Wd1 ranges from 0.04 to 1.0 mas. Since Wd1 is probably of order a few kpc in distance, the larger uncertainties prohibit a precise distance estimate for an individual star. However, the combined statistics of all cluster members should easily produce a more precise distance. Section 2 describes the data and method to infer the distance. First, we describe the Gaia DR2 data and possible systematic uncertainties. Then we infer the approximate parallax of the cluster by estimating the true mean parallax for several annuli from the cluster centre (Section 2.2). In Section 2.3, we use a Bayesian inference technique to infer the cluster parallax, cluster density, field-star density, the parallax zero-point of the cluster, field-star length scale, and the field-star offset. In Section 3, we present the inferred distance to Wd1 and the length scale for the field star distribution. Then, we compare this distance and field-star distribution with previous works (Section 4). We also discuss how the revised distance affects the properties of cluster members. Section 5 presents a summary and direction for further investigation.
2 Method
In this section, we describe the method to infer the distance to Wd1. The stars in the direction of Wd1 comprise cluster stars as well as Galactic field stars. Therefore, to infer the distance to Wd1, our likelihood model must account for both cluster and field stars. The following sections describe the data and methods required to model both components and infer the distance to Wd1.
2.1 Gaia Data Release 2 data
The source of the data is the Gaia DR2 (Prusti et al., 2016; Brown et al., 2018). We collect all Gaia DR2 sources within 10 arcmin of the position of Wd1; , .
Fig. 1 presents the positions of all objects within 10 arcmin of Wd1. The inner circle marks a region that is 1 arcmin from the centre of the cluster. The outer annulus extends from 9 to 10 arcmin. The density of stars is mostly uniform throughout the field of view, it is slightly over dense towards the right; however, the density does noticeably increase towards the centre. This spatial separation between cluster and field stars suggest a strategy for constraining the parameters for each population. The inner circle contains both field and cluster stars, but is dominated by cluster stars. Therefore, the inner circle provides a good constraint on the cluster population. The outer annulus are mostly field stars. Therefore, the outer region will constrain the field star distribution.
Fig. 2 shows histograms of parallax, parallax uncertainty, astrometric excess noise, and astrometric excess noise significance of all stars in the inner circle and the outer annulus. The average parallax for the inner circle is mas, and the average parallax for the outer annulus is mas. These averages include all stars, no filtering, and does not include the parallax zero-point. Never the less, the difference in average parallax already indicates that the cluster is farther than the average field star. The distribution of expected parallax uncertainty (top right panel) is similar between the two regions. The minimum parallax uncertainty in the direction of Wd1 is 0.04 mas, but the vast majority of uncertainties are even larger (up to a few mas), including systematics such as parallax zero-point (Lindegren et al., 2018). The distribution for astrometric excess noise (bottom left panel) shows a considerable difference between the two regions. The astrometric excess noise () represents extra variation in the data that is not included in the five-parameter astrometric model; the astrometric noise significance is (bottom right panel). A higher fraction of objects in the inner circle have large excess noise. Therefore, we keep the sources with to avoid bad astrometric solutions.
Although the astrometric excess noise and significance are useful for assessing the quality of the astrometric solutions, the renormalized unit error (RUWE) and visibility periods used are more useful in very crowded regions like Wd1. The RUWE is the re-normalized goodness of fit (). is the total number of along scan observations used in the astrometric solution of the source. The high crowding is possibly responsible for the lower number of observations per star. A visibility period indicates the number of groups of observations separated from other groups by at least 4 d. Therefore, a higher number of visibility periods indicates that the solution is less vulnerable to errors. As recommended by Lindegren et al. (2018), we use astrometric solutions with at least eight visibility periods and RUWE . The astrometric excess noise, visibility periods, and RUWE cuts reduce the number of sources in the inner circle from 435 to 42 and reduce the number of sources in the outer ring from 2127 to 1344.
If the uncertainties are relatively small, then direct inversion of the parallax is a reasonable inference of the distance, . However, Fig. 3 shows that most of the stars in the inner and the outer region have large uncertainties, and in some cases, the parallaxes are negative. Therefore, the straightforward approach of inverting the observed parallax is either inaccurate or impossible to an individual star. Averaging a collection of stars would mitigate the problem that a subset of the stars have negative parallaxes, but this average would be biased due to the large uncertainties. Therefore, a more sophisticated inference is required such as the Bayesian approach described in Section 2.3.
2.2 Average statistics: a first rough statistical inference
Before using Bayesian inference, we estimate the true mean parallax for several annuli to roughly infer the cluster parallax.
Fig. 4 shows the estimates for the true mean parallax as a function of radius from the cluster centre. After using the selection criteria (Section 2.1), we calculate the mean parallax for each annulus, and adjust it by the parallax zero-point to get an estimate for the true parallax. These averages show a clear indication that the cluster has a smaller parallax (larger distance) than the average field star. The blue line represents the observed mean parallax corrected by the parallax zero-point of mas for several annuli. For the zero-point, we take the mean of recent estimates (Lindegren et al., 2018; Riess et al., 2018; Stassun & Torres, 2018; Zinn et al., 2019). See Section 2.3 for more details. Each annulus is 1 arcmin in width. To estimate the uncertainty on the average, the vertical error bars show the 68% confidence interval after bootstrapping the average 500 times for each annulus. If the uncertainties are accurate, then one could just calculate the uncertainty using standard error propagation. However, as we show below, the uncertainties are inconsistent with the scatter in the data. Therefore, we choose to use the actual data to report the uncertainties in the average. For radii above 3 arcmin, the average parallax seems to be dominated by the field stars. Interior to this radius, becomes more influenced by the cluster with decreasing radius.
Fig. 4 also justifies our choice for the size of the inner circle and the outer annulus. To be clear, the cluster is not limited to a radius of 1 arcmin. Rather, this is the radius for which we restrict our inference of the cluster parallax. Based upon Fig. 4, the cluster clearly extends several arcminutes. However, our model assumes one density for the cluster. Therefore, we restrict our inference to the inner region where the density is roughly uniform and consistent with our model assumption. Conversely, since the cluster clearly extends several arcminutes, we choose the outer ring far enough to be dominated by field stars. Fig. 4 indicates that 10 arcmin is sufficiently far enough to satisfy this requirement.
One obvious way to estimate the parallax is using the variance weighted mean. The variance-weighted mean parallax for the inner circle is mas, which is inconsistent with the simple mean. mas represents the uncertainty from bootstrapping, while the standard error propagation gives a smaller uncertainty of 0.01 mas. This suggests that the empirical uncertainty is quite a bit larger than the reported uncertainty (see Section 4 for more details). Given that the variance-weighted mean is compromised by the inaccurate uncertainties, we should use a more complicated method like Bayesian inference to infer the parallax. In the next subsections, we infer the cluster parallax through Bayesian inference.
2.3 Bayesian analysis
To infer the Wd1 parallax, the posterior distribution for the model parameters, is the product of the likelihood and a multidimensional prior probability :
[TABLE]
Before we fully describe the likelihood model, we define the model parameters and data. The density of stars (see Fig. 1) suggests a model for two sets of stars. One set, the inner circle, contains both cluster and field stars, and the other set, outer ring, includes only field stars. The outer ring constrains the parameters of the field-star distribution, one of which is the length scale (). gives an effective length scale for the distribution of field stars, equation (8). In practice, this length scale is set by many factors, and one of the main factors is an effective optical depth for extinction along the line of sight (Bailer-Jones et al., 2018).
The full set of observations, , includes two data sets: the parallaxes of stars in the inner circle, , and the parallaxes of stars in the outer annulus, . also includes the number of stars in the inner circle, Ni, and the number of stars in the outer annulus, N. and are the parallax uncertainties for the inner region and the outer annulus. We consider the parallax uncertainties to be fixed parameters and the dependencies on the uncertainties are omitted in the following equations for brevity. The model parameters, , are the cluster parallax, (mas), density of the cluster stars in the inner circle, (number per square arcminute), density of the field stars in the outer ring, (number per square arcminute), which we assume to be similar in the inner ring, the parallax zero-point of the cluster, (mas), the field-star length scale, (kpc), and the field-star offset, (mas). The field-star offset () includes the parallax zero-point but it also includes other possible systematics that affects the field-star distribution but not the cluster parallax; we elaborate more on this later. Naturally, is a function of radius from the centre of the cluster. Rather than assuming a radial profile, we consider annuli and infer an average number density for each annulus. Specifically, henceforth, refers to the average density of the central arcminute of the cluster. Each set also has a nuisance parameter. The nuisance parameters, , are the set of true parallaxes for the inner circle, (mas), and the set of true parallaxes for the outer annulus, (mas).
The probabilistic graphical model (PGM), Fig. 5, shows the interdependence among the observations, the model parameters, and the nuisance parameters. The likelihood probability is:
[TABLE]
Each component of the likelihood on the RHS is a likelihood for a particular set of data. The parameters after represent the set of model parameters which determine the data in the model. If the data depend upon the full set of model parameters, then appears, otherwise we include only the model parameters that matter for each likelihood component. For example, the number of stars in the inner circle, N, depends only upon the cluster density, , and field density, .
To describe the model for the observations, first, consider the bottom row in Fig. 5; it shows all observed parameters. Near the centre is , which represents the data for inner circle. Each observed parallax is drawn from a Gaussian distribution centred on the true parallax; each star has its own true parallax, ; therefore, there is a set of many true parallaxes for the inner region, . Since the central region has both field and cluster stars, each true parallax is either drawn from the cluster or the field-star parent distributions. A priori, it is not clear which star is associated with the cluster or field-star distributions. However, the fraction of stars associated with the cluster is and the fraction associated with the field stars is . Hence, the density of cluster and field stars are important parameters in the generation of the data. Unfortunately, modelling just the inner circle does not constrain the four model parameters along the top row.
Modelling the outer ring provides constraints on the field-star parameters. This then, in combination with the inner circle data, provides a unique constraint on the cluster parameters. Each observation of the outer annulus, , is also drawn from a Gaussian, and the true field-star parallax is drawn from the field-star distribution. Our assumption is that all stars in the outer ring are drawn from the field star distribution. The number of stars in the outer ring is simply given by the field-star density, while the inner circle is a weighted combination of the cluster and field-star densities. Hence, to constrain the cluster density and ultimately the fraction of cluster stars in the inner circle, the likelihood must model both the number of stars in the inner circle and the outer ring.
The first two likelihoods in equation (2), and , represent the number of stars in the inner circle and the outer ring:
[TABLE]
and
[TABLE]
where the expected number of stars in the inner circle is and the expected number of stars in the outer ring is . A is the area of the inner circle and A is the area of the outer ring. ) in equation (2) is the likelihood for the outer set of data:
[TABLE]
PGM provides a map of how to further deconstruct the likelihood using the conditional probability theorem:
[TABLE]
The first term in equation (6), , is the probability of observing any parallax for the th star in the outer ring:
[TABLE]
where is the parallax uncertainties for the outer annulus. In this work, we consider the parallax uncertainties to be fixed parameters. The second term in equation 6, , represents the field star distribution. The PGM shows that this distribution only depends upon two model parameters, and ; hence, . If one considers an image populated with stars, then the total number of stars in the image is given by dr, where FOV is the field of view in square radians, is the number density of stars, and r is the distance from the sun. If is constant, then any random star in the image is drawn from a probability distribution of . This distribution is assumed to fall off exponentially, , due to various effects including the Gaia selection function, attenuation due to dust or a combination of both. Therefore, the distribution of the field stars is
[TABLE]
After transforming from distance to parallax, the field-star distribution becomes
[TABLE]
Together, equations (7) and (9) represent the likelihood for the outer ring, given in equation (6). The zero-point in equation (9), represents an offset for the field star distribution, and it may or may not be the same as the zero-point, , for the cluster members. represents a zero-point associated with instrumental and analysis biases (Lindegren et al., 2018). The parallax distribution for field stars, equation (9), assumes that the distribution approaches zero at ; however, it may not do so for several reasons. Certainly, one part is due to the same instrumental and analysis biases that impact the cluster members. In addition, sightlines through the plane of the Galaxy likely have a distribution of field stars that is more complicated than equation (9). For example, while uniformly distributed dust may cause an exponential attenuation with one scale, , a very dusty star-forming region in a spiral arm will likely present a wall, beyond which we cannot see any stars. This would manifest as an abrupt truncation of the field-star distribution at some finite, positive parallax. For this reason, the zero-points for the field stars and the cluster must remain separate variables.
To find the likelihood distribution for the outer ring we marginalize over the nuisance parameters, . In the PGM, Fig. 5, the nuisance parameters are the true parallaxes, . The convolution of the Gaussian and the true field distribution is not analytic and requires a numerical solution.
To derive the likelihood for the inner ring, ) in equation (2), we use equations (5) and (6), but with a change of index from to . The probability of observing any parallax for the th star in the inner circle is also same as equation (7) with a simple exchange of index from to .
Next, we propose a distribution for the true parallaxes of the inner circle given the model parameters, . Once again, the inner circle is composed of cluster and field stars. A star in the inner circle is either drawn from the cluster or from the field-star distributions, and the weighting for each draw is proportional to the density of the respective population. Hence, the distribution for the true parallaxes is
[TABLE]
Since the size of the cluster is much smaller than the distance to the cluster, we assume a delta function for the cluster true-parallax distribution at ; . The second term represents the portion associated with the field stars. To find the likelihood distribution for the inner ring, we marginalize over the nuisance parameter for the inner ring, . The convolution of a Gaussian and a delta function is analytical and a convolution of a Gaussian and the field-star distribution requires a numerical integration.
To find the posterior distribution, equations (1), we choose uniform positive prior distributions for , , , , and . Lindegren et al. (2018) found that the zero-point is a function of colour, magnitude and position; hence, the zero-point has significant variance. However, there are no reference objects in our field to find the zero-point. Therefore, we must use prior information to estimate the zero-point of the cluster. For simplicity, we assume that the zero-point distribution for DR2 fields is Gaussian:
[TABLE]
where and are the mean and variance for the parallax zero-point. The two most effective means to calculate zero-point are to either use background quasars or to use independent distance measurements. Wd1 is in the Galactic plane, so there are no background quasars, and the current distance estimates for Wd1 are too uncertain to use to constrain the zero-point. Therefore, we will use previous analyses to estimate the zero-point distribution. Lindegren et al. (2018) used quasars to infer the zero-point. They found an average of 0.029 mas. Riess et al. (2018) inferred a zero-point of 0.046 0.013 mas for the Cepheid sample. Zinn et al. (2019) compared the distances inferred from astroseismology to infer a zero-point of 0.0528 0.0024 mas. Stassun & Torres (2018) also reported the zero-point of 0.082 0.033 mas from eclipsing binaries. The mean of the above four investigations is = 0.05 mas, and Lindegren et al. (2018) found that the variation for the zero-point across many fields is mas (Lindegren et al., 2018). The data that we use in our inference cannot constrain the zero-point. Therefore, when we infer the posterior distribution for , it will merely reflect this prior distribution.
2.4 Numerical solution for the posterior distribution
To find the posterior distribution, we use a six dimensional Monte Carlo Markov chain package (MCMC), emcee (Goodman & Weare, 2010; Foreman-Mackey et al., 2013) to infer six model parameters (, , , , and ). For each step in the chain, emcee evaluates the posterior by calculating the likelihood for the inner circle and the likelihood for the outer ring. Both likelihoods require the convolution of a Gaussian with the true field distribution. Evaluating these integrals is time intensive. Instead of calculating the integrals at every step in the chain, we create look-up tables for each integral.
Each object has its own look-up table evaluated at a grid of points in . For each trial of in the MCMC, we find the convolution by first-order interpolation in the look-up table. To construct each look-up table, we use trapezoid numerical integration, which requires bounds of integration. Formally, the bounds extend from to , but that is not practical for trapezoid numerical integration. Fortunately, the integrands in the likelihoods have a peak and fall off quickly on either side of this peak. To ensure that the numerical integration adequately samples this peak, we set the bounds of integration to be centred on the peak and have a width that extends just outside the peak. To roughly estimate the position of the peak and the extent of the bounds, we approximate the integrand as the convolution of two Gaussians. The mode and width of the first Gaussian is straightforward, and . The mode of the field star distribution is , and the width is . In the two Gaussian approximation, the mode of the integrand is roughly at and the width is roughly where weighting factors are and . Therefore, we integrate from to .
For each MCMC run, we use 100 walkers, 2500 steps each, and we burn 500 of those. For the results presented in Section 3, the acceptance fraction is in the range.
3 Parallax and Distance to Westerlund 1
Fig. 6 shows the posterior distribution for , , , , , and . The two regions used to constrain these parameters are an inner circle centred on the position of Wd1 and with a radius of 1 arcmin, and an outer annulus from 9 to 10 arcmin. The values in the top right corner show the mode and the highest 68% density interval (HDI) for marginalized distributions. The parallax of the cluster is mas, which corresponds to a distance of kpc, density of the cluster is stars per square arcminute, density of field stars is stars per square arcminute, the parallax zero-point of the cluster is mas, the field-star length scale is kpc, and the field-star offset is mas. The posterior shows a single-peaked marginalized probability distribution for all parameters.
Presumably, the cluster density is a function of radius. Since we model the cluster density with one average density and not a radial profile, there is a potential for bias to affect the inference. As long as the width of each annulus is small compared to the change in density, then approximating the density in each ring with an average annulus should work well. To test this hypothesis, we infer the full posterior distribution for several inner annuli. For each inference, we use the outer annulus from 9 to 10 arcmin to constrain the field-star parameters. Fig. 7 shows the inferred parallaxes for each annulus. The rings extend from 0 to 0.5, 0.5 to 1.0, 1.0 to 1.5, and 1.5 to 2.0 arcmin with 8, 34, 105, and 141 total number of stars, respectively. In terms of parallax, the highest density intervals are [0.25, 0.43], [0.28, 0.43], [0.34, 0.45], and [0.40, 0.54] mas. All rings below 1.5 arcmin are consistent with our main result of 0.35 mas from using an inner circle with radius 1 arcmin (Fig. 6), and rings above 1.5 arcmin more likely represent the field stars parallax. Therefore, we conclude that modelling the average cluster density rather than a radial density profile is sufficient for the chosen inner region sizes.
4 Discussion
The results in Section 3 have significant implications for both the distance to Wd1 and the distribution of field stars. The inferred is kpc, which is consistent with the model of Bailer-Jones et al. (2018), 1.38 kpc. The 2 difference is not that surprising given that the Bailer-Jones et al. (2018) estimate for varies as a function of Galactic longitude and latitude (l,b), and it does not include clusters. Also, given that the Bailer-Jones et al. (2018) length scale was derived from a model of the Galaxy before the era of accurate Gaia parallaxes, it is encouraging that the Bailer-Jones et al. (2018) model is only 2 away from the inferred value. Using Bailer-Jones et al. (2018) length scale, =1.38 kpc, and using the same parallax zero-point for the whole region, we find that the cluster parallax is (sys) mas, which differs by only 1 from the value when we infer as well. While the results are formally consistent, using the Bailer-Jones et al. (2018) length scale, skews the cluster parallax towards the average field-star distribution.
Our inferred parallax of the cluster is mas, which corresponds to a distance of kpc. For this inference, we used Bayesian inference with a six-parameter model; there are other, simpler statistical inferences, but these typically ignore known contamination and systematics. These simpler inferences are useful in checking the results of our Bayesian inference. For example, we calculate the average distance for all stars within 1 arcmin from the cluster centre, where the individual distances comes from Bailer-Jones et al. (2018). To estimate the uncertainty, we bootstrap the average. In this way, the average distance for these stars is kpc; 25% of our posterior distribution contains distances larger than this simple mean. The systematic uncertainty is due to the zero-point uncertainty (Lindegren et al., 2018) and it dominates the uncertainty. Or another approach is to calculate the true mean parallax for the inner circle with 1 arcmin radius (see the first point in Fig. 4). This gives mas; this simple mean is within the 68% confidence interval of our inferred parallax. Once again, the systematic uncertainty is due to the uncertainty in the zero-point. While these simple approaches provide a good check on our result, these naive calculations are not sufficient. For one, many of the parallaxes are negative. Secondly, we infer that at least 1/4 of the stars in the inner arcminute are field stars. Therefore, to infer the distance to the cluster one must use Bayesian inference and model both the field stars and the cluster stars.
Wd1 is located at a Galactic longitude of . Given this longitude and the new inferred distance of kpc ( 8500 ly), Wd1 most likely lies in the Scutum-Centaurus arm, which is one of the major spiral arms of the Milky Way (Bobylev & Bajkova, 2014; Urquhart et al., 2014; Vallée, 2014). This may be an independent way to confirm the new inferred distance of kpc to Wd1.
Based upon the prior distribution, the parallax zero-point for the cluster is mas; in contrast, we infer a field-star offset of mas. In our model, represents a combination of the instrumental zero-point, , and a truncation of the field-star distribution. It could represent the zero-point for the entire region. However, 0.12 mas is much larger than the average of all other previous analyses (mean of mas). Therefore, the most likely conclusion is that is dominated by a truncation of the field star distribution. This would occur if the line of sight toward this region intersects an exceptionally dense region of dust at a true parallax around 0.17 mas. This scenario is consistent with the fact that this line of sight is in the plane of the Galaxy. Alternatively, if we force , then we find that the parallax of the cluster is mas, which corresponds to a distance of kpc (see Fig. 8). Again, the most likely scenario is that ; therefore, the most likely cluster parallax is mas.
is the expected parallax uncertainty, but the empirical uncertainty is different. For example, Lindegren et al. (2018) found that the empirical uncertainty is 1.081 times larger than the expected value for quasars in DR2. Similarly, we measure the empirical uncertainty distribution for all stars within 1 arcmin of Wd1. Fig. 9 shows the parallaxes centred on the cluster and weighted by the uncertainty:
[TABLE]
If the uncertainties are accurate, uncorrelated and random, then the distribution of should be Gaussian with . In other words, 68% of should be within the interval . Fig. 9 reports the percent of the distribution inside of 1, 2, 3, and 4. If uncertainties represent the full variation and the distribution is Gaussian, then one would expect the distribution to have 68% of the data inside 1, 95% inside 2, etc. However, the probabilities are smaller than one would expect. In fact, we calculate that the 68% highest density interval for , HDIx is . The half width of this interval is 1.7. In other words, the empirical uncertainties are 1.7 times larger than the expected uncertainties. This suggests that there are significant problems with the uncertainty estimate in the region of Wd1. At the moment, it is not clear what is causing this inaccuracy. The high extinction (red colours), crowding, and binarity may be three important factors. Without having the raw data, it is difficult to investigate this problem further, so we proceed with our inference, keeping in mind that the DR2 uncertainties are likely too small by a factor of 1.7.
The fact that the empirical uncertainties are 1.7 larger than the calculated uncertainties suggests that there is a problem with the five-parameter astrometric model for these stars. This could be due to any number of issues, including the excess noise model, binarity, or the degrees of freedom bug (Lindegren et al., 2018). Formally, the astrometric excess noise is a way to incorporate extra empirical noise. The DR2 pipeline uses error propagation to include the excess noise into of source i. However, even after the excess noise is included we find that there is still significant excess empirical noise. This discrepancy may be due to the specific model assumed for the excess noise in the DR2 pipeline. However, equation 120 of Lindegren et al. (2012) assumes that the excess noise for each observation is uncorrelated. Hence, more observations will reduce the uncertainty by the square root of the observations. However, if the true excess noise is correlated among the observations, more data will not reduce the uncertainty; the excess noise would represent a floor on the uncertainty. The assumption that the excess noise for each observation is uncorrelated may be why the empirical uncertainties for Wd1 are 1.7 times larger than the calculated uncertainties.
The inferred distance to Wd1 is kpc, which represents the highest precision distance estimate for Wd1 that has been published. Historical estimates to Wd1 range from 1.0 to 5.5 kpc (see Section 1). Recently, Clark et al. (2005) estimated that the distance to Wd1 ranged from 2 to 5.5 kpc. The bounds given by Clark et al. (2005) corresponds to a factor of 2.8 in distance; in contrast, the precision in the Gaia DR2 inferred distance is 18%. One can use the new distance to infer the fundamental parameters of the cluster such as luminosity, mass, and age via isochrone fitting. In this manuscript, we do not perform isochrone fitting. Instead, we estimate these fundamental parameters using two techniques. First, we scale previous estimates using the new distance, and we infer the luminosity, mass, and age of two bright stars in Wd1. The estimates of two bright stars provides a good proxy for the whole cluster.
The 18% precision in distance will lead to precision in luminosity. For stars below about 20 M⊙, the main-sequence luminosity for a given mass should scale as . Therefore, the corresponding uncertainties in mass and age are and , respectively. For stars above about 55 M⊙, , and in this case, the range of mass estimates has the same precision as the luminosity. For the highest masses, the age depends very weakly on mass or luminosity because these stars have very similar lifetimes around 3 Myr.
The new Gaia DR2 distance also provides strong constraints on the luminosity, mass, and age of cluster members. Wd1 hosts a diverse population of evolved massive stars such as WR stars, red and blue supergiants, YHGs, an LBV, and a magnetar. Previous studies have inferred a turn-off mass of around 40 M⊙ and cluster age of 3.5-5 Myr with a presumed distance of around 5 kpc (Clark et al., 2005; Crowther et al., 2006; Ritchie et al., 2009; Negueruela et al., 2010; Lim et al., 2013). These estimates were based on modelling the luminosity and temperatures of YHGs, RSGs, and WR stars. By association, this would imply that the magnetar progenitor had an initial mass of 40 M⊙ (Muno et al., 2005; Ritchie et al., 2010). On the other hand, Koumpia & Bonanos (2012) found a progenitor mass of 40 M⊙ and a distance value of 3.70.6 kpc by studying eclipsing binaries in the Wd1.
Without re-evaluating the bolometric and extinction corrections for each star, the new distance of 2.6 kpc reduces all luminosities by -0.58 dex as compared to a distance of 5 kpc. Using single-star stellar evolution models (Brott et al., 2011), we now infer the mass, age, and corresponding main-sequence turn-off mass for two of the brightest stars in Wd1, the LBV W243 and a YHG4. The inferred log for W243 is 5.2 and for YHG4 is 5.4. The spectral type of W243 is B2I (to A2I) (Westerlund, 1987; Clark & Negueruela, 2003) and for YHG4 is F2Ia*+*. This corresponds to temperatures of 9.17 kK (to 17.58 kK) and 7.2 kK, respectively. For these temperatures, using single-star stellar evolution models (Brott et al., 2011), the masses are 23.9 for W243 and 28.6 for YHG4, the corresponding ages are 7.6 and 5.5 Myr, respectively. While we did not fit isochrones, these ages would correspond to isochrones with main-sequence turn-off masses of 22.3 (W243) and 25.9 (YHG4).
If we assume that LBV W243 is a representative of the cluster, then the age of the cluster is 7.6 Myr, the turn-off mass is 22.3 (down from 40M⊙), and the mass of the most evolved stars is 28.6. Fig. 10 shows that the new inferred luminosity brings the LBV W243 to the lower edge of the S Doradus instability strip (Smith et al., 2004). However, Fig. 10 also clearly shows that there are RSGs in Wd1 with implied initial masses below 20 M*⊙*, well below the presumed turn-off mass even at the nearer distance, and implied ages of around 10 Myr. This may suggest either uncertain bolometric corrections, a range of ages in Wd1, or may point to the influence of binary evolution on the evolved star population (see e.g. Beasor et al. 2019).
Most of the prior distances for Wd1 relied on measuring an apparent magnitude, assigning an absolute magnitude based upon the stellar type, and calculating the distance modulus. However, Wd1 suffers from high extinction, with an inferred of about 11 mag (Clark et al., 2005; Damineli et al., 2016). The uncertainty in the reddening translates to highly uncertain true apparent magnitude estimates, and hence, highly uncertain luminosity-based distance estimates. With an independent geometric Gaia DR2 distance, the reddening and bolometric luminosities of cluster stars can be re-evaluated, although this is beyond the scope of our paper.
One final but important point concerns the total stellar mass of Wd1. This cluster has been discussed as potentially one of the most massive young star clusters in the Galaxy (Clark et al., 2005). However, in addition to lowering the luminosities of the evolved stars, lowering the cluster turn-off mass, and raising the cluster’s age, the smaller distance from DR2 also lowers the total mass of the cluster. The inferred very high total stellar mass of the cluster of 105 M*⊙* was derived by integrating down a Kroupa IMF from the turn-off mass of around 40 M*⊙, and by scaling relative to the number of observed evolved supergiant stars of initially 30-40 M⊙. If the revised DR2 distance lowers all luminosities by -0.58 dex, and hence the turn-off mass from 40 to 22 M⊙* as noted above, then the expected relative number of evolved stars increases by a factor of 4 at the turn-off mass. Normalizing to the observed number of evolved stars therefore lowers the extrapolated total stellar mass of the cluster by roughly the same amount, which would make Wd1’s initial mass comparable to or less than the mass of the Arches cluster in the Galactic centre, although Wd1 would be significantly older (Kim et al., 2000; Stolte et al., 2002; Harfst et al., 2010).
5 Conclusion
We use Gaia DR2 parallax measurements and Bayesian inference to estimate the distance to the Westerlund 1 (Wd1) massive star cluster, as well as the distribution of field stars along the line of sight. We model both cluster stars and Galactic field stars, and we find that the cluster parallax is mas, which corresponds to a distance of kpc. The new distance represents the highest precision, 18%, to Wd1 to date. Much of this precision is limited by the systematics such as parallax zero-point, which is included in the Bayesian model. However, the models are rough and conservative, and require improvement in the future. For example, rather than using one zero-point value, we consider a distribution of zero-points due to the observed variation of the parallax zero-point. we also consider different offsets for the field and cluster stars. This model is a rough estimate, and to further improve the parallax precision will either require better models for the systematics, or better calibration in subsequent data releases.
Wd1 has been discussed as potentially one of the most massive young star clusters in the Galaxy, but revising the distance to this one cluster reduces its total mass and increases its age, and may have profound consequences for stellar evolution theory. An improved distance can significantly narrow the precision on luminosity, mass, and age of the cluster, which provides constraints on the post-main-sequence evolution of cluster members. Based on the new Gaia distance, we infer turn-off mass of around 22 M*⊙, which implies that the progenitor mass of the magnetar CXO J164710.2–455216, and LBV W243 is a little bit above 22 M⊙*.
Acknowledgements
This work has made use of data from the European Space Agency (ESA) mission Gaia (https://www.cosmos.esa.int/gaia), processed by the Gaia Data Processing and Analysis Consortium (DPAC, https://www.cosmos.esa.int/web/gaia/dpac/consortium). Funding for the DPAC has been provided by national institutions, in particular the institutions participating in the Gaia Multilateral Agreement. This project was developed in part at the 2018 Gaia Sprint, hosted by the eScience and DIRAC Institutes at the University of Washington, Seattle. Support for MA and JWM was provided by the National Science Foundation under Grant No. 1313036. Support for NS was provided by NSF awards AST-1312221 and AST-1515559, and by the National Aeronautics and Space Administration (NASA) through HST grant AR-14316 from the Space Telescope Science Institute, which is operated by AURA, Inc., under NASA contract NAS5-26555.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Bailer-Jones et al. (2018) Bailer-Jones C. A. L., Rybizki J., Fouesneau M., Mantelet G., Andrae R., 2018, AJ , 156, 58 · doi ↗
- 2Beasor et al. (2019) Beasor E. R., Davies B., Smith N., Bastian N., 2019, MNRAS , 486, 266 · doi ↗
- 3Bobylev & Bajkova (2014) Bobylev V. V., Bajkova A. T., 2014, MNRAS , 437, 1549 · doi ↗
- 4Brandner et al. (2007) Brandner W., Clark J. S., Stolte A., Waters R., Negueruela I., Goodwin S. P., 2007, A&A , 478, 137 · doi ↗
- 5Brott et al. (2011) Brott I., et al., 2011, A&A , 530, A 115 · doi ↗
- 6Brown et al. (2018) Brown A. G. A., et al., 2018, A&A , 616, A 1 · doi ↗
- 7Clark & Negueruela (2003) Clark J. S., Negueruela I., 2003, A&A , 413, L 15 · doi ↗
- 8Clark et al. (2005) Clark J. S., Negueruela I., Crowther P. A., Goodwin S. P., 2005, A&A , 434, 949 · doi ↗
