Automatic line selection for abundance determination in large stellar spectroscopic surveys
Georges Kordopatis, Vanessa Hill, Karin Lind

TL;DR
This paper introduces a method to evaluate the usefulness of atomic lines in stellar spectra for optimizing spectrograph setups and line lists, enhancing abundance measurements in large surveys.
Contribution
The authors present a novel automated technique to assess line purity and detectability, aiding instrument design and spectral analysis pipelines.
Findings
Green+Red setup detects more elements and useful lines.
High-resolution spectra yield more lines than low-resolution.
Purity threshold impacts the number of usable lines.
Abstract
*Context: The optimisation of new multiplex spectrographs (resolution, wavelength range,...), their associated surveys (choice of setup), or their parameterisation pipelines require methods that estimate which wavelengths contain useful information. *Aim: We propose a method that establishes the usefulness (purity & detectability) of an atomic line. We show two applications: a) optimising an instrument, by comparing the number of useful lines at a given setup, and b) optimising the line-list for a given setup by choosing the least blended lines detectable at different signal-to-noise ratios. *Method: The method compares pre-computed synthetic stellar spectra containing all of the elements and molecules with spectra containing the lines of specific elements alone. Then, the flux ratios between the full spectrum and the element spectrum are computed to estimate the line purities. The…
| Spectrograph-setup | (nm) | (a)𝑎(a)(a)𝑎(a)footnotemark: |
|---|---|---|
| WEAVE-LR(b)𝑏(b)(b)𝑏(b)footnotemark: | [366;959] | |
| WEAVE-HR (B+R)(b)𝑏(b)(b)𝑏(b)footnotemark: | [404;465] + [595;685] | |
| WEAVE-HR (G+R)(b)𝑏(b)(b)𝑏(b)footnotemark: | [473;545] + [595;685] | |
| 4MOST-LR(c)𝑐(c)(c)𝑐(c)footnotemark: | [370;950] | |
| 4MOST-HR(c)𝑐(c)(c)𝑐(c)footnotemark: | [392.6;435.5] + [516;573] + [610;679] | |
| Gaia-RVS | [846;870](d)𝑑(d)(d)𝑑(d)footnotemark: | |
| DESI | [360;980] | |
| HERMES(e)𝑒(e)(e)𝑒(e)footnotemark: | [471.8-490.3] + [564.9-587.3] | |
| + [648.1;673.9] +[759.0;789.0] | ||
| LAMOST-LR(f)𝑓(f)(f)𝑓(f)footnotemark: | [370-900] | 1800 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpectroscopy and Chemometric Analyses · Spectroscopy Techniques in Biomedical and Chemical Research · Molecular spectroscopy and chirality
11institutetext: Université Côte d’Azur, Observatoire de la Côte d’Azur, CNRS, Laboratoire Lagrange, Nice, France 22institutetext: Department of Astronomy, Stockholm University, AlbaNova University Centre, SE-106 91 Stockholm, Sweden
Automatic line selection for abundance determination in large stellar spectroscopic surveys††thanks: Tables with identified lines from 300 to 1000 nm, and resolving powers of 3 000, 6 000, 20 000, 40 000 and 80 000, are only available in electronic form
at the CDS via anonymous ftp to cdsarc.cds.unistra.fr (130.79.128.5) or via https://cdsarc.cds.unistra.fr/cgi-bin/qcat?J/A+A/
Georges Kordopatis
Vanessa Hill Automatic line selection for abundance determination in large stellar spectroscopic surveys††thanks: Tables with identified lines from 300 to 1000 nm, and resolving powers of 3 000, 6 000, 20 000, 40 000 and 80 000, are only available in electronic form at the CDS via anonymous ftp to cdsarc.cds.unistra.fr (130.79.128.5) or via https://cdsarc.cds.unistra.fr/cgi-bin/qcat?J/A+A/Automatic line selection for abundance determination in large stellar spectroscopic surveys††thanks: Tables with identified lines from 300 to 1000 nm, and resolving powers of 3 000, 6 000, 20 000, 40 000 and 80 000, are only available in electronic form at the CDS via anonymous ftp to cdsarc.cds.unistra.fr (130.79.128.5) or via https://cdsarc.cds.unistra.fr/cgi-bin/qcat?J/A+A/ [email protected]
Karin Lind Automatic line selection for abundance determination in large stellar spectroscopic surveys††thanks: Tables with identified lines from 300 to 1000 nm, and resolving powers of 3 000, 6 000, 20 000, 40 000 and 80 000, are only available in electronic form at the CDS via anonymous ftp to cdsarc.cds.unistra.fr (130.79.128.5) or via https://cdsarc.cds.unistra.fr/cgi-bin/qcat?J/A+A/Automatic line selection for abundance determination in large stellar spectroscopic surveys††thanks: Tables with identified lines from 300 to 1000 nm, and resolving powers of 3 000, 6 000, 20 000, 40 000 and 80 000, are only available in electronic form at the CDS via anonymous ftp to cdsarc.cds.unistra.fr (130.79.128.5) or via https://cdsarc.cds.unistra.fr/cgi-bin/qcat?J/A+A/Automatic line selection for abundance determination in large stellar spectroscopic surveys††thanks: Tables with identified lines from 300 to 1000 nm, and resolving powers of 3 000, 6 000, 20 000, 40 000 and 80 000, are only available in electronic form at the CDS via anonymous ftp to cdsarc.cds.unistra.fr (130.79.128.5) or via https://cdsarc.cds.unistra.fr/cgi-bin/qcat?J/A+A/Automatic line selection for abundance determination in large stellar spectroscopic surveys††thanks: Tables with identified lines from 300 to 1000 nm, and resolving powers of 3 000, 6 000, 20 000, 40 000 and 80 000, are only available in electronic form at the CDS via anonymous ftp to cdsarc.cds.unistra.fr (130.79.128.5) or via https://cdsarc.cds.unistra.fr/cgi-bin/qcat?J/A+A/
Abstract
*Context. * Over the last years, new multiplex spectrographs having observed or planning to observe several millions of stars have emerged. The optimisation of these instruments (regarding resolution or wavelength range), their associated surveys (choice of instrumental setup), or their parameterisation pipelines require methods that estimate which wavelengths, or pixels, contain useful information.
*Aims. *We propose a method that establishes the usefulness of an atomic spectral line, where usefulness is defined by the purity of the line and its detectability. We show two applications of our code: a) optimising an instrument, by comparing the number of detected useful lines at a given wavelength range and resolution, and b) optimising the line-list for a given setup, in the sense of creating a golden subsample, choosing the least blended lines detectable at different signal-to-noise ratios.
*Methods. *The method compares pre-computed normalised synthetic stellar spectra containing all of the elements and molecules with spectra containing the lines of specific elements alone. Then, the flux ratios between the full spectrum and the element spectrum are computed to estimate the line purities. The method identifies automatically (i) the line’s central wavelength, (ii) its detectability based on its depth and a given signal-to-noise threshold and (iii) its usefulness based on the purity ratio defined above.
*Results. * We apply this method to compare the three WEAVE high-resolution setups (Blue: nm, Green: nm, Red: nm), and find that the Green+Red setup both allows one to measure more elements and contains more numerous useful lines. However, there is a disparity in terms of which elements are detected over each of the setups, which we characterise. We also study the performances of high-resolution () and low-resolution () spectra covering the entire optical wavelength range. Assuming a purity threshold of 60 per cent, we find that the high-resolution setup contains a much wealthier selection of lines, for any of the considered elements, whereas the low-resolution has a ”loss” of 50 to 90 per cent of the lines (depending on the nucleosynthetic channel considered) even when the signal-to-noise is increased.
*Conclusions. *The method presented provides a vital diagnostic of where to focus to get the most out of a spectrograph, and is easy to implement for future instruments that have not decided yet their final configuration, or for pipelines that require line masks.
Key Words.:
**Stars: abundances, Line: identification,Techniques: spectroscopic **
1 Introduction
The relative abundance ratio of atomic elements measured from the stellar photospheres hold key information about multiple fields in modern astrophysics, ranging from galaxy formation (e.g., Freeman & Bland-Hawthorn, 2002) to stellar nucleosynthesis (Burbidge et al., 1957; Iwamoto et al., 1999; Nomoto et al., 2013; Karakas & Lattanzio, 2014, and references therein), especially if coupled with an estimation of the stellar age (e.g. Kordopatis et al., 2023). Specifically, by measuring the elemental abundance pattern of a star, it is possible to determine its birthplace and siblings, and/or the star formation history that preceded its formation. Yet, measuring the abundance of specific elements in a stellar spectrum is not straightforward (e.g. Jofré et al., 2019). Ultimately, inferring the amount of atoms of a species, present in the photosphere, depends on how easily a specific spectral line is detectable, measurable and transformable into an abundance. In other words, this task depends on the one hand, on the accuracy of the stellar atmosphere and line profile modelling (Gray, 2005) and on the other hand, on how accurately the spectral line can be measured (signal-to-noise ratio, , resolution of the spectrum and blending together of several stellar features). The accuracy of the line modelling in turn depends on how accurately and precisely the stellar atmospheric parameters are known (namely, the effective temperature, Teff, surface gravity, , global metallicity, [M/H], and -elements enhancement, ).
As a consequence, the design phase of a spectroscopic survey is a tough negotiation between spectral resolution, exposure time, adopted wavelength range and total number of targets observed by the end of the project (e.g. Feltzing, 2016). For this purpose, it is important to be able to easily assess, early in the phase of the project, the information available at a given spectral range, resolving power () and signal-to-noise (), for specific types of stars. This is intrinsically not trivial as it often requires to have a kind of stellar spectra parameterisation pipeline already available (e.g., Caffau et al., 2013; Bedell et al., 2014; Hansen et al., 2015). Yet, such a pipeline often requires a tedious phase of training and/or optimisation (e.g. Recio-Blanco et al., 2006; Kordopatis et al., 2011, 2013; Ness et al., 2015; Piskunov & Valenti, 2017), which is therefore incompatible with the timescale or even the scope of the desired tests. In this context, the recent years have seen the development of codes that explore in a quick way the available information in a spectrograph’s configuration, in order to provide answers to the previously raised questions (e.g. Ruchti et al., 2016; Ting et al., 2017; Sandford et al., 2020).
The Spectral Wavelength Optimization Code (SWOC, Ruchti et al., 2016) requires the user to provide a predefined table containing the central wavelength and the equivalent width (or line-depth) of features that are considered of particular interest. SWOC then evaluates the quality and the wavelength distribution of these features for a considered stellar-type, determines the optimal wavelength coverage based on a defined Figure-of-Merit, and eventually combines this information for different stellar types to ascertain the optimal wavelength coverage for a survey. This approach therefore relies on already having a priori information regarding which lines are of interest. This is not always the case, especially in wavelength regions that have not yet been commonly used in large surveys in the past.
A different approach has been adopted by Ting et al. (2017, see also ), that employ the Cramér-Rao bound metric to quantify the amount of information available in a spectrum of specific wavelength range and resolution, associated with a given label (in this case, elemental abundance). Being based on the so-called gradient spectra, i.e. the variation of the spectrum at a given wavelength associated to a specific label, as well as on the covariance matrix of the spectrum, the metric sums over the different wavelength pixels, to inform the user about which elements can be detected above a given significance threshold. Ting et al. (2017) conclude that, given a fixed exposure time and number of pixels (therefore different and wavelength-ranges depending on resolution), low-resolution spectra could provide an equivalent amount of information to high-resolution spectra. Yet, this conclusion has been obtained assuming that resolution and are uniform across the wavelength range and that line-blends are correctly known (and modelled), which is frequently not the case.
The caveats mentioned in the previous paragraphs motivated the development of a new code, that we present in this paper. Its purpose is to identify “useful” lines in a synthetic spectrum, i.e. lines that are visible and not heavily blended at a given spectral resolution and , without any a priori knowledge. This information is then stored and can be used to create either a line-list selection for spectral analysis (e.g. for abundance determination), or to visualise how many lines of a specific element or a nucleosynthetic channel are useful for a given instrumental configuration. It therefore has immediate and valuable applications for spectroscopic surveys based on already existing or future spectroscopic facilities or instruments such as APOGEE (Majewski et al., 2017), DESI (Abareshi et al., 2022), Gaia-RVS (Gaia Collaboration et al., 2016; Cropper et al., 2018), GALAH (De Silva et al., 2015), LAMOST (Deng et al., 2012), 4MOST (de Jong et al., 2019), WEAVE (Jin et al., 2022), MOONS (Cirasuolo et al., 2020), MSE (The MSE Science Team et al., 2019), PFS (Takada et al., 2014), etc.
The paper is structured as follows. In Sect. 2 we present the concept of the code: how it runs, which are the required inputs, and which are the outputs. The synthetic spectral library on which the code relies on is described in Sect. 3. The code is then applied in Sect. 4 on a handful of examples. In Sect. 4.1 a verification of the identified lines based on the line-list established within the Gaia-ESO survey (Randich et al., 2022; Gilmore et al., 2022) is performed, and in Sect. 4.2 we show an illustration of how an instrument’s design can be optimised, by comparing the performances of high- and low-resolution spectrographs for specific types of stars. In Sect. 4.3, we evaluate the performances of the two WEAVE high-resolution configurations to suggest the setup that best drives Galactic archaeology science. In Sect. 4.4 we show how our code can be used to create a ‘golden’ line-list for spectral synthesis codes. Finally, Sect. 5 concludes.
2 Description of the code
2.1 Description of the algorithm
Let be the normalised synthetic spectrum of a star at a given set of atmospheric parameters {Teff, , [M/H], }. This spectrum is computed at an instrumental resolving power (where is the full-width at half maximum of the line spread function of the instrument) with a sampling . contains the lines and blends of all of the elements and molecules present at the photosphere of the star.
Similarly, let be the normalised stellar spectrum containing only the lines associated to the element , at the same parameters, resolving power , pixel sampling and same continuous opacities as in . Each element has a reference linelist111Here, retrieved from the Vienna Atomic Line Database (VALD), http://vald.astro.uu.se/ associated to it (Piskunov et al., 1995; Ryabchikova et al., 2015) which is used for the computation of both and . For convenience, in what follows we will omit the subscript, when implicit. The steps of our algorithm to identify the lines, for a given element , are the following:
We detect all of the lines in by identifying their cores blindly. To achieve this, we search for the zero crossings in the derivative of , without imposing any threshold in the flux (see, however, below). Let be the list containing the wavelengths of all the identified line-cores of the element . The number of lines in is smaller or equal to the number of VALD entries in . 2. 2.
We identify the true central wavelength of each line in , by cross-matching with . Often, several VALD lines fall within one from the considered . In this case, we use the Boltzmann equation to evaluate which is the most prominent in the considered subset. In practice, we choose the line that is expected to be the strongest, ranking all candidate lines according to excitation energy and oscillator strength in the following way:
[TABLE]
where is the number of atoms, is the excitation potential of the line, is the logarithm of the oscillator-strength times the statistical weight of the parent energy level, and is the Boltzmann constant222We note that by doing so, we assume that all of the lines are at the same ionisation level which is not necessarily the case, unless is computed only for a given ionisation level.. 3. 3.
For each item in , we evaluate and keep the lines that are deep enough to be detected at a given . This criterion, derived in Appendix A, is defined as:
[TABLE]
where is the signal-to-noise ratio per resolution element (see Appendix A for the formula with per pixel). We note that the criterion is applied to the elemental spectrum rather than to the observed total spectrum . The reason for this choice is that we want to impose a criterion on the detectability of the line independently of its blend (or purity, see Eq. 4, below). 4. 4.
For each , we identify the blue-end, , and red-end, , of the line, defined as the first wavelengths blue-wards and red-wards where :
[TABLE]
i.e. the wavelengths at which the flux has reached per cent of the value it had at its core. We limit the search for and to . A value of (i.e. 2 per cent) has been empirically adopted. 5. 5.
We define the purity factor as:
[TABLE]
and then compute the purity for the entire line (), the blue-half () and the red-half (). This is done because the line can be blended differently on its blue or red wing (see, for example, Fig. 1) and a line that is free of blend in one of its wings may still be very useful and reliable for abundance determinations (see how blending can affect the equivalent width measurements in Appendix B). The following ranges, and , are therefore adopted in Eq. 4:
[TABLE] 6. 6.
Finally, we evaluate the number of pixels in the blue , and in the red , that are close to the continuum () within the adopted . This allows us, eventually, to flag the lines that have a purity above a determined arbitrary value but that can nevertheless be difficult to detect because they are in the wings of stronger lines further away from or .
2.2 Input/output of the algorithm
In order to run, the code requires as an input () a synthetic spectrum that includes all the elements, () a set of synthetic spectra with the atomic lines of only one element333Several ionisation levels can be present in the element spectrum in case the user does not wish to differentiate between them. each time, computed at the same wavelength range, resolving power and atmospheric parameters as the full spectrum, () the reference line-list for each element that has been used to compute the spectra, () an arbitrary threshold and finally () the spectral resolving power of the instrument. The latter two parameters are used to evaluate the detectability of a line at a given ; the spectral resolving power is applied by convolving the simulated spectra (provided with infinite resolution) with a Gaussian of appropriate FWHM.
The code delivers, for a given element , a table containing the central wavelengths of the lines , the blue-ends , the red-ends , the three purity factors (, , ), the depth of the line in the full spectrum , the depth of the line in the element spectrum , the number of pixels in the blue and the red that have a flux close to the continuum. This information can later be used as desired to make summary/diagnostic plots or in order to select “clean” lines for codes that require such an input.
Figure 1 shows three cherry-picked examples of line identifications with our code for a Solar-like spectrum at . The total, blue and red purity factors are encapsulated in the figure, together with the central wavelength and the depth of the line for the element spectrum alone.
3 Grid of synthetic spectra of infinite resolution
We consider nine stellar types, at different combinations of Teff and , and six different values of [M/H] (see Fig. 2), resulting to 54 different templates.
The spectra are computed using PySME v4.10444https://pypi.org/project/pysme-astro/ (Wehrhahn et al., 2022) and the SME library v5.22555https://www.stsci.edu/~valenti/sme.html (Valenti & Piskunov, 1996; Piskunov & Valenti, 2017) together with the 1-dimensional MARCS model atmospheres (Gustafsson et al., 2008), assuming local thermodynamic and hydrostatic equilibrium. The considered total wavelength range is nm. The sampling is constant at nm. Adopted line-list is from VALD3 database (downloaded in January 2021). The molecular line-list includes CH, CN, C2, TiO, MgH, SiH, CO, and OH. The elemental abundance ratios are the same as for the MARCS model atmospheres i.e. solar scaled with Grevesse et al. (2007) except for Lithium ( adopted for all of the stars) and for -elements, for which the abundance is a function of metallicity, as follows:
[TABLE]
We note that the adopted elemental abundances do not necessarily reflect what exists in nature, and that in practice lines could be more easily (in case of over-abundance) or more difficultly detected (in case of under-abundance).
The resolution of the computed spectra is infinite, in the sense that no macro-turbulence, rotational broadening or instrumental broadening have been applied. To obtain the spectrum as obtained from a specific instrument666Assuming that no rapid-rotators are to be observed and that macro-turbulence does not dominate the line-profile., one therefore simply needs to convolve the initial spectrum with a Gaussian kernel whose FWHM corresponds to the resolving power of the considered spectrograph, then crop at the wavelengths the spectrograph observes (see for example Table 1).
The different elements for which individual spectra have been computed are the following:
- •
Even-Z elements: C, O, Mg, Si, S, Ca, Ti.
- •
Odd-Z elements: Li, N, Na, Al, P, K, Sc.
- •
Iron-peak elements: V, Cr, Mn, Fe, Co, Ni, Cu, Zn.
- •
Neutron-capture elements peak: Rb, Sr, Y, Zr, Mo.
- •
Neutron-capture elements peak: Ba, La, Ce, Pr, Nd, Sm, Eu.
We note that we treat neutral and ionised species separately. Furthermore, whereas molecules are included in the full spectra, molecular lines associated with a given element were not considered for detectability or usefulness.
The VALD line-list used to identify the lines contains 621 357 unique entries. It is a merged version coming from two “extract stellar” requests from the VALD3 database, for a solar-metallicity giant (Teff K, ) and one solar-metallicity dwarf (Teff K, ), which included hyperfine splitting, a depth detection threshold set to 0.001 and a micro-turbulence to .
4 Applications
Below, we show a validation of our code using the Gaia-ESO survey line-list (Sect. 4.1), as well as three different applications/illustrations of it. Section 4.2 investigates the purity of the lines for different instrument setups (different resolving powers but similar wavelength range), while Sect. 4.3 compares how two different setups of similar resolving power compare when probing different wavelength regions. Section 4.4 shows how to select a golden sublist of most useful lines, based on the output of our code.
4.1 Validation through comparison with the Gaia-ESO line-list
The Gaia-ESO public spectroscopic survey (GES, Randich et al., 2022; Gilmore et al., 2022) observed from 2011 to 2018 approximately Milky Way stars using the high-resolution spectrographs UVES () and GIRAFFE (), covering mostly the wavelength regions [480-680] and [850-900] nm. The consortium analysed the spectra using more than five different pipelines (Smiljanic et al., 2014), based on a variety of methods, ranging from spectral synthesis to equivalent-width measurement, and from model-driven to data-driven parameterisation. In this process, a particular effort has been put into homogeneously selecting lines that were suitable for spectral analysis, both in terms of blending and in terms of reliability of atomic parameters. This effort has been published in Heiter et al. (2021), where the authors provide blending quality flags (with the keyword synflag) based on the visual inspection of high-resolution spectra () of the Sun and Arcturus. These lines are labelled ‘Y’, for not blended or blended with a line from the same specie for either star, ‘N’ for blended for both stars, and ‘U’ for blended for at least one of the stars.
To evaluate the performance of our code, we compared our results for spectra with the ones of GES, selecting only the lines that have the synflag=‘Y’. For that reason, we selected synthetic spectra amongst our templates, with Solar-like and Arcturus-like parameters (Teff K, \log g$$=4.5, [Fe/H]=0 and Teff K, \log g$$=1.0, [Fe/H], respectively) and ran our code on these with a signal-to-noise threshold equal to 500 per resolution element and minimum purity equal to 0.2 in order to retrieve as many lines as possible.
Among the 358 lines that GES has identified as reliable888We do not consider, for this work, the hydrogen lines and we keep only one line per element if within 0.01 nm from the others., we recover 331 of them for the Sun and 344 for Arcturus, i.e. 92.5 per cent and 96 per cent, respectively999The crossmatch has been performed by rounding the wavelength to 0.01 nm. . Figure 3 shows the ratio of recovered lines over the ones available from GES, per element. For Arcturus, we recover at least a portion of lines for all of the considered elements of Heiter et al. (2021). This is not the case for the Sun, where our code selects none for La, Mo, Pr or Zr, (Heiter et al. 2021 list contains 1, 2, 1 and 5, respectively). A deeper investigation of the lines for these elements suggested that our code fails at selecting them because in our synthetic spectra they are too weak (possible disagreement between the model and reality), or too blended. We note, however, that Heiter et al. (2021) selection is done on a resolving power which is twice higher than the one considered here and not necessarily in a uniform way for all of the elements, i.e. a synflag=‘Y’ could be assigned to the best line of an element, even if it is rather blended.
Figure 4 shows the purity of the lines as a function of wavelength, focusing, arbitrarily, on the range [470-690] nm. In grey are represented all of the lines we have identified for the Sun or Arcturus, with a purity greater than 0.3 and detectable with a less than 500. The lines selected by GES with synflag=‘Y’ that exist in our selection are circled in coloured solid lines (orange for Fe-peak lines, red for even-Z elements, green for neutron-capture elements and blue for odd-Z elements).
Figure 4 illustrates, in a rather unsurprising way, that the lines that are pure for the Sun, are not necessarily of the same purity for Arcturus and vice-versa. Our code, therefore, provides the advantage to visualise immediately the purity of a set of lines for a given set of atmospheric parameters. Furthermore, Fig.4 validates our code: the lines selected by Heiter et al. (2021) are found to be mostly of high purity (mostly above 0.7 for both stars). Finally, the plot indicates that the GES selection is rather conservative and privileging purity for the Solar spectrum. That said, the purity for Arcturus remains rather high, with the majority of the lines having a value greater than 0.8 (as opposed to higher than 0.95 for the Sun). It is beyond the purpose of this paper to discuss the validity and limitation of the GES selection.
4.2 Instrument design and optimisation
In this section we investigate how lines, selected in a similar way as in the previous section, compare for a high-resolution () and a low-resolution () setup. We take once again the case of the Sun and Arcturus, with the parameters defined in the previous section, as illustrative of a metal-rich turn-off star and a metal-poor giant.
Figures 5 and 6 show the lines that are selected for each setup and each star, provided a minimum purity of 0.6 and a maximum for HR and for LR. A larger S/N threshold is adopted for LR, to mimic the fact that one would gain S/N by going for LR mode at a fixed exposure time. Note that we assume that the S/N is the same across all of the wavelength range, and that the wavelength range is the same for both setups. Neither of these assumptions are true, especially the first one, since noise is in general wavelength dependent (e.g. wavelength dependent efficiency of spectrograph, decreasing optical quality at the borders of the detector, interstellar extinction absorbing preferentially in the blue, …).
For both of the Sun and Arcturus, Fig. 5 and 6, show that the HR setup contains a much wealthier selection of lines, for any of the considered elements. Indicatively, 233 (275) -elements lines, 606 (769) Fe-peak lines, 10 (80) neutron-capture lines and 26 (41) odd-Z elements lines are selected for the Sun (Arcturus) in HR, compared to 124 (78), 374 (375), 2 (8) and 11 (12) in LR, despite the higher S/N threshold (we recall, however, that the purity threshold is maintained equal to 0.6 in both cases). This corresponds to a ”loss” of 50 to 90 per cent of the lines (depending on the nucleosynthetic channel considered). In practice, going for LR implies giving up hopes of detection with a purity greater than 0.6 for Eu, Sm, Nd, Pr, Ce, Mo, Sr, Zn, Cu for Arcturus, while for the Sun the problem is a bit less dramatic, losing only Y, Sr and Zn (due to the fact that many of the aforementioned elements lost in Arcturus LR, are neither detected for the Sun in HR). Furthermore, the purity of the lines overall decreases when in LR, as expected due to the blending of the lines.
This application, illustrates which lines are detectable, for specific spectral types, with what purity, and the required , given an instrumental resolving power. It can be used in order to chose wavelength ranges that contain the most information based on instrumental constraints (e.g. size of the CCD) or observational strategy (e.g. exposure time, target brightness, stellar type).
In what follows, we will use this information to assess which WEAVE-HR setup performs best per nucleosynthetic channel and per element.
4.3 Choosing between setups: application to the high-resolution setups of WEAVE
We now put ourselves in the framework of a survey design, for instance WEAVE. There exist two WEAVE Galactic archaeology (GA) HR surveys, a HR-chemodynamical survey targeting the thin and thick disc as well as the halo, and an Open Cluster survey, aiming to target roughly a hundred of young and old open clusters in the disc (Jin et al., 2022). WEAVE has the possibility to choose between two HR setups: the first one, dubbed in what follows B+R setup, covers the wavelength ranges [404-465] and [595-685] nm. The second one, dubbed G+R setup in what follows, covers the wavelength ranges [473-545] and [595-685] nm. The question that we are trying to answer is the following: which setup combination probes the best the different nucleosynthetic channels? In other words, which combination of setups maximises the number of elements and number of useful lines, per nucleosynthetic channel (-elements, odd-Z elements, Fe-peak elements, neutron-capture elements) across the targeted parameter space of Teff, and [M/H]?
To set this value, we rely on WEAVE’s GA survey plan (WEAVE consortium, private communication) and adopt as a threshold , which is the value of the expected peak in the blue setup for the typical selection of the WEAVE GA-HR baseline survey. Other setups are expected to have a higher value. Using Eqs. 12 and 13, this corresponds to a minimum required depth of for a line to be detected.
We ran our code on a set of metal-poor stars (, representative of the halo), intermediate-metallicity stars (, representative of the thick disc) and metal-rich stars (, representative of the thin disc and open cluster stars). The results are shown in Figs. 7, 8 and 9, for -elements, Fe-peak elements and neutron-capture elements, respectively (we have not plotted the results for Teff K and \log g$$=5.0 for visualisation purposes). They illustrate the number of lines (colour-code) and number of different elements (size of the points) detected for each nucleosynthetic family and each combination of Teff and . The purity threshold for the -, Fe-peak and neutron-capture elements has been arbitrarily set at , and , in order to optimise the number of lines and the purity itself. A detailed view of the detected lines per element across the Kiel diagram is shown in the Appendix, see Figs 12 to 20.
4.3.1 Even-Z elements
As shown in Fig. 7, the red setup is the one clearly driving the science for intermediate and high metallicities, with more than useful lines, throughout the Kiel diagram, and four elements detected with a purity greater than 0.8 (the black solid circle in Fig. 7 is proportional to four elements). For metal-poor stars (), the blue setup performs slightly better than the green and red setups, with more elements and more lines detected. The green setup performs slightly better than the blue one for intermediate and high metallicities, a regime, however, where, as said above, the red setup is the one driving the science for -elements.
More specifically, based on Figs. 12 and 13, the following diagnostics can be drawn about individual elements:
- •
Carbon (atomic) is seen both in green and red (but not for metal-poor stars) setups, with a purity a bit higher for the green setup (). It is not detectable in the blue setup. We note, however, that these are high excitation C I lines, most readily visible in warmer stars, while C measurements may be achieved using molecular features such as CH in cooler stars.
- •
Oxygen is only detectable in the red setup via the nm line.
- •
Magnesium is detectable in all three setups. The green setup has lines with a very good purity for every metallicity regime (thanks to the Mg I triplet). The blue setup contains useful lines too, but with a lower purity ().
- •
Silicon has many lines detected in the red setup (), as opposed to the green and blue setups which are not optimal for this element (less than 10 lines and ).
- •
Sulphur is only detectable in the red setup, for high and intermediate metallicity stars, with .
- •
Calcium has many lines detectable (more than fiour at each setup), and its purity is very good in the red (). The blue setup performs better than the green, with more lines and higher purity.
- •
Titanium has many lines detectable in all setups, with an overall low purity compared to other -elements. For low metallicity stars, the green setup is preferred to the blue one as it shows a higher purity.
4.3.2 Odd-Z elements
No global plot combining the odd-Z elements is presented, as these cannot be linked to a specific nucleosynthetic channel. Nevertheless, their abundance determination is of prime importance on many fields of galactic and stellar evolution, and a thorough description on how the setups perform is necessary. Based on Figs. 14 and 15, the following diagnostics can be drawn:
- •
Lithium is detected at all stellar types and metallicities in the red setup, thanks to the nm line, and additionally at nm for the most metal-poor giant stars. We recall, however, that given the adopted Li abundance in the modelled spectra, A(Li)=2.00 dex, our results are likely overestimated for giants (for which due to dilution A(Li), see however the case of Li-rich giants, e.g. Charbonnel & Balachandran 2000), and under-estimated for more metal-rich turn-off stars (see Karakas & Lattanzio, 2014, and references therein).
- •
Nitrogen (atomic) is not detectable in any of the setups.
- •
Sodium is detected in the red setup for all stars, except for the metal-poor regime, with . The green and blue setups perform similarly, each of them providing low-purity lines () that do not allow detection across the whole Kiel diagram at any metallicities.
- •
Aluminium is seen only in the red, for intermediate and high metallicities. The purity is overall high ().
- •
Phosphorus is not detectable with either of the setups.
- •
Potassium is not detectable in either of the setups.
- •
Scandium is detected in the red setup with . Green and blue setups also contain useful Sc lines, especially at low metallicities, though with a lower purity than in the red setup. The green setup performs better than the blue one both in terms of number of lines and in terms of purity.
4.3.3 Fe-peak elements
As shown in Fig. 8, there exists a plethora of lines to select, with more than 60 lines with a purity greater than 0.9 for any of the setups. Overall, the green setup performs the best for all stars at low metallicity, as well for main-sequence stars at intermediate metallicity. The red setup is the one driving the science for giants at intermediate metallicity and for all stars at high metallicity. The combination of the green and red setups allows us to get at least seven iron-peak elements at any combination of Teff, and [M/H].
More specifically, based on Figs. 16 and 17, the following diagnostics can be drawn about individual elements:
- •
Vanadium has high purity lines in the red setup (). The blue setup performs better than red or green at low metallicity, with lines detected over the entire Kiel diagram.
- •
Chromium has few high purity lines in the red setup for intermediate and high metallicities. At low metallicity, both green and blue setup exhibit many lines, with a marginal advantage of the green setup over the blue one in terms of purer lines.
- •
Manganese has the highest purity lines for intermediate and high metallicity stars in the red setup, which also performs relatively well at low metallicity. Overall, the green setup performs better than blue the former having purer lines than the latter.
- •
Iron has many lines that are detectable in all setups, and in fact Fe I dominates the number counts in Fig. 8. The red setup has the highest purity (), and the green setup has purer Fe lines than the blue.
- •
Cobalt has the purest lines in the red setup. The blue setup performs better than green at low metallicity, allowing a detectability of Co lines for both giants and main-sequence stars with a purity of .
- •
Nickel has the purest lines in the red setup. The green setup performs much better than the blue, with more numerous and purer lines.
- •
Copper has lines seen only in the green, with a relatively low purity (), except for metal-poor stars, where .
- •
Zinc is not seen in the blue setup. The green setup is the only one that allows us to measure a Zn abundance at low metallicities.
4.3.4 Neutron-capture elements
As shown in Fig. 9, WEAVE setups contain much less useful neutron-capture element lines than for the - and Fe-peak elements, with at best 20 lines for metal-rich and intermediate metallicity giants. As far as the turn-off region is concerned, the blue setup is the one performing the best, with more elements being probed compared to the other two setups.
More specifically, based on Figs. 18, 19 and 20, the following diagnostics can be drawn about individual elements:
- •
Rubidium is never detected with a purity greater than 0.5 in the considered setups.
- •
Strontium is detected only in the blue setup (reaching for metal-poor stars).
- •
Yttrium has a better purity in the green setup compared to the blue. Yet, for intermediate and high metallicity stars, Y lines are also detected in the red setup.
- •
Zirconium is best detected across the Kiel diagram in the green setup, and sparsely in the blue for main-sequence and the red setup for cool stars. However, the purity is overall low ().
- •
Molybdenum is detected only in the red setup, only for the cooler and dex stars, with .
- •
Barium is detected in all three setups, with the blue one performing slightly better than the green one in terms of purity.
- •
Lanthanum is sparsely detected in all setups for giants with a variety of purities. There is a slight advantage of the green setup over the blue, with purer and more numerous lines.
- •
Cerium is detected at all evolutionary stages only the blue setup, albeit with a low purity.
- •
Praseodymium is sparsely detected in the blue and green setups.
- •
Neodymium has the most numerous lines detected in the green setup at any evolutionary stage, while having a similar purity similar than the other setups, at intermediate and high metallicities (or better in the case of metal-poor stars).
- •
Samarium is slightly better detected in the blue setup than in the other two setups.
- •
Europium is only detected in the blue setup for metal-poor stars while only the red setup allows allow the detection of usable lines for intermediate and high metallicity giants. No lines are detected in the green setup.
4.3.5 Summary
The diagnostics above were derived for an idealised case of perfectly normalised spectra with white noise (S/N that is constant over the wavelength range). In reality, this will not be the case, and the normalisation is expected to be challenging in the blue setup, due to the multiple atomic and molecular lines. Yet, keeping in mind that WEAVE’s HR baseline survey will not target many cool main-sequence stars (WEAVE consortium, private communication), our results seem to slightly privilege the Green+Red setup, both in numbers of elements detected and in terms of number of lines that are useful. We note, however, that there is a disparity in terms of which elements are detected over each of the setups (e.g. Sr is only detectable in the blue setup), and that the final decision needs to be taken according to the elements that the science cases of the considered surveys decide to highlight and on the expected temperature and metallicity ranges in which those elements need to be detected (i.e. the target selection function).
4.4 Line-list optimisation for abundance determination
For some abundance determination codes, the masking of a subset of lines of a specific element may be desired, either in order to decrease the computational time and/or in order to improve the precision of the measurement. The code presented in Sect. 2, allows one to very simply extract a sub-sample of lines for a given element, provided some observational (e.g. maximum S/N) and purity constraints. To build such a ”golden line-sublist”, the following considerations could be taken into account:
- •
The purity of the selected lines for a given stellar type and metallicity should be as as high as possible.
- •
The linelist for a given stellar type and metallicity needs to include lines that allow an abundance measurement for both a high and a low S/N (reflecting the range in apparent magnitudes of the survey).
- •
For a given stellar type and metallicity, lines on the linear part of the curve-of-growth (i.e. not strong lines) should be favoured, to maximise the sensitivity of the lines to the elemental abundance (Gray, 2005).
- •
When a range of excitation potentials is available for a given species, selecting only the lowest excitation lines should be avoided (typically more prone to non-local thermodynamic equilibrium). In the case of species where many lines are available, spanning a wide range of excitation potentials is desirable to enable checks of the excitation temperature.
- •
The synthetic lines need to reproduce satisfactorily the observed spectra of at least the Sun and Arcturus.
We implemented the above scheme into the creation of a line-list for the blue HR setup of WEAVE. In practice, we imposed as the maximum for the detectability of a line with no purity filter. For each element, we kept all of the available lines if their total number was less than 30 (this number was arbitrarily chosen) when considering all of the set of stellar atmospheric parameters. When there were more than 30 lines available, each stellar spectrum was investigated automatically, splitting the range ] into three bins of equal range, and looking within each of these bins for the lines that had the highest purity. In order to achieve this, we started by imposing a purity of 1 and decreased the latter iteratively by steps of 0.025 until a minimum of five lines was reached while keeping the purity greater than 0.6 (except for Fe, where we imposed a minimum purity of 0.95). Figure 10 shows, for Ti, the properties of all the available lines detectable up to , where we have highlighted in red the ones that we eventually select.
The golden line-sublist for the considered element was then obtained by keeping the union of all of the selected lines across the entire set of atmospheric parameters. Figure 11 shows a histogram of the excitation potential of all the available Ti lines detectable for (in grey) and in red, the sub-sample that we have selected. One can see that they successfully span all the range of , with a bias towards lower values, as desired.
5 Conclusions
Our automatic line selection for abundance determination code is based on the use of synthetic spectra containing all of the elements and blends available and the comparison with a synthetic spectrum at the same stellar parameters containing only one element at the time. In this sense, a comparison with true, observed, spectra is necessary in order to confirm that the lines that are selected are also representing nature accurately. Ideally, this comparison should be done with spectra of stars for which both stellar parameters and individual abundances are best known, i.e. the Sun, Arcturus and other benchmark stars (e.g. Blanco-Cuaresma et al., 2014; Heiter et al., 2015; Jofré et al., 2015). We have not proceeded through this comparison in this work, as results may vary from one resolving power to the other, yet a simple computation of residuals between the synthetic spectra and the real ones, around the lines that our code selects, should suffice in order to discard lines that are not modelled properly.
Our code can serve both as an illustration of where the chemical information is present in a stellar spectrum, but most importantly allows one to optimise i) observational strategies, such as choosing resolution and spectral windows, as well as 2) analysis codes, with the application of masks of high quality. In particular, direct applications for observations using the WEAVE (Jin et al., 2022) and 4MOST (de Jong et al., 2019) facilities (both community and consortium surveys) will benefit largely of the present tool.
The python code allowing to identify and characterise the useful lines can be downloaded on gitlab101010https://gitlab.oca.eu/gkordo/line_selections. We also share via CDS the Tables containing the results at five different resolving powers ( and ) for lines that have a purity greater than 0.4 in at least one of their wings (i.e. or , see Sect. 2.1), for the entire wavelength range between 300 nm and 1000 nm. Results for other resolving powers can be easily computed and provided by contacting the first author of this paper. Finally, the 54 infinite resolution spectra that have been used in this work ( GB) can be shared upon request.
Acknowledgments
We thank the anonymous referee for their comments that helped improving the quality of the paper. This work has benefited from inspiring and fruitful discussions within the WEAVE and 4MOST consortia, as well as with Michael Hanke. Shoko Jin and Scott Trager are warmly thanked for their valuable feedback on early versions of the paper and for discussions that led to the selection of the WEAVE HR wavelength ranges. GK and VH gratefully acknowledge support from the french national research agency (ANR) funded project MWDisc (ANR-20-CE31-0004). KL acknowledges funds from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (Grant agreement No. 852977) and funds from the Knut & Alice Wallenberg foundation. This work was supported by the Programme National Cosmology et Galaxies (PNCG) of CNRS/INSU with INP and IN2P3, co-funded by CEA and CNES. Ansgar Wehrhahn is acknowledged for their contribution to the PySME code. This work has made use of the VALD database, operated at Uppsala University, the Institute of Astronomy RAS in Moscow, and the University of Vienna. as well as the Python packages Numpy (Harris et al., 2020), Matplotlib (Hunter, 2007) and Pandas.
Appendix A Cayrel’s formula and minimum depth of a line
Here we derive our approximation on the desired minimum depth of a line, , in order for it to be detected at a given signal-to-noise ratio, . We start from the standard Cayrel (1988) formula, linking the uncertainty on measuring the equivalent width of a line, to the per pixel, the full width at half-maximum of the line (assuming it has a Gaussian profile) and the pixel size (in wavelength units):
[TABLE]
To a good approximation, . One can therefore derive the formula for the uncertainty of the core of the line, , as:
[TABLE]
where and are in wavelength units, and is per pixel.
Equation 10 can also be expressed as a function of per resolution element, , as follows:
[TABLE]
The detectability of a spectral absorption line is therefore possible if its intrinsic intensity is deeper than:
[TABLE]
Appendix B Equivalent width uncertainties in presence of a blend
We consider the measured EW of a line, which we assume to be a combination of the real EW of the line alone, , and a fractional contribution of a blend to the line, . One can hence write:
[TABLE]
The contribution of the blending to the error on can be written as:
[TABLE]
In order to have an error on smaller than 10 per cent (corresponding to an abundance uncertainty of dex if the line is in the linear part of the curve of growth), one therefore needs :
[TABLE]
and hence: .
Similarly, assuming that the blend is known by a factor of , one can write:
[TABLE]
Following the previous steps, in order to have an error smaller than 10 per cent on , one therefore needs :
[TABLE]
and hence: . So that if , then .
Appendix C Purities and detectability of elements for WEAVE high-resolution setups
The plots in this appendix represent the amount of lines detected per element and per combination of Teff--[M/H] (size of the points), and the mean purity of the lines (colour-code) at each point of the Kiel diagram, for each of WEAVE’s HR setup. Figures are separated into even-Z (Fig. 12 and 13), odd-Z (Figs. 14 and 15), Fe-peak (Figs. 16 and 17) and neutron-capture elements (Figs. 18, 19 and 20). The figures are discussed in Sect. 4.3.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Abareshi et al. (2022) Abareshi, B., Aguilar, J., Ahlen, S., et al. 2022, AJ, 164, 207
- 2Bedell et al. (2014) Bedell, M., Meléndez, J., Bean, J. L., et al. 2014, Ap J, 795, 23
- 3Blanco-Cuaresma et al. (2014) Blanco-Cuaresma, S., Soubiran, C., Jofré, P., & Heiter, U. 2014, A&A, 566, A 98
- 4Burbidge et al. (1957) Burbidge, E. M., Burbidge, G. R., Fowler, W. A., & Hoyle, F. 1957, Reviews of Modern Physics, 29, 547
- 5Caffau et al. (2013) Caffau, E., Koch, A., Sbordone, L., et al. 2013, Astronomische Nachrichten, 334, 197
- 6Cayrel (1988) Cayrel, R. 1988, in IAU Symposium, Vol. 132, The Impact of Very High S/N Spectroscopy on Stellar Physics, ed. G. Cayrel de Strobel & M. Spite, 345
- 7Charbonnel & Balachandran (2000) Charbonnel, C. & Balachandran, S. C. 2000, A&A, 359, 563
- 8Cirasuolo et al. (2020) Cirasuolo, M., Fairley, A., Rees, P., et al. 2020, The Messenger, 180, 10
