Statistically-informed deep learning for gravitational wave parameter estimation
Hongyu Shen, E. A. Huerta, Eamonn O'Shea, Prayush Kumar, Zhizhen Zhao

TL;DR
This paper presents a deep learning approach combining WaveNet, contrastive learning, and normalizing flow to efficiently estimate gravitational wave parameters, matching traditional Bayesian results with significantly reduced computation time.
Contribution
Introduces a novel neural network model for gravitational wave parameter estimation that is fast, accurate, and encodes physical correlations, validated against analytical posteriors.
Findings
Neural network predictions are statistically consistent with Bayesian analyses.
The model produces results within milliseconds per event.
Posterior distributions accurately reflect physical parameter correlations.
Abstract
We introduce deep learning models to estimate the masses of the binary components of black hole mergers, , and three astrophysical properties of the post-merger compact remnant, namely, the final spin, , and the frequency and damping time of the ringdown oscillations of the fundamental bar mode, . Our neural networks combine a modified architecture with contrastive learning and normalizing flow. We validate these models against a Gaussian conjugate prior family whose posterior distribution is described by a closed analytical expression. Upon confirming that our models produce statistically consistent results, we used them to estimate the astrophysical parameters of five binary black holes: andā¦
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Statistically-informed deep learning for gravitational wave parameter estimation
Hongyu Shen1,2, E. A. Huerta3,4, Eamonn OāShea5, Prayush Kumar5,6, and Zhizhen Zhao1,2,7
1Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
2Coordinated Science Laboratory, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
3Data Science and Learning Division, Argonne National Laboratory, Lemont, Illinois 60439, USA
4Department of Computer Science, University of Chicago, Chicago, Illinois 60637, USA
5Cornell Center for Astrophysics and Planetary Science, Cornell University, Ithaca, New York 14853, USA
6International Centre for Theoretical Sciences, Tata Institute of Fundamental Research, Bangalore 560089, India
7National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
Abstract
We introduce deep learning models to estimate the masses of the binary components of black hole mergers, , and three astrophysical properties of the post-merger compact remnant, namely, the final spin, , and the frequency and damping time of the ringdown oscillations of the fundamental bar mode, . Our neural networks combine a modified WaveNet architecture with contrastive learning and normalizing flow. We validate these models against a Gaussian conjugate prior family whose posterior distribution is described by a closed analytical expression. Upon confirming that our models produce statistically consistent results, we used them to estimate the astrophysical parameters of five binary black holes: GW150914, GW170104, GW170814, GW190521 and GW190630. We use PyCBC Inference to directly compare traditional Bayesian methodologies for parameter estimation with our deep learning based posterior distributions. Our results show that our neural network models predict posterior distributions that encode physical correlations, and that our data-driven median results and 90% confidence intervals are similar to those produced with gravitational wave Bayesian analyses. This methodology requires a single V100 NVIDIA GPU to produce median values and posterior distributions within two milliseconds for each event. This neural network, and a tutorial for its use, are available at the Data and Learning Hub for Science.
1 Introduction
The advanced LIGOĀ [1, 2] and advanced VirgoĀ [3] observatories have reported the detection of tens of gravitational wave sourcesĀ [4, 5, 6]. At design sensitivity, these instruments will be able to probe a larger volume of space, thereby increasing the detection rate of sources populating the gravitational wave spectrum. Thus, given the expected scale of gravitational wave discovery in upcoming observing runs, it is in order to explore the use of computationally efficient signal-processing algorithms for gravitational wave detection and parameter estimation.
The rationale to develop scalable and computationally efficient signal-processing tools is apparent. Advanced gravitational wave detectors will be just one of many large-scale science programs that will be competing for access to oversubscribed and finite computational resourcesĀ [7, 8, 9, 10]. Furthermore, transformational breakthroughs in multi-messenger astrophysics over the next decade will be enabled by combining observations in the gravitational, electromagnetic and astro-particle spectra. The combination of these high dimensional, large volume and high speed datasets in a timely and innovative manner presents unique challenges and opportunitiesĀ [11, 12, 13].
The realization that companies such as Google, YouTube, among others, have addressed some of the big-data challenges we are facing in multi-messenger astrophysics has motivated a number of researchers to learn what these companies have done, and how such innovation may be adapted in order to maximize the science reach of big-data projects. The most successful approach to date consists of combining deep learning with innovative and extreme scale computing.
Deep learning was first proposed as a novel signal-processing tool for gravitational wave astrophysics inĀ [14]. That initial approach considered a 2-D signal manifold for binary black hole mergers, namely the masses of the binary components , and considered simulated advanced LIGO noise. The fact that such method was as sensitive as template-matching algorithms, but at a fraction of the computational cost and orders of magnitude faster, provided sufficient motivation to extend such methodology and apply it to detect real gravitational wave sources in advanced LIGO noise inĀ [15, 16]. These studies have sparked the interest of the gravitational wave community to explore the use of deep learning for the detection of the large zoo of gravitational wave sourcesĀ [17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38].
Deep learning methods have matured to now cover a 4D signal manifold that describes the masses of the binary components and the -component of the 3-D spin vector: Ā [39, 40]. These algorithms have been used to search for and find gravitational wave sources processing open source advanced LIGO data in bulk, which is available at the Gravitational Wave Open Science CenterĀ [41]. In the context of multi-messenger sources, deep learning has been used to forecast the merger of binary neutron stars and black hole-neutron star systemsĀ [37, 42]. The importance of including eccentricity for deep learning forecasting has also been studied and quantifiedĀ [38]. In brief, deep learning research is moving at an incredible pace.
Another application area that has gained traction is the use of deep learning for gravitational wave parameter estimation. The established approach to estimate the astrophysical parameters of gravitational wave signals is through Bayesian inferenceĀ [43, 44, 45, 46], which is a well tested and extensively used method, though computationally-intensive. On the other hand, given the scalability and computational efficiency of deep learning models, the gravitational wave parameter estimation can take advantage of its power to produce faster inference.
Gravitational wave parameter estimation has rapidly evolved from point-wise parameter estimationĀ [14, 15, 16] to the use of neural networks dropouts to provide estimation intervalsĀ [47], and to output a parametrized approximation of the corresponding posterior distributionĀ [48]. Other methods have proposed the use of Conditional Variational Auto-Encoders (CVAEs) to infer the parameters of GWs embedded in simulated noiseĀ [49, 50]. InĀ [51] the authors harnesses new methods, e.g., normalizing flowĀ [52], to do parameter estimation over the full 15-dimensional space of binary black hole system parameters for the event GW150914. Building upon this study, authors inĀ [53] presented deep learning methods to estimate the astrophysical parameters of several gravitational wave events. One can also refer toĀ [54, 55] for a comprehensive review of the gravitational-wave-based machine learning approaches.
In this article we quantify the ability of deep learning to estimate the masses of the binary components of binary black hole mergers, and of the astrophysical parameters that describe the properties of the black hole remnant, namely, the final spin, , and the frequency and damping time of the ringdown oscillations of the fundamental bar mode, , known as quasinormal modes (QNMs)Ā [56]. An existing approach proposes to use neural networks to solve differential equations for QNMsĀ [57]. Our approach, on the other hand, differs from this or other studies in the literature in that we estimate the astrophysical parameters of the remnant by directly feeding time-series advanced LIGO strain data into our deep learning algorithms.
This article is organized as follows. In SectionĀ 2 we describe the architecture of our neural network model, and the datasets used to train, validate and test it. We briefly describe the Bayesian inference pipeline, PyCBC Inference, in SectionĀ 3, which we used as a baseline to compare the full posterior distributions predicted by our deep learning model. We quantify the accuracy and physical consistency of the predictions of our deep learning model for several gravitational wave sources in SectionĀ 4. We summarize our findings and future directions of work in SectionĀ 5.
2 Methods
Herein we describe several methods to improve the training performance and model accuracy of our algorithms. We have used PyTorchĀ [58] to design, train, validate and test our neural network models.
2.1 Deep Learning Model Objective
The goal of our deep learning model is to estimate a posterior distribution of the physical parameters of the waveforms from the input noisy data. This approach shares similarities with Bayesian approaches such as Markov Chain Monte Carlo (MCMC), e.g., once a likelihood function and a predefined prior are provided, posterior samples may be drawn. The difference between the deep learning model and MCMC is that our proposed framework will learn a distribution from which we can easily draw samples, thereby increasing computational efficiency significantly. It is worth emphasizing that once the likelihood model is properly defined, the framework we introduce here may be applicable to other disciplines.
In the context of gravitational waves, the noisy waveform is generated according to the following physical model,
[TABLE]
where is the function that maps the physical parameters (masses and spins) to the gravitational waveform templateĀ [46, 59, 60], and denotes the additive noise at various signal-to-noise ratios (SNR). We use with subscript pair to specifically indicate the -th template associated with -th noise realization in our dataset . For simplicity, we use and to indicate noisy waveforms and the physical parameters when the specification of or subscript is not needed. We use and to denote the dimension of and , respectively.
We use WaveNetĀ [61] to extract features from the input noisy waveforms. WaveNet was first introduced as an audio synthesis tool to generate human-like audios given random inputs. It uses dilated convolutional kernel and residual network to capture the spatial information both in the time domain and the model depth, which has been shown to be a powerful tool in model time-series data. Previously, [62, 40] tailored this architecture for gravitational wave denoising and detection. The encoded feature vector comes from an embedding function parameterized by the WaveNet weights , . In other words, .
Normalizing flow is a technique to transform distributions with invertible parameterized functions. Specifically, we use a conditional version of normalizing flow: conditional autoregressive splineĀ [63, 64, 65, 66, 67] to learn the posterior distribution on top of the encoded latent space by WaveNet encoding. and we implement it through a PyTorch-based probabilistic programming package: PyroĀ [65]. Mathematically, we denote the invertible function is parameterized by the learnable model weights and the encoded feature . In this way, we encode dependencies of the posterior distribution on the input . The random vector is drawn according to a pre-defined base distribution , and has the same dimension as . The function is then used to convert the base distribution to the approximated posterior distribution of the physical parameters,
[TABLE]
with .
The computation of the transformation contains two steps. The first step is to compute the intermediate coefficients from the feature vector based on the function , which is parameterized by 2 fully connected layers with weights denoted as , i.e., . The coefficients are used to combine the invertible linear rational splines to form (see Eq. (5) inĀ [64] for details). Therefore, is an element-wise invertible linear rational spline with coefficients . Since depends on the input waveform and , the resulting mapping and parameterized distribution in Eq.Ā (2) vary with the input . The parameterization of the estimated posterior distribution is illustrated in FigureĀ 1.
To learn the network weights, we need to construct the empirical loss objective given the collection of training data . We propose to include a loss term defined on the feature vectors in our learning objective to take account for the variation in the waveform due to noise. That is if the underlying physical parameters are similar, then the similarity of the feature vectors should be large, and vice versa. To achieve this, we use contrastive learning objectiveĀ [68] to distinguish positive data pairs (waveforms with the same physical parameters) from the negative pairs (noisy waveforms with different physical parameters). Specifically, we use the normalized temperature-scaled cross entropy (NT-Xent) loss used in the state-of-the-art contrastive learning technique SimCLRĀ [69, 70]. SimCLR was originally introduced to improve the performance of image classification with additional data augmentation and NT-Xent loss evaluation. We adapt the NT-Xent loss used in contrastive learning to our feature vectors,
[TABLE]
where , is a scalar temperature parameter, and we choose according to the default setting provided inĀ [69]. The NT-Xent loss performs in such a way that, regardless of the noise statistics, the cosine distances of the encoded features associated with the same underlying physical parameters (i.e. and ) are minimized, and the distances of features with different underlying physical parameters are maximized. Consequently, the trained model is robust to the change of noise realizations and noise statistics. Therefore, incorporating the term in Eq.Ā (3) can be used as a noise stabilizer for gravitational wave parameter estimation. We found that the inclusion of this term speeds up the convergence in training.
Our deep learning objective in Eq.Ā (2.1) combines the NT-Xent loss in Eq.Ā (3) with the posterior approximation term. Given a batch of physical parameters , we generate different noise realizations for each and the empirical loss function is,
[TABLE]
where is defined in Eq.Ā (2). Minimizing the loss in Eq.Ā (2.1) with respect to and provides a posterior estimation for gravitational wave events.
It is worth pointing out that while Refs.Ā [71, 72] apply , an arbitrary random distribution to their generative model, our posterior distributions do not involve arbitrary random distributions.
2.2 Separate Models for Parameters
In this paper, we are interested in the following physical parameters: . We find that trying to estimate all parameters using a single model lead to sub-optimal results given that they are of different scales. Thus, we use two separate models with similar model architecture as shown in FigureĀ 1. One model is used to estimate the masses of the binary components, while the other one is used to infer the final spin and QNMs of the remnant.
The final spin of the remnant and its QNMs have similar range of values when the QNMs are cast in dimensionless units. We trained the second model using the fact that the QNMs are determined by the final spin using the relationĀ [56]:
[TABLE]
where correspond to the frequency and damping time of the ringdown oscillations for the fundamental bar mode, and the first overtone . We compute the QNMs followingĀ [56]. One can translate into the ringdown frequency (in units of Hertz) and into the corresponding (inverse) damping time (in units of seconds) by computing , where is the final mass of the remnant, and can be determined using Eq.Ā (1) inĀ [73]. An additional benefit of using two separate models is that the training converges faster with two models considering two different sets of physical parameters at different magnitudes.
2.3 Dataset Preparation and Training
Modeled Waveforms We used the surrogate waveform familyĀ [74] to produce modeled waveforms that describe binary black holes with component masses , , and spin components . By uniformly sampling this parameter space we produce a dataset with 1,061,023 waveforms. These waveforms describe the last second of the late inspiral, merger, and ringdown. The waveforms are produced using a sample rate of 4096Hz.
For training purposes, we label the waveforms using the masses and spins of the binary components, and then use this information to also enable the neural net to estimate the final spin of the black hole remnant using the formulae provided inĀ [75], and the QNMs followingĀ [56]. In essence, we are training our neural network models to identify the key features that determine the properties of the binary black holes before and after merger using a unified framework.
We use 90% of these waveform samples for training, 10% testing. The training samples are randomly and uniformly chosen. Throughout the training, we use AdamW optimizer to minimize the mean squared error of the predicted parameters with default hyper-parameter setupsĀ [76]. We choose the learning rate to be 0.0001. To simulate the environment where the true gravitational waves are embedded, we use real advanced LIGO noise to compute power spectral density (PSD), which is then used to whiten the templates.
Advanced LIGO noise. For training we used a 4096s-long advanced LIGO noise data segment, sampled at 4096Hz, starting at GPS time 1126259462. We obtained these data from the Gravitational Wave Open Science CenterĀ [41]. We estimate a PSD using the entire 4096s segment to whiten the modeled waveforms and noise. For each one second long noisy waveform used in training, we combine the clean whitened template with a randomly picked one second long noise segment from the 4096s-long advanced LIGO strain data. For each generated waveform template (see Eq.Ā 1), we apply two different noisy realizations. As a result, the total number of noisy waveforms (clean templates noise realizations) applied during training is equal to: of training iterations batch size 2.
In SectionĀ 4, we demonstrate that our model, trained only with advanced LIGO noise from the first observing run, is able to estimate the astrophysical parameters of other events across O1-O3. We fixed the merger point of the training templates at the 3,596 timestep out of 4,096 total timesteps. We empirically found having a fixed merger point, rather than shifting the templates to have time-invariant property, provides a tighter estimation of the posteriors. Our deep learning model was trained on 1 NVIDIA V100 GPU with a batch size of 8. In general, it takes about 1-2 days to fully train this model.
2.4 GPS Trigger Time
It is known that a trigger GPS time associated with a gravitational wave event, typically provided by a detection algorithm, may differ from the true time of coalescence. Therefore, we perform a local search around the trigger time by any given detection algorithm as a pre-processing step for the parameter estimation using the trained model. We first identify local merger time candidates by evaluating the normalized cross-correlation (NCC) of the whitened observation with 33,713 whitened clean templates, whose physical parameters uniformly cover the range: , , and , over a time window of 0.015 seconds around the time candidates. The time points with top NCC values are selected as the candidates. Then we use the trained models to estimate the posterior distributions of the physical parameters at each candidate time point. In practice, we found that the trigger times with the best NCC values differ from those published at the Gravitational Wave Open Science Center by up to 0.01s. These trigger times produce different posterior distributions that vary in size by up to for the masses of the binary components, and up to 5% for the astrophysical properties of the compact remnant. We have selected the time point that gives the smallest confidence area for the results we present in SectionĀ 4.2.
3 Bayesian Inference
We compare our data-driven posterior estimation with PyCBC InferenceĀ [46, 59, 60], which uses a parallel-tempered MCMC algorithm, emcee_ptĀ [77], to evaluate the posterior probability for the set of source parameters given the data . The posterior is calculated as where is the likelihood and is the prior. The likelihood function for a set of detectors is
[TABLE]
where and are the frequency-domain representations of the data and the model waveform for detector . The inner product is defined as
[TABLE]
where is the PSD of the -th detector.
We performed the MCMC analysis using the publicly available data from the GWTC-1 releaseĀ [4] and used the corresponding publicly available PSD files for each eventĀ [78]. We analyse a segment of 8 seconds around the GPS trigger 1167559935.6, with the data sampled to 2048 Hz. We use the IMRPhenomDĀ [79] waveform model to generate waveform templates to evaluate the likelihood. We assume uniform priors for the component masses with and uniform priors on the component spins with . We also set uniform priors on the luminosity distance with and the deviation of the arrival time from the trigger time . We set uniform priors for the coalescence phase and the polarization angle . The prior on the inclination angle between the binaryās orbital angular momentum and the line of sight, , is set to be uniform in the sine of the angle, and the right ascension and declination have priors to be uniform over the sky.
Furthermore, they may be used to cross validate the physical reality of an eventĀ [39, 40], and to assess whether the estimated merger time is consistent between the two separate models. For instance, if the models output very different merger times, then we may conclude that they are not providing a reliable merger time. On the other hand, when their results are consistent, within a window between 0.001s and up to 0.005s, then we can remove the ambiguity introduced when using the NCC approach described in SectionĀ 2.4.
4 Experimental Results
In this section we present two types of results. First, we validate our model with a well known statistical model. Upon confirming that our deep learning approach is statistically consistent, we used to estimate the parameters of five binary black hole mergers.
4.1 Validation on Simulated Data
We performed experiments on simulated data that have closed form posterior distributions. This is important to ascertain the accuracy and reliability of our method. The simulated data are generated through a linear observation model with additive white Gaussian noise,
[TABLE]
where the additive noise . We consider the underlying parameters and the linear map , with and . The likelihood function is
[TABLE]
If we assume the prior distribution of is a Gaussian distribution with mean and covariance , we can get an analytical expression for the posterior distribution of given the observation ,
[TABLE]
where
[TABLE]
During the training stage we draw 100 samples of from its prior , and is generated through the linear observation modelĀ (8). We train a 3-layer model with the model objectiveĀ (2.1), and show three examples of the posterior estimation in FigureĀ 2. Therein we show 50% and 90% confidence contours. Black lines represent ground truth results (ellipses given the posterior is Gaussian), while the red contours correspond to the neural network estimations, based on Gaussian kernel density estimation (KDE) with 9,000 samples generated from the network. These results indicate that our deep learning model can produce reliable and statistically valid results.
4.2 Results with Real Events
In this section we use our deep learning models to estimate the medians and posterior distributions of the astrophysical parameters and , respectively, for five binary black hole mergers: GW150914, GW170104, GW170814, GW190521 and GW190630.
As described in SectionĀ 2.1, we consider 1s-long advanced LIGO noise input data batches, denoted as , sampled at 4096Hz. We construct two posterior distribution estimations, , by minimizing the loss in Eq.Ā (2.1) for and for . We use two different multivariate normal base distributions for in the two different models. To estimate the masses of the binary components, the mean and covariance matrix () are: ; whereas for the final spin and QNMs model we use: . āā refers to the diagonal matrix with āā being the diagonal elements. The number of normalizing flow layers also varies for the two models. We use a 3-layer normalizing flow module for masses prediction, and an 8-layer module for the predictions of final spin and QNMs.
Our first set of results is presented in FiguresĀ 3, Ā 4, andĀ 5. These figures provide the median, and the 50% and 90% confidence intervals, which we computed using Gaussian KDE estimation with 9,000 samples drawn from the estimated posteriors. In TablesĀ 1 andĀ 2 we also present a summary of our data-driven median results and 90% confidence intervals, along with those obtained with traditional Bayesian algorithms inĀ [4, 80].
Before we present the main highlights of these results, it is important to emphasize that our results are entirely data-driven. We have not attempted to use deep learning as a fast interpolator that learns the properties of traditional Bayesian posterior distributions. Rather, we have allowed deep learning to figure out the physical correlations among different parameters that describe the physics of black hole mergers. Furthermore, we have quantified the statistical consistency of our approach by validating it against a well known model. This is of paramount importance, since deep learning models may be constructed to reproduce the properties of traditional Bayesian distributions, but that fact does not provide enough evidence of their statistical validity or consistency. Finally, given the nature of the signal processing tools and computing approaches we use in this study, we do not expect our data-driven results to exactly reproduce the traditional Bayesian results reported inĀ [4, 80].
Our results may be summarized as follows. FiguresĀ 3, Ā 4, andĀ 5 show that our data-driven posterior distributions encode expected physical correlations for the masses of the binary components, , and the parameters of the remnant: and . We also learn that these posterior distributions are determined by the properties of the noise and loudness of the signal that describes these events. FigureĀ 3 presents a direct comparison between the posterior distributions predicted by our deep learning models and those produced with PyCBC Inferenceāmarked with dashed lines. These results show that our deep learning models provide real-time, reliable information about the astrophysical properties of binary black hole mergers that were detected in three different observing runs, and which span a broad SNR range.
On the other hand, TablesĀ 1 andĀ 2 show that our median and 90% confidence intervals are better, similar and in some cases slightly larger than those obtained with Bayesian algorithms. In these Tables, Bayesian LIGO results for are directly taken fromĀ [4, 80], while results are computed using their Bayesian results for and the tables available atĀ [82]. These results indicate that deep learning methods can learn physical correlations in the data, and provide reliable estimates of the parameters of gravitational wave sources. To demonstrate that our model represents true statistical properties of the posterior distribution, we tested the posterior estimation on simulated noisy gravitational waveforms. We calculate the empirical cumulative distribution function (CDF) of the number of times the true value for each parameter was found within a given confidence interval , as a function of . We compare the empirical CDF with the true CDF of in the P-P plot in FigureĀ 6. To obtain the empirical CDF, for each test waveform (1000 waveforms in total) and one-dimensional estimated posterior distribution generated from the network with 9,000 samples, we record the count of the confidence intervals (=1% , ā¦, 100%) where the true parameters fall. The empirical CDF is based on the frequency of such counts with the 1000 waveforms randomly drawn from the test dataset. Since the empirical CDFs lie close to the diagonal, we conclude that the networks generate close approximation of the posteriors. Furthermore, our data-driven results, including medians and posterior distributions, can be produced within 2 milliseconds per event using a single NVIDIA V100 GPU. We expect that these tools will provide the means to assess in real-time whether the inferred astrophysical parameters of the binary components and the post-merger remnant adhere to general relativistic predictions. If not, these results may prompt follow up analyses to investigate whether apparent discrepancies are due to poor data quality or other astrophysical effectsĀ [83].
The reliable astrophysical information inferred in low-latency by deep learning algorithm warrants the extension of this framework to characterize other sources, including eccentric compact binary mergers, and sources that require the inclusion of higher-order waveform modes. Furthermore, the use of physics-inspired deep learning architectures and optimization schemesĀ [29] may enable an accurate measurement of the spin of binary components. These studies should be pursued in the future.
5 Conclusion
We designed neural networks to estimate five parameters that describe the astrophysical properties of binary black holes before and after the merger event. The first two parameters constrain the masses of the binary components, while the others estimate the properties of the black hole remnant, namely . These models combine a WaveNet architecture with normalizing flow and contrastive learning to provide statistically consistent estimates for both simulated distributions, and real gravitational wave sources.
Our findings indicate that deep learning can abstract physical correlations in complex data, and then provide reliable predictions for the median and 90% confidence intervals for binary black holes that span a broad SNR range. Furthermore, while these models were trained using only advanced LIGO noise from the first observing run, they were capable of generalizing to binary black holes that were reported during the first, second and third observing runs.
These models will be extended in future work to provide informative estimates for the spin of the binary components, including higher-order waveform modes to better model the physics of highly spinning and asymmetric mass-ratio black hole systems.
6 Acknowledgements
Neural network models are available at the Data and Deep Learning Hub for ScienceĀ [84, 85]. EAH, HS and ZZ gratefully acknowledge National Science Foundation (NSF) awards OAC-1931561 and OAC-1934757. EOS and PK gratefully acknowledge NSF grants PHY-1912081 and OAC-193128, and the Sherman Fairchild Foundation. PK also acknowledges the support of the Department of Atomic Energy, Government of India, under project no. RTI4001. This work utilized the Hardware-Accelerated Learning (HAL) cluster, supported by NSF Major Research Instrumentation program, grant OAC-1725729, as well as the University of Illinois at Urbana-Champaign. Compute resources were provided by XSEDE using allocation TG-PHY160053. This work made use of the Illinois Campus Cluster, a computing resource that is operated by the Illinois Campus Cluster Program (ICCP) in conjunction with the National Center for Supercomputing Applications and which is supported by funds from the University of Illinois at Urbana-Champaign. This research used resources of the Argonne Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC02-06CH11357. This research also made use of LIGO Data Grid clusters at the California Institute of Technology. This research used data, software and/or web tools obtained from the LIGO Open Science Center (https://gw-openscience.org), a service of LIGO Laboratory, the LIGO Scientific Collaboration and the Virgo Collaboration. LIGO is funded by the U.S. National Science Foundation. Virgo is funded by the French Centre National de Recherche Scientifique (CNRS), the Italian Istituto Nazionale della Fisica Nucleare (INFN) and the Dutch Nikhef, with contributions by Polish and Hungarian institutes.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Abbott B P, Abbott R, Abbott T D, Abernathy M R, Acernese F, Ackley K, Adams C, Adams T, Addesso P, Adhikari R X and et al 2016 Physical Review Letters 116 131103 ( Preprint 1602.03838 )
- 2[2] Aasi J, Abbott B, Abbott R, Abbott T, Abernathy M, Ackley K, Adams C, Adams T, Addesso P, Adhikari R et al. 2015 Classical and quantum gravity 32 074001
- 3[3] Acernese F et al. 2015 Classical and Quantum Gravity 32 024001 ( Preprint 1408.3978 )
- 4[4] Abbott B P et al. (LIGO Scientific Collaboration and Virgo Collaboration) 2019 Phys. Rev. X 9 (3) 031040 URL https://link.aps.org/doi/10.1103/Phys Rev X.9.031040
- 5[5] Abbott R et al. (LIGO Scientific, Virgo) 2021 Phys. Rev. X 11 021053 ( Preprint 2010.14527 )
- 6[6] Abbott R et al. (LIGO Scientific, Virgo) 2021 Astrophys. J. Lett. 913 L 7 ( Preprint 2010.14533 )
- 7[7] Huerta E, Haas R, Fajardo E, Katz D S, Anderson S, Couvares P, Willis J, Bouvet T, Enos J, Kramer W T et al. 2017 Boss-ldg: a novel computational framework that brings together blue waters, open science grid, shifter and the ligo data grid to accelerate gravitational wave discovery 2017 IEEE 13th International Conference on e-Science (e-Science) (IEEE) pp 335ā344
- 8[8] Huerta E, Haas R, Jha S, Neubauer M and Katz D S 2019 Computing and Software for Big Science 3 5
