Gaussian processes for blazar variability studies

V. Karamanavis

arXiv:1703.02380·astro-ph.HE·March 8, 2017

Gaussian processes for blazar variability studies

V. Karamanavis

PDF

Open Access

TL;DR

This paper introduces Gaussian processes as a novel method for modeling blazar variability time series and demonstrates their application to multi-wavelength light curves of PKS 1502+106, providing new insights into blazar behavior.

Contribution

It presents the use of Gaussian processes for blazar time series modeling, a new approach in this field.

Findings

01

Successful application of GP modeling to blazar light curves

02

Enhanced understanding of blazar variability patterns

03

Potential for improved predictions of blazar activity

Abstract

This article shortly introduces Gaussian processes (GP) as a new approach for modelling time series in the field of blazar physics. In the second part of the paper, recent results from an application of GP modelling to the multi-wavelength light curves of the blazar PKS 1502+106 are discussed.

Figures2

Click any figure to enlarge with its caption.

Equations8

k (x_{i}, x_{j}) = σ_{f}^{2} exp [\frac{- ( x _{i} - x _{j} ) ^{2}}{2 l ^{2}}],

k (x_{i}, x_{j}) = σ_{f}^{2} exp [\frac{- ( x _{i} - x _{j} ) ^{2}}{2 l ^{2}}],

k_{n} (x_{i}, x_{j}) = k (x_{i}, x_{j}) + σ^{2} I,

k_{n} (x_{i}, x_{j}) = k (x_{i}, x_{j}) + σ^{2} I,

L = lo g p (y ∣ x, θ) = - \frac{1}{2} y^{T} k_{n}^{- 1} y - \frac{1}{2} lo g ∣ k_{n} ∣ - \frac{n}{2} lo g 2 π .

L = lo g p (y ∣ x, θ) = - \frac{1}{2} y^{T} k_{n}^{- 1} y - \frac{1}{2} lo g ∣ k_{n} ∣ - \frac{n}{2} lo g 2 π .

\frac{\partial L}{\partial θ _{j}} = \frac{1}{2} y^{T} \frac{\partial k _{n}}{\partial θ _{j}} k_{n}^{- 1} y - \frac{1}{2} \mbox T r a ce (k_{n}^{- 1} \frac{\partial k _{n}}{\partial θ _{j}}) .

\frac{\partial L}{\partial θ _{j}} = \frac{1}{2} y^{T} \frac{\partial k _{n}}{\partial θ _{j}} k_{n}^{- 1} y - \frac{1}{2} \mbox T r a ce (k_{n}^{- 1} \frac{\partial k _{n}}{\partial θ _{j}}) .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAstrophysics and Cosmic Phenomena · Computational Physics and Python Applications · Dark Matter and Cosmic Phenomena

Full text

Abstract

This article shortly introduces Gaussian processes as a new approach for modelling time series in the field of blazar physics. In the second part of the paper, recent results from an application of GP modelling to the multi-wavelength light curves of the blazar PKS 1502+106 are discussed.

keywords:

active galaxies; blazars; jets; radio continuum; gamma-rays; time-series analysis; Bayesian modelling; Gaussian processes

\articlenumber

x \doinum10.3390/—— \pubvolumexx

\externaleditorAcademic Editor: name \historyReceived: date; Accepted: date; Published: date

\TitleGaussian Processes for Blazar Variability Studies \AuthorVassilis Karamanavis \AuthorNamesVassilis Karamanavis

1 Introduction

Detailed investigations of the emission from blazars have offered, and continue to offer, important insights into this extreme population of active galactic nuclei (AGN). Blazar variability is an indispensable tool because it offers information on scales even smaller than those probed with imaging techniques. Furthermore, given the predictions of various blazar variability models (e.g., Marscher and Gear, 1985; Valtaoja et al., 1992; Spada et al., 2001; Camenzind and Krockenberger, 1992) , we are able to assess their validity taking advantage of the fingerprints of each model in the time and frequency domains, namely on light curves and broadband spectra.

Numerous monitoring efforts have been in place since the early days of the discovery of AGN and blazars, however most, if not all, lacked the high time resolution required for detailed studies. The tables have turned with the launch of the Fermi $\gamma$ -ray space telescope and complementary efforts for multi-frequency blazar monitoring. One such example in the radio domain is the F-GAMMA ( $Fermi$ -GST AGN Multi-frequency Monitoring Alliance) program (Fuhrmann et al., 2007; Angelakis et al., 2010). The F-GAMMA operated between 2007 and 2015 and served as a prime provider of single-dish radio monitoring complementary to Fermi Fuhrmann et al. (2016).

The goal of variability studies is the extraction of parameters such as the variability amplitudes, time scales, and others, from observations. The problem of modelling the observed variability in the radio and other bands, has been addressed using numerous approaches. For modelling light curves and the flares seen in them, Gaussian, exponential, and various other fitting methods have been used (e.g. Valtaoja et al. (1999); Kudryavtseva et al. (2011); Angelakis et al. (2015); this is by no means a complete list). Depending on each specific case, they work better or worse but the common characteristic among all the above is that a functional form of the underlying process is always assumed. In other words, a parametric approach is being employed. However, this might not be appropriate and we might have to admit that at times we have weak prior knowledge as to what is the mathematical form of such a process.

An elegant way to overcome this state of limited knowledge is by means of Bayesian inference. Specifically, this paper is concerned with the application of Gaussian processes in the field of blazar variability studies and modelling of time series. In the next section, a brief introduction to Gaussian processes is given. Then, an application to the light curves of the blazar PKS 1502+106 is discussed, aiming to quantify an outburst seen from 2008 to 2010 and extract light curve parameters like the flare amplitude and putative delays between its peak at different bands. First results along with an extended discussion can be found in Karamanavis (2015).

2 Non-parametric models and Gaussian processes

The problem at hand is as follows: we want to extract (or learn, in the context of Bayesian machine learning) the properties of a system from a given set of observations. This can be a function $f()$ that describes the system well enough and predicts its future evolution. There are essentially two ways of achieving this goal. The first involves the selection of a class of functions, e.g. linear, quadratic, Gaussian etc. with (free) parameters, the values of which are obtained through minimizing the residuals between $f()$ and our observations. Such a selection can result from strong prior knowledge regarding the system. One can increase the flexibility of the model by increasing the number of parameters at the expense, though, of potential overfitting; i.e. describing random small scale variations of the data instead of the general trend.

Still, cases in which we have little or no prior knowledge regarding the underlying process, and consequently the appropriate model to use, exist in abundance. Non-parametric models offer an interesting alternative to the first approach. By comparison, non-parametric models “translate” our prior knowledge about a system, no matter how generic this might be, e.g. smooth and/or continuous changes, into probability distributions of functions that are more likely to describe the system. The task of modelling then reduces to fine-tuning these prior distributions so that they include functions that best describe our observations; i.e. the posterior distributions. Since no explicit use of free parameters is made, these models are referred to as non-parametric. Non-parametric models, however, use so-called hyperparameters which instead of determining the properties of functions, as in the first approach, govern the properties of the distribution. These still need to be inferred and Bayesian model selection provides a robust framework for doing so.

Gaussian processes (GPs) are the generalization of uni- or multivariate Gaussian distributions of variables into the space of functions. GPs offer a very flexible framework for modelling unknown functions Williams and Rasmussen (1996); Roberts et al. (2012); Ghahramani (2015). They are widely used in machine learning applications and a new field of applications in astronomy has recently opened (e.g. Gibson et al., 2012; Aigrain et al., 2012; Rajpaul et al., 2015; Karamanavis et al., 2016). The interested reader is referred to Rasmussen and Williams (2005) for additional details.

2.1 Gaussian processes for regression

Given a set of observations, e.g. a typical light curve, with the form $D=(x_{i},y_{i})$ , where $y_{i}$ represent recorded flux densities at times $x_{i}$ , we would like to find all functions drawn from the prior distribution which pass through all observations. The latter distribution would be the posterior distribution and its mean value describes the data best (see Figure 1).

What generates the prior distribution is a covariance kernel or function. This governs the similarity (covariance) between any two function values, at $x_{i}$ and $x_{j}$ say. It is chosen based on our prior knowledge of the system. There exists a multitude of kernels which can model unknown functions based on their characteristics, e.g. periodic, linear etc. (e.g. Duvenaud, 2014). It is noteworthy that sums and products of kernels produce valid kernels too. For the purposes of this paper, the squared-exponential (SE) kernel is chosen. This is the simplest kernel and the only underlying assumption is that of a smooth process (i.e. the functions it produces have derivatives of all orders everywhere in their domain). The SE kernel is defined as

[TABLE]

with $l$ the the characteristic length scale, i.e. the distance in the $x$ dimension after which the function changes significantly and $\sigma_{f}^{2}$ , the variance, i.e. the mean distance of the function from its mean (serves only as a scaling factor). Here, $l$ and $\sigma_{f}^{2}$ are the hyperparameters.

Uncertainty is accommodated by simply adding a term describing it, to the noiseless SE covariance kernel of Eq. 1, which then becomes

[TABLE]

with $\boldsymbol{I}$ the identity matrix (e.g. Roberts et al., 2012).

2.2 Training the Gaussian process

Bayesian model selection provides the way for optimizing the hyperparameters, thereby refining the posterior distribution and enabling us to perform inference. We essentially update our prior knowledge in the light of a data set. This is done by means of maximizing the log marginal likelihood (Williams and Rasmussen, 1996), as outlined below.

For the SE kernel, the vector of hyperparameters is $\boldsymbol{\theta}=\{l,\sigma^{2}\}$ and the probability (or evidence) of the training data $\boldsymbol{y}$ , given the hyperparameters vector $\boldsymbol{\theta}$ , is $p(\boldsymbol{y}\,|\,\boldsymbol{x},\boldsymbol{\theta})$ . The log marginal likelihood is given by

[TABLE]

More generally, in the case of a hyperparameter vector $\boldsymbol{\theta}=\{\theta_{j}\,|\,j=1,...,n\}$ , the derivatives of the log marginal likelihood with respect to each $\theta_{j}$ are

[TABLE]

By means of numerical gradient optimization algorithms one can maximize the log marginal likelihood using Eq. 4 and obtain the best set of hyperparameters (for an outline of the algorithm see Rasmussen and Williams (2005)).

3 Application to the light curves of the blazar PKS 1502+106

PKS 1502+106 (S3 1502+10, OR 103) is a powerful flat-spectrum radio quasar (FSRQ), driven by a supermassive black hole of about $10^{9}$ M*⊙*, at redshift $z=1.8385$ (see Karamanavis et al. (2016) and references therein). The source has been detected by Fermi in 2008 showing an intense $\gamma$ -ray outburst at MeV–GeV energies (Abdo et al., 2010). The flare was monitored by F-GAMMA and other programs in the radio band. Radio data from the F-GAMMA program111www.mpifr-bonn.mpg.de/div/vlbi/fgamma/fgamma.html were employed, including observations with the Effelsberg 100-m (EB) and the IRAM 30-m (PV) telescopes in ten frequency bands from 2.64 to 142.33 GHz. Monthly observations from the EB and PV telescopes were performed quasi-simultaneously, thus ensuring maximum spectral coherency. A detailed description of the observational setup and data reduction is provided elsewhere (Fuhrmann et al., 2014; Angelakis et al., 2015; Nestoras et al., ; Fuhrmann et al., 2016).

Considering the full length of available multi-frequency light curves, GP regression was applied to the data at each frequency separately. A variant of the software by Pedregosa et al. (2011) was used, which was adapted to our specific modelling needs. A suite of machine learning applications for Python, including GP regression, is available online222www.scikit-learn.org. To obtain the best unbiased result, the length scale parameter $l$ was randomly initialized 100 times. Then, the posterior mean is returned with a 95% confidence interval for the flux density, along with the set of hyperparameters which maximize the log marginal likelihood. The relevant light curve parameters extracted are the peak amplitude ( $S_{\rm m}$ ), the time of the peak, and the cross-band delay. Furthermore, the flare rise and decay times are calculated between the peak and the two flux density minima which immediately precede and follow $S_{\rm m}$ Karamanavis et al. (2016).

4 Results and discussion

4.1 Quantifying the broadband outburst of PKS 1502+106

PKS 1502+106 has shown an isolated outburst visible across the electromagnetic spectrum. The results of the GP regression are visible in the right panels of Fig. 2. The amplitudes at each of our radio frequencies first increase, up to about 43 GHz where they peak, and then drop at higher $\nu$ (Fig. 4 of Ref. Karamanavis et al. (2016)).The flare peaks with increasing time delay as we follow it towards lower frequencies (Fig. 5 of Ref. Karamanavis et al. (2016)). The dependence of amplitudes and delays on $\nu$ can be well approximated by power laws, a telltale sign of shock evolution within the source’s parsec-scale jet. These results are in agreement with the study of the source using mm-wave very-long-baseline interferometry (mm-VLBI; see Karamanavis et al. (2016)).

The observed time lags between radio frequencies can be attributed to synchrotron opacity effects. In this context, the flare maxima appear first at higher frequencies and gradually progress to lower frequencies (e.g. Valtaoja et al., 1992). Timing the flare maximum at each frequency with respect to the data at 142.33 GHz yields the following delays. The longest delay of $(205\pm 2.0)$ days is seen between $142.33$ and $2.64$ GHz. Decreasing lags are clearly visible towards higher frequencies with a time lag of $(136.9\pm 8.1)$ d at 4.85 GHz and $(96.2\pm 3.8)$ d at 8.35 GHz. At higher frequencies, the lags are $(85.8\pm 5.2)$ d at 10.45 GHz, $(67.2\pm 0.9)$ d at 14.6, $(39.4\pm 5.6)$ d at 23.03, and $(39.8\pm 11.2)$ d at 32 GHz. At the two highest frequencies of 43.05 GHz and 86.24 GHz, the lags are $(40.8\pm 3.1)$ d and $(24.8\pm 0.2)$ d, respectively (Table 3 of Ref. Karamanavis et al. (2016)).

The aforementioned lags allow the calculation of the synchrotron opacity profile of PKS 1502+106. Considering that at a given frequency the core represents a feature whose optical depth is close to unity, the standard relativistic jet paradigm predicts that the apparent position of this unit-opacity surface depends on observing frequency (see Karamanavis et al. (2016) and references therein). The importance of this “core-shift” effect lies in the fact that it gives critical insights into the physical processes of ultracompact jets, like the de-projected distance between the core (at each frequency) and the jet base, as well as the strength of the magnetic field along the flow (e.g., Lobanov, 1998; Hirotani, 2005). Using a “time-lag core shift” method, which proves to be a good alternative to VLBI core-shifts (e.g. Kudryavtseva et al., 2011; Kutkin et al., 2014), the positions of the nine unit-opacity surfaces (or photospheres) with respect to the jet base were inferred. These are: $(10.2\pm 1.2)$ pc for the 2.64 GHz core, $(7.1\pm 1.0)$ pc at 4.85, $(5.3\pm 0.8)$ pc at 8.35, $(4.8\pm 0.8)$ pc at 10.45, $(4.0\pm 0.7)$ pc at 14.60, $(2.6\pm 0.6)$ pc at 23.05, $(2.9\pm 1.0)$ pc at 32.00, $(3.4\pm 0.8)$ pc at 43.05, and $(4.0\pm 1.1)$ pc at 86.24 GHz (see Table 4 of Ref. Karamanavis et al. (2016) for the values and their calculation). Finally, we obtain the equipartition magnetic field along the jet axis, with values increasing from 14 mG at the 2.64 GHz core up to 176 mG at 86.24 GHz.

4.2 Where do the $\gamma$ rays come from?

The location of the $\gamma$ -ray emitting region is decisively constrained by combining the aforementioned findings with those of Fuhrmann et al. (2014) and Karamanavis et al. (2016). Specifically, in Fuhrmann et al. (2014) the authors calculate the relative distance between the 86 GHz core and the $\gamma$ -ray site to be about 2.1 pc, while in Karamanavis et al. (2016), using high-resolution mm-VLBI imaging, the knot most likely connected with the flare in question is identified (knot C3) and its kinematical properties are deduced ( $\beta_{\rm app}\sim 7$ c, jet viewing angle of $2.6^{\circ}$ ).

Having calculated the absolute distance between the jet base and the 86.24 GHz core $(4.0\pm 1.1)$ pc, and with the relative separation between the same core and the $\gamma$ -ray site (2.1 pc), the high-energy emission originates at $(1.9\pm 1.1)$ pc from the jet base. This places it far from the bulk of the broad-line material of the source ( ${\rm R}_{\rm BLR}\approx 0.1$ pc) making external Compton, on a field other than that of the BLR, a very relevant emission mechanism for PKS 1502+106’s $\gamma$ ray emission, with implications on the origin of the seed photon field (see Karamanavis et al. (2016) and references therein).

Acknowledgements.

V.K. thanks L. Fuhrmann, E. Angelakis, T. P. Krichbaum, I. Myserlis, I. Nestoras, J. A. Zensus, and H. Ungerechts for interesting discussions and useful comments; also W. Max-Moerbeck and B. Boccardi for carefully reading the paper. Partially based on observations with the Effelsberg 100-m and the IRAM 30-m telescopes.

Bibliography28

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Marscher and Gear (1985) Marscher, A.P.; Gear, W.K. Models for high-frequency radio outbursts in extragalactic sources, with application to the early 1983 millimeter-to-infrared flare of 3C 273. Ap J 1985 , 298 , 114–127.
2Valtaoja et al. (1992) Valtaoja, E.; Terasranta, H.; Urpo, S.; Nesterov, N.S.; Lainela, M.; Valtonen, M. Five Years Monitoring of Extragalactic Radio Sources - Part Three - Generalized Shock Models and the Dependence of Variability on Frequency. A&A 1992 , 254 , 71.
3Spada et al. (2001) Spada, M.; Ghisellini, G.; Lazzati, D.; Celotti, A. Internal shocks in the jets of radio-loud quasars. MNRAS 2001 , 325 , 1559–1570, [astro-ph/0103424] .
4Camenzind and Krockenberger (1992) Camenzind, M.; Krockenberger, M. The lighthouse effect of relativistic jets in blazars - A geometric origin of intraday variability. A&A 1992 , 255 , 59–62.
5Fuhrmann et al. (2007) Fuhrmann, L.; Zensus, J.A.; Krichbaum, T.P.; Angelakis, E.; Readhead, A.C.S. Simultaneous Radio to (Sub-) mm-Monitoring of Variability and Spectral Shape Evolution of potential GLAST Blazars. The First GLAST Symposium; Ritz, S.; Michelson, P.; Meegan, C.A., Eds., 2007, Vol. 921, American Institute of Physics Conference Series , pp. 249–251, [0704.3944] .
6Angelakis et al. (2010) Angelakis, E.; Fuhrmann, L.; Nestoras, I.; Zensus, J.A.; Marchili, N.; Pavlidou, V.; Krichbaum, T.P. The F-GAMMA program: multi-wavelength AGN studies in the Fermi-GST era. Ar Xiv e-prints 2010 , [ar Xiv:astro-ph.CO/1006.5610] .
7Fuhrmann et al. (2016) Fuhrmann, L.; Angelakis, E.; Zensus, J.A.; Nestoras, I.; Marchili, N.; Pavlidou, V.; Karamanavis, V.; Ungerechts, H.; Krichbaum, T.P.; Larsson, S.; Lee, S.S.; Max-Moerbeck, W.; Myserlis, I.; Pearson, T.J.; Readhead, A.C.S.; Richards, J.L.; Sievers, A.; Sohn, B.W. The F-GAMMA program: Multi-frequency study of Active Galactic Nuclei in the Fermi era. Program description and the first 2.5 years of monitoring. Ar Xiv e-prints 2016 , [ar Xiv:astro-ph.HE/1608.02580]
8Valtaoja et al. (1999) Valtaoja, E.; Lähteenmäki, A.; Teräsranta, H.; Lainela, M. Total Flux Density Variations in Extragalactic Radio Sources. I. Decomposition of Variations into Exponential Flares. Ap JS 1999 , 120 , 95–99.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Abstract

keywords:

1 Introduction

2 Non-parametric models and Gaussian processes

2.1 Gaussian processes for regression

2.2 Training the Gaussian process

3 Application to the light curves of the blazar PKS 1502+106

4 Results and discussion

4.1 Quantifying the broadband outburst of PKS 1502+106

4.2 Where do the γ\gammaγ rays come from?

Acknowledgements.

4.2 Where do the $\gamma$ rays come from?