MWA Tied-Array Processing I: Calibration and Beamformation

S. M. Ord; S. E. Tremblay; S. J. McSweeney; N. D. R. Bhat; C. Sobey,; D. A. Mitchell; P. J. Hancock; F. Kirsten

arXiv:1905.01826·astro-ph.IM·August 15, 2019

MWA Tied-Array Processing I: Calibration and Beamformation

S. M. Ord, S. E. Tremblay, S. J. McSweeney, N. D. R. Bhat, C. Sobey,, D. A. Mitchell, P. J. Hancock, F. Kirsten

PDF

TL;DR

This paper details the algorithms and pipeline developed for the Murchison Widefield Array to perform tied-array beam formation, enabling high-resolution pulsar observations and presenting initial polarimetric results.

Contribution

It introduces new algorithms and a processing pipeline for tied-array beam formation in the MWA, enhancing its capabilities for high-time resolution radio astronomy.

Findings

01

Successful formation of tied-array beams from calibrated antenna signals

02

First polarimetric profiles of specific pulsars at 185 MHz using MWA

03

Demonstration of high-time and frequency resolution observations

Abstract

The Murchison Widefield Array is a low-frequency Square Kilometre Array precursor located at the Murchison Radio-astronomy Observatory in Western Australia. Primarily designed as an imaging telescope, but with a flexible signal path, the capabilities of this telescope have recently been extended to include off-line incoherent and tied-array beam formation using recorded antenna voltages. This has provided the capability for high-time and frequency resolution observations, including a pulsar science program. This paper describes the algorithms and pipeline that we have developed to form the tied array beam products from the summation of calibrated signals of the antenna elements, and presents example polarimetric profiles for PSRs J0437-4715 and J1900-2600 at 185 MHz.

Equations113

w_{j}

w_{j}

Δ t_{j} = (w_{j} + L_{j}) / c,

Δ t_{j} = (w_{j} + L_{j}) / c,

ϕ_{j, n} = 2 π Δ t_{j} f_{n} rad

ϕ_{j, n} = 2 π Δ t_{j} f_{n} rad

\frac{d w _{j}}{d t} = - 1 \times (sin (H) X_{j} + cos (H) Y_{j}) cos (δ) \frac{d H}{d t},

\frac{d w _{j}}{d t} = - 1 \times (sin (H) X_{j} + cos (H) Y_{j}) cos (δ) \frac{d H}{d t},

\frac{d ϕ _{j, n}}{d t}

\frac{d ϕ _{j, n}}{d t}

\frac{d H}{d t}

\frac{d ϕ _{j, n}}{d t}

\frac{d ϕ _{j, n}}{d t}

J_{j} = G_{j} D_{j} A_{j} P_{j} T_{j} I_{j} F_{j},

J_{j} = G_{j} D_{j} A_{j} P_{j} T_{j} I_{j} F_{j},

v_{j} = J_{j} e .

v_{j} = J_{j} e .

V_{j k} = ⟨ v_{j} v_{k}^{H} ⟩ = J_{j} ⟨ e e^{H} ⟩ J_{k}^{H} .

V_{j k} = ⟨ v_{j} v_{k}^{H} ⟩ = J_{j} ⟨ e e^{H} ⟩ J_{k}^{H} .

\langle\textsf{{e}}\textsf{{e}}^{H}\rangle=\left(\begin{array}[]{cc}\langle e_{x}e^{*}_{x}\rangle&\langle e_{x}e^{*}_{y}\rangle\\ \langle e_{y}e^{*}_{x}\rangle&\langle e_{y}e^{*}_{y}\rangle\\ \end{array}\right)=\frac{1}{2}\left(\begin{array}[]{cc}I+Q&U+iV\\ U-iV&I-Q\\ \end{array}\right).

\langle\textsf{{e}}\textsf{{e}}^{H}\rangle=\left(\begin{array}[]{cc}\langle e_{x}e^{*}_{x}\rangle&\langle e_{x}e^{*}_{y}\rangle\\ \langle e_{y}e^{*}_{x}\rangle&\langle e_{y}e^{*}_{y}\rangle\\ \end{array}\right)=\frac{1}{2}\left(\begin{array}[]{cc}I+Q&U+iV\\ U-iV&I-Q\\ \end{array}\right).

V_{j k, f_{n}}^{model} = a = 1 \sum N_{C} J_{j, a} P_{j k, a} J_{k, a}^{H} exp {- i ϕ_{j k, a, f_{n}}}

V_{j k, f_{n}}^{model} = a = 1 \sum N_{C} J_{j, a} P_{j k, a} J_{k, a}^{H} exp {- i ϕ_{j k, a, f_{n}}}

V_{j k, f_{n}}^{residual}

V_{j k, f_{n}}^{residual}

V_{j k, a, f_{n}}

V_{j k, a, f_{n}}

V_{j k, a, f_{n}}^{'}

V_{j k, a, f_{n}}^{'}

J_{j, a} P_{j k, a} J_{k, a}^{H} = V_{j k, f_{n}, a}^{'} .

J_{j, a} P_{j k, a} J_{k, a}^{H} = V_{j k, f_{n}, a}^{'} .

J_{j, a}^{n} = E_{j} J_{j, a}^{n - 1}

J_{j, a}^{n} = E_{j} J_{j, a}^{n - 1}

S (E_{j}) = j k, j \neq = k \sum N_{A} ∥ E_{j} J_{j, a} P_{j k, a} J_{k, a}^{H} - V_{j k, a, f_{n}}^{'} ∥^{2},

S (E_{j}) = j k, j \neq = k \sum N_{A} ∥ E_{j} J_{j, a} P_{j k, a} J_{k, a}^{H} - V_{j k, a, f_{n}}^{'} ∥^{2},

∥ A ∥_{F} = tr (A A^{H})

∥ A ∥_{F} = tr (A A^{H})

S (E_{j})

S (E_{j})

(E_{j} J_{j, a} P_{j k, a} J_{k, a}^{H} - V_{j k, f_{n}}^{'})^{H})

S (E_{j}) = tr (j k, j \neq = k \sum N_{A} (E_{j} M_{j k, a} M_{j k, a}^{H} E_{j}^{H} - E_{j} M_{j k, a} V_{j k, a, f_{n}}^{' H}

S (E_{j}) = tr (j k, j \neq = k \sum N_{A} (E_{j} M_{j k, a} M_{j k, a}^{H} E_{j}^{H} - E_{j} M_{j k, a} V_{j k, a, f_{n}}^{' H}

- V_{j k, a, f_{n}}^{'} M_{j k, a}^{H} E_{j}^{H} + V_{j k, a, f_{n}}^{'} V_{j k, a, f_{n}}^{' H}))

\frac{\partial tr ( A B )}{\partial B} = A^{H}

\frac{\partial tr ( A B )}{\partial B} = A^{H}

\frac{\partial S ( E _{j} )}{\partial E _{j}} = k, k \neq = j \sum N_{A} [E_{j} M_{j k, a} M_{j k, a}^{H} - V_{j k, a, f_{n}}^{'} M_{j k, a}^{H}] .

\frac{\partial S ( E _{j} )}{\partial E _{j}} = k, k \neq = j \sum N_{A} [E_{j} M_{j k, a} M_{j k, a}^{H} - V_{j k, a, f_{n}}^{'} M_{j k, a}^{H}] .

k, k \neq = j \sum N_{A} [E_{j} M_{j k, a} M_{j k, a}^{H}] = k, k \neq = j \sum N_{A} [V_{j k, a, f_{n}}^{'} M_{j k, a}^{H}] .

k, k \neq = j \sum N_{A} [E_{j} M_{j k, a} M_{j k, a}^{H}] = k, k \neq = j \sum N_{A} [V_{j k, a, f_{n}}^{'} M_{j k, a}^{H}] .

E_{j} = k, k \neq = j \sum N_{A} [V_{j k, a, f_{n}}^{'} M_{j k, a}^{H}] \times k, k \neq = j \sum N_{A} M_{j k, a} M_{j k, a}^{H}^{- 1}

E_{j} = k, k \neq = j \sum N_{A} [V_{j k, a, f_{n}}^{'} M_{j k, a}^{H}] \times k, k \neq = j \sum N_{A} M_{j k, a} M_{j k, a}^{H}^{- 1}

J_{k, a}^{n} = ((1 - α) + α E_{j}) J_{j, a}^{n - 1},

J_{k, a}^{n} = ((1 - α) + α E_{j}) J_{j, a}^{n - 1},

U P U^{H} = P

U P U^{H} = P

J_{j, a}

J_{j, a}

D_{j}

J_{j, s} = D_{j} B_{s} .

J_{j, s} = D_{j} B_{s} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

\jid

PASA

\jyear2024

\DeclareBibliographyAliaswebpageonline

MWA Tied-Array Processing I: Calibration and Beamformation

S. M. Ord1,2

S. E. Tremblay2,3

S. J. McSweeney2,3

N. D. R. Bhat2,3

C. Sobey2,4

D. A. Mitchell1,3

P. J. Hancock2,3

F. Kirsten5,2

1CSIRO Astronomy and Space Science

PO Box 76

Epping

NSW 1710

Australia

2International Centre for Radio Astronomy Research (ICRAR)

GPO Box U1987

Perth

WA 6845

Australia

3ARC Centre of Excellence for All-sky Astrophysics (CAASTRO)

4CSIRO Astronomy and Space Science

PO Box 1130 Bentley

WA 6102

Australia

5Department of Space

Earth and Environment

Chalmers University of Technology

Onsala Space Observatory

439 92

Onsala

Sweden

Abstract

The Murchison Widefield Array is a low-frequency Square Kilometre Array precursor located at the Murchison Radio–astronomy Observatory in Western Australia. Primarily designed as an imaging telescope, but with a flexible signal path, the capabilities of this telescope have recently been extended to include off-line incoherent and tied-array beam formation using recorded antenna voltages. This has provided the capability for high-time and frequency resolution observations, including a pulsar science program. This paper describes the algorithms and pipeline that we have developed to form the tied array beam products from the summation of calibrated signals of the antenna elements, and presents example polarimetric profiles for PSRs J0437 $-$ 4715 and J1900 $-$ 2600 at 185 MHz.

doi:

10.1017/pas.2024.xxx

keywords:

instrumentation: interferometers, techniques: interferometric, pulsars: general, pulsars: individual (PSR J0437–4715, PSR J1900–2600)

1 Introduction

The Murchison Widefield Array Phase 1 (MWA; Tingay et al. 2013), consists of 2048 dual–polarisation dipole antennas arranged into 128 aperture array ‘tiles’ to form a connected element interferometer. It operates between approximately 80 and 300 MHz and was primarily designed as an imaging telescope. The array has now been upgraded to Phase 2, being described in Wayth et al (2018), which deploys 256 tiles, but with only 128 available at any given time. Phase 2 supports two configurations, one a compact configuration with baselines in general shorter than in Phase 1, and an extended configuration, with baselines substantially longer than in Phase 1. We present the pipeline and algorithms employed to provide array voltage beams using 128 antennas, for high time resolution polarimetric observations of pulsars. This pipeline was originally developed for MWA Phase 1, but is also used in Phase 2 operations.

As an interferometer, the signals from all of the 128 dual-polarisation antennas are brought together in the MWA correlator (Ord et al. 2015), and the cross-power spectrum is measured. The output from the correlator is accumulated for an integration period of at least 0.25 seconds and usually longer. These visibility sets are used to form images of the radio sky. The time resolution of imaging interferometers is typically too coarse for pulsar observations, and the full cross power spectrum is a very large data set for telescopes with a large number of elements. Full correlation of radio arrays is also computationally challenging and early observations of pulsars using radio telescope arrays were performed by summing the elements into phased-array beams; indeed this is how they were serendipitously discovered (Hewish et al. 1968). As the pulsar field matured, observations were preferentially made at higher frequencies with single dishes. Recently, interest in the Epoch of Re-ionisation has driven the design and production of next–generation low–frequency radio arrays with extensive collecting area and full correlation between elements (e.g. the MWA and LOFAR; van Haarlem et al. 2013; and SKA-LOW111 As detailed at http://www.ska.gov.uk in the design phase). Pulsar observations are key science cases for LOFAR and the SKA, and in this paper we outline the beamforming capability that has been developed for the MWA.

For the beamformer the signals from each antenna are recorded by the MWA Voltage Capture System (VCS; Tremblay et al 2015). The VCS system records channelised data – 3072 channels across 30.72 MHz of bandwidth, for each of the 256 inputs. After the signal streams are recorded, they are calibrated and combined into a single, channelised, dual-polarisation voltage, tied-array, pencil beam. In beamforming we are attempting to measure the sky brightness in a single look direction. This does not require that we form the full cross correlation between all elements; we need only form a sum over all the elements, producing a data product that is similar to the single pixel of an interferometric image. This is a much less compute intensive operation than all-to-all correlation. However, the sum has to be performed coherently, and separately for each look-direction, otherwise the contributions from each antenna will not add constructively. This requires precise instrumental calibration to ensure minimum loss of sensitivity.

Initial pulsar observations made with the MWA did not utilise beamforming. It is much simpler to avoid precise calibration and coherent addition and to detect the signal from each antenna and sum this detected product. This is called the incoherent-beam and has been used to perform a census of southern hemisphere pulsars (Xue et al 2017) and to observe the properties of individual pulsars and the intervening interstellar medium (Bhat et al 2014). The disadvantage of this method is that each antenna has such low intrinsic gain that the incoherent sum has a large noise component, resulting in comparatively low sensitivity. The advantage of the incoherent sum is the wide field of view, which may contain many objects simultaneously, and its computational simplicity. We examine the relative performance of the incoherent and coherent sum for the MWA in §3.1.

In order to direct the tied-array beam to a given look direction the outputs of all antennas must be combined, incorporating a compensating delay for beam steering, cable length differences, and the phase of the antenna complex gain (see Figure 1 for a schematic of a 3–element beamformer). In general the delay is not an integer number of discrete samples, so is often applied as both a whole sample delay correction and a residual phase correction. The calibration of the tied array beam utilises the Real Time System (RTS: Mitchell et al 2008) in order to obtain estimates of the complex gain of each constituent antenna.

Following the delay and calibration compensation the individual antenna voltage streams are combined into a single dual-polarisation beam, which can then be packed into two file formats. Either an undetected dual-polarisation format, VDIF (Whitney et al 2009). Or the time series can be detected, formed into Stokes parameters and packed into PSRFITS (Hotan et al 2004) format. All subsequent analysis is performed by the PRESTO (Ransom 2011), DSPSR (van Straten and Bailes 2010) and PSRCHIVE (Hotan et al. 2004) software packages.

The format of this paper is as follows. We first discuss the practical determination and application of the delay corrections. Then we outline the calibration process, which determines the antenna Jones matrices. We then describe the application of the compensating delays and the calibration, and the formation of the beams. We finally present full polarimetric profiles of pulsars obtained using the MWA tied-array beamformer in both coherent and incoherent mode and discuss their properties.

2 Data Acquisition, Array Calibration and Beam Formation

Calibration is the act of determining the relative gain of each constituent antenna. In this context it is performed for the purpose of maintaining coherence in the beamforming process. The discussion of array calibration in this paper follows the Jones matrix formalism popularised for radio astronomy in the series of papers by Hamaker, Sault and Bregman (1996, 1996a, 1996b and 2000, henceforth HSB) and revisited by Smirnov (2011a, 2011b, 2011c). We will repeat the salient points of this description as it provides useful background, but direct the reader to these papers for a more complete treatment. Firstly, we outline the process by which the data streams are acquired and the delay model constructed.

2.1 Data Acquisition

The subsystem that records the baseband data, the so-called Voltage Capture System (VCS), and the data produced by the system are described in detail in Tremblay (2015) and will be outlined here on a cursory level.

The VCS splits off the complex data after they have been channelised into 10 kHz and 100 $\mu$ s samples by a polyphase filterbank. These data pass through a set of 16 servers, each of which possesses 2 x 1.44 TB RAIDs (Redundant Arrays of Independent Disks). The channelised data can be written out to these RAIDs in real-time for $\sim 100$ minutes before reaching their capacity limit. The data are then moved into the MWA data archive hosted by the Pawsey Supercomputing Centre in Perth, Australia.

VCS files each consist of 4-bit + 4-bit complex voltages from every antenna, polarisation, 10 kHz channel, and 100 $\mu$ s; each 242 MB in size. 32 of these files are recorded for each second of MWA-VCS observations, generating a data volume of $\sim$ 28 TB/hour.

2.2 Initial Delay Correction

We implement all of the features of delay tracking and beam formation as a software operation on the recorded telescope voltages. The metadata for an MWA observation are held in the MWA observing database and can be accessed in the form of a FITS table. For the purposes of the delay calculations the data retrieved from this file are; the locations of the antennas, the cable lengths, and the pointing of the telescope. The analysis is simplified by considering a reference position to which all signals will be delayed, then each antenna delay-distance can be considered a baseline, and measured in the same way as the typical interferometric ( $u,v,w$ ) coordinates. The geometric delay-distance, measured in metres, for a given antenna, ${j}$ , is simply the $w$ coordinate as calculated by:

[TABLE]

where $({X_{j}},{Y_{j}},{Z_{j}})$ is the $j^{\mathrm{th}}$ antenna’s position in local geocentric coordinates. The local frame is where $Z$ points north, $X$ points through the equator from the geocentre along the local meridian and $Y$ is east. This is a geocentric, earth fixed coordinate system, except the $X$ axis is pointing toward the local meridian and not Greenwich. $H$ and $\delta$ are the hour angle and declination of the look direction respectively. The time delay, with units of seconds, including the electrical cable length ( $L_{j}$ ) is:

[TABLE]

where $c$ is the speed of light. For all the MWA antenna positions and cable lengths, and with the time resolution of the captured data (100 $\mu\mathrm{s}/\mathrm{sample}$ ), all of the antenna data streams arrive within a sample-time. No whole sample delays are required. It is computationally simpler to express this time delay as a phase shift. To do this firstly we need to consider time measured in samples, rather than seconds. We then need to consider the frequency of the observation, as the phase shift is a function of frequency. In our case, the raw data captured by the VCS system is already channelised, so this correction can be applied per channel. In practice, with time measured in seconds and frequency in Hz all unit conversions cancel for the phase calculation. The phase correction for a channel, $n$ with centre frequency $f_{n}$ of antenna ${j}$ required to compensate for beam steering and cable delays is:

[TABLE]

2.3 Fringe Rate

As the Earth rotates, the projection of each baseline changes with respect to the phase tracking centre. For a phase centre at fixed celestial coordinates, the only time-variable quantity in Equation 1 is the hour angle $H$ . Differentiating Equation 1 with respect to time:

[TABLE]

and the only time-variable quantity in Equation 3 is the $w_{j}$ in $\Delta{t_{j}}$ . The fringe rate is therefore:

[TABLE]

For an east-west (E-W) baseline, the fringe rate simplifies to:

[TABLE]

We assign a reference position, to which all antennas will be delayed, to be near the centre of the array. The maximum “baseline” therefore, in the specific case of the MWA Phase 1, is approximately 1.5 km. Given a maximum frequency of approximately 300 MHz, the maximum fringe rate (with source on the celestial equator, crossing the local meridian) is approximately 0.5 rad $\mathrm{s}^{-1}$ . The MWA Phase 1 antenna distribution is centrally condensed with most antennas being within a few hundred metres; which implies that for the vast majority of baselines the rate is much less than this. The MWA Phase 2 offers an extended layout allowing baselines as long as 5 km. This will approximately double the maximum fringe rate for these antennas to $\sim$ 1 rad $\mathrm{s.}^{-1}$

2.4 Gain Calibration

Each antenna in the array has a complex gain, imparting a phase turn on the incoming voltage stream. This serves to decorrelate the sum of the antenna signals. The gain calibration process is an attempt to determine this instrumental response. Since the work of HSB it is commonplace to describe the instrumental Jones matrix of antenna $j$ as a combination of antenna-based Jones matrices:

[TABLE]

•

G – electrical gains: Usually diagonal; can be direction dependent.

•

D – leakage: how much of one polarisation is detected in the orthogonal polarisation; direction dependent.

•

A – antenna effects: any peculiarities of the antenna that are a-priori known.

•

P – parallactic angle: rotation of the feed with respect to the sky: rotation matrix, time variable.

•

T – tropospheric gain: diagonal and polarisation independent, time variable.

•

I – ionospheric gain: ionospheric opacity, time variable.

•

F – Faraday rotation: due to the ionosphere: rotation matrix; direction dependent, time variable.

In order to compensate for the different antenna gains, they must be calibrated; that is, placed on the same relative, or absolute, amplitude and phase scale. The individual antennas that comprise the MWA are of low intrinsic gain; the sky at low frequencies is dominated by bright diffuse synchrotron radiation, and the antennas do not have calibrated noise diodes to be used as model sources. The antennas cannot therefore be calibrated individually, and must be calibrated as an interferometer. This allows baselines to be combined to increase signal to noise (shorter baselines to be excluded from the sum to reduce the contribution from diffuse structure) and a component model of the sky brightness can be used as the model in the calibration process. However, in order to calibrate the MWA using this method the raw voltages, as captured for beamforming, must be correlated to form visibilities from which the calibration solution can be obtained. The calibration is obtained either via a short calibration scan performed on a nearby calibrator field, or an in-field calibration is used. The FoV of the MWA is very large and a given field contains many sources suitable for calibration (see §2.4.1). A salient feature of the MWA is that it employs a software correlator, which can be run offline on the recorded data, generating exactly the same products as are generated on-line (Ord et al. 2015). In the case where dedicated calibration observations are used, experience has shown that calibration scans with the sources close to transit are generally better, generating higher signal-to-noise ratio tied-array beams (Xue et al submitted). Therefore often a single high quality calibration observation will be made and will serve for all observations during a session, typically a few hours in duration.

Following HSB and Smirnov (2011a,b,c), for convenience we will collapse Equation 8 and write the instantaneous response of an antenna, $j$ , to an incident electric field vector, $\textsf{{e}}=(e_{x},e_{y})^{H}$ , where $H$ is the Hermitian transpose, as the antenna voltage $\textsf{{v}}_{j}$ :

[TABLE]

The action of correlating the voltages from any two antennas ( $j$ and $k$ ), is performed as an outer product between the antenna pairs, performed for all unique pairs, and in each frequency channel. a process which produces the visibility matrix:

[TABLE]

Where $\langle.\rangle$ denotes an average over some time and frequency interval the Jones matrices can be considered constant. The outer-product $\langle\textsf{{e}}\textsf{{e}}^{H}\rangle$ results in the coherency matrix which for a single look-direction has a direct relationship to the Stokes parameters given by:

[TABLE]

Using Equations 10 and 11 we can relate the polarisation state of a calibrator source to the measured visibilities, via the Jones matrices of the antennas.

2.4.1 Antenna Jones Matrix Determination and Calibration

With this framework we will now outline the mechanisms by which the Real-Time-System (RTS) Calibration and Measurement Loop (CML) determines the unknown Jones matrices from the measured visibilities. The CML was specifically designed to incorporate direction dependence and ionospheric corrections into the gain determination. At MWA observing frequencies ionospheric phase shifts are considerable. Since these are variable in time and direction, they must be decoupled from instrumental phase shifts to limit the number of free calibration parameters. The CML assumes that the phases scale linearly with wavelength and, for a given frequency and calibrator, have a linear trend across the relatively small array. The CML can then track a single wavelength-squared-dependent refractive offset for each calibrator source, while building up antenna gain information on slower time scales. The CML starts by finding the brightest calibrators and estimating their apparent flux densities (in the form of coherency matrices) from a source catalogue. The phase at which a calibrator enters the visibility is equivalent to the geometric delay discussed in §2.2; it is calculated as the scalar product between the baseline vector and the look-direction, which includes a component due to catalogue position and the refractive ionospheric offset. Denoting the combined phase shift for calibrator $a$ , at the frequency $f_{n}$ , as $\phi_{jk,a,f_{n}}$ , we can form model visibilities as the sum over calibrators:

[TABLE]

where $\textsf{{P}}_{jk,a}$ is the coherency matrix for calibrator $a$ . We will use this simple summation over calibrator sources for clarity, but in practice we can extend the sky model over more than the $N_{C}$ calibrators, can include a tapering term in $uv$ to convolve a sky component with a known morphology, and can form a given calibrator from a summation over multiple neighbouring components. As the calibration components can be complex structures, the model is generally a function of baseline. The full-sky model is then subtracted from the visibility set.

[TABLE]

The calibrator (or more complex model), $a$ , is then added back in at its current best-fit position. To form a visibility set which is dominated by the calibrator.

[TABLE]

The visibility set is then rotated to the calibrator position and undergoes a small amount of averaging in time and frequency to increase the S/N.

[TABLE]

The measured $\mathsf{{V}}_{{jk,a,f_{n}}}^{\prime}$ can be directly compared to a model of the measured visibility matrix for the source $a$ . In other words, we are attempting to find a gain solution for which $\mathsf{{V}}_{{jk,f_{n}}}^{\mathrm{residual}}\exp\{i\phi_{jk,a,f_{n}}\}=0$ and consequently:

[TABLE]

This is an implementation of the peeling scheme introduced by Noordam et al 2004 and is repeated iteratively for a list of $N_{c}$ calibrators. There are as many implementations of the solution to Equation 16 as there are astronomical software packages. Here we will present a derivation of the gain solution scheme used in the RTS. The scheme is essentially a fully polarimetric implementation of antsol, as described in Bhatnagar and Nityananda (2001) and references therein. Equation 16 can be linearised by assuming that $\textsf{{P}}_{jk,a}$ is known, which by definition is one of the components of the sky model, and all the $\textsf{{J}}_{k,a}^{n-1}$ are known from a starting estimate, or the previous iteration $(n-1)$ . We are solving for the error matrix $\textsf{{E}}_{j}$ , that ideally corrects a previous iteration of the antenna gain - to a better estimate for the current iteration:

[TABLE]

This system is overdetermined and can be solved in a least squared sense by minimising the objective function:

[TABLE]

where all J are from the starting estimate or previous iteration. As this is a full polarimetric treatment all the terms in Equation 18 are $2\,\times\,2$ matrices, therefore minimising this equation in order to find the best set of gain solutions, in a least squares sense, is a non-trivial operation. We have used the symbol for a norm $(\|.\|$ ) as to express the size of the squared difference between the model and the measurements, but not indicated which norm we should use, or how it should be calculated. In the RTS the chosen norm is the Frobenius, or Hilbert-Schmidt norm, $\|.\|_{F}$ .

[TABLE]

So the objective function becomes formally:

[TABLE]

For clarity, we will denote the predicted, or model visibility $\textsf{{J}}_{j,a}\textsf{{P}}_{jk,a}\textsf{{J}}_{k,a}^{H}$ as $\textsf{{M}}_{jk,a}$

[TABLE]

The minimum of Equation 18 is found by differentiating Equation LABEL:eqn:obj2 with respect to the error term $\textsf{{E}}_{j}$ and equating the result to zero. In performing this differentiation three points need to be considered. Firstly, a complex variable and its conjugate are independent. Secondly, we are assuming all $\textsf{{E}}_{j}$ are independent. Finally, in considering the derivative of a $\mathrm{trace}$ that contains a matrix product the following holds:

[TABLE]

Setting this derivative to zero reveals:

[TABLE]

As the summation on the L.H.S. of Equation 24 is over $k$ only, we can move the $\textsf{{E}}_{j}$ outside the summation to obtain:

[TABLE]

This shows that the update matrix for antenna $j$ is formed by the sum of all visibilities containing that antenna, weighted by the model, divided by the square of the weights. Therefore in the case where the model matches the visibilities the update matrix is the identity matrix. With this scheme we start with an estimate for all antenna gains $\textsf{{J}}_{j,a}$ , then via Equation 25 obtain a new estimate of $\textsf{{E}}_{j}$ . We then use this to update our estimate of $\textsf{{J}}_{j,a}$ .

There are multiple cycles of the calibration and measurement loop, we iterate over each gain calibrator. Full gain calibration as described above is only applied to a small number of calibrators, but many can be used to measure ionospheric offsets. In the beamforming implementation, where we only require a good gain estimate in a single direction for the purpose of beamforming the dominant ionospheric shift is accounted for by the bulk offset of the brightest calibrator from its catalogue position. This shift is generally incorporated into the gain solutions. Although in this implementation we only require the solution in a single direction, we benefit from employing multiple calibrators in two ways: firstly in the scheme where we treat each calibrator separately we achieve a better gain solution by more accurately removing the contribution from the contaminating sources. Secondly and more commonly, we use multiple calibrators as components in a field based calibration scheme where the model contributions are summed into a single model before fitting. This has the benefit of increasing the SNR of the model as it includes more flux. The disadvantage is that any errors in the model components are not individually fit for. In the case of multiple calibrators we generally decouple the solutions by not simultaneously calibrating on and subtracting sources, except where complete subtraction of contaminating sources in warranted, as there only so far you can go with this scheme before you start removing too many degrees of freedom from the dataset. There are more sophisticated schemes that can simultaneously solve for multiple directions (SAGEcal as described in Kazemi et al 2011 and Yatawatta et al 2009. In tests we have found that we converge to the same solutions as these more sophisticated schemes, but generally we require more iterations. Our iterations are however cheaper in terms of operations.

2.4.2 Update Rates and Degeneracies

In practice we average the previous iteration with the new prediction to obtain the next estimate.

[TABLE]

This update has been presented by previous authors: HBS, Mitchell et al (2008), and discussed in detail by Salvini and Wijnholds (2014). we have found that this scheme is very stable, provided $\alpha$ is less than 0.5. In the case of low S/N we have found that reducing the update rate is useful in maintaining convergence.

The whole scheme is predicated on the relationship between the calibrator coherency matrix and the visibilities, Equation 16. However it is possible to replace the calibrator $P$ , via a similarity transform of the following form:

[TABLE]

This is possible for only a small set of U and, or, P. For example if U is a rotation and P is unpolarised then Equation 27 clearly holds. This amounts to saying if the calibrator is unpolarised then there is an unconstrained rotation that can be absorbed into the calibration solutions. Note that this is true even if there is a significant amount of instrumental polarisation as it is the diagonal nature of the model that permits the degeneracy. There is another degeneracy, as the calibration is also insensitive the row exchange operator providing the calibrator is unpolarised. Generally, the second case is easier to spot as this increases the value of the off-diagonal terms in the calibration, this can manifest as linear polarisation suddenly transferring between $Q$ and $U$ . We cannot alleviate the first degeneracy without a source of known polarisation, but we can reduce the likelihood of the second by using starting estimates that are close to the actual solutions via a reasonable model of the antenna.

2.5 Beam Formation

The raw voltages for each coarse, 1.28 MHz, channel are retrieved from storage and the formation of the tied array beam proceeds as a pipeline operation after a calibration solution is obtained. Each coarse channel is beamformed independently. A gain solution is obtained for each antenna using the geometric delay to the pointing centre and the calibration solution; every sample from every antenna is multiplied by its gain solution to equalise the antenna gains and phase all antennas to the same direction. Either the Stokes parameters are formed, or the single dual–polarisation tied-array voltage beam is written to disk. The retrieval of the raw data is an implementation-dependent task and can be arranged via discussion with the MWA project (contacted via http://mwatelescope.org). In this section of the paper we will discuss the determination and application of the gain solution and the formation of both the Stokes parameters and the tied-array voltage beam.

2.5.1 Delay Compensation

Firstly, the process outlined in §2.2 is employed: given the arrival time of the voltage sample, and the array layout, the ( $u,v,w$ ) for each antenna relative to a reference antenna are determined. The $w$ being that used in Equations 1 and 2 to determine the geometric delay for each antenna. The relative cable length between the antenna in question and the reference antenna is obtained and associated with the electrical length of this cable difference, this is then turned into a time delay. Both these time delays are combined and converted into a phase shift using Equation 3. This phase correction is determined for each 10 kHz channel and updated every second to track the source.

As has been noted, in §2.3. the geometric delay is changing at a rate of a 0.5 rad $\mathrm{s}^{-1}$ on the longest baseline at the highest frequency, but is typically much slower due to the centrally condensed array layout. As we are only updating the delay compensation every second, the phased array beam will decorrelate to some degree. To minimise this we calculate delays at the centre of each second, which limits the phase error to a maximum of 0.25 rad, in the worst case. This is an acceptable level of phase error: quarter of a radian of phase error corresponds to a 1% drop in signal to noise ratio. The extended configurations may require an increase to this update rate, which can be accommodated.

2.5.2 Gain Compensation

The next step is to incorporate the antenna gains into the delay model. The MWA beamformer can use antenna gain calibrations obtained via a variety of methods. We predominately use solutions obtained by the RTS, as introduced in §2.4, but can also read the calibration information determined by the calibrate tool (Offringa et al 2014) used by the GLEAM survey team (Hurley-Walker et al 2017). These two methods incorporate a level of direction dependence in that they account for the MWA beam model and the ionosphere. We can also utilise more standard MIRIAD and CASA gain tables if required.

The first step in obtaining the antenna gain compensation from an RTS solution is to realise that the measured Jones matrices can be decomposed into the direction independent gain ( $\textsf{{D}}_{j}$ ) and a direction dependent component, common to all solutions. This direction dependent element is the beam model in the reference direction of the calibration solution. This direction is given in the sky model used for the calibration, it may be the position of a single source, or an assumed reference position for a group of components. The essential part is that it is a common component of all solutions, in the terminology of the calibration system it is the alignment matrix, ( $\textsf{{A}}_{\mathrm{a}}$ ).

As the calibration is in the assumed alignment direction, rather than the desired pointing direction, we have to rotate the solution incorporating the direction dependence of the antenna. We first remove the alignment component:

[TABLE]

We can then form the Jones matrix in the desired direction of source, $s$ : which we denote $\textsf{{J}}_{j,\mathrm{s}}$ . This is obtained by post-multiplying by the model beam response for the source direction, $\textsf{{B}}_{\mathrm{s}}$ .

[TABLE]

The model for the antenna used for the observations in this paper is an analytic beam model. The MWA antenna beam is modelled as the sum of 16 dual–polarisation dipoles above an infinite ground screen. Alternative MWA beam models (e.g. Sokolowski et al. 2017) could also be utilised and a comparison of the polarimetric performance of the different models will be presented in future work.

2.5.3 Applying Delay and Gain Compensation

We then apply the delay and gain compensation to each sample. The voltage vector for channel $n$ and antenna $j$ , is generated in response to an incident electric field vector (e), which contains a noise contribution ( $\textsf{{n}}_{j}$ ). In general, the noise contribution completely dominates the incident electric field vector from the source. Note that the beam is being formed on a single time sample, from a single channel.

Firstly the delay compensation is applied to the measured voltage.

[TABLE]

The measured voltage is the incident signal, modified by the instrument Jones matrix plus noise.

[TABLE]

using the solution $\textsf{{J}}_{j}$ as an estimate of $\textsf{{J}}_{\mathrm{actual},j}$ we premultiply by its inverse:

[TABLE]

We then accumulate this product across all $N_{A}$ antennas, assuming we are perfectly calibrated then this sum forms an estimate, $\textsf{{e}}^{\prime}$ , of the true incident electric field e, corrupted by noise $\boldsymbol{\sigma}$ .

[TABLE]

where

[TABLE]

2.5.4 Forming the Stokes Parameters

As shown in Equation 11 the Stokes parameters are formed from the coherency matrix $\langle\textsf{{e}}\textsf{{e}}^{H}\rangle$ . In the presence of noise, a single time sample of the coherency matrix becomes:

[TABLE]

The expectation value, $\langle\cdot\rangle$ , or long running time average of this term will be dominated by the signal ( $\textsf{{e}}\textsf{{e}}^{H}$ ) and the noise squared term ( $\textsf{{$ \boldsymbol{\sigma} $}}\textsf{{$ \boldsymbol{\sigma} $}}^{H}$ ) as the zero mean nature of the thermal noise will reduce terms linear in $\boldsymbol{\sigma}$ to average to zero. If we consider the form of the noise squared term $\boldsymbol{\sigma}$$\boldsymbol{\sigma}^{H}$ :

[TABLE]

The first term in the square brackets on the RHS of Equation 42 is the sum of the autocorrelation of the noise contribution to the voltage in each antenna and the second term is the sum of all the correlated noise. If we assume that the receiver noise is uncorrelated then the second term averages down with time. As the antennas are noise dominated we can approximate this by the antenna autocorrelation. The autocorrelations contain the sky signal too, this is the incoherent beam after all. However, for a large number of antennas the benefit of removing the autocorrelation noise compensates for the loss of signal. This process also has the added benefit that the tied-array beam more clearly matches the single pixel response of the MWA interferometer.

To clarify the equivalence of the tied-array beam and interferometer response when we form beams in this manner consider the following thought experiment. Take a simple two element array, which is phased at zenith, with no calibration errors and equal gains, so no phase terms are required. The phased array sum, V, of a single polarisation, v, from each antenna including noise n is given by:

[TABLE]

recall this voltage sample is detected by the action of $\textsf{{V}}\textsf{{V}}^{H}$ . The product of which is the sum of the following sixteen terms:

[TABLE]

Lets suppose we had access to a complex correlator and instead formed the complex correlation between the two inputs:

[TABLE]

The four terms of C are present in the 16 terms of the tied-array sum, but there are others. Now another four terms can be accounted for by the conjugate of C. Interferometers typically only form half the correlations, but the tied array does not have that luxury and the sum contains both. This information is redundant and does not improve the signal to noise ratio. But that still leaves eight terms that are present in a tied-array sum, but not in a typical interferometric scheme. Consider the auto-correlations of the inputs to the tied array:

[TABLE]

These eight terms are precisely those, that when subtracted from the tied-array sum produce the equivalent to the interferometer output. That is:

[TABLE]

Therefore as we accumulate the individual voltages into the coherent $\textsf{{e}}^{\prime}$ , we also accumulate the individual antenna detected signals $\textsf{{e}}^{\prime}_{j}\textsf{{e}}^{\prime H}_{j}$ for all $j$ , we use these as an estimate of the autocorrelation component of the noise in the beam in the formation of the Stokes parameters. Removing the autocorrelations does not of course eliminate the noise, but does remove a substantial fraction. The remaining noise components are the terms linear in the noise in Equation 40 and the correlated noise in 42 which are reduced by integration in time (and frequency).

The individual Stokes parameters are formed in the following manner:

[TABLE]

3 Example Beamformed Observations

The incoherent beamformer has already been used to study the properties and population of pulsars with the MWA (Xue et al 2017). The coherent beamformer is also beginning to be employed (Bhat et al 2016; Meyers et al 2017; McSweeney et al 2017; Bhat et al 2018) and is undergoing a process of validation. In this paper we present some example pulse profiles to demonstrate the capability of the beamformer; a companion publication, Xue et al (in preparation) will present further validation.

The beamformer generates either full-polarisation PSRFITS search mode files or VDIF format voltage beams. The profiles presented in this paper have been generated using the PSRFITS output pipeline. This format is a time series of detected powers in the Stokes basis, all four Stokes parameters are formed and digitised to 8-bit, with a header conforming to the PSRFITS standard. These data were subsequently folded and dedispersed using the DSPSR software package (van Straten and Bailes 2011) and a timing ephemeris from the pulsar database222Ephemerides found using PSRCAT the pulsar catalogue (v. 1.57) retrieved from http://www.atnf.csiro.au/people/pulsar/psrcat/ (Manchester et al. 2005). Subsequent papers in this series will examine in detail the fidelity of the beamformed data. For the purposes of this paper, we will compare the morphology of the determined polarisation with that already in the literature.

3.1 Coherent vs Incoherent Signal to Noise

The efficiency of incoherent beamforming, i.e. how the signal to noise ratio (S/N) of an observation increases by adding the detected power from the constituent antennas into the sum, is a function of the ratio of sky to antenna noise. This behaviour is summarised in Kudale and Chengular (2017), who compare the sensitivities of coherent and incoherent summation for the Giant Metrewave Radio Telescope (GMRT). They give the S/N obtained in an observation of a source of flux density $S$ , using an incoherent sum of $N_{A}$ antennas, each with a receiver temperature $T_{R}$ and gain $G$ , with an antenna temperature of $T_{A}$ to be:

[TABLE]

from which it can be seen that if the antenna temperature dominates the receiver temperature sufficiently, then the S/N is no longer a function of $\sqrt{N_{A}}$ . Although the MWA is in general sky-dominated, the ratio is not high. However, there are pointing directions, near the Galactic plane for example, where this is not the case.

In the case of a coherent sum, the S/N with perfect phasing is given by:

[TABLE]

and when the source contribution to the system temperature is small ( $GS<<(T_{A}+T_{R})$ ) the S/N reduces to the expected expression:

[TABLE]

Thus, in the case where receiver noise dominates over sky noise, the coherent sum is a factor of $\sqrt{N_{A}}$ better than an incoherent sum. However, in the case where sky noise dominates, the improvement may be as much as a factor of $N_{A}$ . There is one more case to consider, when the source makes a contribution to the noise temperature. This is self-noise and is examined in detail in Kulkarni (1989) and Anantharamaiah et al (1991). In a beamfomer, self-noise is an issue when the source flux is comparable to the System Equivalent Flux Density (SEFD)333SEFD is the flux of a source that, if observed, would double the system temperature. of an antenna divided by the number of antennas in the array. The SEFD of an MWA antenna is approximately $\sim 20$ kJy at 200 MHz, so for self-noise to start to become an issue the source flux would need to be greater than $\sim 200$ Jy, which is easily reached in observations of bright single pulses e.g. Meyers et al (2017, 2018 in preparation), however in observations of pulsed sources the noise measurement is taken when the pulse is not present, thereby mitigating this effect.

The efficiency of the coherent beamforming is also a function of the accuracy of the antenna phasing, which is determined by the quality and applicability of the calibration. As examined in detail by Kudale and Chengular (2017), if the phase error has zero mean and variance $\sigma^{2}$ then Equation 55 becomes:

[TABLE]

Since the MWA is a low–frequency telescope, with comparatively short baselines, the variance of any phase error will be low. However in some calibration schemes a calibration solution is transferred from a dedicated calibration observation. When solutions are transferred the ionospheric refraction will be different and could potentially result in a persistent phase error, reducing the efficiency of the beamforming. It is becoming more common to use ”in-field” calibration where the bulk refractive offset of the field is not an issue, however we do not employ a scheme to determine the accuracy of the calibration solutions obtained.

3.2 Pulsar Observations

3.2.1 PSR J1900–2600

Figure 2 shows pulse profiles for PSR J1900–2600 obtained using an MWA-VCS observation at 184.5 MHz with 30.72 MHz of bandwidth (3072 $\times$ 10 kHz channels) and an integration time of approximately 1500 seconds. The profiles resulting from the incoherent and coherent sum are shown in the upper and lower panels, respectively. The S/N444The pulsar S/N is calculated by the PSRCHIVE package, following the method outlined in Cordes & McLaughlin (2003) of the coherently-formed beam is approximately 240, the incoherent sum of the same observation is only 14. The factor of improvement is 17, much larger than the factor of 10 expected. J1900-2600 is very close to the Galactic plane therefore, as examined in §3.1, the sky temperature likely dominates over the receiver noise, making the incoherent sum less efficient.

This pulsar shows considerable profile evolution as a function of frequency. However, the profile presented here is consistent with that at 243 MHz in Johnston et al (2008).

3.2.2 PSR J0437–4715

PSR J0437–4715 is a very well studied pulsar in the southern hemisphere, and the nearest and brightest millisecond pulsar. However there is no published low–frequency polarisation data for this object. Figure 3 shows the first polarimetric profile of J0437–4715 below 200 MHz. In this case the improvement in S/N between the incoherent and coherent sum is approximately 8, which is comparable with that expected. As the pulsar is in a region of the sky well away from the Galactic plane so the incoherent sum S/N should not be sky dominated. The polarisation profile is consistent with observations made at higher frequencies with the Parkes telescope. Observations have been published at 430, 660 and 732 MHz (Navarro et al. 1997; van Straten 2002 and Dai et al 2015). While the linear polarisation is in good agreement, especially with the lowest frequency observations, the circular polarisation seems to display a sense change which is not evident here. However, given the rapid evolution of the pulse profile as a function of frequency it is not surprising that a true comparison is difficult.

4 Summary

The primary motivation for this paper is to document the algorithms used in the MWA tied-array beamformer. We have described the steps required to calibrate and coherently combine voltages from the 128 antennas of the Murchison Widefield Array. We have also discussed some of the issues inherent in extrapolating signal–to–noise ratios between incoherent and coherent observations. Finally we have presented some representative profiles of low–frequency observations of PSRs J0437–4715 and J1900–2600, and compared them with their published profiles at nearby or higher frequencies. In a subsequent paper in this series we will examine the polarimetric stability and fidelity of the MWA beamformed observations.

Acknowledgements.

This scientific work makes use of the Murchison Radio-astronomy Observatory, operated by CSIRO. We acknowledge the Wajarri Yamatji people as the traditional owners of the Observatory site. Support for the operation of the MWA is provided by the Australian Government (NCRIS), under a contract to Curtin University administered by Astronomy Australia Limited. We acknowledge the Pawsey Supercomputing Centre which is supported by the Western Australian and Australian Governments. This research was conducted by the Australian Research Council Centre of Excellence for All-sky Astrophysics (CAASTRO), through project number CE110001020. FK acknowledges support through the Australian Research Council grant DP140104114

Bibliography38

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] K.. Anantharamaiah et al. “Source-noise in radio synthesis images” In IAU Colloq. 131: Radio Interferometry. Theory, Techniques, and Applications 19 , Astronomical Society of the Pacific Conference Series, 1991, pp. 6–9
2[2] N… Bhat et al. “Scintillation Arcs in Low-frequency Observations of the Timing-array Millisecond Pulsar PSR J 0437-4715” In Astrophysical Journal 818 , 2016, pp. 86 DOI: 10.3847/0004-637X/818/1/86 · doi ↗
3[3] N… Bhat et al. “The Low-frequency Characteristics of PSR J 0437-4715 Observed with the Murchison Wide-field Array” In Astrophysical Journal 791 , 2014, pp. L 32 DOI: 10.1088/2041-8205/791/2/L 32 · doi ↗
4[4] N… Bhat et al. “Observations of Low-frequency Radio Emission from Millisecond Pulsars and Multipath Propagation in the Interstellar Medium” In The Astrophysical Journal Supplement Series 238 , 2018, pp. 1 DOI: 10.3847/1538-4365/aad 37c · doi ↗
5[5] S. Bhatnagar and R. Nityananda “Solving for closure errors due to polarization leakage in radio interferometry of unpolarized sources” In Astronomy and Astrophysics 375 , 2001, pp. 344–350 DOI: 10.1051/0004-6361:20010799 · doi ↗
6[6] J.. Cordes and M.. Mc Laughlin “Searches for Fast Radio Transients” In Astrophysical Journal 596 , 2003, pp. 1142–1154 DOI: 10.1086/378231 · doi ↗
7[7] S. Dai et al. “A study of multifrequency polarization pulse profiles of millisecond pulsars” In Monthly Notices of the Royal Astronomical Society 449 , 2015, pp. 3223–3262 DOI: 10.1093/mnras/stv 508 · doi ↗
8[8] J.. Hamaker “Understanding radio polarimetry. IV. The full-coherency analogue of scalar self-calibration: Self-alignment, dynamic range and polarimetric fidelity” In Astronomy and Astrophysics Supplement Series 143 , 2000, pp. 515–534 DOI: 10.1051/aas:2000337 · doi ↗