Primary beam effects of radio astronomy antennas -- II. Modelling the MeerKAT L-band beam
K. M. B. Asad (1, 2, 3, 4), J. N. Girard (5), M. de Villers (6), T., Ansah-Narh (2), K. Iheanetu (2), O. Smirnov (2, 4), M. G. Santos (3, 4), R., Lehmensiek (6), J. Jonas (2, 4), D. I. L. de Villiers (7), K. Thorat (2, 4,, 8), B. Hugo (2, 4), S. Makhathini (2), G. I. G. Jozsa (2

TL;DR
This paper develops a detailed, sparse model of MeerKAT's L-band primary beam using observations and simulations, enhancing calibration and imaging accuracy for radio astronomy.
Contribution
It introduces a novel Zernike polynomial-based beam model and software tool for improved direction-dependent calibration of MeerKAT antennas.
Findings
Model is more accurate for diagonal beam elements and at lower frequencies.
Provides a software tool (EIDOS) for beam modeling and calibration.
Future improvements expected with more precise measurements and simulations.
Abstract
After a decade of design and construction, South Africa's SKA-MID precursor MeerKAT has begun its science operations. To make full use of the widefield capability of the array, it is imperative that we have an accurate model of the primary beam of its antennas. We have taken available L-band full-polarization 'astro-holographic' observations of three antennas and a generic electromagnetic simulation and created sparse representations of the beams using principal components and Zernike polynomials. The spectral behaviour of the spatial coefficients has been modelled using discrete cosine transform. We have provided the Zernike-based model over a diameter of 10 deg averaged over the beams of three antennas in an associated software tool (EIDOS) that can be useful in direction-dependent calibration and imaging. The model is more accurate for the diagonal elements of the beam Jones matrix…
| Experiment ID | 20181206-0010 |
|---|---|
| Target object | 3C 273 |
| Target RA (J2000) | |
| Target DEC (J2000) | |
| Start time | 6 December, 2018, 10:29:01 SAST |
| Duration | 0.5 hours |
| Time resolution | 1.0 s |
| Scanning antenna | M009, M012, M015 |
| Centre frequency | 1284.0 MHz |
| Bandwidth | 856 MHz |
| Channel width | 0.835937 MHz |
| Number of channels | 1024 |
| Average elevation | |
| Average azimuth |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Primary beam effects of radio astronomy antennas – II. Modelling MeerKAT L-band beams
K. M. B. Asad,1,2,3,4 J. N. Girard,5 M. de Villiers,4 T. Ansah-Narh,2 K. Iheanetu,2 O. Smirnov,2,4 M. G. Santos,3,4 R. Lehmensiek,6 J. Jonas,2,4 D. I. L. de Villiers,7 K. Thorat,2,4,8 B. Hugo,2,4 S. Makhathini2, G. I. G. Jozsa2,4,9 and S. K. Sirothia2,4
1ARGI, Department of Physical Sciences, Independent University, Bangladesh, Bashundhara RA, Dhaka, Bangladesh
2Department of Physics and Electronics, Rhodes University, PO Box 94, Grahamstown, 6140, South Africa
3Department of Physics and Astronomy, University of the Western Cape, Bellville, Cape Town, 7535, South Africa
4South African Radio Astronomy Observatory, 2 Fir Street, Black River Park, Observatory, Cape Town, 7405, South Africa
5AIM, CEA, CNRS, Université Paris-Saclay, Université de Paris, F-91191 Gif-sur-Yvette, France
6 EMSS Antennas, Stellenbosch, South Africa
7 Department of Electrical and Electronic Engineering, Stellenbosch University, Stellenbosch, South Africa
8 Department of Physics, University of Pretoria, Hatfield, Pretoria, 0028, South Africa
9 Argelander-Institut für Astronomie, Auf dem Huügel 71, D-53121 Bonn, Germany E-mail: [email protected]
(Accepted XXX. Received YYY; in original form ZZZ)
Abstract
After a decade of design and construction, South Africa’s SKA-MID precursor MeerKAT has begun its science operations. To make full use of the widefield capability of the array, it is imperative that we have an accurate model of the primary beam of its antennas. We have taken available L-band full-polarization ‘astro-holographic’ observations of three antennas and a generic electromagnetic simulation and created sparse representations of the beams using principal components and Zernike polynomials. The spectral behaviour of the spatial coefficients has been modelled using discrete cosine transform. We have provided the Zernike-based model over a diameter of 10 degrees averaged over the beams of three antennas in an associated software tool (EIDOS) that can be useful in direction-dependent calibration and imaging. The model is more accurate for the diagonal elements of the beam Jones matrix and at lower frequencies. As we get more accurate beam measurements and simulations in the future, especially for the cross-polarization patterns, our pipeline can be used to create more accurate sparse representations of MeerKAT beams.
keywords:
Astronomical instrumentation – Interferometric techniques
††pubyear: 2020††pagerange: Primary beam effects of radio astronomy antennas – II. Modelling MeerKAT L-band beams–A
1 Introduction
Observational radio astronomy is continuing its growth through the construction of new generations of radio telescopes such as LOFAR (van Haarlem et al., 2013), ASKAP (McConnell et al., 2016), MeerKAT (Jonas & MeerKAT Team, 2016) and the upcoming SKA (Braun et al., 2015). The new and upcoming telescopes can offer exquisite sensitivity and resolution and the ability to image large fractions of the sky very quickly which makes them ideal for exploring new science that relies on the detection of extremely weak astrophysical signals. However, to make full use of these enhanced capabilities, observers must improve their calibration techniques and the models of instrumental effects. The first and second generations of calibration strategies (defined in Smirnov, 2011a) can no longer do justice to the new interferometers.
The key science goals of these newcomers will often demand an implementation of more advanced calibration strategies where both direction independent (e. g., receiver electronics) and dependent (e. g., primary beam and ionosphere) effects are taken into account under an unified mathematical formalism such as the radio interferometer measurement equation (RIME: Hamaker et al., 1996; Smirnov, 2011a) which relates the visibilities observed () by a baseline (formed by the antennas and ) to the true sky represented by the ‘brightness matrix’ . It is given by the equation (following Smirnov, 2011a, eqn. 18)
[TABLE]
where , are the coordinates of the baseline in a reference frame oriented toward the observing direction, and are their direction independent (DIE) and dependent systematic effects (DDE), respectively, are the ‘direction cosines’ toward the sky and . In this paper, will be used to denote only the primary beam (hereafter, only beam, which should not be confused with the point spread function of an array), neglecting other DDEs, such as the ionospheric, tropospheric and Faraday rotation effects.
The worrying assumption of traditional self-calibration, i. e. second generation calibration, is that DDEs are trivial, which implies each visibility is a measurement of the sky ‘coherency function’ corrupted by some multiplicative gains. In such a scenario, an interferometer array would measure the Fourier transform of one common sky. This assumption holds only if the DDEs are identical across all antennas and constant in time, which they are not. In the presence of the actual non-trivial DDEs, the observed visibilities would be a convolution of the ‘sky coherency’ and the DDEs (Smirnov, 2011b). In our case, the beam convolves the ideal visibilities with a different kernel per antenna and per time and frequency step and, hence, we get a different -plane for every baseline, time and frequency sample. Correcting for the beam effects is not trivial and still under much experimentation (for an example, see Bhatnagar et al., 2013; Tasse et al., 2018), but a clear pre-requisite step towards this complex task would be making an accurate beam model.
A simple approximate model can be created using, e. g., Gaussian or Jinc functions that can capture the basic features of the beam. However, such models cannot account for asymmetries in the shapes of the main and side lobes and the position and level of far-out sidelobes. Capturing such spectral effects will be crucial to properly characterise weak astrophysical signals. For instance, the foreground cleaning techniques in the intensity mapping of neutral Hydrogen (HI) during the epoch of reionization (Mellema et al., 2013) and also at lower redshifts (Santos et al., 2015) rely on the spectral smoothness of the diffuse Galactic foregrounds, which can be destroyed by such frequency effects if the beam is not properly modelled and corrected for.
More realistic models of the beam, in angular space, time and frequency, can be either phenomenological, based on the appearance of the far-field radiation pattern of an observing antenna, or physical, based on the actual engineering physics of the antenna (described briefly in Jagannathan et al., 2018, section 3.1). One of the most widely used phenomenological methods is ‘astro-holography’ (AH), where a holographic measurement of the far-field pattern of an antenna scanning an astronomical source is taken, by cross-correlating its measured voltages with that of another antenna tracking the same source (for more information see Morris et al., 1988; Cotton, 1994; Harp et al., 2011; Perley, 2016). Measurement of the beam directly through this method is usually called ‘beam holography’ and using the measured beam to model the figure of the reflecting surface of an antenna, through the Fourier relationship between the far-field pattern and the aperture illumination function, is usually called ‘dish holography’. The first use of AH in radio astronomy was aimed at the latter (Scott & Ryle, 1977). However, the basis of most physical approaches is the electromagnetic (EM) simulation of an antenna. EM simulations can predict the beam over a large field of view relatively easily, but it is computationally very expensive to produce EM models for each antenna of an array and for every frequency channel. In addition, it is more difficult to account for the temporal and environmental behaviour of the beam in EM simulations. On the other hand, beam models created from AH observations are more accurate and easier to obtain for multiple frequency channels and time samples, but they are usually restricted to smaller fields of view and angular resolution. Therefore, information from both AH observation and EM simulation can lead us to a better representation of the beam.
This paper is the second in a series of papers dealing with the modelling and effects of the primary beams of radio astronomy antennas taking into account both the physical and phenomenological approaches. The first paper presented the modelling of the Karl G. Jansky Very Large Array (VLA) beam from AH and compared these models to those created from EM simulations and physical considerations (Iheanetu et al., 2019, hereafter ‘Paper I’). It presented two different techniques of AH beam modelling—‘data-driven’ modelling using Principal Component Analysis (PCA) and ‘basis-driven’ modelling using Zernike polynomials (ZP). In this paper, we will present different approaches of modelling available AH measurements and EM simulations of MeerKAT, an SKA-MID precursor array located in South Africa. Here, we call the data-driven approach ‘characteristic’ and the basis-driven approach ‘analytic.’ Besides the PCA and ZP approaches, we will also demonstrate the use of spherical harmonics (SH). The characteristic and analytic basis models created from the AH observations will be compared with each other and also with the EM simulations. We will also present spectral modelling of the spatial coefficients using Discrete Cosine Transform (DCT).
A simpler and analytic primary beam model of MeerKAT is presented by Mauch et al. (2020) based on cosine-tapered field illumination (equation 3 in the paper).111Available at https://pypi.org/project/katbeam. They have compared this model with an azimuthal average of the AH beam of M. de Villiers (in preparation) and found a good match out to degrees from the phase centre. We have used the same AH beams and created a more elaborate asymmetric sparse model based upon it.
Section 2 gives a general introduction to MeerKAT and describes the AH observations and EM simulations used in this work. Section 3 describes the general formalism of the characteristic and analytic approaches of modelling the spatial shape of the beam. Here, we also discuss spectral modelling of the spatial coefficients and compare the different approaches. Finally, we end with the main conclusions of the paper and some remarks about our future work in Section 4. The Zernike-based model presented here is available through the openly accessible tool EIDOS.222https://github.com/ratt-ru/eidos
2 Beam measurement and simulation
MeerKAT is located in the Upper Karoo region of South Africa. It has 64 interlinked receptors among which, 48 are located in a core region of 1 km in diameter centred at South, East. The other 16 are located outside the core giving a maximum baseline of 8 km. Fig. 1 (left) shows a satellite picture of the MeerKAT location and Fig. 1 (right) shows a recent photograph of some of its antennas. The left and right panels of Fig. 2 show the distribution of the receptors outside and inside the core, respectively. We refer the readers to Jonas & MeerKAT Team (2016) and Camilo et al. (2018) for more information about MeerKAT.
Three receivers of MeerKAT are expected to cover three different bands of the radio spectrum, namely the UHF (580–1015 MHz), L (900–1670 MHz) and S (1750–3500 MHz) band. We will only focus on L-band because substantial AH observations have been carried out at these frequencies. Beams of all the 64 antennas have been measured at L-band, but we have the observations of only three antennas available for this work. On the other hand, the EM-simulated beams of MeerKAT have been created from the physical properties of a generic MeerKAT antenna and, hence, it is assumed to be same for all antennas. Because AH and EM beams rely on completely different methods and pipeline, one can be used to check the sanity of another. In the following subsections, we describe the two beams and use one to check the accuracy of the other. Antenna-to-antenna variation of the beam for the three antennas, yet another check of the sanity of the beam, is also discussed.
2.1 Astro-holographic observation
AH observations are available for all 64 antennas of MeerKAT. We have taken one such observation in which the nearest bright quasar 3C 273 (Schmidt, 1963) was scanned using 18 antennas and simultaneously tracked using another 36 antennas for 0.5 hours. The relevant observational parameters are given in Table 1. It is classified as an ‘astro’-holographic observation because it was targeted at an astronomical object; for an example of a holographic observation of the beam of a MeerKAT antenna using a satellite beacon instead of an astronomical source, see Jonas & MeerKAT Team (2016, figure 6).
After calibrating the observed data, the AH beam was extracted for three of the scanning antennas (M009, M012 and M015) using all the available tracking antennas. The raw noisy beam measurements are then de-noised via aperture-plane masking. Aperture-plane masking is performed relying on the Fourier relationship between the primary beam and the aperture illumination function (AIF) of an antenna. The measured beams are Fourier transformed to obtain the AIF, the Fourier modes lying outside the physical aperture plane are masked and, finally, the masked AIF is inverse-Fourier transformed to create a smooth de-noised beam. We have used the de-noised beam for all analyses in this paper.
For AH observations, the correlator is operated as in normal observing mode (while the antenna tracking strategy is necessarily different), and a set of visibilities is generated. The calibration process is broadly similar to that of a normal observation, but some important subtleties need to be taken into account (see, e. g., Paper I for an extended discussion). The data in this work was reduced using a combination of the standard MeerKAT Science Data Processing (SDP) pipeline and the MeerKAT holography tool KATHOLOG (M. de Villiers, in prep).
Before producing models from the AH measurements, we shift the beam centres at each frequency channel independently to remove the frequency-dependent offset, i. e. squint, of the beam centre from the pointing centre. The squint is different for each of the feeds of each antenna and it also varies with frequency. Therefore, for an accurate comparison between the AH and EM datasets, we remove squint from both of them. The squints were calculated by fitting 2D elliptical Gaussians on the beam measurements at each frequency.333The python module gaussfitter is used. The spectral behaviour of the squints corresponding to the element of the AH measurement is shown in Fig. 3. Note that the pointing offset averaged over all frequencies is related to the mechanical pointing error of an antenna, not with its optical properties. Therefore, we have calculated the squint after taking out the average pointing error. The resulting squint varies horizontally from low to middle frequencies of the L-band, but the variation at higher frequencies is in the vertical direction. We store the per-frequency per-antenna per-feed squint values and these can be added to the squint-less beam model at a later stage if desired.
The re-centred squint-less beam dataset has the shape () where is the number of frequency channels, the matrices represent the Jones elements, and correspond to the total number of pixels. The beam centre is always at the pixel position . We then normalise the beams with respect to the centre by dividing the complex Jones images by the complex Jones matrix formed by the central pixel, i. e. where are the pixel coordinates. Finally, we average the beam measurements of the three antennas to create an antenna-averaged beam because the EM model is created for a generic MeerKAT antenna and, hence, it would be more appropriate to compare this model with an averaged AH beam (the antenna-to-antenna variation is discussed in Section 2.3).
We refer to this re-centred normalised three-antenna-averaged AH beam over a diameter of 10 degrees as . It is effectively a squint-less differential beam measurement with respect to the centre. As an example, the squared amplitude of the AH beam at a particular frequency (1070 MHz) is shown in Fig. 4. The main lobe and the first three sidelobes in the diagonal elements and a cloverleaf pattern in the off-diagonal elements are visible in the figure.
The sidelobe levels can be seen more clearly in Fig. 5 where a 1D cut through the centre of the Stokes I beam (average of the diagonal Jones elements) is shown. The asymmetry of the beam is also clear from the difference between the horizontal (red solid line) and vertical (blue solid line) cuts. The first sidelobe level is more than 20 dB below the maximum and the second sidelobe more than 30 dB below. Cuts through the main-diagonal (green solid line) and the anti-diagonal (magenta solid line) of the average cross-polarization pattern is also shown. Cross-polarization power increases with radius and reach a maximum of dB in the central region of the cloverleafs. The corresponding cuts through the VLA beam are shown in the same figure (dashed lines) and we see that the first sidelobe level of VLA beam, at 1070 MHz, is almost an order magnitude higher than that of MeerKAT. However, their cross-polarization levels are comparable.
2.2 Electromagnetic simulation
MeerKAT primary beam has been simulated using the EM simulation software GRASP444https://www.ticra.com/software/grasp taking into account the principles of physical optics (PO) and the physical theory of diffraction (PTD). The GRASP simulations compare well with EM simulations performed using the MLFMM (multilevel fast multipole method) technique of FEKO.555https://altairhyperworks.com/product/FEKO The simulations are available for the whole L-band with a spectral resolution of 5 MHz and over the entire hemisphere. For the purpose of this paper, we restrict ourselves to within a diameter of 10 degrees. The frequency-dependent squints of the simulated beam are shown in Fig. 3 (shades of grey). Squint varies smoothly in a horizontal direction as one goes from low to high frequencies and this trend is similar to the AH measurements. We remove these squints and re-centre the EM models to the same pixel as the centre of the AH measurements and normalise them using the same convention. The re-centred and normalised EM dataset of shape (), where is the number of channels in the EM simulation, is hereafter referred to as .
The EM model at 1070 MHz is shown in the top row of Fig. 6 (top panel, top row) and the residuals after subtracting the model from the corresponding AH measurement is shown in the bottom row. The residuals have been multiplied by 100 for better visualisation. There is a low-level dipolar structure in the residuals and further investigation is needed to identify its cause which is not within the scope of this paper. However, it is certain that the diagonal Jones elements of and are much closer to each other than the off-diagonal elements. The good match in the diagonal and the relatively poor match in the off-diagonal element can be seen more clearly in the bottom panel of Fig. 6, where the radial profiles of the and elements of the AH and EM datasets and the corresponding residuals are shown.
Furthermore, the proximity between the observation and simulation can be quantified as a fractional difference with the normalised root-mean-square error (NRMSE) of the magnitude of the EM model with respect to that of the given AH measurement. We use NRMSE to show the overall (as the images are averaged over 10 degrees) similarity of the two dataset as shown in Fig. 7. It is defined as
[TABLE]
as a function of frequency. The NRMSE of the off-diagonal, i. e. cross-polarization, elements is, on average, around four times higher than that of the diagonal elements and the error increases with frequency. Again, to what extent this discrepancy is due to actual error of the EM models cannot be known for certain because the low-level off-diagonal elements are less well-known in the AH observations and also become more noisy at higher frequencies. The NRMSE of the diagonal elements increases smoothly with frequency, but that of the off-diagonal elements increases rapidly after around 1350 MHz. The outliers in this figure are caused by Radio Frequency Interference (RFI) and not due to any intrinsic effect of the model or the measurement. Note that the NRMSE is averaged over a diameter of 10 degrees and, hence, dominated by the errors near the nulls.
2.3 Accuracy of the beams
The AH observation and EM simulation are completely independent methods. The first derives from data while the other from solving differential equations. Therefore, we can compare them to check their relative consistency at various levels of accuracy. The EM simulation accounts for the mean features of the beam whereas AH observation helps representing the actual characteristics of the antennas. The NRMSE of the diagonal elements of is around 0.1 over 10 degrees which is very low considering the fact that there are as many as three nulls and sidelobes within this diameter; the error would be much lower within the main lobe. The good match is also evident in the radial profiles of the two beams shown in Fig. 6. Therefore, we can safely say that the observation and simulation match well with each other for the diagonal elements.
The case is very different for the off-diagonal (or cross-polarization) elements. The corresponding NRMSE is four times higher and can be more than 1.0 at higher frequencies. As mentioned above, this discrepancy could be due to the fact that the polarized observation was not calibrated properly. Due to the lack of available polarization-calibrated data and the lack of information regarding the accuracy of simulated cross-polarization beam, we are not considering the off-diagonal elements in our sparse beam model described in Section 3.
We use an AH beam averaged over three antennas, but the beam is known to vary from antenna to antenna. The right panel of Fig. 7 shows the NRMSE of the beams of the three antennas with respect to the averaged beam averaged over a diameter of 10 degrees for both the diagonal (below NRMSE = 0.1) and off-diagonal (above NRMSE = 0.1) elements. For the diagonal elements, the antenna-to-antenna NRMSE variation range from 0.001 to 0.007 across the band which is almost an order of magnitude lower than the NRMSE of the simulated beam with respect to (left panel). M009 and M012 show similar variations across the band, but the variations of the M015 antenna is higher. The variations in the off-diagonal terms are greater as expected and this gives us another reason for not trusting the polarization data at this stage. Even though the NRMSE of the M015-beam is almost a factor of 2 higher than the other two antennas, we can consider the averaged beam to be a good approximation for all antennas in this case because the NRMSE for all three antennas is considerably low. However, we caution the readers that we are talking about only three antennas and the scenario might change when all the antennas are considered.
In the following section, we present two different sparse models for the diagonal elements of the beam Jones matrix, one for the AH observation and the other for the EM simulation, both within a field of view of 10 degrees, and compare the two models at every stage. A single model combining information from both observation and simulation might be beneficial in some cases, as discussed in the introduction, but we have kept the models separate for now. Our main motivation behind modelling the two datasets in parallel was to compare the results of the noisy (observation) and noiseless (simulation) cases.
3 Beam modelling
The squint-less, normalised, differential AH measurement and the EM simulation are effectively the starting point for the main body of our work. We model the spatial beamshape using characteristic and analytic basis functions and the spectral behaviour using DCT. This creates a sparse representation of the given beam datasets that can be used in direction dependent calibration and imaging effectively.
3.1 Spatial modelling
The beamshape is modelled using two types of orthonormal bases: ‘characteristic basis’ (derived from PCA) and ‘analytic bases’ (Zernike polynomials and spherical harmonics). The method and results of the two approaches are described in the following two subsections. Here, we will focus on just one frequency channel, centred at MHz, because both and are sampled at this frequency, and the spectral behaviour of the spatial coefficients will be treated in the next section.
3.1.1 Characteristic basis
In this approach, the basis vector (eigenvector or eigenbeam) to describe a beam is created from the AH beam measurement itself, akin to the Characteristic Basis Function Pattern (CBFP) method (e. g. Maaskant & Ivashina, 2012; Mutonkole et al., 2016, and references therein). Following the method described in section 4.3 of Paper I, we have used Principal Component Analysis (PCA) to create such eigenbeams with the help of Singular Value Decomposition (SVD).
As opposed to the analytic bases, it is more appropriate to apply PCA on the spatio-spectral beam-cube, containing data for all the L-band frequency channels of and . We remove the RFI-affected channels from the AH measurement, stack and together and flatten the spatial dimensions to create a 4D array of shape , where is the total number of RFI-free frequency channels, and the total number of pixels in a beam image. For each complex Jones element, the matrix is decomposed into three complex matrices—, a unitary matrix, , a diagonal matrix whose diagonals contain all the singular values, and , a unitary matrix—using SVD.666The python module scipy.linalg.svd is used. Here, is the collection of our basis vectors or principal components (PC), i. e. eigenbeams, and contains the corresponding coefficients (for the relationship between SVD and PCA, see Jolliffe & Cadima, 2016, section 2a). The models for all frequencies of L-band are reconstructed from the eigenbeams and the coefficients as , where contains the strongest coefficients and the corresponding PCs.
The results over the full L-band will be presented in Section 3.2; here we focus on a single channel. The top row of Fig. 8 shows the reconstructed diagonal Jones elements of (first from the left; AH) and (second from the left; EM) where MHz.777In this paper, refers to a model created using the basis function for both AH observation and EM simulation whereas and refer to the AH and EM models created using the basis, individually. The corresponding residuals after subtracting the models from are shown below the models. Note that all the residual images are calculated from the actual amplitudes of the Jones elements, not the squared amplitudes, i. e. power. These models are created using strongest PCs. The NRMSE () of the model created from AH is shown in Fig. 9 as a function of number of modes used. We see that PCA can represent the beam measurements accurately with less than 15 components (red plus markers), but we have used 15 modes for a fair comparison with the models created using analytic bases.
The bottom panels of Fig. 8 show the radial profiles of the and Jones elements of the data (blue dots), the characteristic models (green line) and the residuals (red dots). Radial profiles are created by averaging the beam images in concentric circular annuli. In both the images of the top panels and the radial profiles of the bottom panels, we show the squared amplitude for the sake of consistency. The mean and standard deviation of the residuals are shown above the radial profile plots. If we take the square-root of these values, we see that with 15 PCs, we can reach an rms residual level of for the diagonal Jones elements for both the AH and EM beams.
Along with a faithful representation of the beam, one other advantage of using an orthogonal basis is the sparsity of the representation—the beam can be described using less information. The AH or EM datasets contained parameters for describing the full-Jones beam and although the characteristic basis model now needs parameters to represent the full-bandwidth full-polarization beam, the frequency behaviour of the PC coefficients can be modelled using parameters, as described in Sec. 3.2, further reducing the information-load. However, will still be a large number, especially if a bigger field of view or better resolution is desired. This is where the ‘analytic basis’ approach comes in. It can further sparsify the model by representing the spatial dimensions via 2D analytic functions calculated over unit disks.
3.1.2 Analytic basis
We test two types of analytic bases to represent and —Zernike polynomials (ZP; for an introduction, see Lakshminarayanan & Fleck, 2011, and Appendix A) and spherical harmonics (SH). Unlike PCA, the analytic basis models are created for each AH and EM frequency channel independently. The basic procedure is the same for both ZP and SH. If the analytic bases are denoted by for ZP and SH, respectively, and the corresponding coefficients by , then the decomposition of gives us the coefficients
[TABLE]
where T stands for transpose and + for the Moore-Penrose pseudoinverse.888Calculated using the python module numpy.linalg.pinv. These coefficients can be used to reconstruct a beam model as
[TABLE]
where denotes the strongest among the selected coefficients, and the weakest, and for ZP and SH. The results of the decomposition and reconstruction of the AH and EM datasets using these bases are presented below.
Zernike polynomial model:
ZP models created from the AH () and EM () datasets are shown in the top panels of Fig. 8, the former third from the left and the latter last from the left. Like the PCA model, the 15 strongest modes were used to reconstruct these. We keep using 15 coefficients because, as shown in Fig. 9 and described in more detail below, modelling error cannot be reduced substantially by including more coefficients in the case of a single frequency channel. By comparing the residual images of the two leftmost panels with those of the two rightmost panels of Fig. 8, we immediately see that PCs can represent the data more faithfully with the same number of modes. A comparison of the corresponding radial profiles reveals that the residual levels for and are, on average, an order of magnitude higher than those for and .
NRMSE of (difference between AH measurement and ZP representation) is shown in Fig. 9 (blue plus markers for and blue crosses for for comparison) as a function of number of modes used; , not shown here, follows a similar trend. After using the 15 strongest Zernike modes, the NRMSE of is and that of around 0.3; compare them to the corresponding NRMSE for (red plus and cross)—around 0.02 and 0.2. Comparing the blue and red markers, we see that the NRMSE for and decreases in a similar fashion as more modes are included, but they always maintain a difference of almost an order of magnitude. The difference is lower for the off-diagonal elements as both of them are dominated by noise; we do not include these cross-polarization terms in our final model.
ZP are real functions, therefore, we model both the real and imaginary parts using the real functions and store the resulting complex coefficients. The beam measurement and simulation were decomposed using the first 300 Zernike modes and the 15 strongest ones among them were selected for this particular reconstruction at 1070 MHz. As discussed before, the main reason for using analytic bases is to reduce the amount of information needed to represent the spatial structure. It is, therefore, imperative to look at the shapes of the Zernike modes needed for that representation. However, we want to find the strongest Zernike coefficients needed to model both the beam measurements and simulations for all frequencies and, hence, the shapes of the individual modes is discussed in Section 3.2.
Spherical harmonic model:
Unlike the ZP, the SH are complex valued although their imaginary parts do not have any zero-frequency component and for any order (or degree) , the frequency contains the average of all the harmonics. We decompose the observed complex beam using the first 256 complex SH (with a maximum of 15) and select the strongest coefficients. Unlike PCs and ZPs, SH’s cannot reconstruct the diagonal elements accurately with 15 modes and at least 40 modes are needed for a reasonable representation. Therefore, the the SH model , shown in Fig. 10 (left panel), has been created using 40 coefficients. Even with 40 modes, SH does not perform as well as PC and ZP—the residuals show a dipolar structure in the main lobe of the beam. This precludes using SH for modelling 10-degree beams across a wide bandwidth and, hence, in the next section, we will focus only on ZP and PC.
3.2 Spectral modelling
The two full-bandwidth sparse representations , based on PC, and , based on ZP, can be compressed even further if we model the spectral behaviour of the associated strongest coefficients in each case. PCs are calculated from the combined dataset of the full-bandwidth AH and EM beams, but ZP decomposition is performed at each frequency channel separately. To prepare a dataset for PCA, we removed the RFI-affected channels from AH measurements, as mentioned in Section 3.1.1, and created a new dataset containing only the non-contaminated AH channels and all the EM channels.
We used three different figures of merit to detect RFI-affected channels in the AH measurement: the rms of the beam images, the position of the peak in the diagonal elements and the difference between the energies of the ZP coefficients of adjacent channels. In the RFI-affected channels, rms is comparatively high and the beam is not exactly centred at the central pixel (128,128). The dataset created after removing the unusable channels based on these two criteria was used for PCA. However, these criteria could detect only the worst channels. To identify the remaining lower-level RFI, we used ZP coefficients . ZP decomposition was performed on all channels including the RFI-affected ones because, unlike PC, ZP coefficients of one channel do not depend on those of another. Then, we identified the channels for which as RFI-affected. Finally, we created a comprehensive list of RFI-affected channels and masked those channels in both and the PC coefficients .
3.2.1 Principal components
The most dominant PCs are shown in Fig. 11 as a function of frequency for the diagonal Jones elements of (dots) and (solid lines). AH measurements are available from 856 to 1712 MHz with a resolution of 0.83 MHz, but we show results up to the effective L-band edge, 1670 MHz. EM simulations are available from 900 to 1670 MHz with a resolution of 5 MHz. The top two panels in the figure show the real parts of and the bottom ones the imaginary parts. The first 5 real PCs of the diagonal elements of and match very well and their difference is only visible in the imaginary parts which are at a much lower level.
The amplitude of the strongest PC (red lines in Fig. 11) models the beamwidth as a function of frequency to a large extent. This amplitude decreases smoothly with frequency because beamwidth is proportional to , but it also exhibits a low-level frequency-dependent ripple, more clearly visible in the imarinary part (red lines in and ). The ripple of MeerKAT beam is described in more detail in Section 3.2.4.
3.2.2 Zernike coefficients
In order to select the most dominant Zernike coefficients , we calculate the average of the absolute value of the real and imaginary parts separately over all frequencies for both AH measurement and EM simulation. The 15 strongest Zernike modes selected from these datasets is shown in Fig. 13. The red circles and blue dots show the average energies of the coefficients for and , respectively. We see that the real part of the diagonal elements is mainly represented by the Zernike modes with an angular frequency , i. e. the spherical modes. The spherical modes are symmetric, but the beam has asymmetries which are represented by the astigmatism () and coma () modes. The imaginary part is also represented by the spherical, astigmatism and coma modes, but astigmatism is much more dominant than the spherical modes. The energies of and match very well in the real part as opposed to the imaginary part.
Spectral behaviour of the most dominant Zernike coefficients is presented in Fig. 12. Like Fig. 11, are plotted using dots and using solid lines. Similar to the PCs, the strongest Zernike coefficients for modelling the real part of the diagonal elements match very well between and and the ripple of the beamwidth is also clearly visible in the imaginary parts. Note that higher order spherical modes are needed for modelling higher frequency beams because number of sidelobes within the diameter increases with frequncy.
The spectral shape of the spatial coefficients is modelled using DCT as described below. We discuss spectral compression only for the ZP case because this is the basis we have used in our final models and ZPs can represent the beam using less information. The spectral behaviour of the PCs can be modelled using the same method if needed.
3.2.3 Discrete cosine transform
Only 15 ZP coefficients were needed to model the beam at 1070 MHz as shown in Fig. 8 and 9, but at least 20 coefficients are necessary for the full L-band. Therefore, we model the spectral behaviour of the 20 most dominant spectral coefficients , i. e. for AH and for EM. Before compressing the coefficients, we interpolate999The python module numpy.interp is used to perform a piecewise linear interpolation. their energies, for both AH and EM models, to channels from 900 to 1670 MHz with a resolution of 0.1 MHz. In case of , the coefficients for the RFI-affected channels were reconstructed by interpolation from the RFI-free ones and then the coefficients of channels were used to fill channels via piecewise linear interpolation. In case of , coefficients from the original channels are used to fill the channels.
Interpolated coefficients are then compressed via DCT. First, we decompose the coefficients using DCT of Type II as
[TABLE]
for where when and otherwise. We have seen that the number of resulting DCT coefficients that are more than 10 times higher than the noise level in the decomposition is usually around 30 for and 40 for . Therefore, we decided to keep 40 DCT coefficients for each Jones elements of the AH model and 30 for the EM model. needs more coefficients because it has noise.
If the resulting most dominant DCT coefficients are denoted by , we can reconstruct a de-noised smooth spectral model of the coefficients via Inverse DCT (same as DCT of Type III) as
[TABLE]
for . Both (thick coloured lines) and (thin white lines) are shown in Fig. 14 for both the AH (left panels) and EM (right panels) models. The sixth to tenth most dominant coefficients for modelling and are shown because the first five coefficients, already shown in Fig. 12, have even less error and are not representative of the rest of the coefficients.
From Fig. 14, we see that the smooth reconstruction of the spectral behaviour follows the original energies of the coefficients closely. DCT can also reconstruct the small-scale ripple on the coefficients to some extent, but further work is needed to model the ripple more accurately based on physical considerations.
We need only coefficients to represent the full L-band complex beam model of MeerKAT where is the number of DCT coefficients for modelling the spectral behaviour of each the ZP coefficients. To show the accuracy of the semi-analytic beam models created from these coefficients, provided in EIDOS, we reconstruct the models for 155 channels from 900 to 1670 MHz and calculate their NRMSE with respect to the given measurement and simulation. These errors are also compared with the corresponding error of the PC-based models.
Fig. 15 shows the NRMSE of the PC and ZP-based models for the and elements. The left and right panels show the errors for and , respectively. Although we have not described the modelling of the off-diagonal Jones elements, their NRMSE is included in this plot in order to compare them with the corresponding errors of the diagonal elements. For example, we see that the NRMSE of is almost the same as the NRMSE of which shows that PC models are usually much closer to a given data at the expense of modelling the noise. is almost an order of magnitude higher than that of which also reiterates the results presented in Fig. 9 for a single channel. However, we need much less information to reconstruct beam models using ZP and, hence, only Zernike coefficients are provided in EIDOS. Errors of the diagonal elements increase with frequency for both and because there are more sidelobes at higher frequencies making the modelling less optimal. Off-diagonal elements exhibit comparatively higher error, as expected. The outlier points in the left panel are caused by RFI which is present in the AH measurement, but not in our models.
By comparing the left and right panels of Fig. 15, one can see that PCs model the AH measurement and EM simulation with similar errors (blue and green markers), but ZPs exhibit higher error (red and magenta markers) in modelling the AH dataset. This is because the measurements have low-level noisy structure and PCs are more prone to modelling the noise than ZPs. Even though PCs are more faithful to a given data, representing all the structures and irregularities in an AH observation might not be desirable.
3.2.4 Beamwidth and ripple
As a final test of the ZP-based models, we compare the full width at half-maximum (FWHM or ‘beamwidth’) of the beam models with that of the original datasets as a function of frequency. To calculate the beamwidth, we fit 2D elliptical Gaussian functions to the datasets and models within the region where beam amplitude is more than . The same method was used to calculate squint as described in Section 2.1 and shown in Fig. 3. The resulting beamwidth has a horizontal (semi-major axis) and a vertical (semi-minor axis) component and they vary from the theoretical beamwidth (where denotes wavelength and m, the dish diameter). Fig. 16 shows (for the horizontal width only) as a function of frequency for the Stokes I beams ( element of the Mueller matrix) of the AH measurement (red dots), EM simulation (blue line), and the Zernike models created from the AH (red line) and EM (blue line) datasets. The ripple of is clearly visible here because is smooth. It is more prominent at the lower part of the band. The models follow the original datasets closely, although they diverge more at the upper part of the band. There is a clear division between the lower and upper parts of the L-band in terms of beamwidth and, hence, the band can be conveniently divided into two subbands: one before 1350 MHz, the other after it. If we model the lower and upper subbands separately, the models will be more accurate, albeit at the expense of increasing the amount of required information. For continuum science, a continuous full-bandwidth beam is more desirable, but for emission line studies, e. g. in case of HI intensity mapping, we can use a more accurate model of the beam created from the lower subband.
The ripple of the beamwidth is caused by the interference of the electric field diffracted from the secondary reflector with the electric field of the main beam (de Villiers, 2013). To show the amplitude of the MeerKAT ripple, we subtract a smooth third-order polynomial from calculated from the original AH and EM datasets. The resulting ripple for the lower subband is shown in the inset of Fig. 16 in arcmin units. The top and bottom panels show the horizontal () and vertical () ripples, respectively, and the AH and EM datasets are represented by the red and blue colours, respectively. is significantly larger than which can be interpreted as a frequency-dependent ‘beam squash’ (Heiles et al., 2001), i. e. the beam is not squeezed symmetrically but squashed along a preferred direction. The ripples predicted by the EM simulation match reasonably well with the AH measurements. The match is relatively poor in case of because it is at a lower level and, hence, more affected by the noise of the AH dataset.
4 Primary beam correction
We have provided the Zernike polynomials based model of the primary beam in the EIDOS package. The model is accurate for the diagonal elements of the Jones matrix. It is averaged over three antennas. In spite of these limitations, the model can already be used in primary beam correction of Stokes I images and in direction-dependent calibration. As an example, we show an application of the EIDOS primary beam using the DDFacet101010https://github.com/saopicc/DDFacet (Tasse et al., 2018) imaging software and compare the result with an image produced by WSCLEAN111111https://sourceforge.net/projects/wsclean/ (Offringa et al., 2014) without any model of the primary beam.
Fig. 17 demonstrates the application of the primary beam pattern during deconvolution of a wideband MeerKAT-16 (taken using 16 antennas of the MeerKAT array) image. Panels (b) and (d) show images produced by DDFacet with and without a primary beam model. In both cases, the underlying sky model being fitted during deconvolution is a power-law spectrum121212This uses the SSD deconvolution mode of DDFacet (i. e. two parameters, flux and spectral index, per model pixel). In case (b), this sky model is unable to account for the apparent spectral curvature induced by the primary beam, resulting in clear artefacts around brighter sources. In case (d), the imager is supplied with our MeerKAT primary beam model. The spectral effects of the beam are then accounted for properly, allowing the power-law spectrum sky model to fit the underlying emission better. The remaining artefacts are probably due to the unknown pointing errors. Panels (a) and (c) show images produced by WSCLEAN which deconvolves without any primary beam information. In case (a), the spectral model is a 1st-order polynomial (two parameters per model pixel), which also leaves substantial artefacts. In case (c), the spectral model is a 3rd-order polynomial (four parameters per model pixel), which is able to fully fit the spectral curvature induced by the beam. Although the image quality in case (c) and (d) is comparable, the former uses twice as many degrees of freedom in the deconvolution model (being forced to absorb primary beam effects into the sky model), while the latter is able to recover a more physical sky model, with primary beam effects being explicitly accounted for in the instrumental measurement equation.
5 Conclusion
This is the second in a series of papers dealing with primary beam modelling effects of radio astronomy antennas. The basic formalism was set in Paper I (Iheanetu et al., 2019) in the context of the VLA. In this paper, we extend that formalism and use it to create MeerKAT beam models from astro-holographic (AH; ) observations and EM simulations (). Our aim was to present a pipeline that is able to create sparse representations of the beam, given any dataset, that can be used for direction dependent calibration and imaging. This model can be called ‘eidetic’ because it reconstructs the given datasets faithfully. We have modelled the given (Fig. 4) and (Fig. 6) over a diameter of 10 degrees using characteristic and analytic basis functions, for all frequencies of L-band and without considering the cross-polarization, and compared the results. Here, is averaged over three antennas and is a generic simulation for an ideal antenna.
The diagonal elements of matches well with but their difference increases with frequency (Fig. 6 and 7). We have decomposed the two beam datasets using principal components (PC), Zernike polynomials (ZP) and spherical harmonics (SH). PCs and ZPs can represent the beam measurements and simulations very well with around 15 modes (compare the panels of Fig. 8), but 10-degree-beams cannot be modelled as well using the same number of SH modes (Fig. 10). Therefore, we focus on PCs and ZPs.
The coefficients corresponding to the most dominant PC and ZP modes are modelled in frequency using DCT. Therefore, a full-bandwidth full-polarization 10-degree beam model can be represented using coefficients where is the number of DCT coefficients needed to model the spectral behaviour of each of the PC or ZP coefficients. The first PC models the beamwidth as a function of frequency to a large extent (red line in the diagonal elements of Fig. 11) and the small-scale ripple of the width can be seen clearly in the imaginary part of this component. The spectral behaviour of the most dominant PC (Fig. 11) and ZP (Fig. 12) modes representing (dots) and (solid lines) matches well in the diagonal elements.
Beam models can be represented by analytic basis functions like ZPs using less information than the characteristic bases like PCs because for the latter, we also need to store the eigenbeams. By looking at the 15 most dominant ZP modes (Fig. 13), we see that the diagonal elements of or are modelled mainly by the symmetric spherical modes and the asymmetries in the beam are modelled by the coma and astigmatism modes.
We have been able to represent the spectral behaviour of the coefficients using 40 DCT coefficients for and 30 for (Fig. 14). Before DCT decomposition, the coefficients for the RFI-affected channels in are calculated by interpolating from the RFI-free channels. Through interpolation and DCT, we have calculated coefficients for all frequencies between 900 and 1670 MHz with a resolution of 0.1 MHz.
The resulting beam models calculated from the ZP coefficients have a residual error (after subtracting the model from data) of around corresponding to the power of the diagonal element (Fig. 8). For the electric field, the corresponding residual levels would be around . Comparatively, the PC-based models exhibit lower residuals (Fig. 8). The off-diagonal elements are less accurate and the errors of all the Jones elements increase with frequency, as seen in the trend of the normalised root-mean-square error (NRMSE) (Fig. 9 and 15). However, NRMSE should be used with caution because it is averaged over a diameter of 10 degrees and, hence, mostly dominated by errors in the nulls and noisy regions.
The antenna and frequency dependent squints of the beam (Fig. 3) were taken out before modelling, but they can be reintroduced by re-centring the model beams at a later stage if required. On the other hand, the squash effect (Fig. 16) is already incorporated in the Zernike coefficients; the beamwidth has a ripple of amplitude arcmin in the horizontal and arcmin in the vertical direction and it varies as a function of frequency with a period of MHz.
The ZP and corresponding DCT coefficients for representing and are provided in the EIDOS software tool that can be used to calculate MeerKAT eidetic beam models for any frequency of the L-band. The diagonal Jones terms have been modelled with a high degree of certainty; between the high signal-to-noise ratio of the AH measurements and the excellent match to EM, we do not expect that the diagonal model can be much improved (although further investigation into elevation dependence and antenna-to-antenna differences is certainly merited). The off-diagonal terms, however, need further refinement; the intrinsically high noise of AH suggests that these models can be improved with additional AH observations, and relatively poor match with EM simulation also needs to be further investigated. Therefore, we have not discussed the modelling of the off-diagonal elements in this paper.
The advantage of having a Zernike-based eidetic model of the beam is that it can be calculated quickly and instantaneously during calibration or beam-correction. For example, we have compared the quality of the images produced by applying our eidetic beam on a real observation using the DDFacet imager with that produced by WSCLEAN without any prior information of the beam. We have seen that WSCLEAN can produce images comparable in quality to that of DDFacet (Fig. 17), but only at the expense of twice as many degrees of freedom. Therefore, in spite of the limitations, the beam models given in EIDOS can already be used in primary beam correction of Stokes I images through DDFacet.
Our beam models can be used in MeerKAT-specific pipelines through the radio astronomy package STIMELA.131313https://github.com/SpheMakh/Stimela The VLA beam models presented in Paper I will also be included in this tool. In future papers, we will explore the spatial, spectral and polarization effects of these beams on continuum imaging and intensity mapping.
Acknowledgements
The MeerKAT telescope is operated by the South African Radio Astronomy Observatory (SARAO), which is a facility of the National Research Foundation (NRF), an agency of the Department of Science and Technology (DST) of South Africa. KMBA and MGS acknowledge support from SARAO and NRF (grant 84156) and DILV from NRF grant 75322. The research of OS and KI is supported by the South African Research Chairs Initiative of the DST and NRF. We thank Christopher Jackson Finlay (SARAO) for the parallelization script in EIDOS. We also thank Kavilan Moodley for useful suggestions.
Data availability
The beam models presented in this paper can be created using the EIDOS python package: https://github.com/ratt-ru/eidos. The underlying data is available in the SARAO Archive: https://archive.sarao.ac.za.
Appendix A Zernike polynomials
The Zernike polynomials of order and angular frequency can be written in polar coordinates as (following Born & Wolf, 1999, chapter IX, section 9.2.1)
[TABLE]
where and are non-negative integers with and is always even. The radial polynomials
[TABLE]
and the normalisation is such that for all permissible values of and , .
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Bhatnagar et al. (2013) Bhatnagar S., Rau U., Golap K., 2013, Ap J , 770, 91 · doi ↗
- 2Born & Wolf (1999) Born M., Wolf E., 1999, Principles of Optics , 7 edn
- 3Braun et al. (2015) Braun R., Bourke T., Green J. A., Keane E., Wagg J., 2015, in Advancing Astrophysics with the Square Kilometre Array (AASKA 14). p. 174
- 4Camilo et al. (2018) Camilo F., et al., 2018, Ap J , 856, 180 · doi ↗
- 5Cotton (1994) Cotton W. D., 1994, AIPS Memoranda, 86
- 6Hamaker et al. (1996) Hamaker J. P., Bregman J. D., Sault R. J., 1996, A&AS, 117, 137
- 7Harp et al. (2011) Harp G. R., et al., 2011, IEEE Transactions on Antennas and Propagation , 59, 2004 · doi ↗
- 8Heiles et al. (2001) Heiles C., et al., 2001, PASP , 113, 1247 · doi ↗
