Polca SARA - Full polarization, direction-dependent calibration and   sparse imaging for radio interferometry

Jasleen Birdi; Audrey Repetti; and Yves Wiaux

arXiv:1904.00663·astro-ph.IM·January 8, 2020

Polca SARA - Full polarization, direction-dependent calibration and sparse imaging for radio interferometry

Jasleen Birdi, Audrey Repetti, and Yves Wiaux

PDF

TL;DR

Polca SARA is a novel algorithm for radio interferometry that jointly performs full polarization, direction-dependent calibration, and sparse imaging, with proven convergence and improved imaging quality.

Contribution

It introduces a non-convex optimization method that guarantees global convergence and integrates sophisticated priors for calibration and imaging in radio interferometry.

Findings

01

Superior polarization constraint based imaging performance

02

Importance of calibrating direction-dependent effects and polarization leakage

03

Enhanced imaging quality with the proposed calibration approach

Abstract

New generation of radio interferometers are envisaged to produce high quality, high dynamic range Stokes images of the observed sky from the corresponding under-sampled Fourier domain measurements. In practice, these measurements are contaminated by the instrumental and atmospheric effects that are well represented by Jones matrices, and are most often varying with observation direction and time. These effects, usually unknown, act as a limiting factor in achieving the required imaging performance and thus, their calibration is crucial. To address this issue, we develop a global algorithm, named Polca SARA, aiming to perform full polarization, direction-dependent calibration and sparse imaging by employing a non-convex optimization technique. In contrast with the existing approaches, the proposed method offers global convergence guarantees and flexibility to incorporate sophisticated…

Tables2

Table 1. Table 1: Cygnus A (top two tables) and Hydra A (bottom two tables) results: SNR and Dynamic range values for different considered cases. For each of the reconstructed Stokes image, the shown values are the mean computed over 5 performed simulations for the cases with following considerations- for calibration step: known normalized DIEs, DIE calibration, no off-diagonal Jones terms calibration, and full Jones matrix DDE calibration; and for imaging step: without and with the enforcement of the polarization constraint.

	without polarization constraint				with polarization constraint
	Normalized	DIE	DDE calibration	DDE calibration	Normalized	DIE	DDE calibration	DDE calibration
	DIEs	calibration	w/o off-	of full Jones	DIEs	calibration	w/o off-	of full Jones
			diagonal terms	matrix			diagonal terms	matrix
Stokes $I$	19.4	21.4	34	34.7	18.4	20.2	33.6	37.4
Stokes $Q$	18.6	16.9	20.8	21.2	16.3	15.2	23.4	24.9
Stokes $U$	16.9	18	19.9	22.5	15.6	16.7	21.3	25.2
(a) SNR (in dB) values for Cygnus A
	without polarization constraint				with polarization constraint
	Normalized	DIE	DDE calibration	DDE calibration	Normalized	DIE	DDE calibration	DDE calibration
	DIEs	calibration	w/o off-	of full Jones	DIEs	calibration	w/o off-	of full Jones
			diagonal terms	matrix			diagonal terms	matrix
Stokes $I$	1.04	1.73	10.8	12.8	1.07	1.82	9.64	41.4
Stokes $Q$	3.31	3.43	4.79	4.94	2.59	2.16	6.79	9.61
Stokes $U$	2	2.14	2.62	3.38	0.55	0.87	1.39	19.8
(b) Dynamic range (DR) values for Cygnus A in units of $10^{4}$
	without polarization constraint				with polarization constraint
	Normalized	DIE	DDE calibration	DDE calibration	Normalized	DIE	DDE calibration	DDE calibration
	DIEs	calibration	w/o off-	of full Jones	DIEs	calibration	w/o off-	of full Jones
			diagonal terms	matrix			diagonal terms	matrix
Stokes $I$	18.3	20.6	24.9	25	18.3	20.7	31.2	33.2
Stokes $Q$	17.8	18.6	11.3	11.4	17.9	18.8	19.7	20.7
Stokes $U$	10.3	11.5	8.09	9.07	10.9	12.2	12.6	19.8
(c) SNR (in dB) values for Hydra A
	without polarization constraint				with polarization constraint
	Normalized	DIE	DDE calibration	DDE calibration	Normalized	DIE	DDE calibration	DDE calibration
	DIEs	calibration	w/o off-	of full Jones	DIEs	calibration	w/o off-	of full Jones
			diagonal terms	matrix			diagonal terms	matrix
Stokes $I$	0.82	1.54	2.35	2.45	0.82	1.49	6.43	15.9
Stokes $Q$	2.01	2.21	0.23	0.23	0.68	0.93	1.01	1.19
Stokes $U$	2.44	2.53	0.21	0.28	0.31	0.5	0.48	2.7
(d) Dynamic range (DR) values for Hydra A in units of $10^{4}$

Table 2. Table 2: Important variable notations and configurable algorithmic parameters for Polca SARA method (Algorithm 1 ).

$𝗦_{0}$	Stokes matrix containing the high amplitude,
	thresholded part of $\bar{𝗦}$ obtained with 1GC
$𝗘$	Stokes matrix containing the unknown part of $\bar{𝗦}$
${\bar{𝗗}}_{t, α}$	DDEs for antenna $α$ and time instant $t$
${\bar{𝗨}}_{α, 1} = {\bar{𝗨}}_{α, 2} = {\bar{𝗨}}_{α}$	Compact-support Fourier kernels corresponding
	to ${\bar{𝗗}}_{t, α}$
${\bar{𝗨}}_{1} (and {\bar{𝗨}}_{2})$	Concatenation of matrices ${({\bar{𝗨}}_{α, 1})}_{1 ⩽ α ⩽ n_{a}}$
	(and ${({\bar{𝗨}}_{α, 2})}_{1 ⩽ α ⩽ n_{a}})$ )
$K and P$	Support sizes in spatial and temporal
	Fourier domains of DDEs, respectively
$L^{(i)} and J^{(i)}$	Number of inner-loop FB iterations for DDEs and
	images updates, respectively
$𝜼 = [η_{1}, η_{2}, η_{3}]$	Regularization parameters for the images
$γ$	Regularization parameter for the DDEs
$ε_{𝖴} and ε_{𝖤}$	Stopping criterion for the DDEs and images
	update, respectively
$ε_{0}$	Global stopping criterion for Algorithm 1

Equations88

\displaystyle{\bm{\mathsf{Y}}}_{t,\alpha,\beta,\nu}=\int\overline{\bm{D}}_{t,\alpha,\nu}(\bm{l})\,\mathcal{L}\big{(}\overline{\bm{S}}_{t,\nu}(\bm{l})\big{)}\,\overline{\bm{D}}_{t,\beta,\nu}^{\dagger}(\bm{l}){\mathsf{e}}^{-2{\mathsf{i}}\pi\bm{u}_{t,\alpha,\beta}\cdot\bm{l}}d^{2}\bm{l},

\displaystyle{\bm{\mathsf{Y}}}_{t,\alpha,\beta,\nu}=\int\overline{\bm{D}}_{t,\alpha,\nu}(\bm{l})\,\mathcal{L}\big{(}\overline{\bm{S}}_{t,\nu}(\bm{l})\big{)}\,\overline{\bm{D}}_{t,\beta,\nu}^{\dagger}(\bm{l}){\mathsf{e}}^{-2{\mathsf{i}}\pi\bm{u}_{t,\alpha,\beta}\cdot\bm{l}}d^{2}\bm{l},

[{\mathsf{Y}}_{t,\alpha,\beta}]_{ii^{\prime}}=\displaystyle\sum_{j,j^{\prime}=1}^{2}\sum_{n=-N/2}^{N/2-1}\big{[}\overline{{\bm{\mathsf{D}}}}_{t,\alpha}(n)\big{]}_{ij}\big{[}\mathcal{L}\big{(}\overline{{\bm{\mathsf{S}}}}(n)\big{)}\big{]}_{jj^{\prime}}\big{[}\overline{{\bm{\mathsf{D}}}}^{*}_{t,\beta}(n)\big{]}_{j^{\prime}i^{\prime}}\\ {\mathsf{e}}^{-2{\mathsf{i}}\pi(k_{t,\alpha,\beta})\frac{n}{N}}+[{\Omega}_{t,\alpha,\beta}]_{ii^{\prime}},

Y = ϕ (S, U_{1}, U_{2}) + Ω,

Y = ϕ (S, U_{1}, U_{2}) + Ω,

E, U_{1}, U_{2} minimize h (E, U_{1}, U_{2}) + r (E) + p (U_{1}, U_{2}),

E, U_{1}, U_{2} minimize h (E, U_{1}, U_{2}) + r (E) + p (U_{1}, U_{2}),

[{\mathb{y}}_{t,\alpha,\beta}]_{q}=\displaystyle\sum_{z=1}^{4}\sum_{n=-N/2}^{N/2-1}[\overline{{\bm{\mathsf{M}}}}_{t,\alpha,\beta}(n)]_{qz}\big{[}\mathcal{\widetilde{L}}\big{(}\overline{{\bm{\mathsf{S}}}}(n)\big{)}\big{]}_{z}\\ {\mathsf{e}}^{-2{\mathsf{i}}\pi(k_{t,\alpha,\beta})\frac{n}{N}}+[{\omega}_{t,\alpha,\beta}]_{q}.

[{\mathb{y}}_{t,\alpha,\beta}]_{q}=\displaystyle\sum_{z=1}^{4}\sum_{n=-N/2}^{N/2-1}[\overline{{\bm{\mathsf{M}}}}_{t,\alpha,\beta}(n)]_{qz}\big{[}\mathcal{\widetilde{L}}\big{(}\overline{{\bm{\mathsf{S}}}}(n)\big{)}\big{]}_{z}\\ {\mathsf{e}}^{-2{\mathsf{i}}\pi(k_{t,\alpha,\beta})\frac{n}{N}}+[{\omega}_{t,\alpha,\beta}]_{q}.

y = Φ (\overline{S}) + ω,

y = Φ (\overline{S}) + ω,

Φ (\overline{S}) = G F Z L (\overline{S}),

Φ (\overline{S}) = G F Z L (\overline{S}),

S minimize r (S) subject to ∥Φ (S) - y ∥_{2} ⩽ ϵ,

S minimize r (S) subject to ∥Φ (S) - y ∥_{2} ⩽ ϵ,

\underset{\begin{subarray}{c}{{\bm{\mathsf{S}}}}\end{subarray}}{\mathrm{minimize}}\;\;\bar{h}\big{(}{\bm{\mathsf{S}}}\big{)}+r({\bm{\mathsf{S}}}),

\underset{\begin{subarray}{c}{{\bm{\mathsf{S}}}}\end{subarray}}{\mathrm{minimize}}\;\;\bar{h}\big{(}{\bm{\mathsf{S}}}\big{)}+r({\bm{\mathsf{S}}}),

(\forall S \in R^{2 \times 2 N}) \overset{ˉ}{h} (S) = \frac{1}{2} ∥Φ (S) - y ∥_{2}^{2} .

(\forall S \in R^{2 \times 2 N}) \overset{ˉ}{h} (S) = \frac{1}{2} ∥Φ (S) - y ∥_{2}^{2} .

r (S) = g (S) + r^{'} (S),

r (S) = g (S) + r^{'} (S),

\displaystyle g({\bm{\mathsf{S}}})=\|{\bm{\Psi}}^{\dagger}\mathcal{R}\big{(}{\bm{\mathsf{S}}}\big{)}\|_{{\bm{\mathsf{W}}},1}=\displaystyle\sum_{i=1}^{4}\sum_{j=1}^{J}{\mathsf{W}}_{j,i}\big{|}[{\bm{\Psi}}^{\dagger}\mathcal{R}\big{(}{\bm{\mathsf{S}}}\big{)}]_{j,i}\big{|}\,,

\displaystyle g({\bm{\mathsf{S}}})=\|{\bm{\Psi}}^{\dagger}\mathcal{R}\big{(}{\bm{\mathsf{S}}}\big{)}\|_{{\bm{\mathsf{W}}},1}=\displaystyle\sum_{i=1}^{4}\sum_{j=1}^{J}{\mathsf{W}}_{j,i}\big{|}[{\bm{\Psi}}^{\dagger}\mathcal{R}\big{(}{\bm{\mathsf{S}}}\big{)}]_{j,i}\big{|}\,,

\mathbb{P}=\big{\{}\widetilde{{\bm{\mathsf{S}}}}=\mathcal{R}\big{(}{\bm{\mathsf{S}}}\big{)}\in{\mathbb{R}}^{N\times 4}\,\big{|}\,(\forall n\in\{1,\ldots,N\})\\ -\widetilde{{\mathsf{S}}}_{n,1}+\|\widetilde{{\bm{\mathsf{S}}}}_{n,2:4}\|_{2}\leqslant 0\big{\}}.

\mathbb{P}=\big{\{}\widetilde{{\bm{\mathsf{S}}}}=\mathcal{R}\big{(}{\bm{\mathsf{S}}}\big{)}\in{\mathbb{R}}^{N\times 4}\,\big{|}\,(\forall n\in\{1,\ldots,N\})\\ -\widetilde{{\mathsf{S}}}_{n,1}+\|\widetilde{{\bm{\mathsf{S}}}}_{n,2:4}\|_{2}\leqslant 0\big{\}}.

r^{'} (S) = ι_{P} (S) .

r^{'} (S) = ι_{P} (S) .

ι_{P} (S) = {0, if S \in P, + \infty, otherwise .

ι_{P} (S) = {0, if S \in P, + \infty, otherwise .

{\bm{\mathsf{Y}}}_{t,\alpha,\beta}=\mathcal{D}_{t}(\overline{{\bm{\mathsf{U}}}}_{\alpha})\,\mathcal{X}_{t,\alpha,\beta}\big{(}{\bm{\mathsf{F}}}\,\widetilde{{\bm{\mathsf{Z}}}}\,\mathcal{\widetilde{L}}(\overline{{\bm{\mathsf{S}}}})\big{)}\,\mathcal{D}_{t}^{\prime}(\overline{{\bm{\mathsf{U}}}}_{\beta})+\bm{\Omega}_{t,\alpha,\beta},

{\bm{\mathsf{Y}}}_{t,\alpha,\beta}=\mathcal{D}_{t}(\overline{{\bm{\mathsf{U}}}}_{\alpha})\,\mathcal{X}_{t,\alpha,\beta}\big{(}{\bm{\mathsf{F}}}\,\widetilde{{\bm{\mathsf{Z}}}}\,\mathcal{\widetilde{L}}(\overline{{\bm{\mathsf{S}}}})\big{)}\,\mathcal{D}_{t}^{\prime}(\overline{{\bm{\mathsf{U}}}}_{\beta})+\bm{\Omega}_{t,\alpha,\beta},

\underset{\begin{subarray}{c}{{\bm{\mathsf{U}}}_{1},{\bm{\mathsf{U}}}_{2}}\end{subarray}}{\mathrm{minimize}}\;\;\widetilde{h}\big{(}{\bm{\mathsf{U}}}_{1},{\bm{\mathsf{U}}}_{2}\big{)}+p({\bm{\mathsf{U}}}_{1},{\bm{\mathsf{U}}}_{2}),

\underset{\begin{subarray}{c}{{\bm{\mathsf{U}}}_{1},{\bm{\mathsf{U}}}_{2}}\end{subarray}}{\mathrm{minimize}}\;\;\widetilde{h}\big{(}{\bm{\mathsf{U}}}_{1},{\bm{\mathsf{U}}}_{2}\big{)}+p({\bm{\mathsf{U}}}_{1},{\bm{\mathsf{U}}}_{2}),

\displaystyle\widetilde{h}\big{(}{\bm{\mathsf{U}}}_{1},{\bm{\mathsf{U}}}_{2}\big{)}

\displaystyle\widetilde{h}\big{(}{\bm{\mathsf{U}}}_{1},{\bm{\mathsf{U}}}_{2}\big{)}

= 1 ⩽ α ⩽ n_{a} \sum \frac{1}{2} ∥ G_{α, 2} (U_{α, 2}) - Y_{α} ∥_{F}^{2},

p\big{(}{\bm{\mathsf{U}}}_{1},{\bm{\mathsf{U}}}_{2}\big{)}=\gamma\|{\bm{\mathsf{U}}}_{1}-{\bm{\mathsf{U}}}_{2}\|_{F}^{2}+\iota_{\mathbb{D}}\big{(}{\bm{\mathsf{U}}}_{1}\big{)}+\iota_{\mathbb{D}}\big{(}{\bm{\mathsf{U}}}_{2}\big{)},

p\big{(}{\bm{\mathsf{U}}}_{1},{\bm{\mathsf{U}}}_{2}\big{)}=\gamma\|{\bm{\mathsf{U}}}_{1}-{\bm{\mathsf{U}}}_{2}\|_{F}^{2}+\iota_{\mathbb{D}}\big{(}{\bm{\mathsf{U}}}_{1}\big{)}+\iota_{\mathbb{D}}\big{(}{\bm{\mathsf{U}}}_{2}\big{)},

\underset{\begin{subarray}{c}{{\bm{\mathsf{E}}},\,{\bm{\mathsf{U}}}_{1},{\bm{\mathsf{U}}}_{2}}\end{subarray}}{\mathrm{minimize}}\;\;h\big{(}{\bm{\mathsf{E}}},{\bm{\mathsf{U}}}_{1},{\bm{\mathsf{U}}}_{2}\big{)}+r({\bm{\mathsf{E}}})+p\big{(}{\bm{\mathsf{U}}}_{1},{\bm{\mathsf{U}}}_{2}\big{)},

\underset{\begin{subarray}{c}{{\bm{\mathsf{E}}},\,{\bm{\mathsf{U}}}_{1},{\bm{\mathsf{U}}}_{2}}\end{subarray}}{\mathrm{minimize}}\;\;h\big{(}{\bm{\mathsf{E}}},{\bm{\mathsf{U}}}_{1},{\bm{\mathsf{U}}}_{2}\big{)}+r({\bm{\mathsf{E}}})+p\big{(}{\bm{\mathsf{U}}}_{1},{\bm{\mathsf{U}}}_{2}\big{)},

\displaystyle h\big{(}{\bm{\mathsf{E}}},{\bm{\mathsf{U}}}_{1},{\bm{\mathsf{U}}}_{2}\big{)}

\displaystyle h\big{(}{\bm{\mathsf{E}}},{\bm{\mathsf{U}}}_{1},{\bm{\mathsf{U}}}_{2}\big{)}

= 1 ⩽ α ⩽ n_{a} \sum \frac{1}{2} ∥ G_{α, 1} (U_{α, 1}) - Y_{α} ∥_{F}^{2},

= 1 ⩽ α ⩽ n_{a} \sum \frac{1}{2} ∥ G_{α, 2} (U_{α, 2}) - Y_{α} ∥_{F}^{2},

r (E) = g (E) + ι_{P} (S_{0} + E) + ι_{K} (E),

r (E) = g (E) + ι_{P} (S_{0} + E) + ι_{K} (E),

\mathbb{K}=\big{\{}{\bm{\mathsf{E}}}\in{\mathbb{R}}^{2\times 2N}|(\forall n\in{\mathbb{S}}_{0})\,\,{\mathsf{E}}(n)\in[-\vartheta\,{\mathsf{S}}_{0}(n),\vartheta\,{\mathsf{S}}_{0}(n)]\,\big{\}},

\mathbb{K}=\big{\{}{\bm{\mathsf{E}}}\in{\mathbb{R}}^{2\times 2N}|(\forall n\in{\mathbb{S}}_{0})\,\,{\mathsf{E}}(n)\in[-\vartheta\,{\mathsf{S}}_{0}(n),\vartheta\,{\mathsf{S}}_{0}(n)]\,\big{\}},

\mathbb{K}^{\prime}=\big{\{}{\bm{\mathsf{E}}}\in{\mathbb{R}}^{2\times 2N}|(\forall n\in{\mathbb{S}}_{0})\,\,{\mathsf{E}}(n)\in[-\vartheta\,{\mathsf{S}}_{0}(n),\vartheta\,{\mathsf{S}}_{0}(n)]\,,\\ (\forall n\not\in{\mathbb{S}}_{0})\,\,[{\mathsf{E}}(n)]_{1,1}\geqslant 0\big{\}}.

\mathbb{K}^{\prime}=\big{\{}{\bm{\mathsf{E}}}\in{\mathbb{R}}^{2\times 2N}|(\forall n\in{\mathbb{S}}_{0})\,\,{\mathsf{E}}(n)\in[-\vartheta\,{\mathsf{S}}_{0}(n),\vartheta\,{\mathsf{S}}_{0}(n)]\,,\\ (\forall n\not\in{\mathbb{S}}_{0})\,\,[{\mathsf{E}}(n)]_{1,1}\geqslant 0\big{\}}.

r (E) = g (E) + ι_{K^{'}} (E) .

r (E) = g (E) + ι_{K^{'}} (E) .

prox_{f} (S) = V \in R^{2 \times 2 N} argmin f (V) + \frac{1}{2} ∥ V - S ∥_{F}^{2} .

prox_{f} (S) = V \in R^{2 \times 2 N} argmin f (V) + \frac{1}{2} ∥ V - S ∥_{F}^{2} .

P_{E} (S) = V \in E argmin ∥ V - S ∥_{F}^{2} .

P_{E} (S) = V \in E argmin ∥ V - S ∥_{F}^{2} .

\Gamma_{q}^{(i)}=\big{(}\zeta_{q,\alpha}^{(i)}\,\bm{1}_{2\times 2K\times P}\big{)}_{1\leqslant\alpha\leqslant n_{a}},

\Gamma_{q}^{(i)}=\big{(}\zeta_{q,\alpha}^{(i)}\,\bm{1}_{2\times 2K\times P}\big{)}_{1\leqslant\alpha\leqslant n_{a}},

0 < σ^{(i)} < 1/∥Φ ∥_{2},

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Polca SARA - Full polarization, direction-dependent calibration and sparse imaging for radio interferometry

Jasleen Birdi,1 Audrey Repetti,1,2 and Yves Wiaux1

1 Institute of Sensors, Signals and Systems, Heriot-Watt University, Edinburgh EH14 4AS, UK

2 Department of Actuarial Mathematics and Statistics, Heriot-Watt University, Edinburgh EH14 4AS, UK E-mail: [email protected]

(Accepted XXX. Received YYY; in original form ZZZ)

Abstract

New generation of radio interferometers are envisaged to produce high quality, high dynamic range Stokes images of the observed sky from the corresponding under-sampled Fourier domain measurements. In practice, these measurements are contaminated by the instrumental and atmospheric effects that are well represented by Jones matrices, and are most often varying with observation direction and time. These effects, usually unknown, act as a limiting factor in achieving the required imaging performance and thus, their calibration is crucial. To address this issue, we develop a global algorithm, named Polca SARA, aiming to perform full polarization, direction-dependent calibration and sparse imaging by employing a non-convex optimization technique. In contrast with the existing approaches, the proposed method offers global convergence guarantees and flexibility to incorporate sophisticated priors to regularize the imaging as well as the calibration problem. Thus, we adapt a polarimetric imaging specific method, enforcing the physical polarization constraint along with a sparsity prior for the sought images. We perform extensive simulation studies of the proposed algorithm. While indicating the superior performance of polarization constraint based imaging, the obtained results also highlight the importance of calibrating for direction-dependent effects as well as for off-diagonal terms (denoting polarization leakage) in the associated Jones matrices, without inclusion of which the imaging quality deteriorates.

keywords:

techniques: interferometric – techniques: polarimetric – techniques: image processing – techniques: calibration

††pubyear: 2019††pagerange: Polca SARA - Full polarization, direction-dependent calibration and sparse imaging for radio interferometry–2

1 Introduction

In radio interferometry (RI), the acquired measurements are related to the Fourier coefficients of the sky brightness distribution under observation (Thompson et al., 2001). These sampled frequency points in the Fourier domain are dictated by the separations between the antenna pairs constituting the interferometer. A finite number of antennas within an interferometer leads to under-sampling of the Fourier domain associated with the sought images. Given such measurements, the task to recover the images of interest can be performed by using a suitable imaging technique in RI. In fact, very often, the target radio sources generate polarized radiations (Ginzburg & Syrovatskii, 1965). Study of these polarized emissions is crucial for various scientific goals (e.g. to determine the magnetic field distributions around the source and along the signal’s propagation path (Pacholczyk, 1970)), offering a way to probe the details in addition to those provided by the total intensity alone. For this purpose, Stokes parameters- $I,Q,U$ and $V$ provide a representation of the sky intensity distribution, where the first parameter denotes the total intensity whereas the others characterize the polarization state of the radio emissions. In particular, while the linear polarization is described by the Stokes parameters $Q$ and $U$ , the circular polarization is denoted by Stokes $V$ . Conventionally speaking, although the main focus has been to develop algorithms for total intensity imaging, for the purpose of imaging these parameters, referred to as polarimetric imaging, the same techniques as developed for Stokes $I$ imaging have been applied for imaging each of the Stokes parameters.

In practice, rather than providing an accurate representation of the true sky distribution, the RI measurements are often corrupted both by the atmospheric effects encountered by the radio emissions en route to the receiving antennas and by the antenna-based errors. These measurement-contaminating effects are represented in the form of Jones matrices (Smirnov, 2011). Only in the cases when these effects are absent or known accurately, image recovery amounts to use of any RI imaging algorithm. In practical cases, where these effects are unknown, these need to be estimated along with imaging to achieve good quality reconstruction. Calibration is this process of estimation of the Jones matrices. Moreover, these time-variable calibration terms in general also exhibit direction dependency, thereby encompassing both direction-independent and -dependent effects (DIEs and DDEs, respectively). For instance, while the antenna-based gains denote the DIEs, the antenna primary beam pattern, atmospheric phase delays etc. exhibit spatial variation within the field of view and thus constitute DDEs. Historically, only the DIE calibration has been performed (Thompson et al., 2001). However, the calibration of DDEs has become a crucial issue especially for the new generation radio telescopes including the LOw Frequency ARray (LOFAR) (http://www.lofar.org) and the upcoming Square Kilometre Array (SKA) (https://www.skatelescope.org), aiming to produce sky images at unprecedented resolution with high sensitivity. Without efficient DDE calibration, these telescopes cannot be exploited to their fullest potential. The incorporation of the DDEs in the calibration process is also essential to produce high quality images without limiting their dynamic ranges (Bhatnagar et al., 2013). Thus, the resultant problem of image recovery from the acquired corrupted data consists in estimating both the sought images and the DDE calibration terms. We now review the imaging and calibration techniques in RI, which can then be combined to solve the overall problem.

In the context of RI imaging techniques, the standard clean algorithm implements a non-linear, greedy approach to perform deconvolution in an iterative manner (Högbom, 1974). Working pixelwise, clean implicitly assumes sparsity of the sought image, removing at each iteration, a fraction of the maximum intensity pixel convolved with the dirty beam from the computed residual image. Over the past years, many variants of this celebrated algorithm have been developed, for instance, its multi-scale version (Cornwell, 2008). In addition to these clean based approaches, recent developments in the field of compressive sensing (CS) applied for astronomical imaging have given birth to many imaging algorithms, particularly for radio interferometry (Wiaux et al., 2009; Carrillo et al., 2012; Garsden et al., 2015; Onose et al., 2016). Leveraging optimization framework, these methods aim to solve the underlying image recovery problem by enforcing sparsity of the image of interest in some suitable domain. As a matter of fact, these techniques have shown the potential to surpass the image reconstruction quality obtained by clean based approaches (Carrillo et al., 2014; Onose et al., 2016; Pratley & Johnston-Hollitt, 2016; Onose et al., 2017; Dabbech et al., 2018). While the aforementioned methods have been developed mainly for Stokes $I$ imaging, as previously mentioned these can be extended to polarimetric imaging by following the same imaging approach for all the Stokes parameters. In the context of sparsity regularized approaches, one such extension for polarimetric imaging has been presented in Akiyama et al. (2017), promoting sparsity of each of the Stokes parameters using $\ell_{1}$ norm combined with total variation (TV) regularization (Rudin et al., 1992). A point worth noting here is that the above mentioned approaches, whether clean and its variants or the sparsity regularized method, solve for the Stokes parameters totally independently. However, these images are physically linked via polarization constraint, that is the polarized intensity cannot be greater than the total intensity. Within the CS framework, this constraint has been exploited by a recently proposed approach, namely Polarized SARA, estimating jointly the Stokes parameters (Birdi et al., 2018a, b). This approach has been shown to provide better reconstruction quality in comparison with the case when the constraint is not accounted for.

Concerning the calibration problem, it can be formulated as a non-linear least squares problem and to solve it, most of the existing solvers employ gradient-based techniques like the Levenberg-Marquardt (LM) algorithm (Moré, 1978). But it suffers from its high computational cost. To overcome this issue, an efficient and fast solver named StEFCal has been proposed in Salvini & Wijnholds (2014), which relies on an alternating direction implicit method. However, it solves only for the DIEs. In the context of DDE calibration, several methods have been proposed in the last years. In particular, the Source Peeling and Atmospheric Modeling (SPAM) method proposed in Intema et al. (2009) models the ionosphere as a phase screen, whose parameters are estimated by fitting the calibrated phases obtained from peeling (Noordam, 2004) with the modelled phases, using the LM algorithm. Subsequently, this model can be used for the prediction of correction phases in any arbitrary desired direction within the field of view. To apply these corrections, SPAM relies on a facet-based approach, applying the DDE calibration solution for the facet centre, defined by a bright source or approximate centre of a cluster of closely located bright sources, to the whole facet. This scheme works under the assumption that the DDEs show smooth variation over the field of view, thereby having (approximately) constant value across a given facet. This approach has analogies with the facet calibration which consists in partitioning the sky into a number of facets and solving for the DDE solution of the facet centre using a self-calibration (SelfCal) loop, which in turn is applied to the whole facet (Van Weeren et al., 2016). Despite some similarities, these two approaches are different in the way they obtain the calibration solutions, i.e. unlike SPAM, facet calibration does not make use of global phase screen model. In contrast to these approaches solving for the calibration solutions source by source (or in terms of facets), SAGE calibration algorithm solves for the calibration solutions for all sources at once, employing an extension of the Expectation Maximization algorithm (Yatawatta et al., 2009; Kazemi et al., 2011). It therefore provides increased convergence speed in comparison with the traditional LM solvers. It is to be noted that within a given iteration, SAGE algorithm requires to define partitioning over the unknown Jones parameter space depending on the source characteristics. Another approach for DDE calibration arises from the convolutional nature of the DDEs. More precisely, the multiplication of the DDEs in the image domain can be equivalently seen as their convolution in the Fourier domain. The A-Projection algorithm makes use of the latter characteristic and offers a way to correct for those known DDEs whose associated Jones matrix are (approximately) unitary, by embedding them into the forward (degridding) and backward (gridding) operators (Bhatnagar et al., 2013). While this approach was only developed for the correction of known DDEs, it has been incorporated in the Pointing SelfCal algorithm to solve for antenna pointing errors (Bhatnagar & Cornwell, 2017).

In order to estimate the sky model while having imperfect knowledge of calibration terms, any of the above mentioned calibration algorithms is generally combined with clean (or its variants) for image recovery step. As such, the global algorithm fails to have any convergence guarantees. The first step in the direction of addressing this issue has been made by the joint calibration and imaging algorithm proposed in Repetti et al. (2017). This algorithm leverages non-convex optimization techniques to present a global algorithm which alternates between the estimation of the image of interest and the DDEs, while benefitting from the convergence guarantees. In addition to this, unlike the approaches requiring sky partitioning, which may not always be best achieved in an automatic manner, this technique works on the whole image all at once with minimal user intervention. However, this approach has been developed only for Stokes $I$ imaging and calibration, without dealing with the full polarization model. As a matter of fact, in the case of full Stokes imaging and calibration, even the global algorithm consisting of combination of clean based imaging and any of the previously mentioned calibration techniques, do not adopt any polarimetric imaging specific approach. These factors raise the scope for the development of a globally convergent algorithm which not only incorporates full polarization model, but also uses advanced approaches specific for full Stokes imaging to produce good quality images.

The work presented here lies within this scope and we propose Polca SARA - a full polarization, calibration and sparse imaging algorithm with proven convergence guarantees. In particular, we build our method on the approach developed in Repetti et al. (2017), and later on extended in Repetti & Wiaux (2017) and Thouvenin et al. (2018), assuming spatial and temporal smoothness of the DDEs. On the one hand, we generalize this approach to the full polarization model, developing an algorithm alternating between the estimation of the DDEs and the Stokes images. On the other hand, thanks to the underlying non-convex optimization technique, the proposed approach can deal with sophisticated priors suited to the images under consideration as well as to the DDEs. Leveraging this flexibility, we adapt Polarized SARA method, specifically designed for Stokes imaging, to be used for imaging step in the considered case.

The remainder of the article is organised as follows. Section 2 describes the full polarization model encountered in RI. The proposed approach to solve the underlying joint calibration and imaging problem is presented briefly in Section 3. A more formal and mathematical presentation of the proposed approach is provided in later sections. Particularly, the imaging problem to be solved is posed in Section 4, followed by the introduction of the calibration problem in Section 5. A blend of these problems to give the resultant joint calibration and imaging problem and the proposed algorithm to solve it are detailed in Section 6. To assess the performance of the proposed algorithm, we perform various tests which are presented in Section 7. We give the concluding remarks and discuss about possible directions of future work in Section 8.

2 Full polarization observation model

The radio interferometric measurements, termed as visibilities, are acquired by antenna pairs. Within $n_{a}$ number of total antennas, consider an antenna pair $(\alpha,\beta)\in\{1,\ldots,n_{a}\}^{2}$ , at time instant $t\in\{1,\ldots,T\}$ , associated with the baseline components $(u_{t,\alpha,\beta},v_{t,\alpha,\beta},w_{t,\alpha,\beta})$ expressed in observation wavelength units. Here, $\bm{u}_{t,\alpha,\beta}=(u_{t,\alpha,\beta},v_{t,\alpha,\beta})$ represents the components in the plane normal to the target source direction, while $w_{t,\alpha,\beta}$ is the component in the source direction. Furthermore, let $\overline{\bm{S}}=\bigl{[}\begin{smallmatrix}I&Q\\ U&V\end{smallmatrix}\bigr{]}$ encompasses the Stokes parameters of the target sky area, described in terms of the direction cosines $(l,m,n)$ with $\bm{l}=(l,m)$ lying in the tangential plane to the celestial sphere and $n(\bm{l})=\sqrt{1-l^{2}-m^{2}}$ lying in the line of sight. Then, the measurements ${\bm{\mathsf{Y}}}_{t,\alpha,\beta,\nu}\in{\mathbb{C}}^{2\times 2}$ made by the antenna pair $(\alpha,\beta)$ at time instant $t$ and observation frequency $\nu$ is given by the radio interferometric measurement equation (RIME) (Hamaker et al., 1996; Smirnov, 2011) as follows

[TABLE]

where $(\cdot)^{\dagger}$ denotes the Hermitian conjugate of its argument and $\mathcal{L}\big{(}\overline{\bm{S}}_{t,\nu}(\bm{l})\big{)}=\overline{\bm{B}}_{t,\nu}(\bm{l})$ is the brightness matrix formed by the application of the linear operator $\mathcal{L}$ on the components of the Stokes matrix $\overline{\bm{S}}$ . For instance, considering linear feeds, the brightness matrix is given by $\overline{\bm{B}}_{t,\nu}(\bm{l})=\biggl{[}\begin{smallmatrix}I+Q&U+\text{i}V\\ U-\text{i}V&I-Q\end{smallmatrix}\biggr{]}_{t,\nu}(\bm{l})$ , whereas for circular feeds, it reads as $\overline{\bm{B}}_{t,\nu}(\bm{l})=\biggl{[}\begin{smallmatrix}I+V&U+\text{i}Q\\ U-\text{i}Q&I-V\end{smallmatrix}\biggr{]}_{t,\nu}(\bm{l})$ . Furthermore, in equation (1), for every antenna $\alpha\in\{1,\ldots,n_{a}\}$ , the direction-independent and -dependent effects (DIEs and DDEs, respectively) at time $t$ and frequency $\nu$ are encoded in the Jones matrix $\overline{\bm{D}}_{t,\alpha,\nu}$ . It is worth mentioning that equation (1) presents a general case where the Stokes images and the DIEs/DDEs can vary with respect to the observation frequency and time. However, in the current work, we consider the Stokes images without having any time or frequency dependence. In addition, for the DIEs/DDEs, we deal with their temporal dependency considered at a single observation frequency. Hence, hereafter, we take $\overline{\bm{S}}_{t,\nu}(\bm{l})=\overline{\bm{S}}(\bm{l})$ and drop the frequency index from the other variables.

To recover the Stokes parameters from the given measurements, we formulate the inverse problem (1) in the discrete domain. It amounts to sampling the continuous variables such that the field of view is discretized into a grid, with the vectorized form of this grid being represented by the index $n\in\{-N/2,\ldots,N/2-1\}$ . Within this representation, we thus have the Stokes matrix $\overline{{\bm{\mathsf{S}}}}\in{\mathbb{R}}^{2\times 2N}$ , and for each antenna $\alpha$ and time instant $t$ , the DDEs $\overline{{\bm{\mathsf{D}}}}_{t,\alpha}\in{\mathbb{C}}^{2\times 2N}$ . As illustrated in Fig. 1, these matrices can also be seen as $2\times 2$ block matrices111For any such matrix $\overline{{\bm{\mathsf{S}}}}$ , $\overline{{\bm{\mathsf{S}}}}(n)$ denotes the $2\times 2$ matrix consisting of the $n^{\text{th}}$ elements of each of the blocks in the parent block matrix $\overline{{\bm{\mathsf{S}}}}$ . Furthermore, for $(i,i^{\prime})\in\{1,2\}^{2}$ , $[\overline{{\bm{\mathsf{S}}}}(n)]_{ii^{\prime}}$ refers to the $n^{\text{th}}$ element of the row vector block contained in the $i^{\text{th}}$ row and ${i^{\prime}}^{\text{th}}$ column of the argument block matrix., whose each block is a row vector of dimension $N$ . In particular, for the Stokes matrix $\overline{{\bm{\mathsf{S}}}}=\bigl{[}\begin{smallmatrix}\overline{\bm{s}}_{1}&\overline{\bm{s}}_{2}\\ \overline{\bm{s}}_{3}&\overline{\bm{s}}_{4}\end{smallmatrix}\bigr{]}$ , the elements $\overline{\bm{s}}_{1},\overline{\bm{s}}_{2},\overline{\bm{s}}_{3}$ and $\overline{\bm{s}}_{4}$ , each of size $N$ , are respectively the discretizations of the Stokes parameters $I,Q,U$ and $V$ . On the other hand, for the DDEs, for each index $n$ , we have the so called $2\times 2$ Jones matrix $\overline{{\bm{\mathsf{D}}}}_{t,\alpha}(n)\in{\mathbb{C}}^{2\times 2}$ . The diagonal elements of this matrix correspond to the antenna’s complex voltage pattern, whereas the off-diagonal elements account for the polarization leakage terms arising from instrumental leakage, transmission effects, etc (Bhatnagar et al., 2013). Moreover, if for every $(i,i^{\prime})\in\{1,2\}^{2}$ , $[\overline{{\bm{\mathsf{D}}}}_{t,\alpha}]_{ii^{\prime}}=\delta_{t,\alpha}\bm{1}_{N}$ with $\delta_{t,\alpha}\in{\mathbb{C}}$ and $\bm{1}_{N}$ being $N$ dimensional unitary row vector, then $\overline{{\bm{\mathsf{D}}}}_{t,\alpha}$ reduces to a DIE.

Following the introduced notations, each component of the visibility matrix ${\bm{\mathsf{Y}}}_{t,\alpha,\beta}$ , indexed by $[\cdot]_{ii^{\prime}}$ with $(i,i^{\prime})\in\{1,2\}^{2}$ , at discrete spatial frequency $k_{t,\alpha,\beta}$ can be represented as

[TABLE]

where $(\cdot)^{*}$ stands for complex conjugation of its argument and the visibilities are corrupted by an additive Gaussian noise ${\bm{\Omega}}_{t,\alpha,\beta}\in{\mathbb{C}}^{2\times 2}$ . For $n\in\{-N/2,\ldots,N/2-1\}$ , as for the continuous version (1), $\mathcal{L}\big{(}\overline{{\bm{\mathsf{S}}}}(n)\big{)}=\overline{{\bm{\mathsf{B}}}}(n)\in{\mathbb{C}}^{2\times 2}$ is the brightness matrix.

3 A brief overview of the proposed method

While a more formal description of the proposed approach is presented in the later sections, this section is designed to give an intuitive explanation of the approach. Particularly, the description in this section is targeted to facilitate readers to understand the results and conclusion sections without delving into the mathematical details provided in Sections 4- 6.

Given the measurements of the form (2), the aim is to recover the images of interest. The problem of image recovery becomes more challenging when the DDEs $\overline{{\bm{\mathsf{D}}}}$ are unknown and need to be estimated along with the images. The current work is devoted to this more challenging yet practical case, for which we propose a joint calibration and imaging algorithm to solve the underlying problem. The resultant approach is dubbed as Polca SARA.

In order to calibrate for the DDEs, we consider their smooth variation both across the field of view and in time, implying that they are spatially and temporally band-limited (Repetti et al., 2017; Thouvenin et al., 2018). Exploiting this characteristic, we instead represent the DDEs $\overline{{\bm{\mathsf{D}}}}$ by their compact-support kernels $\overline{{\bm{\mathsf{U}}}}$ in the spatial and temporal Fourier domain. Then, only these non-zero Fourier coefficients need to be estimated. It reduces significantly the dimension of the underlying problem. Furthermore, we note that while equation (2) is linear with respect to $\overline{{\bm{\mathsf{S}}}}$ (when the DDEs are kept fixed), it is non-linear with respect to the DDEs. To counteract this issue, we draw on the notion of bi-linearity proposed in Salvini & Wijnholds (2014) and Repetti et al. (2017) in the context of RI calibration. It consists in introducing ${\bm{\mathsf{U}}}_{1}={\bm{\mathsf{U}}}_{2}={\bm{\mathsf{U}}}$ such that problem (2) becomes bi-linear with respect to the DDEs. The resultant problem can then be written as

[TABLE]

where $\phi$ is the tri-linear measurement operator, mapping the images of interest coupled with the DDE Fourier kernels ${\bm{\mathsf{U}}}_{1}$ and ${\bm{\mathsf{U}}}_{2}$ to the acquired visibilities ${\bm{\mathsf{Y}}}$ . On the imaging front, we further consider that calibration transfer has been performed and thus, the Jones matrices in equation (2) can be taken as identity as a first instance, without any directional dependency. Doing so, we first solve the problem to obtain an estimation ${\bm{\mathsf{S}}}^{\prime}$ of the unknown images. Nevertheless, the used measurement model being devoid of DDEs, the estimated images are likely to contain artefacts. We thus instead keep a thresholded version ${\bm{\mathsf{S}}}_{0}$ of ${\bm{\mathsf{S}}}^{\prime}$ , considering that ${\bm{\mathsf{S}}}={\bm{\mathsf{S}}}_{0}+{\bm{\mathsf{E}}}$ , where ${\bm{\mathsf{E}}}$ is unknown and needs to be estimated.

In view of the discussion above, we propose to estimate the variables of interest by solving the following problem:

[TABLE]

where the data-fidelity term $h$ is given by a least-squares criterion (equations (22)-(24)) and the regularization terms are defined both for the images ( $r$ in equation (25)) and the DDEs ( $p$ in equation (20)).

For the images of interest, we exploit the compressive sensing framework (Candès et al., 2006) and consider $\ell_{1}$ regularization term to promote sparsity of the sought images in a suitable dictionary. More precisely, inspired by the good performance of the sparsity averaging reweighted analysis (SARA) prior for Stokes $I$ imaging in Carrillo et al. (2013); Carrillo et al. (2014) and for full Stokes imaging in Birdi et al. (2018b), we employ this prior consisting in promoting average sparsity over a concatenation of Dirac basis and first eight Daubechies wavelet bases. It is coupled with a reweighting scheme, solving sequentially the $\ell_{1}$ regularized problems (Candès et al., 2008). Although the current work does not deal with the reweighting procedure, it can be easily incorporated. In practice, other sophisticated priors injecting additional information about the images of interest can also be taken into account in the regularization term, eg. polarization constraint in Birdi et al. (2018b). In the current work, we consider both the cases without and with the enforcement of the polarization constraint. For the DDEs, the regularization term $p$ is defined to impose values of the Jones matrices’ elements to lie within specified bounds as well as to control the distance between ${\bm{\mathsf{U}}}_{1}$ and ${\bm{\mathsf{U}}}_{2}$ which ideally should be equal. Additional details for the image (resp. DDEs) estimation are provided in Section 4 (resp. Section 5).

The resultant problem (4) is solved using a block-coordinate forward-backward approach (Chouzenoux et al., 2016). As the name suggests, this iterative procedure relies on a forward-backward (FB) scheme to update each of the blocks/variables of interest $({\bm{\mathsf{U}}}_{1},{\bm{\mathsf{U}}}_{2},{\bm{\mathsf{E}}})$ . The FB scheme consists in treating the smooth term with a gradient step and the non-smooth term via a proximity step. The global method is detailed in Section 6. For a better understanding, the overall algorithmic structure is depicted in Fig. 2. It has similarities with the traditional self-calibration method in the sense that it alternates between the imaging and calibration steps. Yet the two approaches are different in terms of their underlying principle and implementation. First, the imaging and calibration steps in our approach use the same optimization toolbox and hence, the global algorithm benefits from convergence guarantees. Second, to ensure this global convergence, each of these updates is obtained by performing only a finite number of iterations and not until convergence. Moreover, these updates must follow a cyclic rule, that is each variable must be updated at least once every given number of iterations. To this end, $L_{cyc}$ global iterations are performed to update the DDEs $({\bm{\mathsf{U}}}_{1},{\bm{\mathsf{U}}}_{2})$ , alternating between $L$ FB iterations on ${\bm{\mathsf{U}}}_{1}$ and $L$ FB iterations on ${\bm{\mathsf{U}}}_{2}$ . After $L_{cyc}$ global iterations, the Stokes images ${\bm{\mathsf{E}}}$ are updated by executing $J$ FB iterations.

The results obtained by implementing Polca SARA are presented in Section 7. Particularly, we compare between the following cases: on the calibration front- considering normalized DIEs (i.e. identity Jones matrices), DIE calibration, Jones matrices DDE calibration without considering off-diagonal terms, and full Jones matrices DDE calibration; and on the imaging front- in the absence and presence of the polarization constraint. While the calibration cases are chosen to judge the importance of calibrating for the off-diagonal terms as well as for the DDEs, the two imaging cases are to test the efficiency of the polarization constraint in the context of joint calibration and imaging. Furthermore, concerning these imaging cases, when the polarization constraint is not imposed, the proximity step for the update of ${\bm{\mathsf{E}}}$ is evaluated using Algorithm 2, whereas in the case of enforcing this constraint, Algorithm 3 is employed to compute the corresponding proximity step.

A detailed mathematical formulation of the proposed approach is presented in the next sections, starting with the description of the imaging problem followed by the calibration problem and finally combining the two. On the other hand, with the intuitive description provided in this section at disposal, the reader might jump directly to the simulations and results section (Section 7) averting the mathematical details.

4 Imaging problem

The case of having either pre-calibrated data or knowledge of DIEs/DDEs beforehand amounts to solving the problem only for image recovery. The RI literature is brimming with such methods aiming to estimate the images of interest from the underlying imaging problem. Nevertheless, most of these methods were initially proposed mainly for Stokes $I$ imaging, including clean based methods (Högbom, 1974; Bhatnagar & Cornwell, 2004; Cornwell, 2008) and CS techniques (Wiaux et al., 2009; Carrillo et al., 2014; Garsden et al., 2015; Onose et al., 2016; Onose et al., 2017). In order to apply these methods for full polarization imaging, these can be extended to the estimation of each of the Stokes parameters totally independently using the same techniques as developed for Stokes $I$ imaging. Such an extension of CS based Stokes $I$ imaging to full polarization imaging has been presented in Akiyama et al. (2017), solving independently for the sought images. A more evolved approach aiming to jointly estimate the Stokes parameters has been very recently proposed in Birdi et al. (2018a, b). In this section, we aim to summarize this work which was proposed in the context of DIEs, and generalize it to the DDEs case, which will be useful for the understanding of the joint imaging and calibration method developed in Section 6.

In this context, a reformulation of equation (2) is more commonly used where the measurements made by an antenna pair $(\alpha,\beta)$ at time instant $t$ are represented by a vector ${\mathb{y}}_{t,\alpha,\beta}\in{\mathbb{C}}^{4}$ , whose each element $q\in\{1,\ldots,4\}$ is given by

[TABLE]

with $\bm{\omega}_{t,\alpha,\beta}\in{\mathbb{C}}^{4}$ being a realization of additive Gaussian noise, and $\mathcal{\widetilde{L}}\big{(}\overline{{\bm{\mathsf{S}}}}(n)\big{)}$ operates on the Stokes matrix to produce the brightness matrix $\overline{{\bm{\mathsf{B}}}}(n)$ , followed by its vectorization. Furthermore, $\overline{{\bm{\mathsf{M}}}}_{t,\alpha,\beta}=(\overline{{\bm{\mathsf{D}}}}_{t,\alpha}\otimes\overline{{\bm{\mathsf{D}}}}_{t,\beta}^{\dagger})\in{\mathbb{C}}^{4\times 4N}$ is the Mueller matrix formed by the outer product of the Jones matrices for antennas $\alpha$ and $\beta$ at time instant $t$ , and in turn, this matrix can be seen as a $4\times 4$ block matrix with each block given by $N$ dimensional row vector. Let us note that, for the Fourier transformed Mueller matrix, denoted by $\widehat{\overline{{\bm{\mathsf{M}}}}}_{t,\alpha,\beta}$ , each block is obtained by the complex convolution of the Fourier transforms of the corresponding Jones matrices (Bhatnagar et al., 2013).

Using equation (5), the overall measurement model can be written as

[TABLE]

where $\bm{y}\in{\mathbb{C}}^{4M}$ is the resultant measurement vector corrupted by the noise vector $\bm{\omega}\in{\mathbb{C}}^{4M}$ . The measurement operator $\Phi$ mapping the images of interest to the acquired visibilities is modelled as

[TABLE]

that can be explained as follows. The operation ${\mathcal{\widetilde{L}}}\big{(}\overline{{\bm{\mathsf{S}}}}\big{)}$ generates the brightness vector $\bm{b}\in{\mathbb{C}}^{4N}$ that needs to be Fourier transformed at the sampled spatial frequencies. To evaluate these Fourier transforms in a computationally efficient manner, we make use of the non-uniform fast Fourier Transform that relies on interpolation of the Fourier coefficients from discrete to continuous domain (Fessler & Sutton, 2003). In this context, the zero-padding matrix $\widetilde{{\bm{\mathsf{Z}}}}\in{\mathbb{C}}^{\gamma 4N\times 4N}$ oversamples the images in $\bm{b}$ , by factor $\gamma$ in each dimension and also accounts for the scale factors to pre-compensate for the interpolation convolution kernels. A fast Fourier transform operator ${\bm{\mathsf{F}}}\in{\mathbb{C}}^{\gamma 4N\times\gamma 4N}$ is then applied to compute the 2D Fourier transform of each of these oversampled images. To interpolate these discrete Fourier coefficients to the continuous frequency points, the compact support convolution kernels (the so called degridding kernels) are embedded in the matrix ${\bm{\mathsf{G}}}\in{\mathbb{C}}^{4M\times\gamma 4N}$ . More precisely, the matrix ${\bm{\mathsf{G}}}$ is modelled to take into account the degridding kernels as well as the components of the Fourier transformed Mueller matrix. In particular, each of the four rows of ${\bm{\mathsf{G}}}$ associated with a frequency $k_{t,\alpha,\beta}$ contains the convolution of the corresponding elements of $\widehat{\overline{{\bm{\mathsf{M}}}}}_{t,\alpha,\beta}$ centred on $k_{t,\alpha,\beta}$ with the degridding kernel. Then, the application of this matrix on the Fourier transformed over-sampled images produce the measurements. When working in the absence of the DDEs, a direct correspondence of the measurement operator in equation (7) can be noticed with the measurement operator used for Stokes $I$ imaging (Onose et al., 2017; Pratley et al., 2017) and more recently, for full Stokes imaging (Birdi et al., 2018b).

It is to be mentioned here that most of the imaging techniques in RI literature rely on taking the Jones matrices and hence Mueller matrix as identity in equation (5). In other words, they work under the assumption of absence of any calibration errors or availability of pre-calibrated data. However, in presence of calibration errors, the imaging techniques should account for these terms within their models. Working in this direction, we can generalize the model proposed in Birdi et al. (2018b) and employ it to solve inverse problem (6). The corresponding imaging strategy leverages the CS theory. In particular, CS based methods offer a way from optimization point of view to solve the underlying inverse problem. The main idea is to obtain the estimation of the sought images by solving a minimization problem of the form (Birdi et al., 2018a, b)

[TABLE]

where $\epsilon$ is related to the norm of the additive noise, $\|\cdot\|_{2}$ denotes the $\ell_{2}$ norm of its argument vector. The term $r({\bm{\mathsf{S}}})$ is the regularization term, injecting prior information in the reconstruction process, whereas the data fidelity is ensured by the constraint $\|\Phi({\bm{\mathsf{S}}})-\bm{y}\|_{2}\leqslant\epsilon$ , implying that the residual is situated within an $\ell_{2}$ ball whose radius is determined by $\epsilon$ . In the current work, due to technical assumptions related to the proposed joint imaging and calibration algorithm (Chouzenoux et al., 2016), we propose to solve an unconstrained version of problem (8), given by

[TABLE]

where $\bar{h}({\bm{\mathsf{S}}})$ is the data fidelity term. Using this formulation and assuming that the additive noise is i.i.d. Gaussian, $\bar{h}({\bm{\mathsf{S}}})$ is given by a least squares criterion

[TABLE]

This formulation has been introduced in Repetti et al. (2017) in the particular case of joint Stokes $I$ imaging and calibration. In order to incorporate various polarimetric imaging specific prior informations in the regularization function, Birdi et al. (2018b) proposed to use a hybrid regularization term of the form

[TABLE]

where the function $g\colon{\mathbb{R}}^{2\times 2N}\rightarrow]-\infty,+\infty]$ imposes sparsity of the underlying Stokes images and the function $r^{\prime}\colon{\mathbb{R}}^{2\times 2N}\rightarrow]-\infty,+\infty]$ constrains the domains of the argument images as per some physical constraints. In particular, the first term is inspired by the CS framework which proposes to exploit the sparsity of the target images in some dictionary ${\bm{\Psi}}\in{\mathbb{C}}^{N\times J}$ (Candès et al., 2006; Donoho, 2006). The choice of this dictionary is dependent on the images under scrutiny. To give an intuitive idea, images which are already sparse, for instance point sources images, are well represented by dirac basis taking ${\bm{\Psi}}$ to be identity. Otherwise, sparsity can be imposed in some other domain, such as TV regularization for piece-wise constant images (Rudin et al., 1992; Wiaux et al., 2010; Akiyama et al., 2017), wavelet basis for smooth images (Mallat, 2009), etc. In particular for astronomical images, a collection of wavelet bases is shown to be a good candidate for sparsifying dictionary ${\bm{\Psi}}$ (Carrillo et al., 2012, 2014), both for Stokes $I$ imaging (Onose et al., 2016; Onose et al., 2017) and specifically for polarization imaging in Birdi et al. (2018a, b). Formally, for any matrix $\widetilde{{\bm{\mathsf{S}}}}\in{\mathbb{R}}^{N\times 4}$ , the sparse representation of the sought images in a chosen dictionary ${\bm{\Psi}}$ is given by ${\bm{\Psi}}^{\dagger}\widetilde{{\bm{\mathsf{S}}}}$ . The regularization function is then defined to impose sparsity of this term. This can be achieved by using the $\ell_{0}$ pseudo-norm, counting the number of non-zero components of its argument (Donoho et al., 1995). A more common approach is to use $\ell_{1}$ norm which overcomes the problem of non-convexity of the $\ell_{0}$ norm (Chen et al., 2001). An even better approximation of the $\ell_{0}$ norm is provided by the (re)weighted $\ell_{1}$ norm, which tends to diminish the magnitude dependency of the $\ell_{1}$ norm (Candès et al., 2008; Carrillo et al., 2012). The (re)weighted $\ell_{1}$ norm is given by

[TABLE]

where the subscripts $j$ and $i$ in the notation $[\cdot]_{j,i}$ stand respectively for the row and column indices of the argument matrix, and $\mathcal{R}\colon{\mathbb{R}}^{2\times 2N}\rightarrow{\mathbb{R}}^{N\times 4}$ is the operator consisting in placing the four Stokes images contained in the matrix ${\bm{\mathsf{S}}}$ in four columns, whereas its adjoint $\mathcal{R}^{\dagger}:{\mathbb{R}}^{N\times 4}\rightarrow{\mathbb{R}}^{2\times 2N}$ do the contrary, i.e. storing the four images in the rows of a $2\times 2$ block matrix. In equation (12), ${\bm{\mathsf{W}}}\in{\mathbb{R}}_{+}^{J\times 4}$ is the weighting matrix. In the case when this matrix is chosen to be identity, the $\ell_{1}$ regularization term is obtained. Another useful piece of information which can be used to regularize the problem is the polarization constraint to be satisfied by the recovered Stokes images (Birdi et al., 2018b). This constraint comes from a physical point of view, that the polarized intensity should be smaller or at most equal to the total intensity and mathematically, can be described by the following set:

[TABLE]

As can be noticed, this constraint also imposes implicitly the positivity of the total intensity image (Stokes $I$ ). The enforcement of this constraint amounts to incorporation of the indicator function of the set $\mathbb{P}$ in the regularization function, i.e.

[TABLE]

The indicator function of any such set $\mathbb{P}$ is defined as

[TABLE]

5 Calibration problem

In practice, the DDEs are often unknown and need to be estimated. In this section, we formulate the calibration problem to be solved. We assume that the DDEs exhibit a smooth variation both across the field of view and in time. It implies that DDEs are band-limited spatially as well as temporally. This is enforced by considering compact-support kernels of the DDEs in both spatial (Repetti & Wiaux, 2017; Repetti et al., 2017) and temporal Fourier domains (Thouvenin et al., 2018). More specifically, for each antenna $\alpha\in\{1,\ldots,n_{a}\}$ , the DDEs $(\overline{{\bm{\mathsf{D}}}}_{t,\alpha})_{1\leqslant t\leqslant T}$ are represented by the Fourier kernels $\overline{{\bm{\mathsf{U}}}}_{\alpha}\in{\mathbb{C}}^{2\times 2K\times P}$ , where $K$ and $P$ are the support sizes in spatial and temporal Fourier domain, respectively, with $K\ll N,P\ll T$ . Then, the task is to estimate only non-zero Fourier coefficients of the DDEs, thereby reducing the dimension of the underlying problem significantly. Analogous to the calibration inverse problem in Repetti et al. (2017), problem (2) can be reformulated as

[TABLE]

where $\mathcal{D}_{t}\colon{\mathbb{C}}^{2\times 2K\times P}\rightarrow{\mathbb{C}}^{2\times 2N}$ is the operator acting on $\overline{{\bm{\mathsf{U}}}}_{\alpha}$ to give a sparse matrix $\mathcal{D}_{t}(\overline{{\bm{\mathsf{U}}}}_{\alpha})$ containing the compact support kernels in $\widehat{\overline{{\bm{\mathsf{D}}}}}_{t,\alpha}$ , flipped and centred in zero spatial frequency. Similarly, the operator $\mathcal{D}_{t}^{\prime}\colon{\mathbb{C}}^{2\times 2K\times P}\rightarrow{\mathbb{C}}^{2N\times 2}$ is defined such that $\mathcal{D}_{t}^{\prime}(\overline{{\bm{\mathsf{U}}}}_{\beta})$ is a sparse matrix consisting of the compact support kernels in $\widehat{\overline{{\bm{\mathsf{D}}}}}^{\dagger}_{t,\beta}$ centred in zero spatial frequency. Finally, $\mathcal{X}_{t,\alpha,\beta}\big{(}{\bm{\mathsf{F}}}\,\widetilde{{\bm{\mathsf{Z}}}}\,\mathcal{\widetilde{L}}(\overline{{\bm{\mathsf{S}}}})\big{)}\in{\mathbb{C}}^{2N\times 2N}$ is a $2\times 2$ block matrix, with each block of size $N\times N$ . Each row/ column of such a block consists of a shifted version of the Fourier transform of the corresponding image in reshaped brightness vector $\widetilde{\mathcal{L}}(\overline{{\bm{\mathsf{S}}}})$ , mimicking the convolution operation. Moreover, to account for the continuous sampled frequencies, these Fourier transforms are convolved with the degridding kernels centred at the associated frequency $k_{t,\alpha,\beta}$ .

It can be observed that problem (16) is non-linear with respect to the compact-support kernels $(\overline{{\bm{\mathsf{U}}}}_{\alpha})_{1\leqslant\alpha\leqslant n_{a}}$ . Following the approach proposed in Salvini & Wijnholds (2014) and Repetti et al. (2017), we linearize it by introducing the matrices $\overline{{\bm{\mathsf{U}}}}_{\alpha,1}$ and $\overline{{\bm{\mathsf{U}}}}_{\alpha,2}$ such that $\overline{{\bm{\mathsf{U}}}}_{\alpha,1}=\overline{{\bm{\mathsf{U}}}}_{\alpha,2}=\overline{{\bm{\mathsf{U}}}}_{\alpha}$ for every $\alpha\in\{1,\ldots,n_{a}\}$ . Using this strategy, the problem becomes bi-linear and then the objective is to estimate both the matrices $\overline{{\bm{\mathsf{U}}}}_{1}=(\overline{{\bm{\mathsf{U}}}}_{\alpha,1})_{1\leqslant\alpha\leqslant n_{a}}\in{\mathbb{C}}^{2\times 2K\times P\times n_{a}}$ and $\overline{{\bm{\mathsf{U}}}}_{2}=(\overline{{\bm{\mathsf{U}}}}_{\alpha,2})_{1\leqslant\alpha\leqslant n_{a}}\in{\mathbb{C}}^{2\times 2K\times P\times n_{a}}$ , where $\overline{{\bm{\mathsf{U}}}}_{1}$ (resp. $\overline{{\bm{\mathsf{U}}}}_{2}$ ) concatenates the non-zero Fourier coefficients $\overline{{\bm{\mathsf{U}}}}_{\alpha,1}$ (resp. $\overline{{\bm{\mathsf{U}}}}_{\alpha,2}$ ) for all antennas. The estimation of these matrices is achieved by solving the following minimization problem for the DDE calibration:

[TABLE]

where $\widetilde{h}$ is the least-squares data fidelity term and $p$ is the regularization function for ${\bm{\mathsf{U}}}_{1}$ and ${\bm{\mathsf{U}}}_{2}$ . The data fidelity term for DDE calibration reads as

[TABLE]

where $\|\cdot\|_{F}$ stands for the Frobenious norm of the matrix argument, and considering $t$ and $\beta$ taking all the values respectively in the ranges $\{1,\ldots,T\}$ and $\{1,\ldots,n_{a}\}$ , the following hold: ${\bm{\mathsf{Y}}}_{\alpha}=({\bm{\mathsf{Y}}}_{t,\alpha,\beta})_{t,\beta\neq\alpha}$ and the operator $\mathcal{G}_{\alpha,1}({\bm{\mathsf{U}}}_{\alpha,1})$ in equation (18) generates a concatenation of the terms of the form $\mathcal{D}_{t}({{\bm{\mathsf{U}}}}_{\alpha,1})\,\mathcal{X}_{t,\alpha,\beta}\big{(}{\bm{\mathsf{F}}}\,\widetilde{{\bm{\mathsf{Z}}}}\,\mathcal{\widetilde{L}}({{\bm{\mathsf{S}}}})\big{)}\,\mathcal{D}_{t}^{\prime}({{\bm{\mathsf{U}}}}_{\beta,2})$ . Similarly, $\mathcal{G}_{\alpha,2}({\bm{\mathsf{U}}}_{\alpha,2})$ in equation (19) consists in the operation $\mathcal{D}_{t}({{\bm{\mathsf{U}}}}_{\beta,1})\,\mathcal{X}_{t,\beta,\alpha}\big{(}{\bm{\mathsf{F}}}\,\widetilde{{\bm{\mathsf{Z}}}}\,\mathcal{\widetilde{L}}({{\bm{\mathsf{S}}}})\big{)}\,\mathcal{D}_{t}^{\prime}({{\bm{\mathsf{U}}}}_{\alpha,2})$ to produce the corresponding measurements.

On the other hand, the regularization term $p$ in problem (17) is given by

[TABLE]

where $\gamma>0$ is the regularization parameter and the first term in equation (20) controls the distance between the matrices ${\bm{\mathsf{U}}}_{1}$ and ${\bm{\mathsf{U}}}_{2}$ , thereby imposing the constraint that these two matrices should be equal. The set $\mathbb{D}$ is defined to constrain the values of the Fourier coefficients of the DDEs to lie within the specified bounds. In particular, $\mathbb{D}$ is defined such that for each $t\in\{1,\ldots,T\}$ , the Fourier kernels stored in diagonal terms of $\widehat{{{\bm{\mathsf{D}}}}}_{t,\alpha}$ have the central coefficients ( $[\widehat{{{\bm{\mathsf{D}}}}}_{t,\alpha}(0)]_{11}\,\text{and}\,[\widehat{{{\bm{\mathsf{D}}}}}_{t,\alpha}(0)]_{22}$ ) belonging to an $\ell_{\infty}$ complex ball centred in 1 with radius $\theta_{1}>0$ , whereas the rest of the coefficients belong to an $\ell_{\infty}$ complex ball centred in 0 with radius $\theta_{1}$ . On the other hand, for the Fourier kernels in the off-diagonal terms, the central and the other coefficients are assumed to be contained in $\ell_{\infty}$ balls centred in 0 with radius $\theta_{2}$ and $\theta_{3}$ , respectively. Intuitively, this can be understood as follows. Since we work in the scenario after the calibration transfer is performed, the zero spatial frequency coefficient of the DDEs (i.e. the DIEs) encoded in the diagonal terms of the Jones matrices are normalized to 1, thus lying in a complex neighbourhood of $1+\text{i}0$ . Moreover, while the central coefficient in the spatial Fourier domain represents the mean gain, the higher order spatial frequencies characterize the gain variations across the field of view with respect to this mean gain. Therefore, these coefficients have smaller values in comparison with the central coefficient. Lastly, concerning the off-diagonal terms which encompass the polarization leakage, their values are usually much smaller than the diagonal terms.

6 Proposed calibration $\&$ imaging approach

The current work deals with a practical case when neither the Stokes images nor the DDEs associated with the antennas are known. In the particular case when the sky is considered to be unpolarized, only Stokes $I$ imaging needs to be performed and the $2\times 2$ Jones matrices are often replaced by a scalar value. In this scenario, Repetti et al. (2017) recently proposed a method to perform joint calibration and imaging. Motivated by the good performance obtained by this flexible method with proven convergence guarantees, we extend this approach for full polarization model and propose a joint calibration and imaging algorithm to solve the problem under scrutiny. This leads to considering a global minimization problem aiming to solve for the Stokes parameters ${\bm{\mathsf{S}}}$ and the calibration matrices ${\bm{\mathsf{U}}}_{1}$ and ${\bm{\mathsf{U}}}_{2}$ . This non-convex problem can be cast by combining the minimization problems (9) and calibration (17) which were proposed in the earlier sections solely for Stokes imaging and DDE calibration, respectively.

Given the non-convexity of the underlying minimization problem, choice of initialization is crucial. In this context, we exploit the fact that the calibration transfer has been performed, to obtain an initial estimate of the Stokes images, as suggested in Repetti & Wiaux (2017) (although for Stokes $I$ imaging only). Such first imaging step consists in considering the Jones matrices as identity and solving the associated minimization problem (9).

Let ${\bm{\mathsf{S}}}^{\prime}$ be the Stokes parameters estimated by solving problem (9). Since these are obtained ignoring the DDEs, in general, these may contain artefacts. Therefore, we instead use a thresholded version of ${\bm{\mathsf{S}}}^{\prime}$ , denoted by ${\bm{\mathsf{S}}}_{0}$ , which contains only the high amplitude coefficients of ${\bm{\mathsf{S}}}^{\prime}$ . With this first approximation of the images at hand, the original unknown image can be seen as a sum of ${\bm{\mathsf{S}}}_{0}$ and ${\bm{\mathsf{E}}}\in{\mathbb{R}}^{2\times 2N}$ , where the latter is unknown and need to be estimated.

Finally, we propose to define the estimates $({\bm{\mathsf{E}}},{\bm{\mathsf{U}}}_{1},{\bm{\mathsf{U}}}_{2})$ as solutions to the following global, non-convex minimization problem:

[TABLE]

where $h$ is the least squares data fidelity term associated with the data model, $r$ and $p$ are the regularization terms for the image and the DDEs, respectively. In particular, with ${\bm{\mathsf{S}}}={\bm{\mathsf{S}}}_{0}+{\bm{\mathsf{E}}}$ , the data fidelity term is given by the least squares criterion as proposed in equations (10), (18) and (19), i.e.

[TABLE]

where operator $\Phi$ in equation (22) is formed using fixed values of $({\bm{\mathsf{U}}}_{1},{\bm{\mathsf{U}}}_{2})$ . Similarly, $\mathcal{G}_{\alpha,1}({\bm{\mathsf{U}}}_{\alpha,1})$ in equation (23) (and $\mathcal{G}_{\alpha,2}({\bm{\mathsf{U}}}_{\alpha,2})$ in equation (24), resp.) is determined by fixed $({\bm{\mathsf{U}}}_{\beta,2},{\bm{\mathsf{E}}})$ ( $({\bm{\mathsf{U}}}_{\beta,1},{\bm{\mathsf{E}}})$ , resp.) with $\beta\in\{1,\ldots,n_{a}\}$ and $\beta\neq\alpha$ . While estimating the Stokes parameters, equation (22) is employed as the data fidelity term. In particular, keeping $({\bm{\mathsf{U}}}_{1},{\bm{\mathsf{U}}}_{2})$ fixed, the convexity of this term with respect to ${\bm{\mathsf{E}}}$ can be noticed. In the same manner, equation (23) ((24), resp.) is chosen while updating ${\bm{\mathsf{U}}}_{1}$ ( ${\bm{\mathsf{U}}}_{2}$ , resp.) which is convex with respect to ${\bm{\mathsf{U}}}_{1}$ ( ${\bm{\mathsf{U}}}_{2}$ , resp.) keeping the other two variables fixed. It is then straightforward to see that the non-convex function $h$ is in fact convex for each of the variables while fixing the others.

Concerning the regularization terms, the function $p\big{(}{\bm{\mathsf{U}}}_{1},{\bm{\mathsf{U}}}_{2}\big{)}$ is given from equation (20), whereas the function $r({\bm{\mathsf{E}}})$ for the images is associated with the priors introduced in equations (11), (12) and (14). In particular, one can choose whether to take the polarization constraint into account or not. In the former case, the regularization term boils down to

[TABLE]

where $g({\bm{\mathsf{E}}})=\sum_{i=1}^{4}\eta_{i}\|\big{(}{\bm{\Psi}}^{\dagger}\mathcal{R}({\bm{\mathsf{S}}}_{0}+{\bm{\mathsf{E}}})\big{)}_{\colon,i}\|_{1}$ denotes the SARA prior considering ${\bm{\Psi}}$ to be a concatenation of Dirac basis and first eight Daubechies wavelets (Carrillo et al., 2014; Birdi et al., 2018b). Here, we have taken the weighting matrix to be equal to identity, i.e. without adapting the reweighting scheme and, for every $i\in\{1,2,3,4\}$ , $\eta_{i}>0$ is the regularization parameter. The set $\mathbb{K}$ is chosen to take into account the errors that might appear on the estimated non-zero coefficients of ${\bm{\mathsf{S}}}_{0}$ . Formally, it is defined as

[TABLE]

where ${\mathbb{S}}_{0}$ is the support of ${\bm{\mathsf{S}}}_{0}$ . The parameter $\vartheta\in[0,1]$ is chosen according to the error percentage considered on ${\bm{\mathsf{S}}}_{0}$ .

On the other hand, in the absence of the polarization constraint, the positivity of Stokes $I$ image needs to be imposed explicitly, that can be accounted for by replacing the set $\mathbb{K}$ in equation (25) by the set $\mathbb{K}^{\prime}$ , defined as

[TABLE]

In such a case, the function $r$ reads as

[TABLE]

6.1 Proposed algorithm

In order to solve problem (21), we observe that it has a block-variable structure with ${\bm{\mathsf{U}}}_{1},{\bm{\mathsf{U}}}_{2}$ and ${\bm{\mathsf{E}}}$ being the three blocks constituting the problem. On top of it, although the global problem is non-convex, it is convex with respect to each of these blocks. Leveraging this block-variable structure, we propose to use an iterative algorithm based on a block-coordinate forward-backward approach (Chouzenoux et al., 2016) to solve problem (21). It consists in alternating between the estimation of the DDEs and the Stokes images. In turn, for each of these estimations, FB iterations are employed, i.e. the smooth terms are dealt by their gradient, whereas the non-smooth terms are managed by their proximity operators. Formally, given a proper, lower-semicontinuous function $f\colon{\mathbb{R}}^{2\times 2N}\rightarrow]-\infty,+\infty]$ , its proximity operator (Moreau, 1965) at ${\bm{\mathsf{S}}}\in{\mathbb{R}}^{2\times 2N}$ is defined as

[TABLE]

In the particular case of $f$ being an indicator function of a set ${\mathbb{E}}$ , the proximity operator reduces to commonly known projection operator $\mathcal{P}_{{\mathbb{E}}}$ , defined as

[TABLE]

In light of the discussion above, we present the proposed algorithm as Algorithm 1. It consists of a global loop and inner iteration loops. At each iteration of the global loop, indexed by $i\in{\mathbb{N}}$ (step 2), we choose either to update the DDEs or the image following an essentially cyclic rule, that is each of the variables must be updated at least once within a given finite number of iterations. For every $i^{\text{th}}$ iteration, this is taken care by the choice of number of inner loop iterations $L^{(i)}\in{\mathbb{N}}$ and $J^{(i)}\in{\mathbb{N}}$ to update the DDEs and the image, respectively. To be more precise, in the former case, each of the calibration matrices ${\bm{\mathsf{U}}}_{1}^{(i)}$ (step 7) and ${\bm{\mathsf{U}}}_{2}^{(i)}$ (step 11) are updated by performing $L^{(i)}$ number of FB iterations in the inner loop, using the images estimated at the previous iterate. When the images are chosen to be updated in the global loop, the updated DDEs from the previous iterate are used to estimate the image in step 18, executing $J^{(i)}$ FB iterations. The overall algorithm can then be understood by splitting it into two parts: Calibration and Imaging. These two parts are explained in what follows.

Calibration:

It comprises of the estimation of the matrices ${\bm{\mathsf{U}}}_{1}$ and ${\bm{\mathsf{U}}}_{2}$ . In this case, while the data fidelity term $h$ is differentiable, the regularization term $p$ consist of both smooth and non-smooth terms. Thus, in each $\ell^{\text{th}}$ FB iteration to estimate either of these matrices, the gradient step for the differentiable terms is coupled with the proximity operator of the non-smooth term, as shown in steps 7 and 11. In particular, for the gradient step, for every $q\in\{1,2\}$ , the step size $\Gamma_{q}^{(i)}\in{\mathbb{R}}^{2\times 2K\times P\times n_{a}}$ is chosen as

[TABLE]

where $\bm{1}_{R}$ is a matrix of ones of dimension $R$ , and $\zeta_{q,\alpha}^{(i)}$ is given by $0<\zeta_{q,\alpha}^{(i)}<1/(\gamma+\upsilon_{q,\alpha}^{(i)}),$ with $\upsilon_{q,\alpha}^{(i)}$ denoting the Lipschitz constant of the partial derivative of $h$ with respect to ${\bm{\mathsf{U}}}_{q,\alpha}^{(i)}$ .

Furthermore, since in this case the non-smooth term is the indicator function of the set $\mathbb{D}$ , as explained previously, this reduces to performing projection on this set, $\mathcal{P}_{\mathbb{D}}$ which basically ensures that the values of the estimated DDEs Fourier coefficients lie within the bounds specified earlier.

Imaging:

This step updates the Stokes images while using the DDEs estimates from the previous iterate. As shown in step 18, it involves computing the gradient of the data fidelity term, followed by the proximity operator of the regularization function $r$ . In this case, the step size $\sigma^{(i)}$ for the gradient step is chosen such that it satisfies

[TABLE]

where $\|\Phi\|_{2}$ computes the spectral norm of $\Phi$ , and $\Phi$ is generated using the updated values $({\bm{\mathsf{U}}}_{1}^{(i+1)},{\bm{\mathsf{U}}}_{2}^{(i+1)})$ .

Regarding the proximity step, the regularization term $r({\bm{\mathsf{E}}})$ is a hybrid term incorporating a mixture of prior information. Particularly, based on the choice of inclusion or exclusion of the polarization constraint, the computation of the proximity operator differs. For comparison purposes, here we consider both the cases.

Regularization without polarization constraint:

When we work in the absence of the polarization constraint, the regularization term $r({\bm{\mathsf{E}}})$ is given by equation (28). Then, the proximity operator of $r$ evaluated at any point ${\bm{\mathsf{R}}}\in{\mathbb{R}}^{2\times 2N}$ amounts to

[TABLE]

which does not have an explicit formulation and thus requires sub-iterations for its computation. One such possibility is to employ dual forward-backward algorithm (Combettes & Pesquet, 2011; Combettes et al., 2011), as presented in Algorithm 2.

In particular, step 3 of Algorithm 2 performs the projection onto the set $\mathbb{K}^{\prime}$ . It is followed by the computation of the proximity operator of function $g$ in step 5, which in the current case of $g$ being the $\ell_{1}$ norm, corresponds to the soft-thresholding operator (Chaux et al., 2007). This is evaluated as

[TABLE]

where $\varepsilon^{(i)}=\mu^{-1}\sigma^{(i)}$ and the soft-thresholding operation is a component-wise operation such that $\widetilde{{\bm{\mathsf{R}}}}\in{\mathbb{R}}^{J\times 4}$ is defined for every $q\in\{1,2,3,4\}$ and $j\in\{1,\ldots,J\}$ as

[TABLE]

In essence, this operation sets the elements smaller than a threshold to zero, whereas shrinks all the other elements by this value. Therefore, it promotes sparsity of its argument variable.

Regularization with polarization constraint:

In the case when the polarization constraint is to be enforced, the regularization term is given from equation (25) and the associated proximity operator at a point ${\bm{\mathsf{R}}}$ is given by

[TABLE]

The evaluation of this operator requires projection onto the set $\mathbb{P}$ , that does not have a closed form solution. To circumvent this difficulty, we have recently proposed a method, named Polarized SARA (Birdi et al., 2018a, b), that enforces this constraint leveraging the epigraphical projection techniques (Chierchia et al., 2015). In this context, with the introduction of an auxiliary variable ${\bm{\mathsf{Z}}}\in{\mathbb{R}}^{N\times 2}$ , the polarization constraint set is splitted into simpler, easily manageable constraint sets, thereby performing the projection onto these sets. Formally, it corresponds to the following reformulation of problem (36):

[TABLE]

In order to impose the constraints (37b)- (37d), we make use of the indicator functions of the corresponding sets. In particular, the constraints (37b) and (37c) are enforced using indicator functions of the epigraphs222The epigraph of a proper, lower semi-continuous function $\psi\colon\mathbb{R}^{N}\to]-\infty,+\infty]$ corresponds to $\operatorname{epi}\psi=\left\{(\bm{z},\kappa)\in\mathbb{R}^{N}\times\mathbb{R}\,\middle|\,\psi(\bm{z})\leqslant\kappa\right\}.$ $\mathbb{E}_{1},\mathbb{E}_{2}$ respectively of the functions $h_{1}$ and $h_{2}$ . On the other hand, the set $\mathbb{V}=\left\{\bm{\mathsf{Z}}\in\mathbb{R}^{N\times 2}\middle|\,\big{(}\forall n\in\{1,\ldots,N\}\big{)}\;{{\mathsf{Z}}}_{n,1}+{{\mathsf{Z}}}_{n,2}\leqslant 0\right\}$ is used to impose the constraint (37d). For further details, we refer the reader to Birdi et al. (2018b).

Doing so, the proximity operator (37a) results into

[TABLE]

For the computation of this proximity operator, we employ Polarized SARA method which is based on primal-dual forward-backward algorithm (Condat, 2013; Vũ, 2013; Pesquet & Repetti, 2015). Particularly, leveraging the flexibility and parallelizability offered by the primal-dual algorithms, we can easily adapt Polarized SARA method to solve our underlying problem (6.1). For the sake of completeness, we also present this adapted version in Algorithm 3. It comprises of solving for the variables of interest, the primal variables, along with the associated auxiliary variables, the dual variables. In this context, the update of the primal variables is similar to the forward-backward strategy, wherein the gradient of the differentiable terms is followed by their proximity step. For the current case, it implies projection onto the set $\mathbb{K}$ (step 4) and ${\mathbb{V}}$ (step 5) for the update of variables $\widetilde{{\bm{\mathsf{E}}}}$ and ${\bm{\mathsf{Z}}}$ , respectively. It is to be noted that these updates incorporate an additive term based on the corresponding dual variables. More precisely, the update of $\widetilde{{\bm{\mathsf{E}}}}$ involves the variables ${\bm{\mathsf{A}}}\in{\mathbb{R}}^{J\times 4}$ and ${\bm{\mathsf{C}}}\in{\mathbb{R}}^{N\times 4}$ related to the sparsity prior and the epigraphical constraints, respectively. Similarly, step 5 comprises of the variable ${\bm{\mathsf{B}}}\in{\mathbb{R}}^{N\times 2}$ associated with the epigraphical constraints. These dual variables are in turn updated by the computation of their associated proximity operators, that is the soft-thresholding operator for ${\bm{\mathsf{A}}}$ (step 9) using the definition in equations (34) and (35), and the projections onto the sets ${\mathbb{E}}_{1}$ and ${\mathbb{E}}_{2}$ for the variables ${\bm{\mathsf{C}}}$ and ${\bm{\mathsf{B}}}$ . A detailed description of these updates is provided in Birdi et al. (2018b).

6.2 Convergence properties

The convergence properties of the proposed joint calibration and imaging algorithm for full polarization model (Algorithm 1) can be deduced from Chouzenoux et al. (2016). Particularly, if a finite number of iterations are performed to update each of the variables, that is for every $i^{\text{th}}$ iteration, choosing $L^{(i)}$ and $J^{(i)}$ to be finite in Algorithm 1, and if the

•

blocks $({\bm{\mathsf{E}}}^{(i)},{\bm{\mathsf{U}}}_{1}^{(i)},{\bm{\mathsf{U}}}_{2}^{(i)})$ in Algorithm 1 are estimated at least once in every finite number of given iterations,

•

step sizes $\Gamma_{1}^{(i)}$ and $\Gamma_{2}^{(i)}$ for the gradient steps while updating ${\bm{\mathsf{U}}}_{1}$ and ${\bm{\mathsf{U}}}_{2}$ , respectively are chosen as per equation (31), and

•

step size $\sigma^{(i)}$ for the update of ${\bm{\mathsf{E}}}$ is chosen as per equation (32),

then Algorithm 1 is guaranteed to converge to a critical point $({\bm{\mathsf{E}}}^{\prime},{\bm{\mathsf{U}}}_{1}^{\prime},{\bm{\mathsf{U}}}_{2}^{\prime})$ of the underlying objective function given in equation (21). Additionally, iteration-by-iteration, the value of this objective function decreases.

6.3 Computational complexity

The computational complexity of Algorithm 1 can be analyzed from its two sub-parts: Calibration and Imaging. In the former case, the update of matrix ${\bm{\mathsf{U}}}_{1}$ in steps 6 - 8 (and ${\bm{\mathsf{U}}}_{2}$ in steps 10 - 12) is performed in parallel for $n_{a}$ antennas, using forward-backward (FB) iterations. In this case, the gradient evaluation of the data fidelity term in equation (23) (and in (24)) is required. It involves the degridding operation while generating the matrix $\mathcal{X}_{t,\beta,\alpha}\big{(}\,{{\bm{\mathsf{F}}}}\,\widetilde{{\bm{\mathsf{Z}}}}\widetilde{\mathcal{L}}({{\bm{\mathsf{S}}}_{0}}+{\bm{\mathsf{E}}})\big{)}$ for all $t\in\{1,\ldots,T\}$ and $\beta\in\{1,\ldots,n_{a}\}$ , with $\beta\neq\alpha$ , that turns out to be the most expensive step. However, it is to be noted that for each global calibration iteration, this matrix needs to be computed only once before the inner FB iterations (both for ${\bm{\mathsf{U}}}_{1}$ and ${\bm{\mathsf{U}}}_{2}$ updates).

Regarding the imaging step, the FB strategy presents the computationally most demanding part of the algorithm. In particular, in step 18, the gradient computation (forward step) of the data fidelity term (equation (22)) consists in performing the application of the operator $\Phi$ , followed by its adjoint operator. Furthermore, the evaluation of the proximity operator (backward step) involves sub-iterations, performed either by Algorithm 2 or Algorithm 3. While this iterative process adds to the computational cost, the heaviest steps within either of the algorithms is the application of the wavelet transform operator ${\bm{\Psi}}$ consisting of 8 wavelet bases and Dirac basis to impose sparsity. This can in turn be implemented in a parallel fashion, for each sparsity basis as well as for each of the underlying images.

7 Simulations and results

In this section, we describe the various simulations conducted using a matlab implementation of the proposed algorithm to assess its performance.

To simulate the data sets, we consider VLA antenna’s A configuration. It consists of $n_{a}=27$ antennas with each antenna pair acquiring measurements at $T=200$ snapshots, considering linear feeds. While performing the non-uniform Fourier transform at these sampled frequencies, we make use of the Kaiser-Bessel kernels of size $5\times 5$ for interpolation (Fessler & Sutton, 2003). Furthermore, we perform tests on two sets of model images of size $N=256\times 256$ . The two sets correspond to the images of the radio galaxies Cygnus A and Hydra A, shown respectively in first and second columns of Fig. 3. For each set, Stokes $I$ , $Q$ and $U$ images ( $V=0$ due to negligible circular polarization) are displayed from top to bottom.

Regarding the DDE Fourier kernels, for every antenna $\alpha\in\{1,\ldots,n_{a}\}$ and at each time instant $t\in\{1,\ldots,T\}$ , these kernels are generated randomly in the Fourier domain, with a spatial Fourier support of $K=5\times 5$ and a temporal Fourier support of size $P=3\times 3$ . Furthermore, the values of the Fourier coefficients are chosen as per the discussion in Section 5. More precisely, for the diagonal terms in the Jones matrix, the central coefficients that correspond to the Fourier coefficients of the DIEs have their real and imaginary parts lying in the interval $[1-\theta_{1},1+\theta_{1}]$ and $[-\theta_{1},\theta_{1}]$ , respectively, whereas the other frequency coefficients (related to the DDEs Fourier coefficients) belong to $[-\theta_{1},\theta_{1}]$ with $\theta_{1}=5\times 10^{-2}$ . On the other hand, the central and all the rest coefficients in the off-diagonal terms are considered belonging to $[-\theta_{2},\theta_{2}]$ and $[-\theta_{3},\theta_{3}]$ , respectively. Here we choose $(\theta_{2},\theta_{3})=(5\times 10^{-2},5\times 10^{-4})$ . The chosen values are in line with the VLA characteristics, wherein the leakage terms (i.e. the off-diagonal terms) are $\sim 10^{-2}$ lower in peak amplitude compared to the diagonal terms, representing a leakage of around $5\%$ of the diagonal terms (Bhatnagar et al., 2013).

7.1 Comparisons performed

For an assessment of the proposed algorithm, we perform an extensive study on a varied number of cases based on choices made both on calibration and imaging fronts. From the calibration viewpoint, we compare between different generations of calibration schemes, i.e. 1GC, 2GC and 3GC. Particularly, we compare the results obtained by DDE calibration with those of considering only DIEs. Furthermore, to determine the importance of off-diagonal terms in the Jones matrices, we run tests with and without calibrating for these terms. On the other hand, from the imaging perspective, it is interesting to analyze the performance of different regularizations for the Stokes parameters. Thus, we consider the cases with and without enforcement of the polarization constraint. A blend of these approaches leads to the following list of tests:

(1)

Imaging with normalized DIEs (1GC): Working under the assumption of calibration transfer, the first step is to consider the Jones matrices as identity, without any directional dependency. It is important to mention here that this is the usual case considered by imaging algorithms in RI, either ignoring the calibration effects or relying on the data provided after the calibration transfer. In such a scenario, only imaging step is performed to obtain an estimation of the Stokes images ${\bm{\mathsf{S}}}$ . Furthermore, in this imaging problem, we can consider two different regularizations and hence two sub-cases: Imaging with normalized DIEs (1.a) without and (1.b) with the enforcement of the polarization constraint.

(2)

Joint DIE calibration and imaging (2GC): It involves using the proposed algorithm for calibration and imaging. While calibrating, only the DIEs, i.e. considering support size $K=1$ for Fourier kernels, are solved for. A thresholded version of the images estimated from (1.a) and (1.b) are used respectively to initialize the problems while working in the absence and presence of the polarization constraint. This corresponds to two sub-cases: DIE calibration and imaging (2.a) without and (2.b) with the polarization constraint.

(3)

Joint DDE calibration and imaging (3GC-polarized): This approach follows the proposed algorithm for calibrating the DDEs in conjunction with the imaging step. Similar to (2), the images are initialized from thresholded versions of (1.a) and (1.b), resulting into four cases:

(3.a)

DDE calibration for the Jones matrices, excluding the off-diagonal terms & imaging without the polarization constraint: It corresponds to the case when the calibration steps are applied only to update the diagonal terms of the Jones matrices, without accounting for the polarization constraint in the imaging step.

(3.b)

DDE calibration for the Jones matrices, excluding the off-diagonal terms & imaging with the polarization constraint: Here, the calibration strategy is the same as (3.a), whereas the polarization constraint is considered in the imaging step.

(3.c)

DDE calibration for the whole Jones matrix, including the off-diagonal terms & imaging without the polarization constraint: It consists in calibrating for the full Jones matrix, but without enforcing the polarization constraint in the imaging step.

(3.d)

DDE calibration for the whole Jones matrix, including the off-diagonal terms & imaging with the polarization constraint: This approach comprises of full Jones matrix calibration and imposing the polarization constraint in the imaging step.

7.2 Simulation settings

As previously mentioned, one of the key points for convergence of the proposed algorithm is to update the variables at least once in every finite number of given iterations. To ensure this, within the global loop in Algorithm 1, we perform $L_{\text{cyc}}=2$ global iterations for the DDEs (i.e. steps 5 to 14 are iterated twice) followed by an update of the Stokes images (steps 16 to 21). Nevertheless, after a certain number of global loops, it may happen that the DDE updates stabilize in fewer than $L_{\text{cyc}}$ iterations. Thus, to avoid unnecessary computation, we define a stopping criterion for the DDE updates as the relative variation between the consecutive estimates of DDEs to be less than a threshold $\varepsilon_{\mathsf{U}}$ , that is

[TABLE]

where we set $\varepsilon_{\mathsf{U}}=10^{-5}$ . In other words, if this criterion is met, then no more DDE iterations are performed, and the algorithm leaps to the imaging step using the estimated DDEs. The estimation of the Stokes images adopts a similar strategy wherein iterations in the inner imaging loop are stopped and the algorithm resorts to the calibration step once the following stopping criterion is satisfied:

[TABLE]

with $\varepsilon_{\mathsf{E}}=10^{-5}$ . Furthermore, in order to ensure stopping of the global algorithm at convergence, we define a stopping criterion as the relative variation between the values of the objective function at consecutive iterates to be less than a threshold $\varepsilon_{0}$ , that is

[TABLE]

where $\varphi^{(i)}$ denotes the objective function value for the $i^{\text{th}}$ iteration computed using the updated values $({\bm{\mathsf{E}}}^{(i+1)},{\bm{\mathsf{U}}}_{1}^{(i+1)},{\bm{\mathsf{U}}}_{2}^{(i+1)})$ and $\varepsilon_{0}$ is fixed to $2\times 10^{-2}$ . Regarding the choice of number of iterations $L^{(i)}$ and $J^{(i)}$ for calibration and imaging inner loops, respectively, we choose $L^{(i)}=5$ when DDE updates need to be performed and $J^{(i)}=100$ when imaging step is carried out. Furthermore, for each of the above mentioned cases, we run 5 simulations varying the DIEs/DDEs and noise realizations.

For quantitative comparison of the results obtained by different tests, we use signal-to-noise ratio (SNR) as a metric. Given an original image $\overline{\bm{s}}\in{\mathbb{R}}^{N}$ , the SNR of the reconstructed image $\bm{s}^{\prime}$ with respect to $\overline{\bm{s}}$ is defined as

[TABLE]

where $\upsilon^{\prime}=\underset{\upsilon>0}{\operatorname{argmin}}\,\,\|\upsilon\bm{s}^{\prime}-\overline{\bm{s}}\|_{2}^{2}$ accounts for the ambiguity problem in the underlying blind deconvolution problem. Furthermore, the values of the regularization parameters $\bm{\eta}=[\eta_{1},\eta_{2},\eta_{3}]$ and $\gamma$ are chosen to maximize the SNR.

We also report the dynamic range (DR) obtained for the reconstructed images. For every image contained in the estimated Stokes matrix ${\bm{\mathsf{S}}}^{\prime}={\bm{\mathsf{S}}}_{0}+{\bm{\mathsf{E}}}^{\prime}$ , with $(i,j)\in\{1,2\}^{2}$ it is defined as follows:

[TABLE]

7.3 Results and analysis

We now present the results obtained by conducting tests for the various cases mentioned before. In terms of quantitative comparison between the different cases performed, Table 1 provides the obtained SNR and dynamic range values for both the sets of images. In each case, the mean values evaluated over the 5 performed simulations are shown. From the calibration front, it can be noticed that joint DDE calibration and imaging leads to a significant improvement in the reconstruction quality, both in terms of SNR and DR, in comparison with either of the cases when only imaging step is performed considering normalized DIEs or jointly calibrating for DIEs and imaging. In particular, the case of normalized DIEs considers Jones matrices to be identity, whereas the DIE calibration scheme also accounts for the DIEs in the off-diagonal Jones terms and is thus further affected by the estimation of these terms. In fact, these terms lead to flux leakage from one Stokes parameter to others and if not perfectly calibrated, can lead to error propagation. It particularly affects the low amplitude Stokes $Q$ and $U$ images. This can be seen from the smaller SNR values of Stokes $Q$ image than that obtained by the case of normalized DIEs (Table 1(a)). Another observation that can be made for these two cases is regarding the higher DR values obtained for Stokes $Q$ and $U$ images than that of Stokes $I$ image. This non-physical feature can be attributed to the usage of incomplete measurement model (i.e. without the incorporation of the DDEs), severely affecting the underlying non-convex minimization problem giving inaccurate solutions. The aforementioned observations for these cases further indicate the necessity of taking direction-dependent terms into account and performing their calibration along with imaging for a better reconstruction quality. Second, comparing the results obtained with and without calibrating for the off-diagonal terms in the Jones matrices shows better SNR and DR values are achieved in the former case, thereby demonstrating the significance of calibrating for the off-diagonal Jones terms to recover high quality, high dynamic range images.

On the other hand, on the imaging front, it is interesting to note that when the image recovery is performed either using only normalized DIEs or calibrating for DIEs, i.e. having inexact knowledge of the Jones matrices and neglecting the DDEs, regularizing the underlying problem with the polarization constraint may not be very effective. Indeed, in such a case, not enforcing the constraint can produce slightly better results, as seen for Cygnus A images. Nevertheless, when the DDE calibration scheme is incorporated, the result analysis highlights the superior performance of this constraint, recovering images with much higher SNR and DR compared to the case where it is not imposed. Especially in the case of model images for Hydra A, the enforcement of the polarization constraint leads to an appreciable improvement of $\sim 9-10$ dBs in SNR of Stokes $Q$ and $U$ images (Table 1(c)) and around one order of magnitude in DR (Table 1(d)). It should be noted that Hydra A images have lesser amplitude than the first set of images (Cygnus A). This is true specifically for Stokes $Q$ and $U$ images which are around one order of magnitude lower in amplitude than the corresponding Stokes $I$ image and hence, difficult to be recovered. In this respect, the obtained results emphasize the crucial role played by suitably chosen regularization prior in enhancing the reconstruction quality. In fact, if the problem is not well regularized (i.e. the polarization constraint is not enforced), the solution of the underlying non-convex joint DDE calibration and imaging problem may stuck in a local minimum leading to poor reconstruction quality. The good performance of the polarization constraint to recover these images shows suitability of this prior for full polarization imaging. This validates the findings of Birdi et al. (2018b) and further extends them to the case of joint calibration and imaging.

The above listed observations are further supported by the visual inspection of the recovered images and the associated absolute error images, shown in Figs. 4-9. While Figs. 4, 5 and 6 display the Cygnus A images for Stokes $I$ , $Q$ and $U$ , respectively, the corresponding images for Hydra A are shown in Figs. 7, 8 and 9. The error images are obtained by computing the absolute difference between the ground truth and reconstructed images. In each of these figures, the recovered images (first and third columns) followed by their absolute error images (second and fourth columns) are shown. The shown images correspond to the reconstructions (and the associated error images) obtained when imaging is performed with: (second row) normalized DIEs, (1.a) without polarization constraint and (1.b) with polarization constraint, and (third row) DIE calibration (2.a) without and (2.b) with the enforcement of the polarization constraint. Similarly, the reconstructed images (and the associated error images) for the case of joint DDE calibration and imaging, excluding the off-diagonal terms and performing full Jones matrix calibration are presented respectively in third and fourth rows, both for (3.a) (and (3.c)) without polarization constraint and (3.b) (and (3.d)) with polarization constraint. It can be observed that joint DDE calibration and imaging offers remarkable advantage over considering only DIEs, mitigating the artefacts occurring because of the calibration errors. Moreover, calibration of the off-diagonal terms is crucial for high dynamic range imaging by diminishing the diffused calibration artefacts in the background. In addition to these remarks, the quality of reconstruction is promoted by considering the polarization constraint in the reconstruction process. While the enforcement of this constraint yields lesser residual in the error images, a careful examination of the recovered images also indicates that this prior is able to produce highly resolved images with finer details as opposed to the case when the constraint is not taken into account.

8 Conclusions

We have presented a joint calibration and imaging technique taking into account the full polarization model for radio interferometry. The proposed technique, dubbed Polca SARA, unifies the estimation of the DDEs for the full Jones matrix and the Stokes images of interest within a global algorithmic structure, exploiting the same optimization framework for both calibration and imaging. In particular, it solves the underlying non-convex minimization problem employing a block-coordinate forward-backward algorithm, thereby following a forward-backward scheme for estimation of each of the variables. The matlab code of the proposed method will soon be made available on GitHub (https://basp-group.github.io/Polca-SARA/).

While our approach is shipped with convergence guarantees, it can also be adapted to incorporate suitable regularization priors for the variables under consideration. Thanks to this flexibility, for the imaging step in the global algorithm, we have employed an approach specifically developed for full Stokes imaging enforcing the physical polarization constraint (Birdi et al., 2018b). These features offered by our method are in contrast with the existing calibration and imaging algorithms in RI which (i) do not benefit from global convergence and (ii) use in fact Stokes $I$ imaging based techniques even for polarimetric imaging. The significance of the latter remark is further highlighted by the results obtained from various numerical simulations performed, achieving better SNR and higher dynamic range with the enforcement of the polarization constraint. Furthermore, in terms of the calibration, the results have shown the importance of calibrating for full Jones matrix, including DDEs and off-diagonal terms, to mitigate the artefacts appearing otherwise in the reconstructed images. Calibration of off-diagonal terms in Jones matrices, denoting the polarization leakage, has been shown to be particularly crucial to obtain high dynamic range images.

Although in this first presentation of the proposed algorithm we have dealt with simulated data, the good performance obtained by this approach unveil it as a promising joint calibration and imaging technique in RI. Keeping this in mind, we plan to investigate its performance on real RI data sets in near future. We further note that in practice, the polarization angle deduced from reconstructed Stokes $Q$ and $U$ images can suffer from a unitary rotational ambiguity in the measurement model (Hamaker, 2000; Smirnov, 2011). A way to tackle this problem could be to add in the minimization problem some prior information about this rotation, that can be obtained from external calibration.

Another possible direction of work is to extend the proposed framework to account for hyperspectral data. While on the calibration part it would involve incorporation of DIEs/DDEs variation with observation frequency (Dabbech et al., 2019), the imaging part would need to be extended for wide-band polarimetric imaging, leveraging the adaptability offered by the currently developed approach.

Acknowledgements

We would like to thank R. Perley from NRAO for providing the two sets of images used for simulation studies. This work was supported by the UK Engineering and Physical Sciences Research Council (EPSRC, grant EP/M011089/1).

Appendix A Table of notations

A list of useful notations and algorithmic parameters to be configured are provided in Table 2.

Bibliography53

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Akiyama et al. (2017) Akiyama K., et al., 2017, Ap J, 838, 1
2Bhatnagar & Cornwell (2004) Bhatnagar S., Cornwell T. J., 2004, A&A, 426, 747
3Bhatnagar & Cornwell (2017) Bhatnagar S., Cornwell T., 2017, The Astronomical Journal, 154, 197
4Bhatnagar et al. (2013) Bhatnagar S., Rau U., Golap K., 2013, Ap J, 770, 91
5Birdi et al. (2018 a) Birdi J., Repetti A., Wiaux Y., 2018 a, in 2018 IEEE 10th Sensor Array and Multichannel Signal Processing Workshop (SAM). IEEE, pp 465–469
6Birdi et al. (2018 b) Birdi J., Repetti A., Wiaux Y., 2018 b, MNRAS, 478, 4442
7Candès et al. (2006) Candès E., Romberg J., Tao T., 2006, IEEE Trans. Inf. Theory, 52, 489
8Candès et al. (2008) Candès E., Wakin M., Boyd S., 2008, J. Fourier Anal. Appl., 14, 877

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Polca SARA - Full polarization, direction-dependent calibration and sparse imaging for radio interferometry

Abstract

keywords:

1 Introduction

2 Full polarization observation model

3 A brief overview of the proposed method

4 Imaging problem

5 Calibration problem

6 Proposed calibration &\&& imaging approach

6.1 Proposed algorithm

Calibration:

Imaging:

Regularization without polarization constraint:

Regularization with polarization constraint:

6.2 Convergence properties

6.3 Computational complexity

7 Simulations and results

7.1 Comparisons performed

7.2 Simulation settings

7.3 Results and analysis

8 Conclusions

Acknowledgements

Appendix A Table of notations

6 Proposed calibration $\&$ imaging approach