Source counts and confusion at 72-231 MHz in the MWA GLEAM survey

T. M. O. Franzen; T. Vernstrom; C. A. Jackson; N. Hurley-Walker; R. D.; Ekers; G. Heald; N. Seymour; and S. V. White

arXiv:1812.00666·astro-ph.GA·February 20, 2019

Source counts and confusion at 72-231 MHz in the MWA GLEAM survey

T. M. O. Franzen, T. Vernstrom, C. A. Jackson, N. Hurley-Walker, R. D., Ekers, G. Heald, N. Seymour, and S. V. White

PDF

TL;DR

This paper presents accurate source counts from the GLEAM survey at 72-231 MHz, revealing discrepancies with models and highlighting confusion noise issues, with implications for future low-frequency radio surveys.

Contribution

It provides the most accurate low-frequency source counts to date from GLEAM, compares them with models, and discusses confusion noise limitations and improvements in data processing.

Findings

01

Source counts are more accurate due to large sky coverage and sensitivity to extended emission.

02

No flattening of spectral index observed at frequencies above 0.5 Jy.

03

Confusion noise dominates thermal noise at frequencies above ~100 MHz.

Abstract

The GaLactic and Extragalactic All-sky MWA survey (GLEAM) is a radio continuum survey at 72-231 MHz of the whole sky south of declination +30 deg, carried out with the Murchison Widefield Array (MWA). In this paper, we derive source counts from the GLEAM data at 200, 154, 118 and 88 MHz, to a flux density limit of 50, 80, 120 and 290 mJy respectively, correcting for ionospheric smearing, incompleteness and source blending. These counts are more accurate than other counts in the literature at similar frequencies as a result of the large area of sky covered and this survey's sensitivity to extended emission missed by other surveys. At S_154MHz > 0.5 Jy, there is no evidence of flattening in the average spectral index (alpha approx. -0.8 where S proportional to nu^alpha) towards the lower frequencies. We demonstrate that the SKA Design Study (SKADS) model by Wilman et al. (2008)…

Tables6

Table 1. Table 1 : Summary of sky regions excised from the GLEAM survey used in the analyses of this paper. The top row indicates the total surveyed area in GLEAM. The GLEAM catalogue area covers 24,831 deg 2 superscript deg 2 \mathrm{deg}^{2} and consists of the total surveyed area excluding the regions listed in the middle rows. The peeled sources are Hydra A, Pictor A, Hercules A, Virgo A, Crab, Cygnus A and Cassiopeia A; their positions are listed in Hurley-Walker et al. ( 2017 ) .

Description	Region	Area ( $\deg^{2}$ )
Total surveyed area	$Dec < + 30^{\circ}$	30,940
Galactic plane	Absolute Galactic latitude $< 10^{\circ}$	4,776
Ionospherically distorted	$0^{\circ} < Dec < + 30^{\circ}$ & $22^{h} < RA < 0^{h}$	859
Centaurus A	$13^{h} 25^{m} 28^{s} - 43^{\circ} 01^{'} 09^{''}$ , $r = 9^{\circ}$	254
Sidelobe reflection of Cen A	$13^{h} 07^{m} < RA < 13^{h} 53^{m}$ & $20^{\circ} < Dec < + 30^{\circ}$	104
Large Magellanic Cloud	$05^{h} 23^{m} 35^{s} - 69^{\circ} 45^{'} 22^{''}$ , $r = {5.5}^{\circ}$	95
Small Magellanic Cloud	$00^{h} 52^{m} 38^{s} - 72^{\circ} 48^{'} 01^{''}$ , $r = {2.5}^{\circ}$	20
Peeled sources	Radius of 10 arcmin	$< 1$
GLEAM catalogue area (region A)		24,831

Table 2. Table 2 : Source finding statistics in region A, covering 24,831 deg 2 superscript deg 2 \mathrm{deg}^{2} . For the 5 σ 𝜎 \sigma detection threshold, and PSF major and minor axes, we quote the mean and standard deviation. Sources are classified as extended as described in Section LABEL:Classifying_sources_as_point-like_or_extended.

Property	$ν = 200$ MHz	$ν = 154$ MHz	$ν = 118$ MHz	$ν = 88$ MHz
5 $σ$ detection threshold (mJy/bm)	$56 \pm 37$	$84 \pm 45$	$137 \pm 68$	$265 \pm 112$
Number of sources	307,455	254,072	195,821	131,250
Percentage extended	7.3	7.3	6.3	6.0
PSF major axis (arcsec)	$144 \pm 16$	$176 \pm 24$	$229 \pm 29$	$313 \pm 36$
PSF minor axis (arcsec)	$132 \pm 5$	$159 \pm 6$	$209 \pm 8$	$287 \pm 12$
Source density ( $\deg^{- 2}$ )	12.4	10.2	7.9	5.3
Number of beams/source	49	40	30	24

Table 3. Table 3 : Region B used to measure the source counts.

RA range	Dec range	Area ( $\deg^{2}$ )
$10^{h} 00^{m} < α < 12^{h} 30^{m}$	$- 40^{\circ} < δ < - 10^{\circ}$	6,516.2
$21^{h} 00^{m} < α < 06^{h} 15^{m}$	$- 60^{\circ} < δ < - 10^{\circ}$	6,516.2

Table 4. Table 4 : Key parameters recorded for three different runs of wsclean on a 154 MHz snapshot image (see text for details). The theoretical noise limit is 12.7 mJy/beam.

wsclean	Image size	Number of	$σ_{obs}$	Processing time
version	(pixels)	CLEAN iterations	(mJy/beam)	(hours)
1.10	4000	$\approx 25, 000$	26.3	5.0
2.5	4000	$\approx 193, 000$	18.6	1.2
2.5	6000	$\approx 500, 000$	14.7	10.1

Table 5. Table 5: Euclidean normalised differential source counts for GLEAM at 200, 154, 118 and 88 MHz. The bin centre corresponds to the mean flux density of all sources in the bin. The quoted counts are corrected for incompleteness, Eddington bias and source blending as described in the text; the correction factor for each bin is provided for reference.

Frequency	Bin start	Bin end	Bin centre	Raw number	Euclidean normalised	Correction	Region
(MHz)	$S$ (Jy)	$S$ (Jy)	$S$ (Jy)	of sources, $N$	counts ( ${Jy}^{3 / 2} {sr}^{- 1}$ )	factor
200	0.044	0.055	0.0493	13864	$378 \pm 8$	$1.10 \pm 0.02$	B
	0.055	0.069	0.0616	12919	$465 \pm 10$	$1.06 \pm 0.02$	B
	0.069	0.086	0.0771	11339	$575 \pm 12$	$1.04 \pm 0.02$	B
	0.086	0.107	0.0959	10210	$711 \pm 16$	$1.02 \pm 0.02$	B
	0.107	0.134	0.1199	28801	$802 \pm 15$	$1.14 \pm 0.02$	A
	0.134	0.168	0.1501	26880	$965 \pm 19$	$1.06 \pm 0.02$	A
	0.168	0.210	0.1879	23025	$1137 \pm 23$	$1.03 \pm 0.02$	A
	0.210	0.262	0.2343	19690	$1342 \pm 28$	$1.01 \pm 0.02$	A
	0.262	0.328	0.2928	16810	$1541 \pm 14$	$0.99 \pm 0.01$	A
	0.328	0.410	0.3664	13791	$1774 \pm 18$	$0.98 \pm 0.01$	A
	0.410	0.512	0.4571	11041	$1976 \pm 21$	$0.98 \pm 0.01$	A
	0.512	0.640	0.5712	8721	$2178 \pm 27$	$0.98 \pm 0.01$	A
	0.640	0.800	0.7120	6786	$2353 \pm 33$	$0.98 \pm 0.01$	A
	0.800	1.000	0.8912	5190	$2494 \pm 39$	$0.97 \pm 0.01$	A
	1.000	1.250	1.1160	4009	$2736 \pm 46$	$0.98 \pm 0.01$	A
	1.250	1.560	1.3909	2971	$2812 \pm 55$	$0.97 \pm 0.01$	A
	1.560	1.950	1.7417	2094	$2796 \pm 65$	$0.98 \pm 0.01$	A
	1.950	2.440	2.1702	1520	$2825 \pm 76$	$0.99 \pm 0.01$	A
	2.440	3.050	2.7023	1124	$2848 \pm 89$	$0.97 \pm 0.01$	A
	3.050	3.820	3.3927	701	$2561 \pm 109$	$1.00 \pm 0.02$	A
	3.820	4.770	4.2372	461	$2361 \pm 116$	$1.00 \pm 0.02$	A
	4.770	5.960	5.2773	333	$2314 \pm 129$	$0.98 \pm 0.01$	A
	5.960	7.450	6.5870	232	$2269 \pm 152$	$0.99 \pm 0.01$	A
	7.450	9.310	8.2935	152	$2165 \pm 181$	$1.01 \pm 0.02$	A
	9.310	11.600	10.3344	101	$2001 \pm 199$	-	A
	11.600	14.600	13.0278	61	$1646 \pm 211$	-	A
	14.600	18.200	16.0634	55	$2088 \pm 282$	-	A
	18.200	22.700	20.4399	25	$1387 \pm 277$	-	A
	22.700	28.400	24.6357	12	$838 \pm 242$	-	A
	28.400	56.800	41.3552	20	$1024 \pm 229$	-	A
	56.800	113.700	75.7906	3	$348_{- 189}^{+ 341}$	-	A
154	0.069	0.086	0.0772	11193	$601 \pm 12$	$1.09 \pm 0.02$	B
	0.086	0.107	0.0959	10216	$760 \pm 16$	$1.09 \pm 0.02$	B
	0.107	0.134	0.1198	9409	$909 \pm 20$	$1.04 \pm 0.02$	B
	0.134	0.168	0.1502	27363	$1071 \pm 20$	$1.15 \pm 0.02$	A
	0.168	0.210	0.1879	25309	$1285 \pm 26$	$1.05 \pm 0.02$	A
	0.210	0.262	0.2346	22288	$1529 \pm 32$	$1.01 \pm 0.02$	A
	0.262	0.328	0.2930	19560	$1800 \pm 16$	$0.99 \pm 0.01$	A
	0.328	0.410	0.3664	16366	$2099 \pm 20$	$0.98 \pm 0.01$	A
	0.410	0.512	0.4577	13462	$2420 \pm 27$	$0.98 \pm 0.01$	A
	0.512	0.640	0.5713	10764	$2674 \pm 31$	$0.97 \pm 0.01$	A
	0.640	0.800	0.7136	8553	$2933 \pm 37$	$0.96 \pm 0.01$	A
	0.800	1.000	0.8906	6629	$3194 \pm 44$	$0.97 \pm 0.01$	A
	1.000	1.250	1.1148	5097	$3465 \pm 53$	$0.98 \pm 0.01$	A
	1.250	1.560	1.3935	3806	$3578 \pm 62$	$0.96 \pm 0.01$	A
	1.560	1.950	1.7365	2849	$3783 \pm 76$	$0.99 \pm 0.01$	A
	1.950	2.440	2.1738	2022	$3690 \pm 88$	$0.97 \pm 0.01$	A
	2.440	3.050	2.7123	1501	$3869 \pm 106$	$0.98 \pm 0.01$	A
	3.050	3.820	3.3869	1106	$3932 \pm 124$	$0.98 \pm 0.01$	A
	3.820	4.770	4.2445	651	$3311 \pm 138$	$0.98 \pm 0.01$	A
	4.770	5.960	5.3171	457	$3263 \pm 158$	$0.99 \pm 0.01$	A
	5.960	7.450	6.6146	316	$3097 \pm 178$	$0.98 \pm 0.01$	A

Frequency	Bin start	Bin end	Bin centre	Raw number	Euclidean normalised	Correction	Region
(MHz)	$S$ (Jy)	$S$ (Jy)	$S$ (Jy)	of sources, $N$	counts ( ${Jy}^{3 / 2} {sr}^{- 1}$ )	factor
154	7.450	9.310	8.2454	223	$3053 \pm 209$	$0.99 \pm 0.01$	A
	9.310	11.600	10.3656	133	$2656 \pm 230$	-	A
	11.600	14.600	12.8897	104	$2733 \pm 268$	-	A
	14.600	18.200	16.4515	56	$2257 \pm 302$	-	A
	18.200	22.700	19.7413	49	$2492 \pm 356$	-	A
	22.700	28.400	25.2264	30	$2223 \pm 406$	-	A
	28.400	56.800	40.6881	24	$1180 \pm 241$	-	A
	56.800	113.700	75.3991	7	$803_{- 296}^{+ 434}$	-	A
118	0.107	0.134	0.1202	9139	$994 \pm 20$	$1.16 \pm 0.02$	B
	0.134	0.168	0.1500	8434	$1226 \pm 26$	$1.12 \pm 0.02$	B
	0.168	0.210	0.1880	7383	$1478 \pm 32$	$1.09 \pm 0.02$	B
	0.210	0.262	0.2351	21261	$1720 \pm 31$	$1.19 \pm 0.02$	A
	0.262	0.328	0.2932	20304	$2068 \pm 41$	$1.09 \pm 0.02$	A
	0.328	0.410	0.3665	18032	$2440 \pm 51$	$1.03 \pm 0.02$	A
	0.410	0.512	0.4577	15535	$2862 \pm 62$	$1.00 \pm 0.02$	A
	0.512	0.640	0.5717	13234	$3309 \pm 35$	$0.98 \pm 0.01$	A
	0.640	0.800	0.7140	10612	$3629 \pm 46$	$0.96 \pm 0.01$	A
	0.800	1.000	0.8913	8608	$4101 \pm 56$	$0.96 \pm 0.01$	A
	1.000	1.250	1.1157	6688	$4477 \pm 69$	$0.96 \pm 0.01$	A
	1.250	1.560	1.3932	4998	$4599 \pm 76$	$0.94 \pm 0.01$	A
	1.560	1.950	1.7374	3797	$4961 \pm 90$	$0.97 \pm 0.01$	A
	1.950	2.440	2.1728	2868	$5141 \pm 105$	$0.95 \pm 0.01$	A
	2.440	3.050	2.7193	1973	$5031 \pm 121$	$0.96 \pm 0.01$	A
	3.050	3.820	3.3995	1500	$5398 \pm 148$	$0.98 \pm 0.01$	A
	3.820	4.770	4.2397	1043	$5263 \pm 175$	$0.98 \pm 0.01$	A
	4.770	5.960	5.2943	627	$4298 \pm 184$	$0.96 \pm 0.01$	A
	5.960	7.450	6.6425	463	$4670 \pm 238$	$1.00 \pm 0.02$	A
	7.450	9.310	8.3012	305	$4243 \pm 259$	$0.99 \pm 0.02$	A
	9.310	11.600	10.3228	210	$4150 \pm 286$	-	A
	11.600	14.600	13.0154	138	$3716 \pm 316$	-	A
	14.600	18.200	16.1551	83	$3197 \pm 351$	-	A
	18.200	22.700	20.4050	70	$3867 \pm 462$	-	A
	22.700	28.400	24.7647	37	$2618 \pm 430$	-	A
	28.400	56.800	35.5961	50	$1759 \pm 249$	-	A
	56.800	113.700	75.5865	13	$1500_{- 410}^{+ 543}$	-	A
88	0.262	0.328	0.2929	5937	$2413 \pm 52$	$1.15 \pm 0.02$	B
	0.328	0.410	0.3666	5329	$3038 \pm 68$	$1.14 \pm 0.02$	B
	0.410	0.512	0.4578	4802	$3585 \pm 85$	$1.07 \pm 0.02$	B
	0.512	0.640	0.5733	14344	$3992 \pm 81$	$1.08 \pm 0.02$	A
	0.640	0.800	0.7149	12701	$4580 \pm 99$	$1.01 \pm 0.02$	A
	0.800	1.000	0.8933	10527	$5158 \pm 69$	$0.98 \pm 0.01$	A
	1.000	1.250	1.1148	8722	$5791 \pm 83$	$0.96 \pm 0.01$	A
	1.250	1.560	1.3943	6864	$6362 \pm 98$	$0.95 \pm 0.01$	A
	1.560	1.950	1.7392	5309	$6653 \pm 116$	$0.93 \pm 0.01$	A
	1.950	2.440	2.1731	3947	$6937 \pm 125$	$0.94 \pm 0.01$	A
	2.440	3.050	2.7171	2947	$7211 \pm 154$	$0.93 \pm 0.01$	A
	3.050	3.820	3.3946	2131	$7448 \pm 190$	$0.96 \pm 0.01$	A
	3.820	4.770	4.2569	1537	$7604 \pm 220$	$0.95 \pm 0.01$	A
	4.770	5.960	5.3082	1061	$7474 \pm 250$	$0.98 \pm 0.01$	A
	5.960	7.450	6.5933	658	$6216 \pm 261$	$0.95 \pm 0.01$	A
	7.450	9.310	8.2753	471	$6297 \pm 344$	$0.95 \pm 0.03$	A
	9.310	11.600	10.3467	317	$6218 \pm 367$	$0.99 \pm 0.02$	A
	11.600	14.600	12.9277	227	$6010 \pm 399$	-	A
	14.600	18.200	16.2550	139	$5437 \pm 461$	-	A
	18.200	22.700	20.1788	81	$4352 \pm 484$	-	A
	22.700	28.400	25.7641	67	$5235 \pm 640$	-	A
	28.400	56.800	36.8540	86	$3300 \pm 356$	-	A
	56.800	113.700	78.1978	15	$1884_{- 480}^{+ 624}$	-	A

Equations30

R = \frac{a _{PSF} b _{PSF}}{a _{rst} b _{rst}},

R = \frac{a _{PSF} b _{PSF}}{a _{rst} b _{rst}},

S_{w} = Σ_{i = 1}^{N} w_{i} S_{0} (\frac{ν _{i}}{ν _{0}})^{α},

S_{w} = Σ_{i = 1}^{N} w_{i} S_{0} (\frac{ν _{i}}{ν _{0}})^{α},

\frac{S _{0}}{S _{w}} = [Σ_{i = 1}^{N} w_{i} (\frac{ν _{i}}{ν _{0}})^{α}]^{- 1} .

\frac{S _{0}}{S _{w}} = [Σ_{i = 1}^{N} w_{i} (\frac{ν _{i}}{ν _{0}})^{α}]^{- 1} .

ln (\frac{S}{S _{peak}}) > 2 (\frac{σ _{S}}{S})^{2} + (\frac{σ _{S_{peak}}}{S _{peak}})^{2} .

ln (\frac{S}{S _{peak}}) > 2 (\frac{σ _{S}}{S})^{2} + (\frac{σ _{S_{peak}}}{S _{peak}})^{2} .

χ^{2} = Σ_{i = 1}^{N} w_{i} [n_{i, 200} - y n_{154} (\frac{S _{i, 200}}{x})]^{2},

χ^{2} = Σ_{i = 1}^{N} w_{i} [n_{i, 200} - y n_{154} (\frac{S _{i, 200}}{x})]^{2},

\displaystyle w_{i}=\left\{\begin{array}[]{ll}\left[\sigma_{n_{i,200}}^{2}+\sigma_{n_{154}(S_{i,200}/x)}^{2}\right]^{-1}&\mathrm{if~{}}\frac{S_{i,200}}{x}>0.5~{}\mathrm{Jy,}\\ 0&\mathrm{otherwise.}\end{array}\right.

\displaystyle w_{i}=\left\{\begin{array}[]{ll}\left[\sigma_{n_{i,200}}^{2}+\sigma_{n_{154}(S_{i,200}/x)}^{2}\right]^{-1}&\mathrm{if~{}}\frac{S_{i,200}}{x}>0.5~{}\mathrm{Jy,}\\ 0&\mathrm{otherwise.}\end{array}\right.

σ_{t} = \frac{2 k _{B} T}{A _{eff} ϵ _{c}} \frac{1}{τ B n _{p} N ( N - 1 )},

σ_{t} = \frac{2 k _{B} T}{A _{eff} ϵ _{c}} \frac{1}{τ B n _{p} N ( N - 1 )},

R (x) d x = \int_{Ω} \frac{d N}{d S} (\frac{x}{B ( θ , ϕ )}) B (θ, ϕ)^{- 1} d Ω d x,

R (x) d x = \int_{Ω} \frac{d N}{d S} (\frac{x}{B ( θ , ϕ )}) B (θ, ϕ)^{- 1} d Ω d x,

P (D) = F^{- 1} [p (ω)],

P (D) = F^{- 1} [p (ω)],

p (ω) = exp [\int_{0}^{\infty} R (x) exp (iω x) d x - \int_{0}^{\infty} R (x) d x] .

p (ω) = exp [\int_{0}^{\infty} R (x) exp (iω x) d x - \int_{0}^{\infty} R (x) d x] .

lo g_{10} (S^{2.5} \frac{d N}{d S}) = i = 0 \sum 5 a_{i} [lo g_{10} (S)]^{i},

lo g_{10} (S^{2.5} \frac{d N}{d S}) = i = 0 \sum 5 a_{i} [lo g_{10} (S)]^{i},

P_{c} (D) * P_{n} (D) = F^{- 1} [p (ω) exp (\frac{- σ _{t}^{2} ω ^{2}}{2})] .

P_{c} (D) * P_{n} (D) = F^{- 1} [p (ω) exp (\frac{- σ _{t}^{2} ω ^{2}}{2})] .

\frac{i = 1 \sum 5 A _{i} P _{c} ( D ) * P _{n, i} ( D )}{i = 1 \sum 5 A _{i}},

\frac{i = 1 \sum 5 A _{i} P _{c} ( D ) * P _{n, i} ( D )}{i = 1 \sum 5 A _{i}},

σ_{s} \approx σ_{c} B_{rms} \frac{Ω _{P}}{Ω _{B}},

σ_{s} \approx σ_{c} B_{rms} \frac{Ω _{P}}{Ω _{B}},

\frac{σ _{s, Phase 1}}{σ _{s, Phase 2}} ⪆ \frac{σ _{c, Phase 1}}{σ _{c, Phase 2}} .

\frac{σ _{s, Phase 1}}{σ _{s, Phase 2}} ⪆ \frac{σ _{c, Phase 1}}{σ _{c, Phase 2}} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

\jid

PASA

\jyear2024

Source counts and confusion at 72–231 MHz in the MWA GLEAM survey

T. M. O. Franzen1,2,3, Email: [email protected]

T. Vernstrom4

C. A. Jackson1,5,3

N. Hurley-Walker1

R. D. Ekers1

G. Heald2

N. Seymour1

and S. V. White1

1International Centre for Radio Astronomy Research, Curtin University, Bentley, WA 6102, Australia

2CSIRO Astronomy and Space Science, PO Box 1130, Bentley WA 6102, Australia

3ASTRON, Netherlands Institute for Radio Astronomy, Oude Hoogeveensedijk 4, 7991 PD, Dwingeloo, The Netherlands

4Dunlap Institute for Astronomy and Astrophysics, University of Toronto, ON, M5S 3H4, Canada

5ARC Centre of Excellence for All-sky Astrophysics (CAASTRO)

Abstract

The GaLactic and Extragalactic All-sky MWA survey (GLEAM) is a radio continuum survey at 72–231 MHz of the whole sky south of declination $+30^{\circ}$ , carried out with the Murchison Widefield Array (MWA). In this paper, we derive source counts from the GLEAM data at 200, 154, 118 and 88 MHz, to a flux density limit of 50, 80, 120 and 290 mJy respectively, correcting for ionospheric smearing, incompleteness and source blending. These counts are more accurate than other counts in the literature at similar frequencies as a result of the large area of sky covered and this survey’s sensitivity to extended emission missed by other surveys. At $S_{154~{}\mathrm{MHz}}>0.5$ Jy, there is no evidence of flattening in the average spectral index ( $\alpha\approx-0.8$ where $S\propto\nu^{\alpha}$ ) towards the lower frequencies. We demonstrate that the SKA Design Study (SKADS) model by Wilman et al. significantly underpredicts the observed 154 MHz GLEAM counts, particularly at the bright end. Using deeper LOFAR counts and the SKADS model, we find that sidelobe confusion dominates the thermal noise and classical confusion at $\nu\gtrsim 100$ MHz due to both the limited CLEANing depth and undeconvolved sources outside the field-of-view. We show that we can approach the theoretical noise limit using a more efficient and automated CLEAN algorithm.

doi:

10.1017/pas.2024.xxx

keywords:

galaxies: active — galaxies: statistics — radio continuum: galaxies — surveys — techniques: image processing

1 Introduction

Differential radio source counts are important because they constrain the nature and evolution of extragalactic sources, and unlike luminosity functions, do not require redshifts. They have to date been best studied at 1.4 GHz. At the highest flux densities ( $S\gtrsim 10$ Jy), the 1.4-GHz Euclidean normalised differential counts, $\frac{dN}{dS}S^{2.5}$ , show a flattened region, as expected in a static, non-evolving (‘Euclidean’) Universe. Below $\sim 10$ Jy, the counts rise with decreasing flux density followed by a plateau and then a steep fall. This bulge is recognised (Longair, 1966) as an indicator of cosmic evolution, in which radio-luminous sources undergo greater evolution in comoving space density than their less-luminous counterparts. Condon & Mitchell (1984) and Windhorst et al. (1985) found that the source count slope flattens around 1 mJy, suggesting a new population of radio sources at low flux densities. This new population is now widely thought to consist predominantly of star-forming galaxies with an admixture of radio-quiet AGN (e.g. Jackson & Wall, 1999; Massardi et al., 2010; de Zotti et al., 2010).

Our knowledge of the low-frequency sky ( $\nu\lesssim 200$ MHz) is poor compared with that at 1.4 GHz, and consequently information about the low-frequency counts is more limited. Low-frequency surveys are particularly sensitive to sources with steep synchrotron spectra. They are not biased by relativistic beaming effects and favour older emission originating from the extended lobes of radio galaxies rather than emission from the core (Wall, 1994). They therefore give a complementary view to $\sim$ GHz surveys.

As well as contributing to our understanding of extragalactic source populations, low frequency counts are useful for the interpretation of Epoch of Reionisation (EoR) data, in which foreground radio sources are a critical contaminant. A number of methods to model and subtract the foreground contamination from EoR data have been explored (see e.g. Morales & Hewitt, 2004; Chapman et al., 2012; Trott, Wayth & Tingay, 2012; Carroll et al., 2016). Higher resolution radio data at a similar frequency to the EoR observations can be used to directly subtract extragalactic radio sources from the EoR data while extrapolation of the known source counts can be used to model and statistically suppress sources to fainter flux densities.

Survey observations over the past few years with instruments such as the Giant Metrewave Radio Telescope (GMRT; Swarup, 1991), the Low Frequency Array (LOFAR; van Haarlem et al., 2013) and the Murchison Widefield Array (MWA; Tingay et al., 2013) have provided a wealth of new information about the low-frequency sky. Recent all-sky low frequency surveys include the VLA Low-frequency Sky Survey Redux at 74 MHz (VLSSr; Lane et al., 2014), the Multifrequency Snapshot Sky Survey at 120–180 MHz (MSSS; Heald et al., 2015), the Tata Institute for Fundamental Research GMRT Sky Survey at 150 MHz (TGSS; Intema et al., 2016) and the Galactic and Extragalactic All-sky MWA survey at 72–231 MHz (GLEAM; Wayth et al., 2015). Among these surveys, GLEAM has the widest fractional bandwidth and highest surface brightness sensitivity. The survey covers the entire sky south of Dec $+30^{\circ}$ at an angular resolution of $\approx 2.5$ arcmin at 200 MHz and is complete to $S_{200~{}\mathrm{MHz}}=50$ mJy in the deepest regions.

Much deeper and higher resolution surveys at 150 MHz covering a few tens of square degrees exist using LOFAR (Hardcastle et al., 2016; Mahony et al., 2016; Williams et al., 2016). The deepest of these by Williams et al. (2016) reaches an rms sensitivity of $\approx 120~{}\mu$ Jy/beam. These surveys have detected a flattening in the counts below $\approx 10$ mJy which is thought to be associated with the rise of the low flux density star-forming galaxies and radio-quiet AGN, as seen at e.g. 1.4 GHz below $\approx 1$ mJy. The ongoing LOFAR Two-metre Sky Survey (LoTSS; Shimwell et al., 2017) at 120–168 MHz will eventually cover the entire northern sky to an rms sensitivity of $\approx 100~{}\mu$ Jy/beam.

The Square Kilometre Array Design Study (SKADS) Semi-Empirical Extragalactic Simulated Sky by Wilman et al. (2008) is in wide use to facilitate predictions for the SKA sky and optimise its design and observing programmes. These models are also a valuable tool in the interpretation of existing radio surveys. The latest low frequency counts provide an opportunity to compare the model predictions and identify any deficiencies.

The confusion noise in low-frequency interferometric images is dependent on the source counts. Classical confusion occurs when the source density is so high that sources cannot be clearly resolved by the array; the image fluctuations are due to the sum of all sources in the main lobe of the synthesised beam. Sidelobe confusion introduces additional noise into an image due to the combined sidelobes of undeconvolved sources. Other basic sources of error in radio interferometric images include the system noise and calibration artefacts. It is important to analyse the relative contribution of these noise terms to assess whether enhancements in the data processing have the potential to further reduce the noise. This is also essential for statistically interpreting survey data below the source detection threshold.

Franzen et al. (2016) derive the 154 MHz source counts using MWA pointed observations of an EoR field covering $570~{}\mathrm{deg}^{2}$ , centred at J2000 $\alpha=03^{\mathrm{h}}30^{\mathrm{m}}$ , $\delta=-28^{\circ}00^{\prime}$ . The image has an angular resolution of 2.3 arcmin and the rms noise in the centre of the image is 4–5 mJy/beam. Using deeper GMRT source counts down to $S_{\mathrm{153~{}MHz}}=6$ mJy, they estimate the classical confusion noise to be $\approx 1.7$ mJy/beam from a $P(D)$ analysis (Scheuer, 1957). They argue that the image is limited by sidelobe confusion but they do not investigate the underlying causes of the sidelobe confusion.

In this paper, we derive the source counts to higher precision using the GLEAM survey, covering $24,831~{}\mathrm{deg}^{2}$ , at 200, 154, 118 and 88 MHz, allowing tight constraints on bright radio source population models. We analyse any change in the shape of the source counts with frequency and compare them with the SKADS model. We use the LOFAR counts by Williams et al. (2016) together with the SKADS model to derive the classical confusion noise across the entire GLEAM frequency range. We quantify the excess background noise in GLEAM and demonstrate that it is primarily caused by sidelobe confusion. We identify which aspects of the data processing contribute to sidelobe confusion and show how the sidelobe confusion can be improved. Finally, we discuss confusion limits for future MWA Phase 2 observations with the angular resolution improved by a factor of two.

2 GLEAM observing, imaging, and source finding

We refer the reader to Wayth et al. (2015) and Hurley-Walker et al. (2017) for details of the survey strategy and data reduction methods for the GLEAM year 1 extragalactic catalogue respectively. In this section, we highlight the points salient to this paper.

The GLEAM survey was conducted using Phase 1 of the MWA, which consisted of 128 16-crossed-pair-dipole tiles, distributed over an area $\approx 3$ km in diameter. The whole sky south of Dec $+30^{\circ}$ was surveyed using meridian drift scan observations. The sky was divided into seven declination strips and one declination strip was covered in a given night. The observing was broken into a series of 2 min scans in five frequency bands (72–103, 103–134, 139–170, 170–200 and 200–231 MHz), cycling through the five frequency bands in 10 min.

Each 2 min snapshot observation was imaged separately using wsclean (Offringa et al., 2014), a w-stacking deconvolution algorithm which appropriately handles the w term for widefield imaging. For imaging purposes, the 30.72 MHz bandwidth was split into four 7.68 MHz sub-bands. The final image products consist of 20 Stokes $I$ 7.68 MHz sub-band mosaics spanning 72–231 MHz as well as four deep wide-band mosaics covering 170–231, 139–170, 103–134 and 72–103 MHz, formed by combining the 7.68 MHz sub-band mosaics.

The source finder aegean (Hancock et al., 2012; Hancock, Trott & Hurley-Walker, 2018) was run on the 170–231 MHz image to create a blind source catalogue centred at 200 MHz. The catalogue was filtered to exclude areas within $10^{\circ}$ of the Galactic plane and other areas affected by poor ionospheric conditions or containing bright, extended sources such as Centaurus A (see Table 1 for details). The filtered catalogue covers an area of $24,831~{}\mathrm{deg}^{2}$ , hereafter referred to as region A, and contains 307,455 components above $5\sigma$ , where $\sigma$ is the rms noise. It is estimated to be 90 per cent complete at $S_{200~{}\mathrm{MHz}}=170$ mJy. In order to provide spectral information across the full frequency range, the priorised fitting mode of aegean was used to perform flux density estimates across the 20 7.68-MHz sub-bands. The catalogue provides both peak and integrated flux densities. The peak flux densities were corrected for ionospheric smearing as outlined below. The three lowest frequency wide-band images were not used to provide measurements for the catalogue.

The GLEAM flux densities are tied to the flux density scale of Baars et al. (1977). Overall, the GLEAM catalogue is consistent with Baars et al. to within 8 per cent for 90 per cent of the survey area, where the difference is primarily caused by uncertainty in the MWA primary beam model.

2.1 Correcting peak flux densities for ionospheric smearing

Ionospheric perturbations cause sources to be smeared out in the final, mosaicked images. The magnitude of the effect is proportional to $\nu^{-2}$ , where $\nu$ is the frequency. Consequently, at any map position, the actual point spread function (PSF) is larger than the restoring beam by a certain amount, depending on the degree of ionospheric smearing. Hurley-Walker et al. (2017) used sources known to be unresolved in higher resolution radio surveys to sample the shape of the PSF across each of the mosaics. Maps of the variation of $a_{\mathrm{psf}}$ , $b_{\mathrm{psf}}$ and $pa_{\mathrm{psf}}$ were produced, where $a_{\mathrm{psf}}$ , $b_{\mathrm{psf}}$ and $pa_{\mathrm{psf}}$ are the major and minor axes and position angle of the PSF respectively.

The increase in area of the PSF resulting from ionospheric smearing is given by

[TABLE]

where $a_{\mathrm{rst}}$ and $b_{\mathrm{rst}}$ are the major and minor axes of the restoring beam respectively. Sources detected in the 170–231 MHz image have a mean value of $R$ of 1.14, with a standard deviation of 0.04, and in regions worst affected by ionospheric smearing, $R$ reaches 1.44. Ionospheric smearing not only increases the source area by a factor of $R$ but also reduces the peak flux density by the same amount, while integrated flux densities are preserved. In order to restore the peak flux density of the sources, the images were multiplied by $R$ . In the catalogue, integrated flux densities were normalised with respect to the position-dependent PSF to ensure that, for bright point sources, peak and integrated flux densities agree.

3 Source finding at 154, 118 and 88 MHz

Since a statistically complete sample is required to measure the counts at any frequency, we cannot use the sub-band measurements quoted in the GLEAM catalogue, obtained from the priorised fitting, to measure the counts. In order to derive the counts at frequencies below 200 MHz, we use the wide-band images covering 139–170, 103–134 and 72–103 MHz, centred at 154, 118 and 88 MHz respectively.

We create a blind source catalogue at each of these frequencies following a similar procedure to that employed by Hurley-Walker et al. (2017). We first use bane (Hancock, Trott & Hurley-Walker, 2018) to remove the background structure and estimate the rms noise across the image. The ‘box’ parameter defining the angular scale on which the rms and background are evaluated is set to 20 times the synthesised beam size. We then run the source finder aegean using a 5 $\sigma$ detection threshold. The integrated flux densities are normalised using the PSF map at the relevant frequency. Sources lying within areas flagged from the GLEAM catalogue (see Table 1) are excluded. The number of sources detected at each frequency and other source finding statistics are given in Table 2.

The mosaics used to create the source catalogues have a relatively large fractional bandwidth; the 88 MHz mosaic has the largest fractional bandwidth of $\approx 0.35$ . For any source with a non-zero spectral index, there is a discrepancy between the average flux density integrated over the band, $S_{\mathrm{w}}$ , and the monochromatic flux density, $S_{0}$ , at the central frequency, $\nu_{0}$ , for two reasons. Firstly, most sources are better described by a power-law slope across the band than a simple linear slope. $S_{\mathrm{w}}$ will always exceed $S_{\mathrm{0}}$ for a source with a power-law slope. The magnitude of this effect increases with fractional bandwidth and for a source with an increasingly non-flat spectrum. The second cause of the discrepancy is the inverse noise-squared weighting applied to the 7.68 MHz sub-band mosaics: in practice, the noise in the 7.68 MHz sub-band mosaics decreases slightly with frequency, causing more weight to be assigned to higher frequency mosaics. For a source with $\alpha<0$ , where $\alpha$ is the spectral index ( $S\propto\nu^{\alpha}$ ), these two effects go in opposite directions: $S_{\mathrm{w}}$ increases as a result of the power-law slope of sources across the band and decreases as a result of the weighting scheme adopted in the mosaicking.

For sources detected in each of the wide-band mosaics, we calculate the required flux density correction factor, $S_{0}/S_{\mathrm{w}}$ . At any position in the mosaic,

[TABLE]

where $w_{i}$ is the weight assigned to the $i^{\mathrm{th}}$ sub-band, normalised such that $\Sigma_{i=1}^{N}w_{i}=1.0$ , $\nu_{i}$ is the central frequency of the $i^{\mathrm{th}}$ sub-band and $N$ is the number of 7.68 MHz sub-bands. The flux density correction factor is given by

[TABLE]

We produce simulated images of the flux density correction factor using the mosaicking software swarp (Bertin et al., 2002) assuming $\alpha=-0.8$ , the typical spectral index of GLEAM sources between 76 and 227 MHz. Using these images we extract the correction factor for sources detected in each of the wide-band images. We find that the mean $\pm$ standard deviation of the correction factor in the 200, 154, 118 and 88 MHz mosaics is $1.000\pm 0.009$ , $1.003\pm 0.001$ , $1.007\pm 0.002$ and $1.002\pm 0.004$ respectively. Given the correction factors are very close to unity ( $<1$ per cent), we ignore them.

4 Determining the source counts

We measure the source counts at 200 MHz using the wide-band flux densities quoted in the GLEAM catalogue and at 154, 118 and 88 MHz using the catalogues compiled in Section 3. At each frequency, the vast majority of sources are point-like due to the large beam size. For unresolved sources, peak flux densities will be significantly more accurate than integrated flux densities at low signal-to-noise ratio (SNR). This is because more free parameters are required to measure an integrated flux density using Gaussian fitting. We note that peak flux densities are corrected for ionospheric smearing as outlined in Section 2.1. Therefore, in measuring the counts, we only use integrated flux densities for sources which are significantly resolved and use peak flux densities for the remaining sources. We distinguish between point-like and extended sources as described in Section 4.1.

The rms noise varies substantially across the survey due to varying observational data quality and the presence of image artefacts originating from bright sources and the Galactic Plane. It increases at lower frequency and becomes less Gaussian as the classical confusion noise becomes more dominant. The counts must be corrected for both incompleteness and Eddington bias (Eddington, 1913) close to the survey detection limit. Incompleteness causes the counts to be underestimated close to the detection limit, while the Eddington bias makes it more likely for noise to scatter sources above the detection limit than to scatter them below it due to the steepness of source counts, consequently boosting the counts in the faintest bins. The magnitude of the Eddington bias only depends on the SNR and the source count slope (Hogg & Turner, 1998).

The number of synthesised beams per source is often used as a measure of confusion as it indicates the typical separation of sources at the survey cut-off limit. The number of beams per source at each frequency is indicated in Table 2. It is only 24 at the lowest frequency, indicating that the average separation between sources is $\sqrt{24}\approx 5$ beams. Vernstrom et al. (2016) used simulated images to investigate the effect of confusion on the source-fitting accuracy for the source finders aegean and obit (Cotton & Uson, 2008). Similar results were obtained for both source finders: sources separated by less than the beam size were fitted as a single source up to 95 per cent of the time, while the total flux density of the sources was, on average, conserved. Thus the effect of confusion is either to prevent a source from being detected or boost its flux density, which may, in turn, significantly bias the counts. In Section 4.2, we use Monte Carlo simulations to investigate the effect of incompleteness, Eddington bias and source blending on the counts.

Conversely, sources (i.e. physical entities associated with a host galaxy) of largest angular size may also be broken up into multiple components in GLEAM. In measuring the source counts, physically related components should be counted as a single source and their flux densities summed together. In Section 4.3, we show that, given the large beam size, the source counts are well approximated as counts of components.

4.1 Classifying sources as point-like or extended

We use the method described in Franzen et al. (2015) to identify extended sources based on the ratio of integrated flux density, $S$ , to peak flux density $S_{\mathrm{peak}}$ . Assuming that the uncertainties on $S$ and $S_{\mathrm{peak}}$ ( $\sigma_{S}$ and $\sigma_{S_{\mathrm{peak}}}$ respectively) are independent, to detect source extension at the 2 $\sigma$ level, we require

[TABLE]

We take $\sigma_{S_{\mathrm{peak}}}$ and $\sigma_{S}$ as the sum in quadrature of the Gaussian parameter fitting uncertainties returned by aegean, which accounts for the local noise, and the GLEAM internal flux density calibration error. The latter is estimated to be 2 per cent at $-72^{\circ}\leq\mathrm{Dec}<18.5^{\circ}$ and 3 per cent at $\mathrm{Dec}<-72^{\circ}$ and $\mathrm{Dec}\geq 18.5^{\circ}$ (Hurley-Walker et al., 2017). For bright sources, where the 2 per cent calibration error dominates, $\frac{S}{S_{\mathrm{peak}}}>1.06$ is considered to be extended.

Table 2 gives the fraction of sources classified as extended at each frequency. Fig. 1 shows $\frac{S}{S_{\mathrm{peak}}}$ as a function of SNR for all sources detected at 200 MHz. 7.3 per cent of sources are classified as extended at this frequency, where the beam size ( $\approx 2.5$ arcmin) is smallest; these are highlighted in red.

Investigations using higher resolution (45 arcsec) radio images from the NRAO VLA Sky Survey (NVSS; Condon et al., 1998) at 1.4 GHz show that a large fraction of resolved sources in GLEAM are, in fact, artefacts of source confusion or noise fluctuations: we randomly select 50 sources classified as extended at 200 MHz in the region of sky covered by NVSS, i.e. at $\mathrm{Dec}>-40^{\circ}$ . We find that 39 of the sources are resolved into multiple components in NVSS. Of these 39 sources, only 16 are likely to be genuinely extended because the NVSS components have similar peak flux densities and there is extended emission linking the components; the remaining 23 sources probably appear extended as a result of source blending. An example of each of these cases is shown in Fig. 2.

4.2 Correcting the counts for incompleteness, Eddington bias and source blending

We conduct Monte Carlo simulations to quantify the effect of incompleteness, Eddington bias and source blending on the counts. Our approach is to inject synthetic point sources with a range of flux densities into the wide-band images using aeres from the aegean package. We then use exactly the same source-finding procedure as described in Section 3 to detect the simulated sources and measure their flux densities. The corrections to the counts as a function of flux density are obtained from the ratio of the injected count to the measured count of the simulated sources.

The major and minor axes of the simulated sources are set to $a_{\mathrm{psf}}$ and $b_{\mathrm{psf}}$ respectively, which are obtained from the PSF map at the relevant frequency. The simulated sources lie at random positions within region A but we set a minimum separation of 20 arcmin ( $\approx 4$ times the beam size at the lowest frequency) between simulated sources to avoid them affecting each other. A simulated source may lie too close to a real ( $>5\sigma$ ) source to be detected separately. In such situations, if the recovered source is closer to the simulated source than the real source, the simulated source is considered to be detected, otherwise not. Thus we account for source confusion in the counts in this analysis.

It is important to ensure that the flux density distribution of the simulated sources is as realistic as possible and extends to well below the $5\sigma$ detection limit ( $\gtrsim 50$ mJy/beam at 154 MHz). This is because the Eddington bias is dependent on the slope of the counts and causes the flux densities of sources with low SNRs to be biased high, boosting the number of sources detected in the faintest bins. The flux density distribution of the simulated sources at 154 MHz is based on the following source count model: above 33 mJy, we use a $3^{\mathrm{rd}}$ order polynomial fit to 154 MHz counts from a 12 hour pointed MWA observation of an EoR field, covering $570~{}\mathrm{deg}^{2}$ (Franzen et al., 2016). Between 6 and 33 mJy, deep 153 MHz GMRT counts from Williams, Intema & Röttgering (2013) and Intema et al. (2011) are well represented by a power law of slope $\gamma=0.96$ , where $S^{2.5}\frac{dN}{dS}=kS^{\gamma}$ . We therefore set $\gamma=0.96$ in this flux density range. A total of 40,000 flux densities ranging between 6 mJy and 15 Jy are drawn randomly from the source count model. We extrapolate the simulated source flux densities to 200, 118 and 88 MHz assuming $\alpha=-0.8$ , as indicated by the typical spectral index seen in GLEAM.

The simulations are repeated 40 times to improve statistics. The solid lines in Fig. 3 show the mean source counts correction factor in region A, $c_{\mathrm{A}}$ , in each of the wide-band images. The effects of both incompleteness and confusion are clearly evident. The sharp increase in the correction factor at low flux density is due to incompleteness. As expected, the survey becomes incomplete at a higher flux density in the lower frequency images. Source blending causes the correction factor to fall below 1.0 at higher flux densities. At 200 MHz, despite the large beam size of $\approx 2.5$ arcmin, the number of beams per source (49) is low enough for confusion not to strongly affect the counts, which are only overestimated by up to 2–3 per cent. As expected, the effect worsens at lower frequency due to the lower number of beams per source: at 88 MHz, the number of beams per source is 24 and the counts are overestimated by up to 7 per cent as a result of confusion.

From visual inspection of the rms noise maps, we identify areas within region A where the rms noise is well below average at zenith angles $\lessapprox 30$ deg, covering in total 6,516.2 $\mathrm{deg}^{2}$ . The lines of RA and Dec bounding this region, hereafter referred to as region B, are given in Table 3. The dashed lines show the correction factor in region B, $c_{\mathrm{B}}$ . The counts start becoming incomplete at a flux density about twice as low as in region A at all frequencies. The counts are measured in region A in flux density bins where $c_{\mathrm{A}}\leq 1.2$ . If $c_{\mathrm{A}}>1.2$ and $c_{\mathrm{B}}\leq 1.2$ , the counts are measured in region B. We do not measure the counts in bins where $c_{\mathrm{B}}>1.2$ as the correction factor rises sharply with decreasing flux density in these bins and becomes unreliable.

4.3 Complex sources

We report counts of components rather than counts for integrated sources. The magnitude of the difference between the two will depend on the beam size and the intrinsic angular source size distribution. White et al., in prep., are analysing a subset of the GLEAM catalogue in detail to study the nature and evolution of the bright end of the low frequency population. The GLEAM 4 Jy sample is a statistically complete sample of 1845 sources with $S_{151\mathrm{MHz}}>4.0$ Jy, covering region A. Only 44 (2.4 per cent) of the sources are resolved into multiple components, where the beam size is $\approx 2.5$ arcmin. Multi-component sources are identified through visual inspection of higher resolution radio images from NVSS, the Sydney University Molonglo Sky Survey (SUMSS; Mauch et al., 2007) and the Faint Images of the Radio Sky at Twenty Centimetres (FIRST; Becker, White & Helfand, 1995) survey. The likelihood of a source showing complex structure increases with flux density above 4 Jy due to the increasing fraction of objects at very low redshifts, as shown in the bottom panel of Fig. 4. No multi-component sources are detected in the highest flux density bin ( $57-114$ Jy) but it only contains 5 sources, 3 of which are extended in GLEAM and resolved into multiple components in NVSS/SUMSS.

We use the GLEAM 4 Jy sample to measure both the source and component counts at $S_{151\mathrm{MHz}}>4.0$ Jy. We find that the component and source counts agree within the Poisson uncertainties, as shown in the top panel of Fig. 4, given the small fraction of sources which are resolved into multiple components. Windhorst, Mathis & Neuschaefer (1990) found that, below $S_{1.4\mathrm{GHz}}=3$ Jy, the median angular size of radio galaxies, $\theta_{\mathrm{med}}$ , decreases continuously towards fainter flux densities, with $\theta_{\mathrm{med}}\propto(S_{1.4~{}\mathrm{GHz}})^{0.3}$ . Assuming that a similar relation holds at lower frequency, we expect our multi-frequency component counts to be a good approximation of the counts for integrated sources.

Finally, we note that the following bright, complex sources were peeled from the GLEAM data and subsequently lie outside region A: Hydra A, Pictor A, Hercules A, Virgo A, Crab, Cygnus A and Cassiopeia A. Centaurus A also lies outside region A. From measurements over 60–1400 MHz available via the NASA/IPAC Extragalactic Database (NED)111http://ned.ipac.caltech.edu/, these sources are all brighter than 100 Jy at 200, 154, 118 and 88 MHz. Since our highest source count bin does not exceed 100 Jy at any of these frequencies, the exclusion of these sources does not bias our source count measurements.

4.4 Analysis of the GLEAM source counts

The corrected GLEAM differential source counts are shown in Fig. 5, while the source count data are provided in Table 5. Uncertainties on the counts are propagated from Poisson errors on the number of sources per bin and the errors on the correction factors derived in Section 4.2. The Poisson error on $N$ is approximated as $\sqrt{N}$ in all bins with $N\geq 20$ . In bins with $N<20$ , we use approximate expressions for 84 per cent confidence upper and lower limits based on Poisson statistics by Gehrels (1986).

The bulge due to source evolution is clearly evident at all four frequencies given the large areal sky coverage and the range of flux densities sampled. A detailed comparison of the shape of the multi-frequency counts is undertaken in Section 5.

In Fig. 6, we compare the GLEAM counts with other counts in the literature at a similar frequency covering more than $100~{}\mathrm{deg}^{2}$ : the 154 MHz counts by Franzen et al. (2016), 7C counts at 151 MHz by McGilchrist et al. (1990) and Hales et al. (2007) and TGSS First Alternative Data Release (ADR1) counts at 150 MHz by Intema et al. (2016). The 7C and TGSS counts are extrapolated to 154 MHz assuming $\alpha=-0.8$ .

The GLEAM counts are generally in excellent agreement with the other counts. We note that GLEAM and TGSS are on different flux density scales, with TGSS on the scale of Scaife & Heald (2012). There is, however, a flux density dependent offset between the GLEAM and TGSS counts. While the ratio of TGSS to GLEAM counts lies close to 1.0 at a few Jy, it decreases to $\approx 0.9$ below $\sim 1$ Jy. This is consistent with a $\approx 6$ per cent decrease in the mean ratio of TGSS to GLEAM flux densities below $\sim 1$ Jy and may be due to missing low surface brightness emission in TGSS. The TGSS observations have a far less centrally concentrated $uv$ coverage than the GLEAM observations. At 154 MHz, GLEAM has a resolution of $\approx 3$ arcmin while TGSS has a resolution of 25 by $25/\cos(\delta-19^{\circ})$ arcsec.

Source counts below 100 MHz are comparatively sparse. In Fig. 7, we compare the 88 MHz GLEAM counts with the VLSSr counts at 74 MHz, placed on the Baars et al. (1977) flux density scale (Lane et al., 2014); 62 MHz counts from LOFAR observations of the 3C295 and Boötes fields, covering $36~{}\mathrm{deg}^{2}$ (van Weeren et al., 2014); and 93.75 MHz counts from a 12 hour pointed observation with the 21 Centimetre Array (CMA) of a $25~{}\mathrm{deg}^{2}$ region of sky coincident with the North Celestial Pole (Zheng et al., 2016). The GLEAM counts, which cover the largest area of sky, show good agreement with the other counts extrapolated to 88 MHz with $\alpha=-0.8$ . Below $\sim 1$ Jy, the GLEAM counts lie very slightly (2–3 per cent) above the VLSSr counts but this is sensitive to the spectral index used in the extrapolation. We note that VLSSr has a resolution of 75 arcsec as compared to the GLEAM resolution of $\approx 5$ arcmin at 88 MHz.

5 Investigating changes in the source count shape with frequency

In this section, we analyse any change in the shape of the GLEAM counts with frequency and the dependence of the spectral index on flux density and frequency. We also show that the behaviour of the counts is broadly consistent with the typical spectra of sources across the MWA band.

The solid line in Fig. 5 is a weighted least squares $5^{\mathrm{th}}$ order polynomial fit to the GLEAM 154 MHz counts. We extrapolate the 200, 118 and 88 MHz GLEAM counts to 154 MHz assuming various spectral indices and divide the extrapolated counts by the 154 MHz source count fit calculated above, as shown in Fig. 8.

We find that, at $S_{154~{}\mathrm{MHz}}\gtrsim 0.5$ Jy, there is no significant change in the shape of the counts at the four frequencies. We calculate the value of $\alpha$ which minimises the $\chi^{2}$ difference between the counts at each of the three pairs of frequencies. When computing $\chi^{2}$ , we exclude the region of the 154 MHz source count fit below 0.5 Jy. For example, for the 154–200 MHz source count pair,

[TABLE]

where

[TABLE]

$n_{i,200}$ is the Euclidean normalised source count in the $i^{\mathrm{th}}$ bin at 200 MHz, $\sigma_{n_{i,200}}$ is the error on $n_{i,200}$ , $n_{154}\left(\frac{S_{i,200}}{x}\right)$ is the 154 MHz source count fit above evaluated at $\frac{S_{i,200}}{x}$ , $S_{i,200}$ is the central flux density of the $i^{\mathrm{th}}$ bin at 200 MHz, $x=(200/154)^{\alpha}$ and $y=x^{1.5}$ . For the 154–200, 154–118 and 154–88 MHz source count pairs, $\chi^{2}$ is minimised with $\alpha=$ –0.75, –0.77 and –0.79 respectively. Thus there is no strong dependence of the spectral index on frequency.

At $S_{154~{}\mathrm{MHz}}<0.5$ Jy, it becomes hard to discriminate between different spectral indices given the steep slope of the counts. There is, however, tentative evidence that a flatter spectral index provides a better match between the 154–200 and 154–118 MHz source count pairs.

Hurley-Walker et al. (2017) calculated the 76–227 MHz spectral indices of sources in the GLEAM catalogue using the 7.68 MHz sub-band flux densities. For the spectral index of a source to be quoted in the catalogue, the source must have a positive flux density in each of the 20 sub-bands (this is not always the case at low SNR) and the spectrum must be well fit by a power-law. From the completeness maps presented in Hurley-Walker et al., in region B, the GLEAM catalogue is 90 per cent complete at $S_{200~{}\mathrm{MHz}}=60$ mJy. Of the 84,003 sources with $S_{200~{}\mathrm{MHz}}>60$ mJy in region B, 75,905 (90.4 per cent) have measured spectral indices in the GLEAM catalogue. Fig. 9 shows the spectral index distribution for these sources. The distribution is roughly symmetric about the median value of –0.79 but there is a positive tail which extends to $\alpha\approx 0.5$ .

The top panel of Fig. 10 shows the median spectral index, $\alpha_{\mathrm{med}}$ , as a function of $S_{200~{}\mathrm{MHz}}$ . Sources which are missing from the spectral index sample because they are not well fit by a power-law are represented by the red histogram in the bottom panel. These sources include compact-steep spectrum (CSS) sources with a peak in their spectra across the MWA band, hypothesized to be the precursors to massive radio galaxies, and are studied in detail in Callingham et al. (2017).

Above 0.5 Jy, we find that there is no significant change in the median spectral index, $\alpha_{\mathrm{med}}$ , with flux density, whereas $\alpha_{\mathrm{med}}$ flattens from $\approx-0.85$ to $\approx-0.75$ between 0.5 and 0.1 Jy. We caution that $\alpha_{\mathrm{med}}$ is biased towards steep values below 0.1 Jy. Indeed, a substantial fraction of sources have no measured spectral indices in bins below 0.1 Jy because they do not have positive flux densities in all sub-bands; the negative flux densities mostly occur in lower frequency sub-bands due to the low SNR (see black histogram in bottom panel of Fig. 10). This probably explains the steepening in $\alpha_{\mathrm{med}}$ with decreasing flux density below 0.1 Jy.

Spectral flattening towards lower frequencies is expected for some sources due to absorption effects including synchrotron self-absorption and thermal absorption of a synchrotron power-law component. Spectral ageing, which causes the spectrum to steepen towards higher frequencies, may introduce additional curvature in the source spectrum.

Given the weak dependence of the median redshift of radio galaxies on flux density (see e.g. Condon, 1993), the flux-density range 0.1–0.5 Jy is expected to correspond to the least-luminous radio galaxies. By studying a number of complete samples of radio sources at frequencies close to 151 MHz with good coverage of the luminosity-redshift plane, Blundell, Rawlings & Willott (1999) found an anti-correlation between the rest-frame spectral index at low frequency and the source luminosity. This correlation is understood to arise through the steepening of the injection spectrum of particles by radiative losses in the enhanced magnetic fields of the hotspots of sources with more powerful jets. It is possible that the spectral flattening observed for GLEAM sources in this flux density range also results from this effect.

At $S_{154~{}\mathrm{MHz}}>0.5$ Jy, we find no evidence of any flattening in the average spectral index with decreasing frequency. Van Weeren et al. (2014) measured source counts at 34, 46 and 62 MHz down to 136, 72 and 51 mJy respectively, from LOFAR observations of the 3C295 and Boötes fields, covering a few tens of square degrees (their 62 MHz counts are displayed in Fig. 7 of this paper). They found that (1) the 62 MHz counts are in good agreement with 153 MHz GMRT and 74 MHz VLA counts, scaling with $\alpha=-0.7$ ; (2) the 34 MHz counts fall significantly below the extrapolated counts from 74 and 153 MHz with $\alpha=-0.7$ . Instead, $\alpha=-0.5$ provides a better match to the 34 MHz counts.

6 Comparison with SKADS Simulated Skies

The SKADS model by Wilman et al. (2008) gives radio flux densities at 151 MHz, 610 MHz, 1.4 GHz, 4.86 GHz and 18 GHz, down to 10 nJy, in a sky area of $20\times 20~{}\mathrm{deg}^{2}$ , and includes four distinct source types: FRI and FRII sources, radio-quiet AGN and star-forming galaxies. We compare observed counts at 154 MHz covering over 5 orders of magnitude in flux density with the source count prediction from the simulated database. We use the 154 MHz GLEAM counts, the deeper MWA EoR counts in the flux density range 30–75 mJy and 150 MHz LOFAR counts by Williams et al. (2016), extrapolated to 154 MHz with $\alpha=-0.8$ .

In the top panel of Fig. 11, we see that the 151 MHz SKADS model lies within the scatter of the observations except at $S\gtrsim 50$ mJy, where it increasingly underpredicts the measured counts with flux density. The GLEAM counts provide a very stringent test above this flux density given their high precision. The model underpredicts the number of sources by $\approx 50$ per cent by $\approx 2$ Jy. Since the model only covers $400~{}\mathrm{deg}^{2}$ , the source population is too poorly sampled above this flux density to perform a precise comparison.

Mauch et al. (2013) compared 325 MHz counts from a GMRT survey of the Herschel-ATLAS/GAMA fields with the SKADS model and found a similar result, albeit to a lower significance. They determined the 325 MHz simulated flux density by calculating the power-law spectral index between 151 and 610 MHz. Their measured counts, which sample the flux density range 10–200 mJy, tend to lie slightly above the simulated counts above $S_{325~{}\mathrm{MHz}}\approx 50$ mJy.

We find that the model is statistically in much better agreement with the data at high flux density after multiplying the simulated flux densities by 1.2, as shown in the bottom panel of Fig. 11. The fit is also somewhat improved at the low flux density end sampled by LOFAR although the data points have larger error bars making it harder to assess the model’s accuracy.

Mauch et al. suggest that the simulated flux densities at low frequency could be too low as a result of excessive spectral curvature implemented in the model. However, it is difficult to see how this is possible: radio-loud AGN dominate the source population at $S_{154~{}\mathrm{MHz}}>50$ mJy in the model. The overwhelming majority of these sources have power-law spectra between 154 MHz and 1.4 GHz, as the emission is lobe-dominated.

At the bright end, the model is based on a compilation of source counts at 151 MHz by Willott et al. (2001). The GLEAM counts provide much tighter constraints. The model is also based on the 151 MHz luminosity function of high-luminosity radio galaxies by Willott et al.. They chose to fit a Schechter luminosity function, whose exponential high-luminosity cutoff is likely too sharp to describe radio galaxies.

Fig. 12 shows the fraction of each source type as a function of $S_{154~{}\mathrm{MHz}}$ as predicted by the SKADS model, after rescaling the simulated flux densities. According to the model, FRII sources are dominant above $\sim 500$ mJy, FRI sources in the flux density range $\sim 1-500$ mJy and star-forming galaxies below $\sim 1$ mJy.

7 Noise and confusion properties of GLEAM mosaics

Fig. 13 shows the mean rms noise, measured using bane, in the narrow- and wide-band mosaics in a circular region within 8.5 deg of the Chandra Deep Field-South (CDFS) at J2000 $\alpha=03^{\mathrm{h}}30^{\mathrm{m}}$ , $\delta=-28^{\circ}00^{\prime}$ , hereafter referred to as region C; this region lies close to zenith (i.e. at $\delta=-26.7^{\circ}$ ) and 55 deg from the Galactic Plane.

We derive the expected thermal noise in this cold region of extragalactic sky. We then use our knowledge of the low-frequency source counts below the flux densities sampled by GLEAM to derive the theoretical noise limit, accounting for both the thermal noise and classical confusion, and compare it with the measured rms noise.

7.1 Estimating the thermal noise

Since no circular polarisation is expected from extragalactic sources, Stokes $V$ images should provide a good measure of the thermal noise. We download all narrow-band, uniformly-weighted Stokes $V$ snapshot images contributing to region C from the GLEAM Data Centre222http://mwa-web.icrar.org/gleam/q/form, originating from four different declination strips ( $-13^{\circ}$ , $-27^{\circ}$ , $-40^{\circ}$ and $-55^{\circ}$ ). We verify that the rms noise in Stokes $V$ images from the Dec $-27^{\circ}$ strip is in good agreement with the theoretical prediction.

The naturally-weighted, point-source sensitivity of the MWA, in Jy/beam, is given by

[TABLE]

where $k_{\mathrm{B}}$ is the Boltzmann constant, $T$ the system temperature in K, $A_{\mathrm{eff}}$ the effective area of each antenna tile in $\mathrm{m}^{2}$ , $N$ the number of antenna tiles, $\epsilon_{\mathrm{c}}$ the correlator efficiency, $\tau$ the integration time in seconds, $B$ the bandwidth in Hz and $n_{\mathrm{p}}$ the number of polarisations (Tingay et al., 2013).

The system temperature is given by $T=T_{\mathrm{sky}}+T_{\mathrm{rec}}$ , where $T_{\mathrm{sky}}$ is the sky temperature and $T_{\mathrm{rec}}$ the receiver temperature. Wayth et al. (2015) present measurements of the average sky temperature for pointings at different declinations and LSTs at multiple GLEAM frequencies. From this information, we obtain $T_{\mathrm{sky}}\approx 228~{}\mathrm{K}~{}(\nu/150~{}\mathrm{MHz})^{-2.53}$ at the location of the CDFS. Following Wayth et al. (2015), we set $T_{\mathrm{rec}}=50$ K except at $\nu>200$ MHz, where we set $T_{\mathrm{rec}}=80$ K; laboratory measurements by Sutinjo et al., in preparation, indicate that $T_{\mathrm{rec}}\approx 80$ K at $\nu>200$ MHz. We set $B=0.75\times 7.68$ MHz given a 25 per cent reduction in the bandwidth due to flagged edge channels. We set the remaining parameters as follows: $A_{\mathrm{eff}}=21.5~{}\mathrm{m}^{2}$ , $N=128$ , $\epsilon_{\mathrm{c}}=1.0$ , $\tau=2$ min and $n_{\mathrm{p}}=2$ . We also account for a 2.1-fold loss in sensitivity due to uniform weighting (Wayth et al., 2015). We find that the theoretical prediction agrees within 25 per cent with the Stokes $V$ noise measurements across the entire frequency range (see Fig. 14).

We combine all the Stokes $V$ snapshot images to produce narrow- and wide-band Stokes $V$ mosaics, following the procedure described in Hurley-Walker et al. (2017) for Stokes $I$ . We measure the mean rms noise in region C of each Stokes $V$ mosaic. The blue horizontal bars in Fig. 13 show our thermal noise estimates for the narrow- and wide-band mosaics.

7.2 Estimating the theoretical noise limit

Given a source count model and beam size, we use the method of probability of deflection (Scheuer, 1957) to derive the exact shape of the source $P(D)$ distribution, $P_{\mathrm{c}}(D)$ , that is the probability distribution of pixel values resulting from all sources present in the image. We then estimate the rms classical confusion noise, $\sigma_{\mathrm{c}}$ , from the core width of this distribution.

A detailed explanation of the equations used to derive the $P_{\mathrm{c}}(D)$ can be found in Vernstrom et al. (2014). Briefly, we calculate the mean number of pixels per steradian with observed intensities between $x$ and $x+dx$ ,

[TABLE]

where $dN/dS$ is the differential source count and $x=SB(\theta,\phi)$ is the image response to a point source of flux density $S$ at a point in the synthesised beam where the relative gain is $B(\theta,\phi)$ . The predicted $P_{\mathrm{c}}(D$ distribution is then computed from the Fourier Transform of $R(x)$ , such that

[TABLE]

where

[TABLE]

The black curve in Fig 15 is a weighted least squares $5^{\mathrm{th}}$ order polynomial fit to the 154 MHz GLEAM counts and the 150 MHz counts by Williams et al. (2016), extrapolated to 154 MHz with $\alpha=-0.8$ . The polynomial fit is given by

[TABLE]

where $a_{0}=3.52$ , $a_{1}=0.307$ , $a_{2}=-0.388$ , $a_{3}=-0.0404$ , $a_{4}=0.0351$ and $a_{5}=0.00600$ . The fit is valid over the flux density range 1 mJy–75 Jy.

Since no 154 MHz source count data are available below $\approx 1~{}\mathrm{mJy}$ , we use the 151 MHz SKADS model count after multiplying the simulated flux densities by 1.2 (see blue curve in Fig 15). We choose to apply this flux density scaling factor as the model is then in better agreement with the observed counts above 1 mJy, as shown in Section 6. At 1 mJy, there is minimal discontinuity between the rescaled SKADS model and the above polynomial fit to the observed counts. Our preferred model, source count model A, consists of our polynomial fit to the observed counts above 1 mJy and the rescaled SKADS model below 1 mJy.

Below a few mJy, the LOFAR counts have relatively large uncertainties and the 151 MHz SKADS model, displayed as the red curve in Fig 15, lies significantly below the LOFAR counts. There is minimal discontinuity between our polynomial fit to the observed counts and the SKADS model at 10 mJy. We therefore consider a second model, source count model B, consisting of the polynomial fit above 10 mJy and the SKADS model below 10 mJy.

In Section 5, we showed that a spectral index scaling of $\approx-0.8$ provides a good match between the GLEAM counts at $S_{154~{}\mathrm{MHz}}>0.5$ Jy. It is not clear whether this continues to be the case at lower flux densities. We extrapolate the models to other frequencies with $\alpha=-0.6$ , –0.8 and –1.0 in order to gauge the effect of spectral indices flatter and steeper than –0.8 on $\sigma_{\mathrm{c}}$ .

In calculating $P_{\mathrm{c}}(D)$ , we assume that the beam is a circular Gaussian with a full width at half-maximum (FWHM) $\theta=\sqrt{a_{\mathrm{psf,mean}}b_{\mathrm{psf,mean}}}$ , where $a_{\mathrm{psf,mean}}$ and $b_{\mathrm{psf,mean}}$ are the mean values of $a_{\mathrm{psf}}$ and $b_{\mathrm{psf}}$ in region C of the PSF map, respectively. This accounts for the increase in area of the PSF, resulting from ionospheric smearing.

The black curve in Fig. 16 shows the $P_{\mathrm{c}}(D)$ distribution that we derive in the wide-band image at 139–170 MHz using source count model A, where $\theta=2.6$ arcmin. The width of the distribution is measured by dividing the interquartile range by 1.349, i.e. the rms for a Gaussian distribution, obtaining $\sigma_{\mathrm{c}}=3.6$ mJy/beam.

To account for the thermal noise, $\sigma_{\mathrm{t}}$ , $P_{\mathrm{c}}(D)$ must be convolved with the thermal noise distribution, $P_{\mathrm{n}}(D)$ , represented as a Gaussian with rms $\sigma_{\mathrm{t}}$ . The convolution of $P_{\mathrm{c}}(D)$ with $P_{\mathrm{n}}(D)$ can be expressed as

[TABLE]

Our thermal noise estimate in region C of the 139–170 MHz mosaic is 2.7 mJy/beam. The red curve in Fig. 16 is a Gaussian centred on zero with a standard deviation of 2.7 mJy/beam, representing $P_{\mathrm{n}}(D)$ , while the blue curve is the convolution of $P_{\mathrm{c}}(D)$ with $P_{\mathrm{n}}(D)$ . The blue curve has a core width of 4.8 mJy/beam, and we take this to be the theoretical noise limit, $\sigma_{\mathrm{lim}}$ .

We follow this procedure to derive $\sigma_{\mathrm{c}}$ and $\sigma_{\mathrm{lim}}$ for the narrow- and wide-band mosaics at all frequencies. We derive $\sigma_{\mathrm{c}}$ and $\sigma_{\mathrm{lim}}$ using both 154 MHz source count models and $\alpha=-0.6$ , –0.8 and –1.0 to extrapolate the models to other frequencies. The range of $\sigma_{\mathrm{c}}$ and $\sigma_{\mathrm{lim}}$ values are displayed in Fig. 13. We find that, at 154 MHz, $\sigma_{\mathrm{c}}$ changes by no more than 3 per cent depending on the source count model adopted. Varying the spectral index has a greater effect on $\sigma_{\mathrm{c}}$ at the upper and lower ends of the GLEAM frequency range.

7.3 Excess background noise

Fig. 13 reveals that the rms noise is a factor of $\approx 2-3$ higher than $\sigma_{\mathrm{lim}}$ in the narrow-band mosaics. The rms noise is a factor of $\approx 2$ higher than $\sigma_{\mathrm{lim}}$ in the wide-band mosaics at the highest 3 frequencies, while it is only $\approx 25$ per cent higher than $\sigma_{\mathrm{lim}}$ at the lowest frequency. The lowest frequency wide-band mosaic is limited by classical confusion since $\sigma_{\mathrm{c}}$ is a factor of $\approx 4$ higher than $\sigma_{\mathrm{t}}$ .

8 Origin of excess background noise in GLEAM images

Possible causes of the excess background noise in GLEAM images include sidelobe confusion, calibration errors, background emission from the Galactic Plane and extended sources not included in the source count model used to derive $\sigma_{\mathrm{c}}$ . We analyse the noise contribution in a GLEAM snapshot image at 139–170 MHz with a beam size of 2.4 arcmin, lying close to the CDFS; the image is displayed in Fig. 17. We then predict the visibilities for the measurement set using a realistic distribution of point sources and image the simulated $uv$ data using exactly the same parameters in wsclean as those used to image the real data. By comparing the $P(D)$ distributions in the real and simulated images, we show that the excess background noise is primarily caused by confusion from sidelobes of the ideal synthesized beam. Finally, we attempt to approach the theoretical noise limit using an improved deconvolution method.

8.1 Noise properties of a real GLEAM snapshot image at 139–170 MHz

We use the method described in Section 7.2 to calculate $P_{\mathrm{c}}(D)*P_{\mathrm{n}}(D)$ within the half-power contour of the primary beam. We derive $P_{\mathrm{c}}(D)$ given the beam size of 2.4 arcmin and assuming source count model A.

Fig. 18 shows the rms noise map of the Stokes $V$ image. The thermal noise in the centre of the field is 8 mJy/beam but varies by a factor of two across the field given the primary beam response. It follows that the thermal noise distribution cannot be well approximated as a Gaussian. To address this problem, we divide the region into five concentric annuli such that the thermal noise varies by no more than 20 per cent in each annulus. The thermal noise in each annulus, $\sigma_{\mathrm{t},i}$ is taken as the mean rms noise in each annulus of the Stokes $V$ image. $P_{\mathrm{c}}(D)*P_{\mathrm{n}}(D)$ is then taken as

[TABLE]

where $P_{\mathrm{n},i}(D)$ is a Gaussian of width $\sigma_{\mathrm{t},i}$ representing the thermal noise distribution in the $i^{\mathrm{th}}$ annulus and $A_{i}$ is the area of the $i^{\mathrm{th}}$ annulus.

The observed $P(D)$ distribution within the half-power contour of the primary beam, $P_{\mathrm{obs}}(D)$ , is compared with $P_{\mathrm{c}}(D)*P_{\mathrm{n}}(D)$ in Fig. 19. The theoretical noise limit obtained from the core width of $P_{\mathrm{c}}(D)*P_{\mathrm{n}}(D)$ is 12.7 mJy/beam. In comparison, the core width of $P_{\mathrm{obs}}(D)$ , $\sigma_{\mathrm{obs}}=26.3$ mJy/beam.

8.2 Simulations to investigate origin of excess background noise

The steps in simulating the image are as follows:

(1)

We simulate a catalogue of point sources at 154 MHz, drawing flux densities randomly between 1 mJy and 70 Jy from source count model A. The sources lie at random positions within 40 deg from the field centre; this region is large enough to encompass the first sidelobe of the primary beam. 2. (2)

From the simulated catalogue, we generate an image of the sky brightness distribution. Each simulated source is modelled as a $\delta$ function at the pixel closest to the source position. If more than one source is assigned to the same pixel, the flux densities of the sources are summed together. To account for the primary beam attenuation, the model image is multiplied by the primary beam response. 3. (3)

We use the ‘–predict’ option in wsclean to predict the visibilities for the measurement set from the model image. 4. (4)

We image the simulated $uv$ data using exactly the same parameters in wsclean as those used to image the real data. The real image was CLEANed to 150 mJy/beam; we ensure that the simulated image is CLEANed to the same flux density threshold. 5. (5)

We add 8 mJy/beam rms Gaussian noise to the simulated image to account for the thermal noise. 6. (6)

We divide the simulated image by the primary beam response.

In step 4, the simulated $uv$ data are imaged using a cell size of 32.7 arcsec, which corresponds to approximately one quarter of the synthesised beam size. Image pixelation effects coupled to the CLEAN deconvolution representation of the sky as a set of $\delta$ functions can limit the dynamic range of interferometric images (Cotton & Uson, 2008). In order to account for this effect in the simulations, sources must be placed at various positions between cells in the simulated image. We achieve this by employing a slightly different cell size for the model image in step 2, which is used to simulate the $uv$ data.

We find that the $P(D)$ distribution in the simulated image, $P_{\mathrm{sim}}(D)$ , is remarkably similar to $P_{\mathrm{obs}}(D)$ , as shown in Fig. 19. The core width of $P_{\mathrm{sim}}(D)$ , $\sigma_{\mathrm{sim}}=24.0$ mJy/beam, is only $\approx$ 9 per cent lower than $\sigma_{\mathrm{obs}}$ . Since the simulated image contains no calibration artefacts, this suggests that the excess background noise in the snapshot image is primarily due to sidelobe confusion.

We repeat the simulations using source count model B but this makes negligible difference to $\sigma_{\mathrm{sim}}$ . Calibration artefacts may explain the slightly higher noise level in the real image, as well as residual sidelobes from Fornax A ( $S_{154~{}\mathrm{MHz}}=750$ Jy; McKinley et al. 2015), which are clearly visible in the real image.

8.3 Improving the deconvolution

The GLEAM snapshot referred to at the beginning of Section 8 was imaged using wsclean v1.10. The pixel size was set to $32.7\times 32.7~{}\mathrm{arcsec}^{2}$ and the image size to $4000\times 4000$ pixels, such that the image encompasses the $\approx 10$ per cent level of the primary beam. The snapshot was imaged down to the first negative CLEAN component. The rms noise of this initial image, $\sigma=50$ mJy/beam, was measured and the snapshot was re-imaged down to a CLEAN threshold of $3\sigma$ (150 mJy/beam). In practice, this CLEANing strategy generally leaves significant residual emission undeconvolved.

We re-image the snapshot using wsclean v2.5, which is more efficient for large images thanks to the implementation of the Clark CLEAN algorithm (Clark, 1980). In minor CLEAN cycles, CLEAN components are subtracted from the image using only the central portion of the PSF and only the largest residuals are searched. This is sufficient to find the CLEAN components providing that the synthesised beam is well behaved; the accuracy of the subtraction is improved during major CLEAN cycles where the FT of the CLEAN components is subtracted from the residual visibility data.

Using wsclean v2.5, we CLEAN the entire image to 3 $\sigma$ , construct a mask from the identified CLEAN components and continue CLEANing with the mask to $1\sigma$ . This is conducted in an automated fashion using the ‘auto-mask’ and ‘auto-threshold’ parameters. It is not necessary to provide wsclean with an estimate of $\sigma$ as the algorithm automatically calculates the standard deviation of the residual image before the start of every major CLEAN cycle, which it then uses to set the CLEAN threshold. This is desirable since, in practice, the noise can drop considerably after the first few major CLEAN cycles as the image quality improves. The use of a mask permits CLEANing down to the noise level.

The total number of CLEAN iterations using wsclean v2.5 is $\approx 190,000$ while it is only $\approx 25,000$ using wsclean v1.10. Despite the much larger number of CLEAN iterations, the processing time for wsclean v2.5 is $\approx$ 4 times shorter. The $P(D)$ distributions obtained using the two versions of wsclean are compared with the theoretical noise limit in Fig. 20. There is a $\approx 29$ per cent reduction in $\sigma_{\mathrm{obs}}$ using wsclean v2.5.

We investigate whether the noise can be reduced further by increasing the size of the region being CLEANed. We re-run wsclean v2.5 increasing the image size from $4000\times 4000$ to $6000\times 6000$ pixels. The imaged field-of-view now encompasses the first null of the primary beam. The resulting $P(D)$ distribution is displayed in Fig. 20. There is a further $\approx 21$ per cent reduction in $\sigma_{\mathrm{obs}}$ , which is now only $\approx 15$ per cent above $\sigma_{\mathrm{lim}}$ .

The results of this analysis are summarised in Table 4. We conclude that both the limited CLEANing depth and far-field sources that have not been deconvolved contribute significantly to the sidelobe confusion in GLEAM. The Clark optimisation is highly effective for large MWA images, permitting deeper CLEANing. We recommend adopting this technique in the future to ensure full exploitation of MWA survey images with the auto-masking and deeper thresholding.

9 Prospects for MWA phase 2

Since the GLEAM survey observations were carried out, the MWA has been upgraded with the addition of a further 128 tiles, 56 of which lie on baselines up to $\approx 6$ km, roughly improving the array resolution by a factor of two (Wayth et al., 2018). The correlator capacity was not increased in Phase 2 of the MWA, so it is still only possible to correlate 128 tiles. In this section, we give an overview of how we expect $\sigma_{\mathrm{c}}$ and the rms sidelobe confusion noise, $\sigma_{\mathrm{s}}$ , to change for MWA Phase 2 observations.

We use the miriad (Sault et al., 1995) task uvgen to simulate an image of the MWA Phase 2 PSF for a 2 min snapshot with a central frequency of 154 MHz and a bandwidth of 30.72 MHz, using a uniform weighting scheme. We fit a Gaussian to the main lobe of the synthesised beam; the geometric average of the major and minor axes of the fitted Gaussian is 1.15 arcmin. We use the method described in Section 7.2 to derive $\sigma_{\mathrm{c}}$ as a function of frequency for MWA Phase 2, setting the beam size $\theta_{\mathrm{Phase~{}2}}=1.15~{}\mathrm{arcmin}~{}(\nu/154~{}\mathrm{MHz})^{-1}$ . We find that $\sigma_{\mathrm{c,~{}Phase~{}1}}/\sigma_{\mathrm{c,~{}Phase~{}2}}$ varies from $\approx 5$ at the high end of the band to $\approx 7$ at the low end of the band, as shown in the top panel of Fig. 21. The top panel of Fig. 21 also includes estimates of $\sigma_{\mathrm{c}}$ for larger, hypothetical arrays with maximum baselines of 9, 12 and 18 km, where we set the beam size to $\frac{2}{3}\theta_{\mathrm{Phase~{}2}}$ , $\frac{1}{2}\theta_{\mathrm{Phase~{}2}}$ and $\frac{1}{3}\theta_{\mathrm{Phase~{}2}}$ , respectively.

The classical confusion noise at 154 MHz as a function of beam size is also displayed in the bottom panel of Fig. 21. We fit the function $\sigma_{\mathrm{c}}=a\theta^{b}$ in three different $\theta$ ranges and find that $b$ drops with decreasing $\theta$ , with $b=2.61$ for $\theta=2.0-4.0$ arcmin, $b=2.18$ for $\theta=1.0-2.0$ arcmin and $b=1.83$ for $\theta=0.5-1.0$ arcmin. Condon (1974) showed that for a power-law differential source count $n(S)=kS^{-\gamma}$ , $\sigma_{\mathrm{c}}\propto\theta^{\frac{2}{\gamma-1}}$ . The flattening of the 154 MHz Euclidean normalised differential counts below $\approx 10$ mJy (corresponding to an increase in $\gamma$ ), can therefore explain the drop in $b$ with decreasing $\theta$ .

Bowman, Morales & Hewitt (2009) derive an expression for the variance in the intensity of a dirty sky map assuming that the primary and synthesised beams are described by top-hat functions, such that the response is defined to be one within a region of diameter $\Theta_{\mathrm{P}}$ in the case of the primary beam and within a region of diameter $\Theta_{\mathrm{B}}$ in the case of the synthesised beam. Outside this region, the response is taken to be zero for the primary beam and a constant value of $B_{\mathrm{rms}}\ll 1$ for the synthesised beam, representing the standard deviation of the synthesised beam sidelobes. With these simplifications,

[TABLE]

where $\Omega_{\mathrm{B}}\approx\Theta_{\mathrm{B}}^{2}$ is the solid angle of the synthesised beam and $\Omega_{\mathrm{P}}\approx\Theta_{\mathrm{P}}^{2}$ is the solid angle of the primary beam.

For the MWA, $B_{\mathrm{rms}}$ varies strongly with distance, $d$ , from the beam centre. We measure $B_{\mathrm{rms}}$ in MWA Phase 1 and 2 PSF images as a function of $d$ (see Fig. 22). Since $B_{\mathrm{rms,~{}Phase~{}1}}\gtrapprox 2~{}B_{\mathrm{rms,~{}Phase~{}2}}$ , $\Omega_{\mathrm{P,~{}Phase~{}1}}=\Omega_{\mathrm{P,~{}Phase~{}2}}$ and $\Omega_{\mathrm{B,~{}Phase~{}1}}\approx 4~{}\Omega_{\mathrm{B,~{}Phase~{}2}}$ ,

[TABLE]

We therefore expect that $\sigma_{\mathrm{s,~{}Phase~{}1}}/\sigma_{\mathrm{s,~{}Phase~{}2}}\gtrapprox 5$ across the MWA frequency range, assuming that the MWA Phase 1 and 2 images are CLEANed to the same flux density threshold. We must also consider that MWA Phase 2 images will take longer to image because of the increased resolution. The calibration of MWA Phase 2 data will be more challenging and, depending on the ionospheric conditions, direction-dependent calibration techniques will probably be required to reach the theoretical noise limit (see e.g. Offringa et al. 2016, Rioja, Dodson & Franzen, submitted).

10 Summary and future work

GLEAM is a contiguous 72–231 MHz survey of the entire sky south of declination $+30^{\circ}$ and has the widest fractional bandwidth and highest surface brightness sensitivity among low radio frequency surveys. We have determined the GLEAM source counts at 200, 154, 118 and 88 MHz to a flux density limit of 50, 80, 120 and 290 mJy respectively, to high precision. The 200 MHz counts are based on the GLEAM extragalactic catalogue by Hurley-Walker et al. (2017). From the three lowest 30.72 MHz sub-band images of GLEAM, we have constructed additional, statistically complete source samples at 154, 118 and 88 MHz to measure the counts at these frequencies.

The counts at 154 and 88 MHz are overall in good agreement with other counts in the literature at a similar frequency. The 151 MHz SKADS model significantly underpredicts the 154 MHz GLEAM counts at $S\gtrsim 50$ mJy. The cause of the discrepancy is unclear. The model is based on the 151 MHz luminosity function of high-luminosity radio galaxies by Willott et al. (2001), which in turn was determined using measurements of the local radio luminosity function (LRLF) for AGN. Since no measurements of the LRLF for AGN were available at 151 MHz, Willott et al. used the LRLF for AGN at 1.4 GHz by Cotton & Condon (1998) and made a simple shift in radio power assuming $\alpha=-0.8$ . They also chose to fit a Schechter luminosity function, whose exponential high-luminosity cutoff is likely too sharp to describe radio galaxies. We find that the model is statistically in much better agreement with the data after multiplying the simulated flux densities by 1.2.

At $S_{154~{}\mathrm{MHz}}>0.5$ Jy, there is no discernible change in the shape of the counts at the four frequencies: a spectral index scaling of $\approx-0.8$ provides a good match between the counts. The spectra of individual sources show, on average, a slight but significant flattening of $\delta\alpha_{76}^{227}\approx 0.1$ between 0.5 and 0.1 Jy.

We may have expected to see a change in the source count shape with frequency due to spectral curvature of generations of sources at different redshifts. The fact that GLEAM is overwhelmingly dominated by sources with steep, power-law spectra indicates that there is no simple way of tracing ageing or evolution of the bright source population from this set of frequencies.

The low-frequency emission from star-forming galaxies remains largely unstudied. Detailed measurements of their spectra are important for understanding the physical processes which contribute to the radio emission from star formation. They can also be used to construct more accurate low frequency source counts, which will be invaluable for planning deep low-frequency surveys with future facilities. Galvin et al. (2018) measured the radio spectra of 19 luminous infrared galaxies (LIRGs) at $0.067<z<0.227$ using GLEAM and Australia Telescope Compact Array (ATCA) follow-up observations at 2.1–45 GHz. They found that many of the sources exhibit low-frequency turnovers in their spectra which can be attributed, in large part, to free-free absorption. Deep LOFAR observations in small-area fields are also probing the low frequency behaviour of star-forming galaxies. The LoTSS is expected to detect hundreds of thousands of star-forming galaxies, primarily at lower redshifts but extending out to $z\geq 1$ .

Although GLEAM is overwhelmingly dominated by radio-loud AGN, the SKADS model predicts that GLEAM contains $375\pm 80$ local ( $z<0.1$ ) star-forming galaxies with $S_{200~{}\mathrm{MHz}}>50$ mJy in region B, covering $\approx 6500~{}\mathrm{deg}^{2}$ . In a future paper, we will cross-match the GLEAM catalogue with nearby optical samples to determine the LRLF for both AGN and star-forming galaxies at 154 MHz. We will correlate the local radio sample with higher frequency surveys including NVSS and SUMSS to characterise the typical spectra of these two populations. We also plan to investigate changes in the spectral behaviour of AGN with respect to radio morphology and luminosity.

Using deep 150 MHz LOFAR counts by Williams et al. (2016) and the SKADS model, we have conducted a $P(D)$ analysis to derive the classical confusion noise in GLEAM images. While the images are limited by classical confusion below $\approx 100$ MHz, the rms noise is a factor of $\approx 2$ higher than the theoretical noise limit, accounting for both the thermal noise and classical confusion, at higher frequencies. By analysing a synthetic snapshot image containing a realistic distribution of point sources, we have demonstrated that the excess background noise is primarily due to confusion from sidelobes of the ideal synthesized beam. We have shown that we can approach the theoretical noise limit using the Clark CLEAN algorithm implemented in wsclean, along with deeper deconvolution and larger image size to encompass the first null of the primary beam.

For the MWA Phase 2 array with the angular resolution improved by a factor of two, we anticipate that both the classical and sidelobe confusion noise will drop by a factor of $\approx 5$ at the high end of the band. Deep pointed observations of the Galaxy and Mass Assembly (GAMA; Driver et al., 2009) 23 field, centred at Dec $-32.5^{\circ}$ , have been made with MWA Phase 2 (Seymour et al., in preparation) at 72–231 MHz with the goal of producing a radio luminosity function and investigating its dependence on MWA in-band spectral index. This work will demonstrate the ‘deep’ imaging quality which MWA Phase 2 can provide and will include an investigation of the factors which affect the noise.

Acknowledgements.

This scientific work makes use of the Murchison Radio-astronomy Observatory, operated by CSIRO. We acknowledge the Wajarri Yamatji people as the traditional owners of the Observatory site. Support for the operation of the MWA is provided by the Australian Government (NCRIS), under a contract to Curtin University administered by Astronomy Australia Limited. We thank the anonymous referee for helpful comments, which have substantially improved this paper. We acknowledge the Pawsey Supercomputing Centre which is supported by the Western Australian and Australian Governments. CAJ thanks the Department of Science, Office of Premier & Cabinet, WA for their support through the Western Australian Fellowship Program.

Appendix A Source count data

The 200, 154, 118 and 88 MHz source count data presented in this paper are provided in Table 5.

Bibliography68

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Baars et al. (1977) Baars J. W. M., Genzel R., Pauliny-Toth I. I. K., Witzel A., 1977, A&A, 61, 99
2Becker, White & Helfand (1995) Becker R. H., White R. L., Helfand D. J., 1995, Ap J, 450, 559
3Bertin et al. (2002) Bertin E., Mellier Y., Radovich M., Missonnier G., Didelon P., Morin B., 2002, in Astronomical Society of the Pacific Conference Series, Vol. 281, Astronomical Data Analysis Software and Systems XI, Bohlender D. A., Durand D., Handley T. H., eds., p. 228
4Blundell, Rawlings & Willott (1999) Blundell K. M., Rawlings S., Willott C. J., 1999, AJ, 117, 677
5Bowman, Morales & Hewitt (2009) Bowman J. D., Morales M. F., Hewitt J. N., 2009, Ap J, 695, 183
6Callingham et al. (2017) Callingham J. R. et al., 2017, Ap J, 836, 174
7Carroll et al. (2016) Carroll P. A. et al., 2016, MNRAS, 461, 4151
8Chapman et al. (2012) Chapman E. et al., 2012, MNRAS, 423, 2518