Gaussian Multiple Access Channels with One-Bit Quantizer at the Receiver

Borzoo Rassouli; Morteza Varasteh; Deniz Gunduz

arXiv:1703.02324·cs.IT·October 17, 2018

Gaussian Multiple Access Channels with One-Bit Quantizer at the Receiver

Borzoo Rassouli, Morteza Varasteh, Deniz Gunduz

PDF

TL;DR

This paper investigates the capacity region of a Gaussian multiple access channel with a one-bit ADC at the receiver, revealing that optimal inputs are discrete and providing bounds on their complexity.

Contribution

It establishes that optimal input distributions are discrete and derives bounds on their number of mass points for channels with one-bit quantization.

Findings

01

Optimal input distributions are discrete.

02

Upper bounds on the number of mass points are derived.

03

Capacity region characterization under one-bit quantization.

Abstract

The capacity region of a two-transmitter Gaussian multiple access channel (MAC) under average input power constraints is studied, when the receiver employs a zero-threshold one-bit analog-to-digital converter (ADC). It is proved that the input distributions of the two transmitters that achieve the boundary points of the capacity region are discrete. Based on the position of a boundary point, upper bounds on the number of the mass points of the corresponding distributions are derived.

Equations290

Y = Γ (X_{1, i} + X_{2, i} + Z_{i}), i \in [1 : n],

Y = Γ (X_{1, i} + X_{2, i} + Z_{i}), i \in [1 : n],

\Gamma(x)=\left\{\begin{array}[]{cc}1&x\geq 0\\ 0&x<0\end{array}\right..

\Gamma(x)=\left\{\begin{array}[]{cc}1&x\geq 0\\ 0&x<0\end{array}\right..

p (0∣ x_{1}, x_{2}) = 1 - p (1∣ x_{1}, x_{2}) = Q (x_{1} + x_{2}),

p (0∣ x_{1}, x_{2}) = 1 - p (1∣ x_{1}, x_{2}) = Q (x_{1} + x_{2}),

P_{e}^{(n)} = \mbox P r {(\hat{W}_{1}, \hat{W}_{2}) \neq = (W_{1}, W_{2})}

P_{e}^{(n)} = \mbox P r {(\hat{W}_{1}, \hat{W}_{2}) \neq = (W_{1}, W_{2})}

\frac{1}{n} i = 1 \sum n x_{j, i}^{2} (w_{j}) \leq P_{j}, \forall w_{j} \in [1 : 2^{n R_{j}}], j \in {1, 2},

\frac{1}{n} i = 1 \sum n x_{j, i}^{2} (w_{j}) \leq P_{j}, \forall w_{j} \in [1 : 2^{n R_{j}}], j \in {1, 2},

R_{1}

R_{1}

R_{2}

R_{1} + R_{2}

g_{1} (F_{X_{1} ∣ U} (\cdot ∣ u) F_{X_{2} ∣ U} (\cdot ∣ u))

g_{1} (F_{X_{1} ∣ U} (\cdot ∣ u) F_{X_{2} ∣ U} (\cdot ∣ u))

g_{2} (F_{X_{1} ∣ U} (\cdot ∣ u) F_{X_{2} ∣ U} (\cdot ∣ u))

g_{3} (F_{X_{1} ∣ U} (\cdot ∣ u) F_{X_{2} ∣ U} (\cdot ∣ u))

g_{4} (F_{X_{1} ∣ U} (\cdot ∣ u) F_{X_{2} ∣ U} (\cdot ∣ u))

g_{5} (F_{X_{1} ∣ U} (\cdot ∣ u) F_{X_{2} ∣ U} (\cdot ∣ u))

r_{1}

r_{1}

r_{2}

r_{3}

r_{4}

r_{5}

R_{1}

R_{1}

R_{2}

R_{1} + R_{2}

R_{1}

R_{1}

R_{2}

R_{1} + R_{2}

(R_{1}, R_{2}) \in R_{1} max R_{1} + R_{2}

(R_{1}, R_{2}) \in R_{1} max R_{1} + R_{2}

= F_{X_{1}} F_{X_{2}} : \mathds E [X_{j}^{2}] \leq 1, \mathds E [X_{j}] = 0, j = 1, 2 max I (X_{1} + X_{2}; Y)

\leq F_{X} (x) : \mathds E [X^{2}] \leq 2, \mathds E [X] = 0 max I (X; Y)

\leq F_{X} (x) : \mathds E [X^{2}] \leq 2 max I (X; Y)

= 1 - H_{b} (Q (2))

\leq F_{U} F_{X_{1} ∣ U} F_{X_{2} ∣ U} : \mathds E [X_{j}^{2}] \leq 1, \mathds E [X_{j}] = 0, j = 1, 2 max I (X_{1}, X_{2}; Y ∣ U)

= (R_{1}, R_{2}) \in C max R_{1} + R_{2},

d_{L} (F, G) = in f {ϵ > 0∣ F (x - ϵ) - ϵ \leq G (x) \leq F (x + ϵ) + ϵ, \forall x \in R} .

d_{L} (F, G) = in f {ϵ > 0∣ F (x - ϵ) - ϵ \leq G (x) \leq F (x + ϵ) + ϵ, \forall x \in R} .

F_{\tilde{X}} (x) = p p^{'} s (x + 22) + p \hat{*} p^{'} s (x + 2) + (1 - 2 (p p^{'} + p \hat{*} p^{'})) s (x) + p \hat{*} p^{'} s (x - 2) + p p^{'} s (x - 22),

F_{\tilde{X}} (x) = p p^{'} s (x + 22) + p \hat{*} p^{'} s (x + 2) + (1 - 2 (p p^{'} + p \hat{*} p^{'})) s (x) + p \hat{*} p^{'} s (x - 2) + p p^{'} s (x - 22),

d_{L} (F_{\tilde{X}}, F_{X}^{*}) = max {p p^{'}, \frac{1}{2} - p p^{'} - p \hat{*} p^{'}} = \frac{1}{2} - p p^{'} - p \hat{*} p^{'} .

d_{L} (F_{\tilde{X}}, F_{X}^{*}) = max {p p^{'}, \frac{1}{2} - p p^{'} - p \hat{*} p^{'}} = \frac{1}{2} - p p^{'} - p \hat{*} p^{'} .

p, p^{'} \leq \frac{1}{4} min d_{L} (F_{\tilde{X}}, F_{X}^{*}) = \frac{3}{16} .

p, p^{'} \leq \frac{1}{4} min d_{L} (F_{\tilde{X}}, F_{X}^{*}) = \frac{3}{16} .

(n_{1},n_{2})=\left\{\begin{array}[]{cc}(3,5)&l_{p}<-1\\ (3,3)&l_{p}=-1\\ (5,3)&l_{p}>-1\end{array}\right..

(n_{1},n_{2})=\left\{\begin{array}[]{cc}(3,5)&l_{p}<-1\\ (3,3)&l_{p}=-1\\ (5,3)&l_{p}>-1\end{array}\right..

\Omega=\big{\{}F_{U,X_{1},X_{2}}\big{|}\ U\in\mathcal{U},\ X_{1}-U-X_{2},\ \mathds{E}[X_{j}^{2}]\leq P_{j},\ j=1,2\big{\}},

\Omega=\big{\{}F_{U,X_{1},X_{2}}\big{|}\ U\in\mathcal{U},\ X_{1}-U-X_{2},\ \mathds{E}[X_{j}^{2}]\leq P_{j},\ j=1,2\big{\}},

(R_{1}^{b}, R_{2}^{b}) = \mbox a r g (R_{1}, R_{2}) \in C (P_{1}, P_{2}) max R_{1} + λ R_{2},

(R_{1}^{b}, R_{2}^{b}) = \mbox a r g (R_{1}, R_{2}) \in C (P_{1}, P_{2}) max R_{1} + λ R_{2},

\max_{(R_{1},R_{2})\in\mathscr{C}(P_{1},P_{2})}R_{1}+\lambda R_{2}=\left\{\begin{array}[]{cc}\max I(X_{1};Y|X_{2},U)+\lambda I(X_{2};Y|U)&0<\lambda\leq 1\\ \max I(X_{2};Y|X_{1},U)+\lambda I(X_{1};Y|U)&\lambda>1\end{array}\right.,

\max_{(R_{1},R_{2})\in\mathscr{C}(P_{1},P_{2})}R_{1}+\lambda R_{2}=\left\{\begin{array}[]{cc}\max I(X_{1};Y|X_{2},U)+\lambda I(X_{2};Y|U)&0<\lambda\leq 1\\ \max I(X_{2};Y|X_{1},U)+\lambda I(X_{1};Y|U)&\lambda>1\end{array}\right.,

I_{\lambda}(F_{X_{1}}F_{X_{2}})=\left\{\begin{array}[]{cc}I(X_{1};Y|X_{2})+\lambda I(X_{2};Y)&0<\lambda\leq 1\\ I(X_{2};Y|X_{1})+\lambda I(X_{1};Y)&\lambda>1\end{array}\right..

I_{\lambda}(F_{X_{1}}F_{X_{2}})=\left\{\begin{array}[]{cc}I(X_{1};Y|X_{2})+\lambda I(X_{2};Y)&0<\lambda\leq 1\\ I(X_{2};Y|X_{1})+\lambda I(X_{1};Y)&\lambda>1\end{array}\right..

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Gaussian Multiple Access Channels with One-Bit Quantizer at the Receiver

Borzoo Rassouli, Morteza Varasteh and Deniz Gündüz Borzoo Rassouli, Morteza Varasteh and Deniz Gündüz are with the Intelligent Systems and Networks group of Department of Electrical and Electronics, Imperial College London, United Kingdom. emails: {b.rassouli12, m.varasteh12, d.gunduz}@imperial.ac.uk.

Abstract

The capacity region of a two-transmitter Gaussian multiple access channel (MAC) under average input power constraints is studied, when the receiver employs a zero-threshold one-bit analog-to-digital converter (ADC). It is proved that the input distributions of the two transmitters that achieve the boundary points of the capacity region are discrete. Based on the position of a boundary point, upper bounds on the number of the mass points of the corresponding distributions are derived.

Index Terms:

Gaussian multiple access channel, one-bit quantizer, capacity region††This work has been presented partially in [1]..

I Introduction

The energy consumption of an analog-to-digital converter (ADC) (measured in Joules/sample) grows exponentially with its resolution (in bits/sample) [2], [3]. When the available power is limited, for example, for mobile devices with limited battery capacity, or for wireless receivers that operate on limited energy harvested from ambient sources [4], the receiver circuitry may be constrained to operate with low resolution ADCs. The presence of a low-resolution ADC, in particular a one-bit ADC at the receiver, alters the channel characteristics significantly. Such a constraint not only limits the fundamental bounds on the achievable rate, but it also changes the nature of the communication and modulation schemes approaching these bounds. For example, in a real additive white Gaussian noise (AWGN) channel under an average power constraint on the input, if the receiver is equipped with a $K$ -bin (i.e., $\log_{2}K$ -bit) ADC front end, it is shown in [5] that the capacity-achieving input distribution is discrete with at most $K+1$ mass points. This is in contrast with the optimality of the Gaussian input distribution when the receiver has infinite resolution.

Especially with the adoption of massive multiple-input multiple-output (MIMO) receivers and the millimeter wave (mmWave) technology enabling communication over large bandwidths, communication systems with limited-resolution receiver front ends are becoming of practical importance. Accordingly, there have been a growing research interest in understanding both the fundamental information theoretic limits and the design of practical communication protocols for systems with finite-resolution ADC front ends. In [6], the authors show that for a Rayleigh fading channel with a one-bit ADC and perfect channel state information at the receiver (CSIR), quadrature phase shift keying (QPSK) modulation is capacity-achieving. In case of no CSIR, [7] shows that (QPSK) modulation is optimal when the signal-to-noise (SNR) ratio is above a certain threshold, which depends on the coherence time of the channel, while for SNRs below this threshold, on-off QPSK achieves the capacity. For the point-to-point multiple-input multiple-output (MIMO) channel with a one-bit ADC front end at each receive antenna and perfect CSIR, [8] shows that QPSK is optimal at very low SNRs, while with perfect channel state information at the transmitter (CSIT), upper and lower bounds on the capacity are provided in [9].

To the best of our knowledge, the existing literature on communications with low-resolution ADCs focus exclusively on point-to-point systems. Our goal in this paper is to understand the impact of low-resolution ADCs on the capacity region of a multiple access channel (MAC). In particular, we consider a two-transmitter Gaussian MAC with a one-bit quantizer at the receiver. The inputs to the channel are subject to average power constraints. We show that any point on the boundary of the capacity region is achieved by discrete input distributions. Based on the slope of the tangent line to the capacity region at a boundary point, we propose upper bounds on the cardinality of the support of these distributions.

The paper is organized as follows. Section II introduces the system model. In Section III, the capacity region of a general two-transmitter memoryless MAC under input average power constraints is investigated. Through an example, it is shown that when there is input average power constraint, it is necessary to consider the capacity region with the auxiliary random variable $U$ in general. The main result of the paper is presented in Section III, and a detailed proof is given in Section IV. Finally, Section V concludes the paper.

Notations. Random variables are denoted by capital letters, while their realizations with lower case letters. $F_{X}(x)$ denotes the cumulative distribution function (CDF) of random variable $X$ . The conditional probability mass function (pmf) $p_{Y|X_{1},X_{2}}(y|x_{1},x_{2})$ will be written as $p(y|x_{1},x_{2})$ . For integers $m\leq n$ , we have $[m:n]=\{m,m+1,\ldots,n\}$ . For $0\leq t\leq 1$ , $H_{b}(t)\triangleq-t\log_{2}t-(1-t)\log_{2}(1-t)$ denotes the binary entropy function. The unit-step function is denoted by $s(\cdot)$ .

II System model and preliminaries

We consider a two-transmitter memoryless Gaussian MAC (as shown in Figure 1) with a one-bit quantizer $\Gamma$ at the receiver front end. Transmitter $j=1,2$ encodes its message $W_{j}$ into a codeword $X_{j}^{n}$ and transmits it over the shared channel. The signal received by the decoder is given by

[TABLE]

where $\{Z_{i}\}_{i=1}^{n}$ is an independent and identically distributed (i.i.d.) Gaussian noise process, also independent of the channel inputs $X_{1}^{n}$ and $X_{2}^{n}$ with $Z_{i}\sim\mathcal{N}(0,1),i\in[1:n]$ . $\Gamma$ represents the one-bit ADC operation given by

[TABLE]

This channel can be modelled by the triplet $\left(\mathcal{X}_{1}\times\mathcal{X}_{2},p(y|x_{1},x_{2}),\mathcal{Y}\right)$ , where $\mathcal{X}_{1},\mathcal{X}_{2}$ ( $=\mathbb{R}$ ) and $\mathcal{Y}$ ( $=\{0,1\}$ ), respectively, are the alphabets of the inputs and the output. The conditional pmf of the channel output $Y$ conditioned on the channel inputs $X_{1}$ and $X_{2}$ (i.e. $p(y|x_{1},x_{2})$ ) is characterized by

[TABLE]

where $Q(x)\triangleq\frac{1}{\sqrt{2\pi}}\int_{x}^{+\infty}e^{-\frac{t^{2}}{2}}dt$ .

Upon receiving the sequence $Y^{n}$ , the decoder finds the estimates $(\hat{W}_{1},\hat{W}_{2})$ of the messages.

A $(2^{nR_{1}},2^{nR_{2}},n)$ code for this channel consists of (as in [10]):

•

two message sets $[1:2^{nR_{1}}]$ and $[1:2^{nR_{2}}]$ ,

•

two encoders, where encoder $j=1,2$ assigns a codeword $x_{j}^{n}(w_{j})$ to each message $w_{j}\in[1:2^{nR_{j}}]$ , and

•

a decoder that assigns estimates $(\hat{w}_{1},\hat{w}_{2})\in[1:2^{nR_{1}}]\times[1:2^{nR_{2}}]$ or an error message to each received sequence $y^{n}$ .

We assume that the message pair $(W_{1},W_{2})$ is uniformly distributed over $[1:2^{nR_{1}}]\times[1:2^{nR_{2}}]$ . The average probability of error is defined as

[TABLE]

Average power constraints are imposed on the channel inputs as

[TABLE]

where $x_{j,i}(w_{j})$ denotes the $i^{\mbox{th}}$ element of the codeword $x_{j}^{n}(w_{j})$ .

A rate pair $(R_{1},R_{2})$ is said to be achievable for this channel if there exists a sequence of $(2^{nR_{1}},2^{nR_{2}},n)$ codes (satisfying the average power constraints (3)) such that $\lim_{n\to\infty}P_{e}^{(n)}=0$ . The capacity region $\mathscr{C}(P_{1},P_{2})$ of this channel is the closure of the set of achievable rate pairs $(R_{1},R_{2})$ .

III Main results

Proposition 1. The capacity region $\mathscr{C}(P_{1},P_{2})$ of a two-transmitter memoryless MAC with average power constraints $P_{1}$ and $P_{2}$ is the set of non-negative rate pairs $(R_{1},R_{2})$ that satisfy

[TABLE]

for some $F_{U}(u)F_{X_{1}|U}(x_{1}|u)F_{X_{2}|U}(x_{2}|u)$ , such that $\mathds{E}[X_{j}^{2}]\leq P_{j},\ j=1,2.$ Also, it is sufficient to consider $|\mathcal{U}|\leq 5$ .

Proof.

The capacity region of the discrete memoryless (DM) MAC with input cost constraints has been addressed in Exercise 4.8 of [10]. If the input alphabets are not discrete, the capacity region is still the same because: 1) the converse remains the same if the inputs are from a continuous alphabet; 2) the region is achievable by coded time sharing and the discretization procedure (see Remark 3.8 in [10]). Therefore, it is sufficient to show the cardinality bound $|\mathcal{U}|\leq 5$ .

Let $\mathscr{P}$ be the set of all product distributions (i.e., of the form $F_{X_{1}}(x_{1})F_{X_{2}}(x_{2})$ ) on $\mathbb{R}^{2}$ . Let $\mathbf{g}:\mathscr{P}\to\mathbb{R}^{5}$ be a vector-valued mapping defined element-wise as

[TABLE]

Let $\mathscr{G}\subset\mathbb{R}^{5}$ be the image of $\mathscr{P}$ under the mapping $\mathbf{g}$ (i.e., $\mathscr{G}=\mathbf{g}(\mathscr{P})$ ). Given an arbitrary $(U,X_{1},X_{2})\sim F_{U}F_{X_{1}|U}F_{X_{2}|U}$ , we obtain the vector $\mathbf{r}$ as

[TABLE]

Therefore, $\mathbf{r}$ is in the convex hull of $\mathscr{G}\subset\mathbb{R}^{5}$ . By Carathéodory’s theorem [11], $\mathbf{r}$ can be written as a convex combination of 6 ( $=5+1$ ) or fewer points in $\mathscr{G}$ , which states that it is sufficient to consider $|\mathcal{U}|\leq 6$ . Since $\mathscr{P}$ is a connected set111 $\mathscr{P}$ is the product of two connected sets, therefore, it is connected. Each of the sets in this product is connected because of being a convex vector space. and the mapping $\mathbf{g}$ is continuous222This is a direct result of the continuity of the channel transition probability., $\mathscr{G}$ is a connected subset of $\mathbb{R}^{5}$ . Therefore, connectedness of $\mathscr{G}$ refines the cardinality of $U$ to $|\mathcal{U}|\leq 5$ .333This refinement of the cardinality is due to the connected version of Carathéodory’s theorem as mentioned in [11, p.267], which is originally due to [12, p.35-36]. ∎

Lemma 1. For the boundary points of $\mathscr{C}(P_{1},P_{2})$ that are not sum-rate optimal, it is sufficient to have $|\mathcal{U}|\leq 4$ .

Proof.

Any point on the boundary of the capacity region that does not maximize $R_{1}+R_{2}$ , is either of the form $(I(X_{1};Y|X_{2},U),I(X_{2};Y|U))$ or $(I(X_{1};Y|U),I(X_{2};Y|X_{1},U))$ for some $F_{U}F_{X_{1}|U}F_{X_{2}|U}$ that satisfies $\mathds{E}[X_{j}^{2}]\leq P_{j},j=1,2.$ In other words, it is one of the corner points of the corresponding pentagon in (4). As in the proof of Proposition 1, define the mapping $\mathbf{g}:\mathscr{P}\to\mathbb{R}^{4}$ , where $g_{1}$ and $g_{2}$ are the coordinates of this boundary point conditioned on $U=u$ , and $g_{3}$ , $g_{4}$ are the same as $g_{4}$ and $g_{5}$ in (5), respectively. The sufficiency of $|\mathcal{U}|\leq 4$ in this case follows similarly to the proof of Proposition 1. ∎

When there is no input cost constraint, the capacity region of the MAC can be characterized either through the convex hull operation as in [10, Theorem 4.2], or with the introduction of an auxiliary random variable as in [10, Theorem 4.3]. The following remark states that when there are input cost constraints, the capacity characterization region requires an auxiliary random variable in general.

Remark 2. Let $(X_{1},X_{2})\sim F_{X_{1}}(x_{1})F_{X_{2}}(x_{2})$ , such that $\mathds{E}[X_{j}^{2}]\leq P_{j},j=1,2$ . Let $\mathscr{R}(P_{1},P_{2})$ denote the set of non-negative rate pairs $(R_{1},R_{2})$ such that

[TABLE]

Let $\mathscr{R}_{1}(P_{1},P_{2})$ be the convex closure of $\bigcup_{F_{X_{1}}F_{X_{2}}}\mathscr{R}(P_{1},P_{2})$ , where the union is over all product distributions that satisfy the average power constraints.

Let $\mathscr{R}_{2}(P_{1},P_{2})$ be the set of non-negative rate pairs $(R_{1},R_{2})$ such that

[TABLE]

for some $F_{U}(u)F_{X_{1}|U}(x_{1}|u)F_{X_{2}|U}(x_{2}|u)$ that satisfies $\mathds{E}[X_{j}^{2}|U=u]\leq P_{j},j=1,2,\ \forall u$ .

It can be verified that $\mathscr{R}_{1}(P_{1},P_{2})=\mathscr{R}_{2}(P_{1},P_{2})$ . By comparing $\mathscr{R}_{2}(P_{1},P_{2})$ to the capacity region $\mathscr{C}(P_{1},P_{2})$ , we can conclude that $\mathscr{R}_{2}(P_{1},P_{2})\subseteq\mathscr{C}(P_{1},P_{2})$ . This follows from the fact that in the region $\mathscr{R}_{2}(P_{1},P_{2})$ , the average power constraint $\mathds{E}[X_{j}^{2}|U=u]\leq P_{j}$ holds for every realization of the auxiliary random variable $U$ , which is a stronger condition than $\mathds{E}[X_{j}^{2}]\leq P_{j}$ used in the capacity region. The following example shows that $\mathscr{R}_{1}(P_{1},P_{2})$ and $\mathscr{R}_{2}(P_{1},P_{2})$ can be strictly smaller than $\mathscr{C}(P_{1},P_{2})$ .

Consider the same Gaussian MAC with one-bit quantizer at the receiver (as depicted in Figure 1) with the following changes: i) $\mathcal{X}_{1}=\mathcal{X}_{2}=\{-\sqrt{2},0,\sqrt{2}\}$ , ii) Besides the average power constraints of $P_{1}=P_{2}=1$ , we also impose the constraint that the inputs should have a zero mean, i.e. $\mathds{E}[X_{j}]=0,\ j=1,2.$ The capacity region of this channel is the set of non-negative rate pairs $(R_{1},R_{2})$ such that (4) holds for some $F_{U}(u)F_{X_{1}|U}(x_{1}|u)F_{X_{2}|U}(x_{2}|u)$ which satisfies $\mathds{E}[X_{j}^{2}]\leq P_{j},\ \mathds{E}[X_{j}]=0,\ j=1,2.$ Also, let $\mathscr{R}_{1}$ be the rate region in Remark 2 with the additional constraints $\mathds{E}[X_{j}]=0,\ j=1,2.$

In order to show that $\mathscr{R}_{1}$ can be strictly smaller than the capacity region, we show that there exists a point in the capacity region which is not in $\mathscr{R}_{1}$ . We have,

[TABLE]

where (6) is due to the fact that $X_{1}+X_{2}$ is a function of the pair $(X_{1},X_{2})$ , and the following Markov chain holds: $(X_{1},X_{2})\to X_{1}+X_{2}\to Y$ . In (7), we use the inequality $\mathds{E}[(X_{1}+X_{2})^{2}]=\mathds{E}[X_{1}^{2}]+\mathds{E}[X_{2}^{2}]\leq 2$ , since $X_{1}$ and $X_{2}$ are independent and zero mean. Also, the channel from $X$ to $Y$ is characterized by the conditional distribution $p_{Y|X}(y|x)\sim\mbox{Bern}(Q(x))$ . Equality in (8) is due to [5], where the maximum is shown to be achieved by the CDF $F^{*}_{X}(x)=\frac{1}{2}s(x+\sqrt{2})+\frac{1}{2}s(x-\sqrt{2})$ , where $s(\cdot)$ is the unit step function. Let $U\sim\mbox{Bern}(\frac{1}{2})$ , $F_{X_{1}|U}(x|1)=F_{X_{2}|U}(x|0)=\frac{1}{2}s(x+\sqrt{2})+\frac{1}{2}s(x-\sqrt{2})$ and $F_{X_{1}|U}(x|0)=F_{X_{2}|U}(x|1)=s(x)$ . For this joint distribution on $(U,X_{1},X_{2})$ , we have $\mathds{E}[X_{j}]=0,\ \mathds{E}[X_{j}^{2}]\leq 1,\ j=1,2$ , and $I(X_{1},X_{2};Y|U)=1-H_{b}(Q(\sqrt{2}))$ , which results in (9).

In what follows, it is proved that the inequality in (7) is strict. In other words, the sum rate of $1-H_{b}(Q(\sqrt{2}))$ cannot be obtained by any rate pair in $\mathscr{R}_{1}$ , while it belongs to the capacity region. Let $\tilde{X}=X_{1}+X_{2}$ , where $X_{1}$ and $X_{2}$ are two zero-mean independent random variables on $\mathcal{X}_{1}$ ( $=\mathcal{X}_{2}$ ) satisfying the average power constraint $\mathds{E}[X_{j}^{2}]\leq 1,j=1,2$ . We show that the minimum Lévy distance444The Lévy distance between two distributions $F,G:\mathbb{R}\to[0,1]$ is defined as

$d_{L}(F,G)=\inf\{\epsilon>0|F(x-\epsilon)-\epsilon\leq G(x)\leq F(x+\epsilon)+\epsilon,\ \forall x\in\mathbb{R}\}.$

between $F^{*}_{X}(x)$ and all the distributions $F_{\tilde{X}}(x)$ (induced by $F_{X_{1}}F_{X_{2}}$ ) is bounded away from zero. Since $\mathds{E}[X_{j}]=0$ and $\mathds{E}[X_{j}^{2}]\leq 1,\ j=1,2$ , the distribution of $X_{1}$ is $F_{X_{1}}(x)=ps(x+\sqrt{2})+(1-2p)s(x)+ps(x-\sqrt{2})$ with $p\leq\frac{1}{4}$ . The same applies to $F_{X_{2}}$ with parameter $p^{\prime}$ ( $\leq\frac{1}{4}$ ). The distribution of $\tilde{X}$ induced by $F_{X_{1}}F_{X_{2}}$ is given by

[TABLE]

where $p\hat{*}p^{\prime}\triangleq p(1-2p^{\prime})+p^{\prime}(1-2p)$ is similar to convolution operation. Let $\tilde{\mathscr{F}}$ be the set of all distributions on $\tilde{X}$ obtained in this way. It can be easily verified that (see Figure 2) for any given $p,p^{\prime}\leq\frac{1}{4}$ , the Lévy distance between $F_{\tilde{X}}$ and $F^{*}_{X}$ is

[TABLE]

Subsequently,

[TABLE]

This shows that there is a neighborhood of $F^{*}_{X}$ whose intersection with $\tilde{\mathscr{F}}$ is empty. Note that any neighborhood with radius less than $\frac{3}{16}$ has this property. Combined with the facts that the mutual information is continuous and $F^{*}_{X}$ is the unique solution555This is due to the strict convexity of $H_{b}(Q(\sqrt{\cdot}))$ , which is used in Jensen’s inequality in [5]., it proves that the inequality in (7) is strict. Therefore, $\mathscr{R}_{1}$ ( $=\mathscr{R}_{2}$ ) is smaller than the capacity region in general.

The main result of this paper is provided in the following theorem. It bounds the cardinality of the support set of the capacity achieving distributions.

Theorem 1. Let $P$ be an arbitrary point on the boundary of the capacity region $\mathscr{C}(P_{1},P_{2})$ of the memoryless MAC with a one-bit ADC front end666The results remain valid if the one-bit ADC has a non-zero threshold. (as shown in Figure 1). $P$ is achieved by a distribution in the form of $F^{P}_{U}(u)F^{P}_{X_{1}|U}(x_{1}|u)F^{P}_{X_{2}|U}(x_{2}|u)$ . Also, let $l_{P}$ be the slope of the line tangent to the capacity region at this point. For any $u\in\mathcal{U}$ , the conditional input distributions $F^{P}_{X_{1}|U}(x_{1}|u)$ and $F^{P}_{X_{2}|U}(x_{2}|u)$ have at most $n_{1}$ and $n_{2}$ points of increase777A point $Z$ is said to be a point of increase of a distribution if for any open set $\Omega$ containing $Z$ , we have $\mbox{Pr}\{\Omega\}>0.$ , respectively, where

[TABLE]

Proof.

The proof is provided in Section IV. ∎

Proposition 1, Lemma 1 and Theorem 1 establish upper bounds on the number of mass points of the distributions that achieve a boundary point. The significance of this result is that once it is known that the optimal inputs are discrete with at most certain number of mass points, the capacity region along with the optimal distributions can be obtained via computer programs.

IV Proof of theorem 1

In order to show that the boundary points of the capacity region are achieved, it is sufficient to show that the capacity region is a closed set, i.e., it includes all of its limit points.

Let $\mathcal{U}$ be a set with $|\mathcal{U}|\leq 5$ , and $\Omega$ be defined as

[TABLE]

which is the set of all CDFs on the triplet $(U,X_{1},X_{2})$ , where $U$ is drawn from $\mathcal{U}$ , and the Markove chain $X_{1}-U-X_{2}$ and the corresponding average power constraints hold.

In Appendix A, it is proved that $\Omega$ is a compact set. Since a continuous mapping preserves compactness, the capacity region is compact. Since the capacity region is a subset of $\mathds{R}^{2}$ , it is closed and bounded888Note that a subset of $\mathds{R}^{k}$ is compact if and only if it is closed and bounded [13].. Therefore, any point $P$ on the boundary of the capacity region is achieved by a distribution denoted by $F^{P}_{U}(u)F^{P}_{X_{1}|U}(x_{1}|u)F^{P}_{X_{2}|U}(x_{2}|u)$ .

Since the capacity region is a convex space, it can be characterized by its supporting hyperplanes. In other words, any point on the boundary of the capacity region, denoted by $(R_{1}^{b},R_{2}^{b})$ , can be written as

[TABLE]

for some $\lambda>0$ .

Any rate pair $(R_{1},R_{2})\in\mathscr{C}(P_{1},P_{2})$ must lie within a pentagon defined by (4) for some $F_{U}F_{X_{1}|U}F_{X_{2}|U}$ that satisfies the power constraints. Therefore, due to the structure of the pentagon, the problem of finding the boundary points is equivalent to the following maximization problem.

[TABLE]

where on the right hand side (RHS) of (13), the maximizations are over all $F_{U}F_{X_{1}|U}F_{X_{2}|U}$ that satisfy the power constraints. It is obvious that when $\lambda=1$ , the two lines in (13) are the same, which results in the sum capacity.

For any product of distributions $F_{X_{1}}F_{X_{2}}$ and the channel in (1), let $I_{\lambda}$ be defined as

[TABLE]

With this definition, (13) can be rewritten as

[TABLE]

where the second maximization is over product distributions of the form $p_{U}(u)F_{X_{1}|U}(x_{1}|u)F_{X_{2}|U}(x_{2}|u),\ |\mathcal{U}|\leq 5$ , such that

[TABLE]

Proposition 2. For a given $F_{X_{1}}$ and any $\lambda>0$ , $I_{\lambda}(F_{X_{1}}F_{X_{2}})$ is a concave, continuous and weakly differentiable function of $F_{X_{2}}$ . In the statement of this Proposition, $F_{X_{1}}$ and $F_{X_{2}}$ could be interchanged.

Proof.

The proof is provided in Appendix B. ∎

Proposition 3. Let $P_{1}^{\prime},P_{2}^{\prime}$ be two arbitrary non-negative real numbers. For the following problem

[TABLE]

the optimal inputs $F_{X_{1}}^{*}$ and $F_{X_{2}}^{*}$ , which are not unique in general, have the following properties,

(i)

The support sets of $F_{X_{1}}^{*}$ and $F_{X_{2}}^{*}$ are bounded subsets of $\mathbb{R}$ . 2. (ii)

$F_{X_{1}}^{*}$ and $F_{X_{2}}^{*}$ are discrete distributions that have at most $n_{1}$ and $n_{2}$ points of increase, respectively, where

[TABLE]

Proof.

We start with the proof of the first claim. Assume that $0<\lambda\leq 1$ , and $F_{X_{2}}$ is given. Consider the following optimization problem:

[TABLE]

Note that $I^{*}_{F_{X_{2}}}<+\infty$ , since for any $\lambda>0$ , from (14),

[TABLE]

From Proposition 2, $I_{\lambda}$ is a continuous, concave function of $F_{X_{1}}$ . Also, the set of all CDFs with bounded second moment (here, $P_{1}^{\prime}$ ) is convex and compact999The compactness follows from [14, Appendix I]. The only difference is in using Chebyshev’s inequality instead of Markov inequality.. Therefore, the supremum in (16) is achieved by a distribution $F_{X_{1}}^{*}$ . Since for any $F_{X_{1}}(x)=s(x-x_{0})$ with $|x_{0}|^{2}<P_{1}^{\prime}$ , we have $\mathds{E}[X_{1}^{2}]<P_{1}^{\prime}$ , the Lagrangian theorem and the Karush-Kuhn-Tucker conditions state that there exists a $\theta_{1}\geq 0$ such that

[TABLE]

Furthermore, the supremum in (17) is achieved by $F_{X_{1}}^{*}$ , and

[TABLE]

Lemma 2. The Lagrangian multiplier $\theta_{1}$ is nonzero.

Proof.

Having a zero Lagrangian multiplier means the power constraint is inactive. In other words, if $\theta_{1}=0$ , (16) and (17) imply that

[TABLE]

We prove that (19) does not hold by showing that its left hand side (LHS) is strictly less than 1, while its RHS equals 1. The details are provided in Appendix C. ∎

$I_{\lambda}(F_{X_{1}}F_{X_{2}})$ ( $0<\lambda\leq 1$ ) can be written as

[TABLE]

where we have defined

[TABLE]

and

[TABLE]

$p(y;F_{X_{1}}F_{X_{2}})$ is nothing but the pmf of $Y$ with the emphasis that it has been induced by $F_{X_{1}}$ and $F_{X_{2}}$ . Likewise, $p(y;F_{X_{1}}|x_{2})$ is the conditional pmf $p(y|x_{2})$ when $X_{1}$ is drawn according to $F_{X_{1}}$ . From (20), $\tilde{i}_{\lambda}(x_{1};F_{X_{1}}|F_{X_{2}})$ can be considered as the density of $I_{\lambda}$ over $F_{X_{1}}$ when $F_{X_{2}}$ is given. $i_{\lambda}(x_{2};F_{X_{2}}|F_{X_{1}})$ can be interpreted in a similar way.

Note that (17) is an unconstrained optimization problem over the set of all CDFs. Since $\int x^{2}dF_{X_{1}}(x)$ is linear and weakly differentiable in $F_{X_{1}}$ , the objective function in (17) is concave and weakly differentiable. Hence, a necessary condition for optimality of $F_{X_{1}}^{*}$ is

[TABLE]

Furthermore, (24) can be verified to be equivalent to

[TABLE]

The justifications of (24), (25) and (26) are provided in Appendix D.

In what follows, we prove that in order to satisfy (26), $F_{X_{1}}^{*}$ must have a bounded support by showing that the LHS of (26) goes to $-\infty$ with $x_{1}$ . The following lemma is useful in the sequel for taking the limit processes inside the integrals.

Lemma 3. Let $X_{1}$ and $X_{2}$ be two independent random variables satisfying $\mathds{E}[X_{1}^{2}]\leq P_{1}^{\prime}$ and $\mathds{E}[X_{2}^{2}]\leq P_{2}^{\prime}$ , respectively ( $P_{1}^{\prime},P_{2}^{\prime}\in[0,+\infty)$ ). Considering the conditional pmf in (1), the following inequalities hold.

[TABLE]

Proof.

The proof is provided in Appendix E. ∎

Note that

[TABLE]

where (30) is due to Lebesgue dominated convergence theorem [13] and (27), which permit the interchange of the limit and the integral; (31) is due to the following

[TABLE]

since $p(0|x_{1},x_{2})=Q(x_{1}+x_{2})$ goes to zero when $x_{1}\to+\infty$ and $p_{Y}(y;F_{X_{1}}^{*}F_{X_{2}})\ (y=0,1)$ is bounded away from zero by (85) ; and (32) is obtained from (85) in Appendix E. Furthermore,

[TABLE]

where (33) is due to Lebesgue dominated convergence theorem along with (29) and (90) in Appendix E; (34) is from (28) and convexity of $\log Q(\alpha+\sqrt{t})$ in $t$ when $\alpha\geq 0$ (see Appendix F).

Therefore, from (32) and (34),

[TABLE]

Using a similar approach, we can also obtain

[TABLE]

From (35), (36) and the fact that $\theta_{1}>0$ (see Lemma 2), the LHS of (25) goes to $-\infty$ when $|x_{1}|\to+\infty$ . Since any point of increase of $F_{X_{1}}^{*}$ must satisfy (25) with equality, and $I_{F_{X_{2}}}^{*}\geq 0$ , it is proved that $F_{X_{1}}^{*}$ has a bounded support, i.e., $X_{1}\in[A_{1},A_{2}]$ for some $A_{1},A_{2}\in\mathbb{R}$ .101010Note that $A_{1}$ and $A_{2}$ are determined by the choice of $F_{X_{2}}$ .

Similarly, for a given $F_{X_{1}}$ , the optimization problem

[TABLE]

boils down to the following necessary condition

[TABLE]

for the optimality of $F_{X_{2}}^{*}$ . However, there are two main differences between (38) and (26). First is the difference between $i_{\lambda}$ and $\tilde{i}_{\lambda}$ . Second is the fact that we do not claim $\theta_{2}$ to be nonzero, since the approach used in Lemma 2 cannot be readily applied to $\theta_{2}$ . Nonetheless, the boundedness of the support of $F_{X_{2}}^{*}$ can be proved by inspecting the behaviour of the LHS of (38) when $|x_{2}|\to+\infty$ .

In what follows, i.e., from (IV) to (44), we prove that the support of $F^{*}_{X_{2}}$ is bounded by showing that (38) does not hold when $|x_{2}|$ is above a certain threshold. The first term on the LHS of (38) is $i_{\lambda}(x_{2};F_{X_{2}}^{*}|F_{X_{1}})$ . From (23) and (27), it can be easily verified that

[TABLE]

From (IV), if $\theta_{2}>0$ , the LHS of (38) goes to $-\infty$ with $|x_{2}|$ , which proves that $X_{2}^{*}$ is bounded.

For the possible case of $\theta_{2}=0$ , in order to show that (38) does not hold when $|x_{2}|$ is above a certain threshold, we rely on the boundedness of $X_{1}$ . Note that, since $F_{X_{1}}$ has a bounded support, we denote its support, without loss of generality, by $[-A_{1},A_{2}]$ , where $A_{1},A_{2}$ are some non-negative real numbers. Then, we prove that $i_{\lambda}$ approaches its limit in (IV) from below. In other words, there is a real number $K$ such that $i_{\lambda}(x_{2};F_{X_{2}}^{*}|F_{X_{1}})<-\lambda\log p_{Y}(1;F_{X_{1}}F_{X_{2}}^{*})$ when $x_{2}>K$ , and $i_{\lambda}(x_{2};F_{X_{2}}^{*}|F_{X_{1}})<-\lambda\log p_{Y}(0;F_{X_{1}}F_{X_{2}}^{*})$ when $x_{2}<-K$ . This establishes the boundedness of $X_{2}^{*}$ . In what follows, we only show the former, i.e., when $x_{2}\to+\infty$ . The latter, i.e., $x_{2}\to-\infty$ , follows similarly, and it is omitted for the sake of brevity.

By rewriting $i_{\lambda}$ , we have

[TABLE]

It is obvious that the first term in the RHS of (40) approaches $-\lambda\log p_{Y}(1;F_{X_{1}}F_{X_{2}}^{*})$ from below when $x_{2}\to+\infty$ , since $p(1;F_{X_{1}}|x_{2})\leq 1$ . It is also obvious that the remaining terms go to zero when $x_{2}\to+\infty$ . Hence, it is sufficient to show that the second line of (40) approaches zero from below, which is proved by using the following lemma.

Lemma 4. Let $X_{1}$ be distributed on $[-A_{1},A_{2}]$ according to $F_{X_{1}}(x_{1})$ . We have

[TABLE]

Proof.

The proof is provided in Appendix G. ∎

From (41), we can write

[TABLE]

where $\gamma(x_{2})\leq 1$ (due to concavity of $H_{b}(\cdot)$ ), and $\gamma(x_{2})\to 1$ when $x_{2}\to+\infty$ (due to (41)). Also, from the fact that $\lim_{x\to 0}\frac{H_{b}(x)}{cx}=+\infty\ (c>0)$ , we have

[TABLE]

where $\eta(x_{2})>0$ and $\eta(x_{2})\to+\infty$ when $x_{2}\to+\infty$ . From (42) and (43), the second line of (40) becomes

[TABLE]

Since $\gamma(x_{2})\to 1$ and $\eta(x_{2})\to+\infty$ as $x_{2}\to+\infty$ , there exists a real number $K$ such that $1-\gamma(x_{2})+\frac{\lambda}{\eta(x_{2})}-\lambda<0$ when $x_{2}>K$ . Therefore, the second line of (40) approaches zero from below, which proves that the support of $X_{2}^{*}$ is bounded away from $+\infty$ . As mentioned before, a similar argument holds when $x_{2}\to-\infty$ . This proves that $X_{2}^{*}$ has a bounded support.

Remark 3. We remark here that the order of showing the boundedness of the supports is important. First, for a given $F_{X_{2}}$ (not necessarily bounded), it is proved that $F_{X_{1}}^{*}$ is bounded. Then, for a given bounded $F_{X_{1}}$ , it is shown that $F_{X_{2}}^{*}$ is also bounded. The order is reversed when $\lambda>1$ , and it follows the same steps as in the case of $\lambda\leq 1$ . Therefore, it is omitted.

We next prove the second claim in Proposition 3. We assume that $0<\lambda<1$ , and a bounded $F_{X_{1}}$ is given. We already know that for a given bounded $F_{X_{1}}$ , $F_{X_{2}}^{*}$ has a bounded support denoted by $[A_{1},A_{2}]$ . Therefore,

[TABLE]

where $\mathscr{S}_{2}$ denotes the set of all probability distributions on the Borel sets of $[A_{1},A_{2}]$ . Let $p_{0}^{*}=p_{Y}(0;F_{X_{1}}F_{X_{2}}^{*})$ denote the probability of the event $Y=0$ , induced by $F_{X_{2}}^{*}$ and the given $F_{X_{1}}$ . Also, let $P_{2}^{*}$ denote the second moment of $X_{2}$ under $F_{X_{2}}^{*}$ . The set

[TABLE]

is the intersection of $\mathscr{S}_{2}$ with two hyperplanes.111111Note that $\mathscr{S}_{2}$ is convex and compact.. We can write

[TABLE]

Note that having $F_{X_{2}}\in\mathscr{F}_{2}$ , the objective function in (47) becomes

[TABLE]

Since the linear part is continuous and $\mathscr{F}_{2}$ is compact121212The continuity of the linear part follows similarly to the continuity arguments in Appendix B. Note that this compactness is due to the closedness of the intersecting hyperplanes in $\mathscr{F}_{2}$ , since a closed subset of a compact set is compact [13]. The hyperplanes are closed due to continuity of $x_{2}^{2}$ and $p(0|x_{2})$ (see (66))., the objective function in (47) attains its maximum at an extreme point of $\mathscr{F}_{2}$ , which, by Dubins’ theorem, is a convex combination of at most three extreme points of $\mathscr{S}_{2}$ . Since the extreme points of $\mathscr{S}_{2}$ are the CDFs having only one point of increase in $[A_{1},A_{2}]$ , we conclude that given any bounded $F_{X_{1}}$ , $F_{X_{2}}^{*}$ has at most three mass points.

Now, assume that an arbitrary $F_{X_{2}}$ is given with at most three mass points denoted by $\{x_{2,i}\}_{i=1}^{3}$ . It is already known that the support of $F_{X_{1}}^{*}$ is bounded, which is denoted by $[A_{1}^{\prime},A_{2}^{\prime}]$ . Let $\mathscr{S}_{1}$ denote the set of all probability distributions on the Borel sets of $[A_{1}^{\prime},A_{2}^{\prime}]$ . The set

[TABLE]

is the intersection of $\mathscr{S}_{1}$ with four hyperplanes131313Note that here, since we know $\theta_{1}\neq 0$ , the optimal input attains its maximum power of $P_{1}^{\prime}$ .. In a similar way,

[TABLE]

and having $F_{X_{1}}\in\mathscr{F}_{1}$ , the objective function in (50) becomes

[TABLE]

Therefore, given any $F_{X_{2}}$ with at most three points of increase, $F_{X_{1}}^{*}$ has at most five mass points.

When $\lambda=1$ , the second term on the RHS of (51) disappears, which means that $\mathscr{F}_{1}$ could be replaced by

[TABLE]

where $\tilde{p}_{0}^{*}=p_{Y}(0;F_{X_{1}}^{*}F_{X_{2}})$ is the probability of the event $Y=0$ , which is induced by $F_{X_{1}}^{*}$ and the given $F_{X_{2}}$ . Since the number of intersecting hyperplanes has been reduced to two, it is concluded that $F_{X_{1}}^{*}$ has at most three points of increase. ∎

Remark 4. Note that, the order of showing the discreteness of the support sets is also important. First, for a given bounded $F_{X_{1}}$ (not necessarily discrete), it is proved that $F_{X_{2}}^{*}$ is discrete with at most three mass points. Then, for a given discrete $F_{X_{2}}$ with at most three mass points, it is shown that $F_{X_{1}}^{*}$ is also discrete with at most five mass points when $\lambda<1$ , and at most three mass points when $\lambda=1$ . When $\lambda>1$ , the order is reversed and it follows the same steps as in the case of $\lambda<1$ . Therefore, it is omitted.

V Conclusion

We have studied the capacity region of a two-transmitter Gaussian MAC under average input power constraints and one-bit ADC front end at the receiver. We have shown that an auxiliary random variable is necessary for characterizing the capacity region in general. We have derived an upper bound on the cardinality of this auxiliary variable, and proved that the distributions that achieve the boundary points of the capacity region are finite and discrete.

Appendix A

Since $|\mathcal{U}|\leq 5$ , we assume $\mathcal{U}=\{0,1,2,3,4\}$ without loss of generality, since what matters in the evaluation of the capacity region is the mass probability of the auxiliary random variable $U$ , not its actual values.

In order to show the compactness of $\Omega$ , we adopt a general form of the approach in [14].

First, we show that $\Omega$ is tight141414A set of probability distributions $\Theta$ defined on $\mathds{R}^{k}$ , i.e. the set of CDFs $F_{X_{1},X_{2},\ldots,X_{k}}$ , is said to be tight, if for every $\epsilon>0$ , there is a compact set $K_{\epsilon}\subset\mathds{R}^{k}$ such that [15]

$\mbox{Pr}\big{\{}(X_{1},X_{2},\ldots,X_{k})\in\mathds{R}^{k}\backslash K_{\epsilon}\big{\}}<\epsilon,\ \forall F_{X_{1},X_{2},\ldots,X_{k}}\in\Theta.$

. Choose $T_{j}$ , $j=1,2$ , such that $T_{j}>\sqrt{\frac{2P_{j}}{\epsilon}}$ . Then, from Chebyshev’s inequality,

[TABLE]

Let $K_{\epsilon}=[0,4]\times[-T_{1},T_{1}]\times[-T_{2},T_{2}]\subset\mathds{R}^{3}$ . It is obvious that $K_{\epsilon}$ is a closed and bounded subset of $\mathds{R}^{3}$ , and therefore, compact. With this choice of $K_{\epsilon}$ , we have

[TABLE]

where (53) is due to (52). Hence, $\Omega$ is tight.

From Prokhorov’s theorem [15, p.318], a set of probability distributions is tight if and only if it is relatively sequentially compact151515A subset of topological space is relatively compact if its closure is compact.. This means that for every sequence of CDFs $\{F_{n}\}$ in $\Omega$ , there exists a subsequence $\{F_{n_{k}}\}$ that is weakly convergent161616The weak convergence of $\{F_{n}\}$ to $F$ (also shown as $F_{n}(x)\stackrel{{\scriptstyle w}}{{\to}}F(x)$ ) is equivalent to

$\lim_{n\to\infty}\int_{\mathbb{R}}\psi(x)dF_{n}(x)=\int_{\mathbb{R}}\psi(x)dF(x),$

(54)

for all continuous and bounded functions $\psi(\cdot)$ on $\mathbb{R}$ . Note that $F_{n}(x)\stackrel{{\scriptstyle w}}{{\to}}F(x)$ if and only if $d_{L}(F_{n},F)\to 0$ . to a CDF $F_{0}$ , which is not necessarily in $\Omega$ . If we can show that this $F_{0}$ is also an element of $\Omega$ , then the proof is complete, since we have shown that $\Omega$ is sequentially compact, and therefore, compact171717Compactness and sequentially compactness are equivalent in metric spaces. Note that $\Omega$ is a metric space with Lévy distance..

Assume a sequence of distributions $\{F_{n}(\cdot,\cdot,\cdot)\}$ in $\Omega$ that converges weakly to $F_{0}(\cdot,\cdot,\cdot)$ . In order to show that this limiting distribution is also in $\Omega$ , we need to show that both the average power constraints and the Markov chain ( $X_{1}-U-X_{2}$ ) are preserved under $F_{0}$ . The preservation of the second moment follows similarly to the argument in [14, Appendix I]. In other words, since $x^{2}$ is continuous and bounded below, from [16, Theorem 4.4.4]

[TABLE]

Therefore, the second moments are preserved under the limiting distribution $F_{0}$ .

For the preservation of the Markov chain $X_{1}-U-X_{2}$ , we need the following proposition.

Proposition 4. Assume a sequence of distributions $\{F_{n}(\cdot,\cdot)\}$ over the pair of random variables $(X,Y)$ that converges weakly to $F_{0}(\cdot,\cdot)$ . Also, assume that $Y$ has a finite support, i.e., $\mathcal{Y}=\{1,2,\ldots,|\mathcal{Y}|\}$ . Then, the sequence of conditional distributions (conditioned on $Y$ ) converges weakly to the limiting conditional distribution (conditioned on $Y$ ), i.e.,

[TABLE]

Proof.

The proof is by contradiction. If (56) is not true, then there exists $y^{\prime}\in\mathcal{Y}$ , such that $F_{n}(\cdot|y^{\prime})\cancel{\stackrel{{\scriptstyle w}}{{\to}}}F_{0}(\cdot|y^{\prime})$ . This means, from the definition of weak convergence, that there exists a bounded continuous function of $x$ , denoted by $g_{y^{\prime}}(x)$ , such that

[TABLE]

Let $f(x,y)$ be a bounded continuous function that satisfies

[TABLE]

With this choice of $f(x,y)$ , we have

[TABLE]

which violates the assumption of the weak convergence of $F_{n}(\cdot,\cdot)$ to $F_{0}(\cdot,\cdot)$ . Therefore, (56) holds. ∎

Since $\{F_{n}(\cdot,\cdot,\cdot)\}$ in $\Omega$ converges weakly to $F_{0}(\cdot,\cdot,\cdot)$ and $\mathcal{U}$ is finite, from Proposition 4, we have

[TABLE]

where it is obvious that the arguments are $x_{1}$ and $x_{2}$ . Since $F_{n}\in\Omega$ , we have $F_{n}(x_{1},x_{2}|u)=F_{n}(x_{1}|u)F_{n}(x_{2}|u)\ \forall u\in\mathcal{U}$ . Also, since the convergence of the joint distribution implies the convergence of the marginals, we have [17], [18, Theorem 2.7],

[TABLE]

which states that under the limiting distribution $F_{0}$ , the Markov chain $X_{1}-U-X_{2}$ is preserved.181818Alternatively, this could be proved by the lower-semicontinuity of the mutual information as follows.

$\displaystyle I_{F_{0}}(X_{1};X_{2}|U=u)$ $\displaystyle\leq\liminf_{n\to\infty}I_{F_{n}}(X_{1};X_{2}|U=u)$

(62)

$\displaystyle=0,\ \forall u\in\mathcal{U},$

(63)

where $I_{F}$ denotes the mutual information under distribution $F$ . The last equality is from the conditional independence of $X_{1}$ and $X_{2}$ given $U=u$ under $F_{n}$ . Therefore, $I_{F_{0}}(X_{1};X_{2}|U=u)=0,\ \forall u\in\mathcal{U}$ , which is equivalent to (61). This completes the proof of the compactness of $\Omega$ .

Appendix B Proof of Proposition 2

B-A Concavity

When $0<\lambda\leq 1$ , we have

[TABLE]

For a given $F_{X_{1}}$ , $H(Y)$ is a concave function of $F_{X_{2}}$ , while $H(Y|X_{2})$ and $H(Y|X_{1},X_{2})$ are linear in $F_{X_{2}}$ . Therefore, $I_{\lambda}$ is a concave function of $F_{X_{2}}$ . For a given $F_{X_{2}}$ , $H(Y)$ and $H(Y|X_{2})$ are concave functions of $F_{X_{1}}$ , while $H(Y|X_{1},X_{2})$ is linear in $F_{X_{1}}$ . Since $(1-\lambda)\geq 0$ , $I_{\lambda}$ is a concave function of $F_{X_{1}}$ . The same reasoning applies to the case $\lambda>1$ .

B-B Continuity

When $\lambda\leq 1$ , the continuity of the three terms on the RHS of (64) is investigated. Let $\{F_{X_{2},n}\}$ be a sequence of distributions which is weakly convergent to $F_{X_{2}}$ . For a given $F_{X_{1}}$ , we have

[TABLE]

where (65) is due to the fact that the $Q$ function can be dominated by 1, which is an absolutely integrable function over $F_{X_{1}}$ . Therefore, $p(y;F_{X_{1}}|x_{2})$ is continuous in $x_{2}$ , and combined with the weak convergence of $\{F_{X_{2},n}\}$ , we can write

[TABLE]

This allows us to write

[TABLE]

which proves the continuity of $H(Y)$ in $F_{X_{2}}$ . $H(Y|X_{2}=x_{2})$ is a bounded ( $\in[0,1]$ ) continuous function of $x_{2}$ , since it is a continuous function of $p(y;F_{X_{1}}|x_{2})$ , and the latter is continuous in $x_{2}$ (see (66)). Therefore,

[TABLE]

which proves the continuity of $H(Y|X_{2})$ in $F_{X_{2}}$ . In a similar way, it can be verified that $\int H(Y|X_{1}=x_{1},X_{2}=x_{2})dF_{X_{1}}(x_{1})$ is a bounded and continuous function of $x_{2}$ which guarantees the continuity of $H(Y|X_{1},X_{2})$ in $F_{X_{2}}$ , since

[TABLE]

Therefore, for a given $F_{X_{1}}$ , $I_{\lambda}$ is a continuous function of $F_{X_{2}}$ . Exchanging the roles of $F_{X_{1}}$ and $F_{X_{2}}$ and also the case $\lambda>1$ can be addressed similarly, and are omitted for the sake of brevity.

B-C Weak Differentiability

For a given $F_{X_{1}}$ , the weak derivative of $I_{\lambda}$ at $F_{X_{2}}^{0}$ is given by

[TABLE]

if the limit exists. It can be verified that

[TABLE]

where $i_{\lambda}$ has been defined in (23). In a similar way, for a given $F_{X_{2}}$ , the weak derivative of $I_{\lambda}$ at $F_{X_{1}}^{0}$ is

[TABLE]

where $\tilde{i}_{\lambda}$ has been defined in (22). The case $\lambda>1$ can be addressed similarly.

Appendix C Proof of Lemma 2

We have

[TABLE]

where (70) is from the non-negativity of mutual information and the assumption that $0<\lambda\leq 1$ ; (71) is justified since the $Q$ function is monotonically decreasing and the sign of the inputs does not affect the average power constraints, $X_{1}$ and $X_{2}$ can be assumed non-negative (or alternatively non-positive) without loss of optimality; in (72), we use the fact that $Q\left(\sqrt{x_{1}^{2}}+\sqrt{x_{2}^{2}}\right)\leq\frac{1}{2}$ , and for $t\in[0,\frac{1}{2}]$ , $H_{b}(t)\geq t$ ; (73) is based on the convexity and monotonicity of the function $Q(\sqrt{u}+\sqrt{v})$ in $(u,v)$ , which is shown in Appendix F. Therefore, the LHS of (19) is strictly less than 1.

Since $X_{2}$ has a finite second moment ( $\mathds{E}[X_{2}^{2}]\leq P_{2}^{\prime}$ ), from Chebyshev’s inequality, we have

[TABLE]

Fix $M>0$ and consider $X_{1}\sim F_{X_{1}}(x_{1})=\frac{1}{2}[s(x_{1}+2M)+s(x_{1}-2M)]$ . By this choice of $F_{X_{1}}$ , we get

[TABLE]

where (78) is due to (75) and the fact that $H(Y|X_{2}=x_{2})=H_{b}(\frac{1}{2}Q(2M+x_{2})+\frac{1}{2}Q(-2M+x_{2}))$ is minimized over $[-M,M]$ at $x_{2}=M$ (or, alternatively at $x_{2}=-M$ ), and $H(Y|X_{1},X_{2}=x_{2})=\frac{1}{2}H_{b}(Q(2M+x_{2}))+\frac{1}{2}H_{b}(Q(-2M+x_{2}))$ is maximized at $x_{2}=0$ . (78) shows that $I_{\lambda}$ ( $\leq 1$ ) can become arbitrarily close to 1 given that $M$ is large enough. Hence, its supremum over all distributions $F_{X_{1}}$ is 1. This means that (19) cannot hold, and $\theta_{1}\neq 0$ .

Appendix D Justification of (24), (25) and (26)

Let $X$ be a vector space, and $Z$ be a real-valued function defined on a convex domain $D\subset X$ . Suppose that $x^{*}$ maximizes $Z$ on $D$ , and that $Z$ is Gateaux differentiable (weakly differentiable) at $x^{*}$ . Then, from [19, Th.2, p.178],

[TABLE]

where $Z^{\prime}(x)|_{x^{*}}$ is the weak derivative of $Z$ at $x^{*}$ .

From (69), we have the weak derivative of $I_{\lambda}$ at $F^{*}_{X_{1}}$ as

[TABLE]

Now, the derivation of (24) is immediate by inspecting that the weak derivative of the objective of (17) at $F^{*}_{X_{1}}$ is given by

[TABLE]

Letting (81) be lower than or equal to zero (as in (79)) results in (24).

The equivalence of (24) to (25) and (26) follows similarly to the proof of Corollary 1 in [20, p.210].

Appendix E Proof of Lemma 3

(27) is obtained as follows.

[TABLE]

where (82) is due to the fact that the binary entropy function is upper bounded by 1. (83) is justified as follows.

[TABLE]

where (85) is based on the convexity and monotonicity of the function $Q(\sqrt{u}+\sqrt{v})$ , which is shown in appendix F.

(28) is obtained as follows.

[TABLE]

where (86) is due to convexity of $Q(\alpha+\sqrt{x})$ in $x$ for $\alpha\geq 0$ .

(29) is obtained as follows.

[TABLE]

where (87) is from $p(y|x_{1},x_{2})\leq 1$ ; and (88) is from (86) and (85).

Note that, (88) is integrable with respect to $F_{X_{2}}$ due to the concavity of $-\log Q(\alpha+\sqrt{x})$ in $x$ for $\alpha\geq 0$ as shown in Appendix F. In other words,

[TABLE]

Appendix F Two convex functions

Let $f(x)=\log Q(a+\sqrt{x})$ for $x,a\geq 0$ . We have,

[TABLE]

and

[TABLE]

where $\phi(x)=\frac{1}{\sqrt{2\pi}}e^{-\frac{x^{2}}{2}}$ . Note that

[TABLE]

where (92) and (93) are, respectively, due to $\phi(x)>xQ(x)$ and $(1+x^{2})Q(x)>x\phi(x)$ ( $x>0$ ). Therefore,

[TABLE]

which makes the second derivative in (91) positive, and proves the (strict) convexity of $f(x)$ .

Let $f(u,v)=Q(\sqrt{u}+\sqrt{v})$ for $u,v\geq 0$ . By simple differentiation, the Hessian matrix of $f$ is

[TABLE]

It can be verified that $\mbox{det}(\mathbf{H})>0$ and $\mbox{trace}(\mathbf{H})>0$ . Therefore, both eigenvalues of $\mathbf{H}$ are positive, which makes the matrix positive definite. Hence, $Q(\sqrt{u}+\sqrt{v})$ is (strictly) convex in $(u,v)$ .

Appendix G Proof of lemma 4

Let $A\triangleq\max\{A_{1},A_{2}\}$ .

It is obvious that

[TABLE]

Therefore, we can write

[TABLE]

for some $\beta\in[0,1].$ Note that $\beta$ is a function of $x_{2}$ . Also, due to concavity of $H_{b}(\cdot)$ , we have

[TABLE]

From the fact that

[TABLE]

we can also write

[TABLE]

where (96) and (98) have been used in (99). (97) and (99) are depicted in Figure 3.

From (96) and (99), we have

[TABLE]

Let

[TABLE]

This minimizer satisfies the following equality

[TABLE]

Therefore, we can write

[TABLE]

where (103) is from the definition in (101); (104) is from the expansion of (102), and $H_{b}^{\prime}(t)=\log(\frac{1-t}{t})$ is the derivative of the binary entropy function; (105) is due to the fact that $H_{b}^{\prime}(t)$ is a decreasing function.

Applying L’hospital’s rule multiple times, we obtain

[TABLE]

From (100), (105) and (106), (41) is proved. Note that the boundedness of $X_{1}$ is crucial in the proof. In other words, the fact that $Q(x_{2}-A)\to 0$ as $x_{2}\to+\infty$ is the very result of $A<+\infty$ .

Bibliography20

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] B. Rassouli, M. Varasteh, and D. G nd z, “Capacity region of a one-bit quantized gaussian multiple access channel,” in IEEE Int’l Sym. Inf. Theory (ISIT) , Achen, Germany, Jun. 2017, pp. 2633–2637.
2[2] R. Walden, “Analog-to-digital converter survey and analysis,” IEEE Journal on Selected Areas in Comm. , vol. 17, no. 4, pp. 539–550, Apr. 1999.
3[3] B. Murmann, “ADC performance survey,” Co RR , vol. abs/1404.7736, 1997-2014. [Online]. Available: http://web.stanford.edu/ murmann/adcsurvey.html
4[4] D. Gunduz, K. Stamatiou, N. Michelusi, and M. Zorzi, “Designing intelligent energy harvesting communication systems,” IEEE Comm. Magazine , vol. 52, pp. 210–216, January 2014.
5[5] J. Singh, O. Dabeer, and U. Madhow, “On the limits of communication with low-precision analog-to-digital conversion at the receiver,” IEEE Trans. Communications , vol. 57, no. 12, pp. 3629–3639, December 2009.
6[6] S. Krone and G. Fettweis, “Fading channels with 1-bit output quantization: Optimal modulation, ergodic capacity and outage probability,” in Proc. IEEE Inf. Theory Workshop (ITW) , Aug. 2010, pp. 1–5.
7[7] A. Mezghani and J. Nossek, “Analysis of Rayleigh-fading channels with 1-bit quantized output,” IEEE Int. Sym. Inf. Theory , pp. 260–264, Jul. 2008.
8[8] ——, “On ultra-wideband MIMO systems with 1-bit quantized outputs: Performance analysis and input optimization,” IEEE Int. Sym. Inf. Theory , pp. 1286–1289, Jun. 2007.