Error Exponent Bounds for the Bee-Identification Problem

Anshoo Tandon; Vincent Y. F. Tan; Lav R. Varshney

arXiv:1905.07868·cs.IT·June 5, 2019

Error Exponent Bounds for the Bee-Identification Problem

Anshoo Tandon, Vincent Y. F. Tan, Lav R. Varshney

PDF

Open Access

TL;DR

This paper introduces the bee-identification problem, defines its error exponent, and derives bounds showing joint decoding outperforms separate decoding, with tight bounds at low rates.

Contribution

It formally defines the bee-identification problem, introduces error exponent bounds, and demonstrates the superiority of joint decoding over separate decoding.

Findings

01

Joint decoding significantly improves error exponents.

02

Lower bounds using typical random codes outperform random code ensembles.

03

Bounds converge at zero rate, indicating optimality of joint decoding in that regime.

Abstract

Consider the problem of identifying a massive number of bees, uniquely labeled with barcodes, using noisy measurements. We formally introduce this `bee-identification problem', define its error exponent, and derive efficiently computable upper and lower bounds for this exponent. We show that joint decoding of barcodes provides a significantly better exponent compared to separate decoding followed by permutation inference. For low rates, we prove that the lower bound on the bee-identification exponent obtained using typical random codes (TRC) is strictly better than the corresponding bound obtained using a random code ensemble (RCE). Further, as the rate approaches zero, we prove that the upper bound on the bee-identification exponent meets the lower bound obtained using TRC with joint barcode decoding.

Equations257

Pr {\tilde{c}_{π (i)} ∣ c_{i}, π}

Pr {\tilde{c}_{π (i)} ∣ c_{i}, π}

Pr {\tilde{C}_{π} ∣ C, π}

D (ϕ (\tilde{C}_{π}), π^{- 1}) = D (ν, π^{- 1}) ≜ {1, \mbox i f ν \neq = π^{- 1}, 0, \mbox i f ν = π^{- 1} .

D (ϕ (\tilde{C}_{π}), π^{- 1}) = D (ν, π^{- 1}) ≜ {1, \mbox i f ν \neq = π^{- 1}, 0, \mbox i f ν = π^{- 1} .

D (C, p, ϕ) ≜ E_{π} [E [D (ϕ (\tilde{C}_{π}), π^{- 1})]],

D (C, p, ϕ) ≜ E_{π} [E [D (ϕ (\tilde{C}_{π}), π^{- 1})]],

D(C,p,\phi)=\Pr\Big{\{}\phi(\tilde{C}_{\pi})\neq\pi^{-1}\Big{\}}=\Pr\left\{\nu\neq\pi^{-1}\right\}.

D(C,p,\phi)=\Pr\Big{\{}\phi(\tilde{C}_{\pi})\neq\pi^{-1}\Big{\}}=\Pr\left\{\nu\neq\pi^{-1}\right\}.

\underline{D} (n, R, p) ≜ C, ϕ min D (C, p, ϕ),

\underline{D} (n, R, p) ≜ C, ϕ min D (C, p, ϕ),

E_{\underline{D}} (R, p) = n \to \infty lim inf \frac{- 1}{n} lo g \underline{D} (n, R, p) .

E_{\underline{D}} (R, p) = n \to \infty lim inf \frac{- 1}{n} lo g \underline{D} (n, R, p) .

\underline{D} (n, R, p) \leq \frac{1}{∣ C ( n , R ) ∣} C \in C (n, R) \sum D (C, p, ϕ),

\underline{D} (n, R, p) \leq \frac{1}{∣ C ( n , R ) ∣} C \in C (n, R) \sum D (C, p, ϕ),

D (C, p, ϕ) \leq j = 1 \sum m Pr {ν (j) \neq = π^{- 1} (j)} .

D (C, p, ϕ) \leq j = 1 \sum m Pr {ν (j) \neq = π^{- 1} (j)} .

\underline{D} (n, R, p) \leq j = 1 \sum m C \in C (n, R) \sum \frac{Pr { ν ( j ) \neq = π ^{- 1} ( j ) }}{∣ C ( n , R ) ∣} .

\underline{D} (n, R, p) \leq j = 1 \sum m C \in C (n, R) \sum \frac{Pr { ν ( j ) \neq = π ^{- 1} ( j ) }}{∣ C ( n , R ) ∣} .

P (n, R, p) ≜ \frac{1}{∣ C ( n , R ) ∣} C \in C (n, R) \sum Pr {ν (j) \neq = π^{- 1} (j)} .

P (n, R, p) ≜ \frac{1}{∣ C ( n , R ) ∣} C \in C (n, R) \sum Pr {ν (j) \neq = π^{- 1} (j)} .

\underline{D} (n, R, p) \leq m P (n, R, p) .

\underline{D} (n, R, p) \leq m P (n, R, p) .

E_{\underline{D}} (R, p) \geq ∣ R_{0} (p) - 2 R ∣^{+},

E_{\underline{D}} (R, p) \geq ∣ R_{0} (p) - 2 R ∣^{+},

R_{0} (p) ≜ 1 - lo g (1 + 4 p (1 - p)) .

R_{0} (p) ≜ 1 - lo g (1 + 4 p (1 - p)) .

E_{r} (R, p) =

E_{r} (R, p) =

E_{r} (R, p) =

E_{r} (R, p) =

D (x ∥ y) ≜ x lo g \frac{x}{y} + (1 - x) lo g \frac{1 - x}{1 - y} .

D (x ∥ y) ≜ x lo g \frac{x}{y} + (1 - x) lo g \frac{1 - x}{1 - y} .

E_{\underline{D}} (R, p) \geq ∣ E_{r} (R, p) - R ∣^{+} .

E_{\underline{D}} (R, p) \geq ∣ E_{r} (R, p) - R ∣^{+} .

Pr {π_{0} \to σ} ≜ Pr {d_{H} (\tilde{C}_{π_{0}}, C_{σ}) \leq d_{H} (\tilde{C}_{π_{0}}, C_{π_{0}})},

Pr {π_{0} \to σ} ≜ Pr {d_{H} (\tilde{C}_{π_{0}}, C_{σ}) \leq d_{H} (\tilde{C}_{π_{0}}, C_{π_{0}})},

D (C, p, ϕ)

D (C, p, ϕ)

\leq σ \in S_{m}, \linebreak σ \neq = π_{0} \sum Pr {π_{0} \to σ},

P_{RCE, σ} ≜ \frac{1}{∣ C ( n , R ) ∣} C \in C (n, R) \sum Pr {π_{0} \to σ},

P_{RCE, σ} ≜ \frac{1}{∣ C ( n , R ) ∣} C \in C (n, R) \sum Pr {π_{0} \to σ},

\underline{D} (n, R, p) \leq σ \in S_{m}, σ \neq = π_{0} \sum P_{RCE, σ} .

\underline{D} (n, R, p) \leq σ \in S_{m}, σ \neq = π_{0} \sum P_{RCE, σ} .

Pr {c_{\overset{}{^}} \to c_{\overset{}{^}}} \leq 2^{- d α_{p}},

Pr {c_{\overset{}{^}} \to c_{\overset{}{^}}} \leq 2^{- d α_{p}},

α_{p} ≜ - lo g 4 p (1 - p) .

α_{p} ≜ - lo g 4 p (1 - p) .

Pr {π_{0} \to σ} \leq 2^{- d_{σ} α_{p}} .

Pr {π_{0} \to σ} \leq 2^{- d_{σ} α_{p}} .

Pr {d_{H} (c_{\overset{}{^}}, c_{\overset{}{^}}) = d} \leq 2^{- n (1 - H (d / n))} .

Pr {d_{H} (c_{\overset{}{^}}, c_{\overset{}{^}}) = d} \leq 2^{- n (1 - H (d / n))} .

P_{RCE, (\overset{}{^} \overset{}{^})} \leq d = 0 \sum n 2^{- n (1 - H (d / n) + 2 (d / n) α_{p})} .

P_{RCE, (\overset{}{^} \overset{}{^})} \leq d = 0 \sum n 2^{- n (1 - H (d / n) + 2 (d / n) α_{p})} .

\hat{δ}_{p} ≜ \frac{4 p ( 1 - p )}{1 + 4 p ( 1 - p )} .

\hat{δ}_{p} ≜ \frac{4 p ( 1 - p )}{1 + 4 p ( 1 - p )} .

2^{- n (1 - H (d / n) + 2 (d / n) α_{p})} \leq 2^{- n (1 - H (\hat{δ}_{p}) + 2 (\hat{δ}_{p}) α_{p})} .

2^{- n (1 - H (d / n) + 2 (d / n) α_{p})} \leq 2^{- n (1 - H (\hat{δ}_{p}) + 2 (\hat{δ}_{p}) α_{p})} .

P_{RCE, σ} \leq 2^{- n (1 - H (\hat{δ}_{p}) + 2 (\hat{δ}_{p}) α_{p} - c_{n})} .

P_{RCE, σ} \leq 2^{- n (1 - H (\hat{δ}_{p}) + 2 (\hat{δ}_{p}) α_{p} - c_{n})} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced biosensing and bioanalysis techniques · DNA and Biological Computing · SARS-CoV-2 detection and testing

Full text

Error Exponent Bounds for the

Bee-Identification Problem

Anshoo Tandon, , Vincent Y. F. Tan, ,

and Lav R. Varshney A. Tandon is with the Department of Electrical and Computer Engineering, National University of Singapore, Singapore 117583 (email: [email protected]. Y. F. Tan is with the Department of Electrical and Computer Engineering, and with the Department of Mathematics, National University of Singapore, Singapore (email: [email protected]).L. R. Varshney is with the Coordinated Science Laboratory and the Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA (email: [email protected])

Abstract

Consider the problem of identifying a massive number of bees, uniquely labeled with barcodes, using noisy measurements. We formally introduce this “bee-identification problem”, define its error exponent, and derive efficiently computable upper and lower bounds for this exponent. We show that joint decoding of barcodes provides a significantly better exponent compared to separate decoding followed by permutation inference. For low rates, we prove that the lower bound on the bee-identification exponent obtained using typical random codes (TRC) is strictly better than the corresponding bound obtained using a random code ensemble (RCE). Further, as the rate approaches zero, we prove that the upper bound on the bee-identification exponent meets the lower bound obtained using TRC with joint barcode decoding.

I Introduction

Consider a group of $m$ different bees, in which each bee is tagged with a unique barcode for identification purposes in order to understand interaction patterns in honeybee social networks [1]. Assume that a camera is employed to picture the beehive to study the interactions among bees. The image output (see Fig. 1) can be considered as a noisy and unordered set of $m$ barcodes. We formally pose the problem of bee-identification from a beehive image as an information-theoretic problem (Sec. I-B).

The bee-identification problem has applications in identification of warehouse products (labeled with unique RFID barcodes) using wide-area sensors. Other applications include package-distribution to recipients from a batch of deliveries with noisy address labels, and similar “bipartite matching” settings. It also has potential applications in identification of the mapping between signals and their meaning in “alien communication” with extraterrestrials, and also in learning communication protocols among robots, via the use of pilot signals going through the alphabet.

We consider the scenario where the barcode for each bee is represented as a binary vector of length $n$ , and the bee barcodes are collected in a codebook $C$ comprising $m$ rows and $n$ columns, with each row corresponding to a bee barcode. As shown in Fig. 2, the channel first permutes the rows of $C$ with a random permutation $\pi$ to produce $C_{\pi}$ . The entries of $C_{\pi}$ are then subjected to noise (corresponding to a binary symmetric channel (BSC) with crossover probability $p$ ), and the channel output is denoted $\tilde{C}_{\pi}$ . We assume that the decoder has knowledge of codebook $C$ , and its task is to recover the row-permutation $\pi$ introduced by the channel. Note that the permutation $\pi$ directly ascertains the identity of all the bees.

I-A Related Work

In a related work motivated by an Internet of Things (IoT) setting, the identification of users in strongly asynchronous massive access channels was studied [2]. The identification of the underlying distributions of a set of observed sequences (where each sequence is generated i.i.d. by a distinct distribution) was analyzed in [3]. The bee-identification problem, on the other hand, allows codebooks where all barcode sequences are generated using the same underlying distribution.

In another related work [4], the fundamental limits of data storage via unordered DNA molecules was investigated. Here, a DNA molecule corresponds to an $\ell$ -length sequence over an alphabet of size 4, and the information is written onto $m$ DNA molecules stored in an unordered way. The storage capacity results in [4] were extended to noisy settings in [5] where the channel adds noise and randomly permutes the $m$ DNA molecules used to store information. The capacity results are obtained under the scenario where the length, $\ell$ , of each DNA molecule grows with $m$ . Although the effective channel in [5] is closely related to the bee-identification channel in Fig. 2, we note that the fundamental problem in [5] is to quantify the data storage capacity, while the main issue in the bee-identification problem is the identification of the row-permutation induced by the channel.

Data communication over permutation channels with impairments was analyzed in [6]. The authors of [6] presented bounds on the size of optimal codes over a finite input alphabet, when the channel randomly permutes the letters of the input sequence in addition to causing impairments such as insertions, deletions, and substitutions. The effective channel for the bee-identification problem (see Fig. 2) differs from the communication channel in [6] in two aspects: (i) The input to the channel in the bee-identification problem is the entire codebook, not just a codeword belonging to the codebook. (ii) The channel in Fig. 2 only permutes the rows of the codebook, but does not permute the letters within a row.

I-B Bee-Identification Problem Formulation

The channel output is a row-permuted and noisy version of the codebook. If $\pi$ denotes a given permutation of $m$ -letters, then the channel first permutes the $m$ rows of codebook $C$ , based on $\pi$ , to produce $C_{\pi}$ (see Fig. 2). Therefore, if $j=\pi(i)$ and the $i$ -th row of codebook $C$ is denoted ${\boldsymbol{c}}_{i}=[c_{i,1}~{}c_{i,2}~{}\cdots~{}c_{i,n}]$ , then the $j$ -th row of $C_{\pi}$ is equal to ${\boldsymbol{c}}_{i}$ . The channel then applies noise on the permuted codebook $C_{\pi}$ to produce $\tilde{C}_{\pi}$ , where noise is modeled by a BSC with crossover probability $p$ , denoted BSC( $p$ ), with $0<p<0.5$ . If $j=\pi(i)$ , and $\tilde{{\boldsymbol{c}}}_{\pi(i)}$ denotes the $j$ -th row of $\tilde{C}_{\pi}$ , then

[TABLE]

where ${d_{i}}\triangleq\mathrm{d_{H}}(\tilde{{\boldsymbol{c}}}_{\pi(i)},{\boldsymbol{c}}_{i})$ denotes the Hamming distance between vectors $\tilde{{\boldsymbol{c}}}_{\pi(i)}$ and ${\boldsymbol{c}}_{i}$ . Let $\mathcal{M}\triangleq\{1,2,\ldots,m\}$ , and let the decoder correspond to a function $\phi$ which takes $\tilde{C}_{\pi}$ as an input and produces a map $\nu:\mathcal{M}\to\mathcal{M}$ where $\nu(k)$ corresponds to the index of the transmitted codeword which produced the received word $\tilde{{\boldsymbol{c}}}_{k}$ , for $1\leq k\leq m$ . In effect, the bee-identification problem is that the decoder has to recover the row-permutation $\pi$ introduced by the channel, by using the knowledge of codebook $C$ and the channel output $\tilde{C}_{\pi}$ .

I-C Bee-Identification Error Exponent

The indicator for the bee-identification error is defined as

[TABLE]

For a given codebook $C$ and decoding function $\phi$ , the expected bee-identification error probability over the BSC( $p$ ) is

[TABLE]

where the inner expectation is over the distribution of $\tilde{C}_{\pi}$ given $C$ and $\pi$ (see (1)), and the outer expectation is over a uniform distribution of $\pi$ over all $m$ -letter permutations. Note that (2) can be equivalently expressed as

[TABLE]

For a given $R>0$ , let the number of barcodes $m$ scale exponentially with blocklength $n$ as $m=2^{nR}$ . Now, for given values of $n$ and $R$ , define the minimum expected bee-identification error probability as

[TABLE]

where the minimum is over all codebooks $C$ of size $2^{nR}\times n$ , and all decoding functions $\phi$ .

Define, $E_{\underline{D}}(R,p)$ , the exponent corresponding to the minimum expected bee-identification error probability, as

[TABLE]

We introduce some notation that is used in the rest of the paper. We will denote $f(n)\doteq g(n)$ when $\lim_{n\to\infty}n^{-1}\log\left(f(n)/g(n)\right)=0$ . Similarly, we write $f(n)~{}\dot{\leq}~{}g(n)$ (respectively, $f(n)~{}\dot{\geq}~{}g(n)$ ) if $\limsup_{n\to\infty}n^{-1}\log\left(f(n)/g(n)\right)\leq 0$ (respectively, $\geq 0$ ).

I-D Our Contributions

The “bee-identification problem” is introduced and the corresponding bee-identification exponent $E_{\underline{D}}(R,p)$ is analyzed in this paper. In particular, we provide the following explicit bounds on this exponent.

•

A lower bound on $E_{\underline{D}}(R,p)$ using a random code ensemble (RCE) with independent barcode decoding (Sec. II-A) and joint barcode decoding (Sec. II-B).

•

A lower bound on $E_{\underline{D}}(R,p)$ using typical random codes (TRC) with independent barcode decoding (Sec. III-A) and joint barcode decoding (Sec. III-B).

•

An upper bound on $E_{\underline{D}}(R,p)$ which is applicable to all possible codebook designs (Sec. IV).

We show that joint decoding of barcodes provides a significantly better exponent compared to separate decoding followed by learning the permutation. For low rates, we prove that the lower bound obtained using TRC is strictly better than the corresponding bound obtained using RCE. Further, as the rate approaches zero, we prove that the upper bound meets the lower bound obtained using TRC with joint barcode decoding.

II Random Code Ensemble

In this section, we present lower bounds on $E_{\underline{D}}(R,p)$ using an RCE [7]. Let ${\mathscr{C}}(n,R)$ denote the set of all binary matrices with $m=2^{nR}$ rows and $n$ columns. Assume that codebook $C$ is uniformly distributed over ${\mathscr{C}}(n,R)$ . It is immediate from the definition of $\underline{D}(n,R,p)$ (4) that

[TABLE]

where the expression on the right denotes the average performance using RCE. We proceed by quantifying this expression when the decoding function $\phi$ corresponds to: (i) independent barcode decoding (Sec. II-A), and (ii) joint barcode decoding (Sec. II-B). The main results in this section are as follows: we present explicit lower bounds on $E_{\underline{D}}(R,p)$ using independent barcode decoding (Thm. 1) and joint barcode decoding (Thm. 2). It is shown (Prop. 2) that the bee-identification exponent obtained using joint barcode decoding is strictly better than the corresponding exponent obtained with independent barcode decoding.

II-A Independent Decoding for Each Barcode

Here, we analyze a naïve decoding strategy where each barcode is decoded independently. In this case, for $1\leq j\leq m$ , the decoder picks $\tilde{{\boldsymbol{c}}}_{j}$ , the $j$ -th row of $\tilde{C}_{\pi}$ , and then decodes it to $\nu(j)=\operatorname*{arg\,min}_{k}\mathrm{d_{H}}(\tilde{{\boldsymbol{c}}}_{j},{\boldsymbol{c}}_{k})$ . If there is more than one codeword at the same minimum Hamming distance from $\tilde{{\boldsymbol{c}}}_{j}$ , then any one of the corresponding codeword indices is chosen at random. From (3) and the union bound, we have

[TABLE]

Combining (6) and (7), we get

[TABLE]

Now define

[TABLE]

Note that $P(n,R,p)$ is independent of index $j$ due to the averaging over the ensemble of codebooks uniformly distributed over ${\mathscr{C}}(n,R)$ . For $i=\pi^{-1}(j)$ , the expression for $P(n,R,p)$ corresponds to the probability of error when the $i$ -th codeword is transmitted over BSC( $p$ ). From (8) and (9), we get

[TABLE]

The following theorem uses (10) to present an explicit lower bound on $E_{\underline{D}}(R,p)$ .

Theorem 1.

We have

[TABLE]

where $|x|^{+}\triangleq\max(0,x)$ , and

[TABLE]

Proof:

It is well known that the random coding exponent over BSC( $p$ ), defined as $E_{\mathrm{r}}(R,p)\triangleq\liminf_{n\to\infty}(1/n)\log\left(1/P(n,R,p)\right)$ , is given by [8, 7]

[TABLE]

where $H(\cdot)$ denotes the binary entropy function, $\delta_{{\mathrm{GV}}}(R)$ is the Gilbert-Varshamov (GV) distance [7] defined as the value of $\delta$ in the interval $[0,0.5]$ with $H(\delta)=1-R$ , and $R_{\mathrm{cr}}(p)$ is the critical rate given by $R_{\mathrm{cr}}(p)=1-H\left(\frac{\sqrt{p}}{\sqrt{p}+\sqrt{1-p}}\right)$ , and

[TABLE]

Using the fact that $m=2^{nR}$ , and combining (5), (10), and the definition of $E_{\mathrm{r}}(R,p)$ , we get

[TABLE]

Now, using explicit numerical computation, it can be shown that $R_{0}(p)\leq 2R_{\mathrm{cr}}(p)$ . The proof is complete by combining (13), (14), and noting that $|E_{\mathrm{r}}(R,p)-R|^{+}=0$ when $R\geq R_{\mathrm{cr}}(p)$ because $E_{\mathrm{r}}(R,p)$ is a decreasing function of $R$ . ∎

The lower bound on $E_{\underline{D}}(R,p)$ given by (11) was obtained by applying a naïve decoding strategy where each barcode was decoded independently. In the next subsection, we analyze the bee-identification exponent using joint barcode decoding.

II-B Joint Decoding of Barcodes

Let $S_{m}$ denote the set of permutations of $\{1,\ldots,m\}$ . For joint maximum likelihood (ML) decoding of barcodes, the decoding function $\phi$ takes the noisy row-permuted codebook $\tilde{C}_{\pi}$ as input, and produces permutation $\nu=\rho^{-1}$ as output, where $\rho=\operatorname*{arg\,min}_{\sigma\in S_{m}}\mathrm{d_{H}}(\tilde{C}_{\pi},C_{\sigma})$ , and $\mathrm{d_{H}}(\tilde{C}_{\pi},C_{\sigma})\triangleq|\{(i,j):\tilde{C}_{\pi}(i,j)\neq C_{\sigma}(i,j),1\leq i\leq m,1\leq j\leq n\}|$ . We aim to provide bounds on $\Pr\{\nu\neq\pi^{-1}\}=\Pr\{\rho\neq\pi\}$ .

For any two permutations $\pi_{1},\pi_{2}\in S_{m}$ , the sets of distances $\{\mathrm{d_{H}}(\tilde{C}_{\pi_{1}},C_{\sigma})\}_{\sigma\in S_{m}}$ and $\{\mathrm{d_{H}}(\tilde{C}_{\pi_{2}},C_{\sigma})\}_{\sigma\in S_{m}}$ are equal. Therefore, the performance of the joint ML decoder is independent of the channel permutation $\pi$ , and we assume, without loss of generality, that the permutation induced by the channel is the identity permutation, denoted $\pi_{0}$ .

For a given codebook $C$ at the transmitter, let $\tilde{C}_{\pi_{0}}$ denote the received noisy codebook at the output of the effective channel, and for $\sigma\in S_{m}$ with $\sigma\neq\pi_{0}$ , we define

[TABLE]

where the event $\{\pi_{0}\to\sigma\}$ is said to occur if $\mathrm{d_{H}}(\tilde{C}_{\pi_{0}},C_{\sigma})\leq\mathrm{d_{H}}(\tilde{C}_{\pi_{0}},C_{\pi_{0}})$ . From (3), we have

[TABLE]

where (15) follows from the union bound. Now define

[TABLE]

which denotes the probability of the event $\{\pi_{0}\to\sigma\}$ , averaged over the ensemble of random binary codebooks. Using (6), (15), and (16), we get

[TABLE]

Now consider two codewords ${\boldsymbol{c}}_{\hat{\imath}}$ , ${\boldsymbol{c}}_{\hat{\jmath}}$ at distance $d$ from each other. Given that ${\boldsymbol{c}}_{\hat{\imath}}$ is transmitted over BSC( $p$ ), the probability that the Hamming distance of the received word from ${\boldsymbol{c}}_{\hat{\jmath}}$ is not more than its distance from ${\boldsymbol{c}}_{\hat{\imath}}$ is [7]

[TABLE]

where

[TABLE]

Therefore, for a given codebook $C=C_{\pi_{0}}$ and permutation $\sigma\in S_{m}$ with $\sigma\neq\pi_{0}$ , if $d_{\sigma}\triangleq\mathrm{d_{H}}(C_{\pi_{0}},C_{\sigma})$ , then

[TABLE]

In the following, we quantify $P_{{\mathrm{RCE}},\sigma}$ for different $\sigma\in S_{m}$ , via (16) and (19).

II-B1 $\sigma$ is a transposition

We first consider the case where $\sigma$ is a transposition, i.e. a permutation that interchanges only two indices. For indices $\hat{\imath},\hat{\jmath}$ , with $1\leq\hat{\imath}<\hat{\jmath}\leq m$ , the Hamming distance between codewords ${\boldsymbol{c}}_{\hat{\imath}}$ and ${\boldsymbol{c}}_{\hat{\jmath}}$ in a random codebook satisfies [7]

[TABLE]

When $\sigma=(\hat{\imath}~{}\hat{\jmath})$ is the permutation that only transposes indices $\hat{\imath}$ and $\hat{\jmath}$ , then $\mathrm{d_{H}}\left(C_{\pi_{0}},C_{(\hat{\imath}~{}\hat{\jmath})}\right)=2d$ if and only if $\mathrm{d_{H}}({\boldsymbol{c}}_{\hat{\imath}},{\boldsymbol{c}}_{\hat{\jmath}})=d$ . Thus, it follows from (20) that $\Pr\left\{\mathrm{d_{H}}\left(C_{\pi_{0}},C_{(\hat{\imath}~{}\hat{\jmath})}\right)=2d\right\}\leq 2^{-n(1-H(d/n))}$ . Further, when $\mathrm{d_{H}}\left(C_{\pi_{0}},C_{(\hat{\imath}~{}\hat{\jmath})}\right)=2d$ , we have $\Pr\{\pi_{0}\to(\hat{\imath}~{}\hat{\jmath})\}\leq 2^{-2d\,\alpha_{p}}$ . Therefore, the probability $P_{{\mathrm{RCE}},(\hat{\imath}~{}\hat{\jmath})}$ can be characterized using (16), (19), and (20) as

[TABLE]

If $\delta=d/n$ is treated as a continuous variable, then the exponent $E_{2}(\delta)\triangleq 1-H(\delta)+2\delta\alpha_{p}$ is a convex function with a unique minimum at $\delta=\hat{\delta}_{p}$ where

[TABLE]

Therefore, for $0\leq d\leq n$ , we have

[TABLE]

Now, if we define $c_{n}\triangleq\left(\log(n+1)\right)/n$ , then it follows from (21) that

[TABLE]

Further, we have $1-H(\hat{\delta}_{p})+2(\hat{\delta}_{p})\alpha_{p}=R_{1}(p)$ , where

[TABLE]

Hence, it follows from (23) and (24) that

[TABLE]

where $\sigma$ is a transposition.

II-B2 $\sigma$ is a product (composition) of disjoint transpositions

We now consider the case where $\sigma=\sigma_{1}\sigma_{2}$ , where $\sigma_{1}$ and $\sigma_{2}$ are disjoint transpositions with $\sigma_{1}=(i~{}j)$ and $\sigma_{2}=(\hat{\imath}~{}\hat{\jmath})$ . As the codewords in a random codebook are independent, then using (20), we have $\Pr\left\{\{\mathrm{d_{H}}({\boldsymbol{c}}_{i},{\boldsymbol{c}}_{j})=d_{1}\}\cap\{\mathrm{d_{H}}({\boldsymbol{c}}_{\hat{\imath}},{\boldsymbol{c}}_{\hat{\jmath}})=d_{2}\}\right\}\leq\prod_{i=1}^{2}2^{-n(1-H(d_{i}/n))}$ . Further, if $\mathrm{d_{H}}({\boldsymbol{c}}_{i},{\boldsymbol{c}}_{j})=d_{1}$ and $\mathrm{d_{H}}({\boldsymbol{c}}_{\hat{\imath}},{\boldsymbol{c}}_{\hat{\jmath}})=d_{2}$ , then $\mathrm{d_{H}}\left(C_{\pi_{0}},C_{\sigma}\right)=2(d_{1}+d_{2})$ , and $\Pr\{\pi_{0}\to\sigma\}\leq 2^{-2(d_{1}+d_{2})\alpha_{p}}$ . Therefore, if $\sigma$ is a product of two disjoint transpositions, then

[TABLE]

In general, when $\sigma$ is a product of $s$ disjoint transpositions, the above argument can be readily extended to show that

[TABLE]

Now, define

[TABLE]

where $R_{0}(p)$ and $R_{1}(p)$ are defined in (12) and (24), respectively. As $2\lambda_{p}\leq R_{1}(p)$ , it follows from (26) that

[TABLE]

We remark that when $\sigma$ is just a transposition, then from (25) we have $P_{{\mathrm{RCE}},\sigma}\leq 2^{-n\left(R_{1}(p)-c_{n}\right)}\leq 2^{-n2\left(\lambda_{p}-c_{n}\right)}$ , which is only a special case of (27) with $s=1$ .

II-B3 $\sigma$ is a $k$ -cycle with $k>2$

Let $\sigma\in S_{m}$ be a $k$ -cycle $(i_{1}~{}i_{2}~{}\cdots~{}i_{k})$ where $i_{l+1}=\sigma(i_{l})$ for $1\leq l\leq k-1$ , and $i_{1}=\sigma(i_{k})$ . We will apply the following proposition towards characterizing $P_{{\mathrm{RCE}},\sigma}$ .

Proposition 1.

Let $\mathbb{F}_{2^{n}}$ denote the space of all $n$ -length binary vectors. Let ${\boldsymbol{c}}_{1},{\boldsymbol{c}}_{2},\ldots,{\boldsymbol{c}}_{k}$ be $k>2$ $\mathrm{i.i.d.}$ random vectors, uniformly distributed over $\mathbb{F}_{2^{n}}$ , and let $d_{1},d_{2},\ldots,d_{k-1}$ be given non-negative integers. Then the following holds

[TABLE]

Proof:

See Appendix A. ∎

For a given codebook $C$ , if $\mathrm{d_{H}}({\boldsymbol{c}}_{i_{l}},{\boldsymbol{c}}_{i_{l+1}})=d_{l}$ for $1\leq l\leq k-1$ , and $\mathrm{d_{H}}({\boldsymbol{c}}_{i_{k}},{\boldsymbol{c}}_{i_{1}})=d_{k}$ , then $\mathrm{d_{H}}(C_{\pi_{0}},C_{\sigma})=\sum_{l=1}^{k}d_{l}$ , and we have

[TABLE]

Further, if codebook $C$ is uniformly distributed over ${\mathscr{C}}(n,R)$ ,

[TABLE]

where (30) follows from (28). Combining (29) and (30),

[TABLE]

If $\delta=d_{l}/n$ is treated as a continuous variable, then the exponent $E_{1}(\delta)\triangleq 1-H(\delta)+\delta\alpha_{p}$ is a convex function with a unique minimum at $\delta=\tilde{\delta}_{p}$ , where

[TABLE]

We have

[TABLE]

and therefore

[TABLE]

where $c_{n}=\left(\log(n+1)\right)/n$ . Combining (31) and (33),

[TABLE]

As $2k/3\leq k-1$ for $k>2$ , we have $k\lambda_{p}\leq 2kR_{0}(p)/3\leq(k-1)R_{0}(p)$ , and it follows from (34) that

[TABLE]

The above equation has been derived for the case where $\sigma$ is a $k$ -cycle with $k>2$ . However, a transposition is just a $k$ -cycle with $k=2$ , and from the remark following (27), it follows that (35) holds even for $k=2$ .

II-B4 General $\sigma\in S_{m}$ with $\sigma\neq\pi_{0}$

It is well known that any permutation $\sigma\neq\pi_{0}$ can be written as a product (composition) of $t$ disjoint cycles, for $t\geq 1$ [9]. Consider a given $\sigma$ which is a product of $t$ disjoint cycles of length $k_{1},\ldots,k_{t}$ , respectively, where $k_{i}\geq 2$ for $1\leq i\leq t$ . Then, we can extend the result in (35) to obtain

[TABLE]

II-B5 Putting it all together

For $1\leq j\leq m$ , if we define

[TABLE]

then (17) can be equivalently expressed as

[TABLE]

Note that the set $\Sigma_{1}$ is empty, as the Hamming distance between two distinct permutations is at least two. The set $\Sigma_{2}$ consists of all transpositions and $|\Sigma_{2}|=\binom{m}{2}\leq 2^{n(2R)}$ . For all $\sigma\in\Sigma_{2}$ , the value of $P_{{\mathrm{RCE}},\sigma}$ is given by (25), and combining this with (38), we get

[TABLE]

For a given $j>2$ , if $\sigma\in\Sigma_{j}$ , then from (36) it follows that $P_{{\mathrm{RCE}},\sigma}\leq 2^{-nj\left(\lambda_{p}-c_{n}\right)}$ . For $j\geq 2$ , the size of the set $\Sigma_{j}$ satisfies $|\Sigma_{j}|<\prod_{i=0}^{j-1}(m-i)<2^{njR}$ . If we define

[TABLE]

then we have $P_{{\mathrm{RCE}},\Sigma_{j}}\leq\beta_{n}^{j}$ . Now, if $R<\lambda_{p}$ , then because $c_{n}=o(1)$ , there exists $N$ such that for $n\geq N$ , we have $R<\lambda_{p}-c_{n}$ and hence $\beta_{n}<1$ . Therefore, for $n\geq N$ ,

[TABLE]

As $\beta_{n}\to 0$ and $c_{n}\to 0$ when $n\to\infty$ , it follows from (41) that

[TABLE]

Combining (39), (40), and (42), for $R<\lambda_{p}$ ,

[TABLE]

Comparing (17) with (43), we observe that the error probability $\underline{D}(n,R,p)$ is dominated by $P_{{\mathrm{RCE}},\sigma}$ terms for $\sigma$ corresponding to $k$ -cycles with $k=2$ and $k=3$ . The next theorem presents an explicit lower bound for $E_{\underline{D}}(R,p)$ when the decoder jointly decodes all the barcodes using a maximum likelihood approach.

Theorem 2.

We have

[TABLE]

where $\eta_{p}(R)\triangleq\min\left\{R_{1}(p)-2R,\,2R_{0}(p)-3R\right\}$ .

Proof:

If $R<\lambda_{p}$ , then $R_{1}(p)\geq 2\lambda_{p}>2R$ . Therefore, from (43) it follows that if $R<\lambda_{p}$ , then $E_{\underline{D}}(R,p)$ is lower bounded by $\min\left\{R_{1}(p)-2R,\ 3\lambda_{p}-3R\right\}=\eta_{p}(R)$ . Further, note that $\eta_{p}(R)>0$ if and only if $R<\lambda_{p}$ . ∎

The following proposition shows that the lower bound (44) (obtained using joint decoding of barcodes) is strictly better than the bound given by (11) (obtained with independent decoding of barcodes) in the interval where it is positive.

Proposition 2.

When $R_{0}(p)>2R$ and $0<p<0.5$ , then we have the strict inequality

[TABLE]

Proof:

When $0<p<0.5$ , we have $0<4p(1-p)<\sqrt{4p(1-p)}<1$ , and hence $R_{1}(p)>R_{0}(p)$ . If $R_{0}(p)>2R$ , then $2R_{0}(p)-3R=2(R_{0}(p)-2R)+R>R_{0}(p)-2R$ . The proof is complete by combining these observations with the definition of $\eta_{p}(R)$ . ∎

Note that $|\eta_{p}(R)|^{+}=0$ for $R\geq 0.5$ , because in this case $\eta_{p}(R)\leq R_{1}(p)-2R\leq R_{1}(p)-1\leq 0$ . In the following section, we present improved lower bounds on $E_{\underline{D}}(R,p)$ by analyzing typical random codebooks.

III Typical Random Code

TRCs are known, in general, to provide higher error exponents than RCE over a BSC [7, 10]. Roughly speaking, TRCs are characterized by the property that their relative minimum distance is at least $\delta_{{\mathrm{GV}}}(2R)$ . Formally, for $0\leq R<0.5$ , $0<\epsilon<\delta_{{\mathrm{GV}}}(2R)$ , and indices $1\leq\hat{\imath}<\hat{\jmath}\leq m=2^{nR}$ , the Hamming distance between codewords ${\boldsymbol{c}}_{\hat{\imath}}$ and ${\boldsymbol{c}}_{\hat{\jmath}}$ in a TRC satisfies [7]

[TABLE]

where $\delta=d/n$ , $\overline{\delta}\triangleq\delta_{{\mathrm{GV}}}(2R)+\epsilon$ , and $\underline{\delta}\triangleq\delta_{{\mathrm{GV}}}(2R)-\epsilon$ .

Let ${\mathscr{C}}_{{\mathrm{TRC}}}(n,R)$ denote the set of all codebooks of size $2^{nR}\times n$ , with the property that the Hamming distance between a pair of codewords ${\boldsymbol{c}}_{i}$ and ${\boldsymbol{c}}_{j}$ satisfies the relation $n\underline{\delta}<\mathrm{d_{H}}({\boldsymbol{c}}_{i},{\boldsymbol{c}}_{j})<n(1-\underline{\delta})$ for all $i\neq j$ . Note that if codebook $C$ is uniformly distributed over ${\mathscr{C}}_{{\mathrm{TRC}}}(n,R)$ , then the Hamming distance between a pair of distinct codewords satisfies (45). It is immediate from (4) that

[TABLE]

where the expression on the right denotes the average performance using TRCs.

In this section we provide lower bounds on the bee-identification exponent $E_{\underline{D}}(R,p)$ using TRCs. The case where each barcode is decoded independently is analyzed in Sec. III-A while joint barcode decoding is analyzed in Sec. III-B. It is shown that these lower bounds on $E_{\underline{D}}(R,p)$ using TRCs outperform the corresponding bounds for RCEs when the rate is smaller than a certain threshold.

III-A Independent Decoding of Barcodes

With independent barcode decoding, the decoder picks $\tilde{{\boldsymbol{c}}}_{j}$ , the $j$ -th row of $\tilde{C}_{\pi}$ , and then assigns $\nu(j)=\operatorname*{arg\,min}_{k}\mathrm{d_{H}}(\tilde{{\boldsymbol{c}}}_{j},{\boldsymbol{c}}_{k})$ , for $1\leq j\leq m$ . From the union bound, we have $D(C,p,\phi)\leq\sum_{j=1}^{m}\Pr\left\{\nu(j)\neq\pi^{-1}(j)\right\}$ , and using (46) we get

[TABLE]

Let $P_{{\mathrm{TRC}}}(n,R,p)\triangleq\sum_{C\in{\mathscr{C}}_{{\mathrm{TRC}}}(n,R)}\frac{\Pr\left\{\nu(j)\neq\pi^{-1}(j)\right\}}{|{\mathscr{C}}_{{\mathrm{TRC}}}(n,R)|}$ . Note that $P_{{\mathrm{TRC}}}(n,R,p)$ is independent of the index $j$ due to the symmetry resulting from averaging over codebooks uniformly distributed over ${\mathscr{C}}_{{\mathrm{TRC}}}(n,R)$ . For $i=\pi^{-1}(j)$ , the expression for $P_{{\mathrm{TRC}}}(n,R,p)$ corresponds to the probability of error when the $i$ -th codeword is transmitted. From (47), we get

[TABLE]

The following theorem uses (48) to present an explicit lower bound on $E_{\underline{D}}(R,p)$ when the rate is smaller than a certain threshold.

Theorem 3.

We have

[TABLE]

where $\alpha_{p}$ is defined in (18), and

[TABLE]

Proof:

It is known that for $0\leq R<R_{{\mathrm{TRC}}}(p)\leq 0.5$ , the error exponent using a TRC over BSC( $p$ ), defined as $E_{{\mathrm{TRC}}}(R,p)\triangleq\liminf_{n\to\infty}(1/n)\log\left(1/P_{{\mathrm{TRC}}}(n,R,p)\right)$ , is given by [7]

[TABLE]

Using the fact that $m=2^{nR}$ , and combining (5), (48), with the definition of $E_{{\mathrm{TRC}}}(R,p)$ , we get

[TABLE]

The proof is completed by applying (51) in (52). ∎

It is well known that $E_{{\mathrm{TRC}}}(R,p)>E_{\mathrm{r}}(R,p)$ for $0\leq R<R_{{\mathrm{TRC}}}(p)$ [7]. This implies that the lower bound on $E_{\underline{D}}(R,p)$ for TRC given by (49) is strictly better than the corresponding bound for RCE given by (11) when $0\leq R<R_{{\mathrm{TRC}}}(p)$ . The next subsection provides a more refined bound on $E_{\underline{D}}(R,p)$ by analyzing joint decoding of barcodes using TRCs.

III-B Joint Decoding of Barcodes

With joint barcode decoding, the decoder takes the noisy row-permuted codebook $\tilde{C}_{\pi}$ as input, and produces the permutation $\nu=\rho^{-1}$ as output, where $\rho=\operatorname*{arg\,min}_{\sigma\in S_{m}}\mathrm{d_{H}}(\tilde{C}_{\pi},C_{\sigma})$ . As in Sec. II-B, we assume, without loss of generality, that the permutation induced by the channel is the identity permutation $\pi_{0}$ . For a given codebook $C$ , we have $D(C,p,\phi)\leq\sum_{\sigma\in S_{m},\sigma\neq\pi_{0}}\Pr\{\pi_{0}\to\sigma\}$ . If we define

[TABLE]

where the expectation is over a uniform distribution of codebook over ${\mathscr{C}}_{{\mathrm{TRC}}}(n,R)$ , then we have

[TABLE]

In the following, we quantify $P_{{\mathrm{TRC}},\sigma}$ for different $\sigma\in S_{m}$ , in order to bound $\underline{D}(n,R,p)$ via (54).

III-B1 $\sigma$ is a transposition

If $\sigma=(\hat{\imath}~{}\hat{\jmath})$ is the permutation that only transposes indices $\hat{\imath}$ and $\hat{\jmath}$ , and $\mathrm{d_{H}}({\boldsymbol{c}}_{\hat{\imath}},{\boldsymbol{c}}_{\hat{\jmath}})=d$ , then $\mathrm{d_{H}}\left(C_{\pi_{0}},C_{(\hat{\imath}~{}\hat{\jmath})}\right)=2d$ , and we have

[TABLE]

When $C$ is uniformly distributed ${\mathscr{C}}_{{\mathrm{TRC}}}(n,R)$ and $n\underline{\delta}\leq d\leq n(1-\underline{\delta})$ , then

[TABLE]

where (56) follows from (45). Combining (53), (55), and (56), we get

[TABLE]

If $\delta=d/n$ is treated as a continuous variable, then the exponent $E_{2}(\delta)=1-H(\delta)+2\delta\alpha_{p}$ is a convex function of $\delta$ with a unique minimum at $\hat{\delta}_{p}$ defined in (22). If we define

[TABLE]

then for $0\leq R<\hat{R}_{p}$ , we have

[TABLE]

The exponent $E_{2}(\delta)$ increases monotonically in $\delta$ for $\delta\geq\hat{\delta}_{p}$ . Therefore, if $0\leq R<\hat{R}_{p}$ and $\epsilon<\delta_{{\mathrm{GV}}}(2R)-\hat{\delta}_{p}$ , the exponent in (57) is minimized for $d=n\underline{\delta}$ , and we have

[TABLE]

where $c_{n}=\left(\log(n+1)\right)/n$ .

III-B2 $\sigma$ is a $k$ -cycle

We now consider the case where $\sigma$ is a $k$ -cycle with $k\geq 3$ . We will apply the following proposition towards characterizing $P_{{\mathrm{TRC}},\sigma}$ .

Proposition 3.

Let ${\boldsymbol{c}}_{i_{1}},{\boldsymbol{c}}_{i_{2}},\ldots,{\boldsymbol{c}}_{i_{k}}$ be $k$ distinct rows in codebook $C$ , and let $d_{l}$ satisfy $n\underline{\delta}\leq d_{l}\leq n\left(1-\underline{\delta}\right)$ for $1\leq l\leq k-1$ . Let $Q_{{\mathrm{TRC}}}\left\{\bigcap_{l=1}^{k-1}\left\{\mathrm{d_{H}}({\boldsymbol{c}}_{i_{l}},{\boldsymbol{c}}_{i_{l+1}})=d_{l}\right\}\right\}$ denote the probability $\Pr\left\{\bigcap_{l=1}^{k-1}\left\{\mathrm{d_{H}}({\boldsymbol{c}}_{i_{l}},{\boldsymbol{c}}_{i_{l+1}})=d_{l}\right\}\right\}$ when $C$ is uniformly distributed over ${\mathscr{C}}_{{\mathrm{TRC}}}(n,R)$ . Then, we have

[TABLE]

where

[TABLE]

and $Q_{{\mathrm{RCE}}}\left\{\bigcap_{i=1}^{m}\{{\boldsymbol{c}}_{i}=\gamma_{i}\}\right\}$ denotes the probability $\Pr\left\{\bigcap_{i=1}^{m}\{{\boldsymbol{c}}_{i}=\gamma_{i}\}\right\}$ when $C$ is uniformly distributed over ${\mathscr{C}}(n,R)$ .

Proof:

See Appendix B. ∎

Now, given that $\sigma=(i_{1}~{}i_{2}~{}\cdots~{}i_{k})$ and $\mathrm{d_{H}}({\boldsymbol{c}}_{i_{l}},{\boldsymbol{c}}_{i_{l+1}})=d_{l}$ for $1\leq l\leq k-1$ , and $\mathrm{d_{H}}({\boldsymbol{c}}_{i_{k}},{\boldsymbol{c}}_{i_{1}})=d_{k}$ , we have $\mathrm{d_{H}}(C_{\pi_{0}},C_{\sigma})=\sum_{l=1}^{k}d_{l}$ , and therefore

[TABLE]

If $d_{0}\triangleq n\underline{\delta}$ , then combining (60) and (62), we get

[TABLE]

where, for $1\leq l\leq k-1$ , we have

[TABLE]

The function $E_{1}(\delta)=1-H(\delta)+\delta\alpha_{p}$ is a convex function of $\delta$ , and has a unique minimum that occurs at $\tilde{\delta}_{p}$ defined in (32). From (50) we observe that $R_{{\mathrm{TRC}}}(p)=0.5(1-H(\tilde{\delta}_{p}))$ . Thus, if $R<R_{{\mathrm{TRC}}}(p)$ , then we have $\delta_{{\mathrm{GV}}}(2R)>\tilde{\delta}_{p}$ . Further, $E_{1}(\delta)$ is an increasing function of $\delta$ for $\delta\geq\tilde{\delta}_{p}$ , and so if $R<R_{{\mathrm{TRC}}}(p)$ and $\epsilon<\delta_{{\mathrm{GV}}}(2R)-\tilde{\delta}_{p}$ , the exponent in (64) is minimized when $d_{l}=d_{0}=n\underline{\delta}$ . Thus, we have

[TABLE]

Combining (63), (65), and (66), for $0\leq R<R_{{\mathrm{TRC}}}(p)$ ,

[TABLE]

where $\sigma$ is a $k$ -cycle with $k>2$ . As $k<2(k-1)$ for $k>2$ , it follows from (67) that

[TABLE]

Recall that $\hat{\delta}_{p}$ and $\hat{R}_{p}$ are given by (22) and (58), respectively. As $x/(1+x)$ is an increasing function of $x$ , and $0<p<0.5$ , it follows that $\hat{\delta}_{p}<\tilde{\delta}_{p}<0.5$ , which implies that $R_{{\mathrm{TRC}}}(p)<\hat{R}_{p}$ . Note that a transposition is simply a $k$ -cycle with $k=2$ , and comparing (59) with (68) we observe that the relation given by (68) holds even when $k=2$ .

III-B3 $\sigma$ is a product (composition) of two disjoint cycles

We now consider the case where $\sigma=\sigma_{1}\sigma_{2}$ , where $\sigma_{1}$ and $\sigma_{2}$ are disjoint cycles of length $k_{1}$ and $k_{2}$ , respectively. Let $\sigma_{1}=(i_{1}~{}i_{2}~{}\cdots~{}i_{k_{1}})$ and $\sigma_{2}=(i_{k_{1}+1}~{}i_{k_{1}+1}~{}\cdots~{}i_{k_{1}+k_{2}})$ . If $d_{0}\leq d_{l}\leq n-d_{0}$ for $1\leq l\leq k_{1}+k_{2}$ , then a straightforward extension of Prop. 3 shows that the probability $\Pr\Big{\{}\bigcap_{l=1}^{k_{1}-1}\{\mathrm{d_{H}}({\boldsymbol{c}}_{i_{l}},{\boldsymbol{c}}_{i_{l+1}})=d_{l}\}\bigcap\left\{\mathrm{d_{H}}({\boldsymbol{c}}_{i_{k_{1}}},{\boldsymbol{c}}_{i_{1}})=d_{k_{1}}\right\}\\ ~{}~{}~{}~{}~{}~{}\bigcap_{l=k_{1}+1}^{k_{1}+k_{2}-1}\left\{\mathrm{d_{H}}({\boldsymbol{c}}_{i_{l}},{\boldsymbol{c}}_{i_{l+1}})=d_{l}\right\}\\ ~{}~{}~{}~{}~{}~{}\bigcap\left\{\mathrm{d_{H}}({\boldsymbol{c}}_{i_{k_{1}+k_{2}}},{\boldsymbol{c}}_{i_{k_{1}+1}})=d_{k_{1}+k_{2}}\right\}\Big{\}}$ is upper bounded by

[TABLE]

Further, for a given codebook $C$ , with $\mathrm{d_{H}}({\boldsymbol{c}}_{i_{l}},{\boldsymbol{c}}_{i_{l+1}})=d_{l},\,1\leq l\leq k_{1}-1$ , $\mathrm{d_{H}}({\boldsymbol{c}}_{i_{k_{1}}},{\boldsymbol{c}}_{i_{1}})=d_{k_{1}}$ , $\mathrm{d_{H}}({\boldsymbol{c}}_{i_{l}},{\boldsymbol{c}}_{i_{l+1}})=d_{l},\,k_{1}+1\leq l\leq k_{1}+k_{2}-1$ , $\mathrm{d_{H}}({\boldsymbol{c}}_{i_{k_{1}+k_{2}}},{\boldsymbol{c}}_{i_{k_{1}+1}})=d_{k_{1}+k_{2}}$ , we have $\mathrm{d_{H}}(C_{\pi_{0}},C_{\sigma})=\sum_{l=1}^{k_{1}+k_{2}}d_{l}$ , and therefore

[TABLE]

Combining (69) and (70), we can upper bound $P_{{\mathrm{TRC}},\sigma}$ by

[TABLE]

The above expression can be equivalently written as

[TABLE]

where $\zeta_{l}$ and $\eta_{k}$ are given by (64) and (65), respectively. Now, applying (65), (66) in (72) for $0\leq R<R_{{\mathrm{TRC}}}(p)$ , we get

[TABLE]

where $\sigma=(i_{1}~{}i_{2}~{}\cdots~{}i_{k_{1}})(i_{k_{1}+1}~{}i_{k_{1}+2}~{}\cdots~{}i_{k_{1}+k_{2}})$ . As $k_{1}\geq 2$ and $k_{2}\geq 2$ , we have $2(k_{1}+k_{2}-2)\geq k_{1}+k_{2}$ , and therefore for $~{}0\leq R<R_{{\mathrm{TRC}}}(p)$ , we have

[TABLE]

III-B4 General $\sigma\in S_{m}$ with $\sigma\neq\pi_{0}$

If permutation $\sigma$ is a product of $r$ disjoint cycles of length $k_{1},\ldots,k_{r}$ , respectively, then similar to (68), (74), we have for $0\leq R\leq R_{{\mathrm{TRC}}}(p)$ ,

[TABLE]

III-B5 Putting it all together

For $1\leq j\leq m$ , if we define $P_{{\mathrm{TRC}},\Sigma_{j}}\triangleq\sum_{\sigma\in\Sigma_{j}}P_{{\mathrm{TRC}},\sigma}$ , where $\Sigma_{j}$ is given by (37), then (54) can be equivalently expressed as

[TABLE]

If $\sigma$ is a product of $r$ disjoint cycles of length $k_{1},\ldots,k_{r}$ , respectively, and $s=\sum_{t=1}^{r}k_{t}$ , then $\sigma$ belongs to the set $\Sigma_{s}$ , and $P_{{\mathrm{TRC}},\sigma}$ is given by (75). Equivalently, for a given $j\geq 2$ , if $\sigma\in S_{m}$ belongs to the set $\Sigma_{j}$ , then for $0\leq R<R_{{\mathrm{TRC}}}(p)$ ,

[TABLE]

The size of $\Sigma_{j}$ satisfies $|\Sigma_{j}|<\prod_{i=0}^{j-1}(m-i)<2^{njR}$ . Therefore, for $0\leq R<R_{{\mathrm{TRC}}}(p)$ , we have

[TABLE]

Now, if we define $\beta_{n}\triangleq 2^{-n\left(0.5(1-H(\underline{\delta}))-R+\underline{\delta}\alpha_{p}-c_{n}\right)}$ , then (78) can be equivalently expressed as $P_{{\mathrm{TRC}},\Sigma_{j}}\leq(1/\alpha_{n})\beta^{j}$ . As $c_{n}=o(1)$ , there exists $\hat{N}$ such that for $n\geq\hat{N}$ , we have $c_{n}<0.5(1-H(\underline{\delta}))-R+\underline{\delta}\alpha_{p}$ and hence $\beta_{n}<1$ . Therefore, for $n\geq\hat{N}$ and $0\leq R<R_{{\mathrm{TRC}}}(p)$ , we have

[TABLE]

where (79) follows because $\alpha_{n}\to 1$ as $n\to\infty$ [7], (80) follows because $\beta_{n}=o(1)$ , and (81) follows because $c_{n}=o(1)$ . Note that $\underline{\delta}=\delta_{{\mathrm{GV}}}(2R)-\epsilon$ , and so $\lim_{\epsilon\to 0}\underline{\delta}=\delta_{{\mathrm{GV}}}(2R)$ and $\lim_{\epsilon\to 0}\left(1-H(\underline{\delta})-2R+2\underline{\delta}\alpha_{p}\right)=2\delta_{{\mathrm{GV}}}(2R)\alpha_{p}$ . As $\epsilon$ can be made arbitrarily small, it follows from (81) that for $0\leq R<R_{{\mathrm{TRC}}}(p)$ , we have

[TABLE]

The following theorem encapsulates the main result of this subsection on bounding the bee-identification exponent, $E_{\underline{D}}(R,p)$ , using joint decoding for TRC.

Theorem 4.

We have

[TABLE]

Proof:

Follows from (5) and (82). ∎

We note that the above lower bound for $E_{\underline{D}}(R,p)$ using TRCs with joint barcode decoding is twice the corresponding bound obtained using independent barcode decoding (see (49)). The following proposition shows that the lower bound given by Thm. 4 using TRC is strictly better than corresponding bound using RCE (see Thm. 2) for $0\leq R<R_{{\mathrm{TRC}}}(p)$ .

Proposition 4.

The lower bound on $E_{\underline{D}}(R,p)$ in (83) obtained for TRC is strictly better than the corresponding bound in (44) obtained for RCE when $0\leq R<R_{{\mathrm{TRC}}}(p)$ .

Proof:

It is known that $E_{{\mathrm{TRC}}}(R,p)>E_{\mathrm{r}}(R,p)$ when $0\leq R<R_{{\mathrm{TRC}}}(p)$ [7]. Further, using explicit numerical computation, it can be shown that $2R_{0}(p)\geq R_{1}(p)+2R_{{\mathrm{TRC}}}(p)$ . Therefore, it follows that for $0\leq R<R_{{\mathrm{TRC}}}(p)$ , we have

[TABLE]

∎

The next section presents an explicit upper bound for $E_{\underline{D}}(R,p)$ which applies to all possible codebook designs.

IV Upper Bound on the Bee-Identification Exponent

This section presents an upper bound on the bee-identification exponent $E_{\underline{D}}(R,p)$ . Towards this, we define the following optimum minimum distance metrics

[TABLE]

For any given codebook $C\in\mathscr{C}(n,R)$ , we show that there exists a set $\mathscr{I}_{C}$ consisting of pairs of codeword indices $(i,j)$ , $i\neq j$ , with the following properties:

(i)

If $(i,j)\in\mathscr{I}_{C}$ , then $\mathrm{d_{H}}({\boldsymbol{c}}_{i},{\boldsymbol{c}}_{j})\leq d^{*}(n,R-\frac{1}{n})$ . 2. (ii)

If $(i,j)\in\mathscr{I}_{C}$ and $(\hat{\imath},\hat{\jmath})\in\mathscr{I}_{C}$ , then $\hat{\imath}\neq i,\hat{\imath}\neq j$ and $\hat{\jmath}\neq i,\hat{\jmath}\neq j$ . 3. (iii)

Size of set $\mathscr{I}_{C}$ is at least $m/4$ .

A set satisfying the above properties can be constructed iteratively as follows.

•

Step 1: For a given codebook $C\in\mathscr{C}(n,R)$ , initialize $\mathscr{I}_{C}$ to be the empty set and let $\mathcal{T}=C$ .

•

Step 2: As $\mathcal{T}$ contains at least $m/2$ codewords, there exists ${\boldsymbol{c}}_{i},{\boldsymbol{c}}_{j}\in\mathcal{T}$ , with $i\neq j$ , satisfying $\mathrm{d_{H}}({\boldsymbol{c}}_{i},{\boldsymbol{c}}_{j})\leq d^{*}(n,R-\frac{1}{n})$ . Include the pair $(i,j)$ to $\mathscr{I}_{C}$ , and let $\mathcal{T}=\mathcal{T}\setminus\{{\boldsymbol{c}}_{i},{\boldsymbol{c}}_{j}\}$ .

•

Step 3: If $|\mathscr{I}_{C}|<m/4$ , then go to Step 2, else stop.

Let the receiver employ ML decoding, and interpret each pair $(i,j)\in\mathscr{I}_{C}$ as a transposition $\sigma=(i~{}j)$ that interchanges indices $i$ and $j$ . Let $A_{(i,j)}$ denote the error event that the receiver incorrectly decodes the channel induced permutation to transposition $(i~{}j)$ (instead of the identity permutation $\pi_{0}$ ), i.e. $A_{(i,j)}=\{\pi_{0}\to(i~{}j)\}$ . Then, the bee-identification error probability $D(C,p,\phi)$ can be lower bounded as

[TABLE]

Using de Caen’s lower bound on the probability of a union [11], the expression on the right side in (84) can itself be lower bounded by

[TABLE]

where $(a)$ follows because events $A_{(i,j)}$ and $A_{(\hat{\imath},\hat{\jmath})}$ are independent when $(\hat{\imath},\hat{\jmath})\neq(i,j)$ . Now

[TABLE]

where $(b)$ follows from the fact that $\mathrm{d_{H}}(C_{\pi_{0}},C_{(i,j)})\leq 2\,d^{*}(n,R-\frac{1}{n})$ for $(i,j)\in\mathscr{I}_{C}$ , and $(c)$ follows because $|\mathscr{I}_{C}|\geq m/4$ . If $R_{{\mathrm{UB}}}(p)\triangleq\sup\{R:2\delta^{*}(R)\alpha_{p}>R\}$ , then combining (84), (85), (86), and noting that $x/(1+x)$ increases with $x$ , we have

[TABLE]

As (87) is true for all $C\in\mathscr{C}(n,R)$ , we have

[TABLE]

The value $\delta^{*}(R)$ can be upper bounded as [12, 13]

[TABLE]

The following theorem provide an upper bound on the bee-identification exponent $E_{\underline{D}}(R,p)$ .

Theorem 5.

We have

[TABLE]

Proof:

Follows immediately from (88) and (89). ∎

The following corollary shows that $E_{\underline{D}}(R,p)$ can be explicitly characterized with a rather simple expression when rate $R$ tends to zero.

Corollary 1.

We have

[TABLE]

Proof:

As $\lim_{R\to 0}\delta_{{\mathrm{LP}}}(R)=0.5$ , we have from (90) that

[TABLE]

On the other hand, we have $\lim_{R\to 0}\delta_{{\mathrm{GV}}}(R)=0.5$ and so it follows from (83) that

[TABLE]

The proof is completed by using (92) and (93). ∎

The above corollary shows that the lower bound on $E_{\underline{D}}(R,p)$ given by (83), and the upper bound on $E_{\underline{D}}(R,p)$ given by (90) become tight as $R\to 0$ .

V A Numerical Example

Fig. 3 plots different bounds for the bee-identification exponent $E_{\underline{D}}(R,p)$ . The explicit lower bound for RCE with independent decoding (ID) (respectively, joint decoding (JD)) is given by (11) (respectively, (44)). The performance with JD is seen to be much better than with ID. When $0\leq R<R_{{\mathrm{TRC}}}(p)$ , the explicit lower bound for TRC with ID (respectively, JD) is given by (49) (respectively, (83)). As shown in Prop. 4, the lower bound obtained using TRC with joint decoding is better than the corresponding bound using RCE. The upper bound is given by (90) and holds for all possible codebook designs. Further, as shown in Cor. 1, it is observed from Fig. 3 that $\lim_{R\to 0}E_{\underline{D}}(R,p)=\alpha_{p}=2.33$ for $p=0.01$ .

VI Discussion

We introduced the information-theoretic “bee-identification problem” which arises naturally in different massive identification settings. We derived explicit upper and lower bounds on the bee-identification exponent, and showed that joint decoding of barcodes provides a significantly better exponent than separate decoding followed by permutation inference. For low rates, we showed that the lower bound on the bee-identification exponent obtained using TRC is strictly better than the corresponding bound obtained using RCE. Moreover, when the rate approaches zero, we showed that the upper bound on the bee-identification exponent coincides with the lower bound obtained using TRC with joint barcode decoding.

Relative to the independent decoding of barcodes, the performance improvement with joint decoding comes at a cost of increased computational complexity. For joint decoding, an exhaustive search entails comparing the received noisy & permuted version of the codebook with $m!$ row-permutations of the codebook. This may be computationally prohibitive even for moderate values of blocklength $n$ when $m$ scales exponentially with $n$ . In practice, intermediate performance between the extremes of independent decoding and joint decoding may be achieved with manageable complexity using ideas from generalized minimum distance decoding [14]. In particular, the decoding process may proceed in two steps: The first step involves independent decoding of each barcode where an erasure is declared if the distance between the received noisy barcode to the nearest barcode in the codebook exceeds a threshold. The second step fixes the codebook row-indices corresponding to the un-erased barcodes, and then decodes the erased barcodes by jointly comparing their received noisy version to different row-permutations of the codebook corresponding to the non-fixed indices. This results in significant reduction in complexity in case only a few barcodes are declared as erasure in the first step. Therefore, we have a tradeoff between performance and complexity via an appropriate choice of the distance threshold parameter for declaring an erasure.

The work in this paper may be extended by considering different variants of the bee-identification error metric, for instance, where error is flagged only when the fraction of incorrectly decoded barcodes exceeds a threshold. Another interesting scenario for future analysis is the problem formulation where some of the $m$ rows in codebook $C$ are deleted, due to some bees being outside the hive when taking the picture.

Appendix A Proof of Prop. 1

Proof:

Let $\gamma_{k-1},\tilde{\gamma}_{k-1}\in\mathbb{F}_{2^{n}}$ , and $\Delta\triangleq\gamma_{k-1}\oplus\tilde{\gamma}_{k-1}$ , where $\oplus$ denotes modulo-2 addition. Then, $\Pr\{\mathrm{d_{H}}(\gamma_{k-1},{\boldsymbol{c}}_{k})=d_{k-1}\}=\Pr\{\mathrm{d_{H}}(\tilde{\gamma}_{k-1},{\boldsymbol{c}}_{k}+\Delta)=d_{k-1}\}\overset{(\mathrm{i})}{=}\Pr\{\mathrm{d_{H}}(\tilde{\gamma}_{k-1},{\boldsymbol{c}}_{k})=d_{k-1}\}$ , where $(\mathrm{i})$ follows from the fact that for a given $\Delta$ , the distribution of ${\boldsymbol{c}}_{k}+\Delta$ is same as the distribution of ${\boldsymbol{c}}_{k}$ . This implies that $\Pr\{\mathrm{d_{H}}({\boldsymbol{c}}_{k-1},{\boldsymbol{c}}_{k})=d_{k-1}|{\boldsymbol{c}}_{k-1}=\gamma_{k-1}\}\overset{(\mathrm{ii})}{=}\Pr\{\mathrm{d_{H}}({\boldsymbol{c}}_{k-1},{\boldsymbol{c}}_{k})=d_{k-1}\}$ . Then $\Pr\{\bigcap_{i=1}^{k-1}\{\mathrm{d_{H}}({\boldsymbol{c}}_{i},{\boldsymbol{c}}_{i+1})=d_{i}\}\}$ can be expressed as

[TABLE]

where ${\boldsymbol{1}}_{\{\cdot\}}$ denotes the indicator function, and $(\mathrm{iii})$ follows from $(\mathrm{ii})$ . Recursively applying (94), we get

[TABLE]

Now, (28) follows from the fact that $\Pr\left\{\mathrm{d_{H}}({\boldsymbol{c}}_{i},{\boldsymbol{c}}_{i+1})=d_{i}\right\}\leq 2^{-n(1-H(d_{i}/n))}$ when ${\boldsymbol{c}}_{i}$ and ${\boldsymbol{c}}_{i+1}$ are uniformly distributed over $\mathbb{F}_{2^{n}}$ [7]. ∎

Appendix B Proof of Prop. 3

Proof:

For $1\leq i\leq m=2^{nR}$ , let ${\boldsymbol{c}}_{i}$ denote the $i$ -th row of codebook $C$ . Let $\mathbb{F}_{2^{n}}$ denote the space of all $n$ -length binary vectors, and let $\gamma_{i}\in\mathbb{F}_{2^{n}}$ for $1\leq i\leq m$ . Let $Q_{{\mathrm{TRC}}}\left\{\bigcap_{i=1}^{m}\{{\boldsymbol{c}}_{i}=\gamma_{i}\}\right\}$ denote the probability $\Pr\left\{\bigcap_{i=1}^{m}\{{\boldsymbol{c}}_{i}=\gamma_{i}\}\right\}$ when $C$ is uniformly distributed over ${\mathscr{C}}_{{\mathrm{TRC}}}(n,R)$ . Then, we have

[TABLE]

where ${\boldsymbol{1}}_{\{\cdot\}}$ denotes the indicator function. Further, let $Q_{{\mathrm{RCE}}}\left\{\bigcap_{l=1}^{k-1}\left\{\mathrm{d_{H}}({\boldsymbol{c}}_{i_{l}},{\boldsymbol{c}}_{i_{l+1}})=d_{l}\right\}\right\}$ denote the probability $\Pr\left\{\bigcap_{l=1}^{k-1}\left\{\mathrm{d_{H}}({\boldsymbol{c}}_{i_{l}},{\boldsymbol{c}}_{i_{l+1}})=d_{l}\right\}\right\}$ when codebook $C$ is uniformly distributed over ${\mathscr{C}}(n,R)$ . Then,

[TABLE]

where $(a)$ follows from (95), and $(b)$ follows from Prop. 1.

∎

Acknowledgement

The authors acknowledge discussions with Ting-Yi Wu and Tim Gernat on the bee-identification problem formulation.

Bibliography14

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] T. Gernat et al. , “Automated monitoring of behavior reveals bursty interaction patterns and rapid spreading dynamics in honeybee social networks,” Proc. Nat. Acad. Sci. U.S.A. , vol. 115, no. 7, pp. 1433–1438, Feb. 2018.
2[2] S. Shahi, D. Tuninetti, and N. Devroye, “The strongly asynchronous massive access channel,” Jul. 2018, ar Xiv:1807.09934 [cs.IT].
3[3] S. Shahi, D. Tuninetti, and N. Devroye, “On identifying a massive number of distributions,” in Proc. 2018 IEEE Int. Symp. Inf. Theory , Jun. 2018, pp. 331–335.
4[4] R. Heckel, I. Shomorony, K. Ramchandran, and D. N. C. Tse, “Fundamental limits of DNA storage systems,” in Proc. 2017 IEEE Int. Symp. Inf. Theory , Jun. 2017, pp. 3130–3134.
5[5] I. Shomorony and R. Heckel, “Capacity results for the noisy shuffling channel,” Feb. 2019, ar Xiv:1902.10832 [cs.IT].
6[6] M. Kovačević and V. Y. F. Tan, “Codes in the space of multisets – coding for permutation channels with impairments,” IEEE Trans. Inform. Theory , vol. 64, no. 7, pp. 5156–5169, Jul. 2018.
7[7] A. Barg and G. D. Forney, “Random codes: minimum distances and error exponents,” IEEE Trans. Inform. Theory , vol. 48, no. 9, pp. 2568–2573, Sep. 2002.
8[8] R. G. Gallager, Information Theory and Reliable Communication . New York: John Wiley and Sons, 1968.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Error Exponent Bounds for the

Abstract

I Introduction

I-A Related Work

I-B Bee-Identification Problem Formulation

I-C Bee-Identification Error Exponent

I-D Our Contributions

II Random Code Ensemble

II-A Independent Decoding for Each Barcode

Theorem 1**.**

Proof:

II-B Joint Decoding of Barcodes

II-B1 σ\sigmaσ is a transposition

II-B2 σ\sigmaσ is a product (composition) of disjoint transpositions

II-B3 σ\sigmaσ is a kkk-cycle with k>2k>2k>2

Proposition 1**.**

Proof:

II-B4 General σ∈Sm\sigma\in S_{m}σ∈Sm​ with σ≠π0\sigma\neq\pi_{0}σ=π0​

II-B5 Putting it all together

Theorem 2**.**

Proof:

Proposition 2**.**

Proof:

III Typical Random Code

III-A Independent Decoding of Barcodes

Theorem 3**.**

Proof:

III-B Joint Decoding of Barcodes

III-B1 σ\sigmaσ is a transposition

III-B2 σ\sigmaσ is a kkk-cycle

Proposition 3**.**

Proof:

III-B3 σ\sigmaσ is a product (composition) of two disjoint cycles

III-B4 General σ∈Sm\sigma\in S_{m}σ∈Sm​ with σ≠π0\sigma\neq\pi_{0}σ=π0​

III-B5 Putting it all together

Theorem 4**.**

Proof:

Proposition 4**.**

Proof:

IV Upper Bound on the Bee-Identification Exponent

Theorem 5**.**

Proof:

Corollary 1**.**

Proof:

V A Numerical Example

VI Discussion

Appendix A Proof of Prop. 1

Proof:

Appendix B Proof of Prop. 3

Proof:

Acknowledgement

Theorem 1.

II-B1 $\sigma$ is a transposition

II-B2 $\sigma$ is a product (composition) of disjoint transpositions

II-B3 $\sigma$ is a $k$ -cycle with $k>2$

Proposition 1.

II-B4 General $\sigma\in S_{m}$ with $\sigma\neq\pi_{0}$

Theorem 2.

Proposition 2.

Theorem 3.

III-B1 $\sigma$ is a transposition

III-B2 $\sigma$ is a $k$ -cycle

Proposition 3.

III-B3 $\sigma$ is a product (composition) of two disjoint cycles

III-B4 General $\sigma\in S_{m}$ with $\sigma\neq\pi_{0}$

Theorem 4.

Proposition 4.

Theorem 5.

Corollary 1.