On Coded Caching with Correlated Files

Kai Wan; Daniela Tuninetti; Mingyue Ji; Giuseppe Caire

arXiv:1901.05732·cs.IT·June 15, 2021

On Coded Caching with Correlated Files

Kai Wan, Daniela Tuninetti, Mingyue Ji, Giuseppe Caire

PDF

TL;DR

This paper investigates the limits of coded caching with correlated files, deriving bounds and optimal schemes under uncoded cache placement, and introduces interference alignment techniques to improve load efficiency.

Contribution

It provides new bounds and optimal caching schemes for correlated files with uncoded placement, and develops interference alignment methods to enhance load reduction.

Findings

01

Optimal caching schemes under specific demand and correlation conditions.

02

A two-phase interference alignment delivery scheme within a factor of 2 of optimal.

03

Reduction in load compared to existing schemes for multi-file requests.

Abstract

This paper studies the fundamental limits of the shared-link coded caching problem with correlated files, where a server with a library of $N$ files communicates with $K$ users who can locally cache $M$ files. Given an integer $r \in [N]$ , correlation is modeled as follows: each r-subset of files contains a unique common block. The tradeoff between the cache size and the average transmitted load is considered. First, a converse bound under the constraint of uncoded cache placement (i.e., each user directly stores a subset of the library bits) is derived. Then, a caching scheme for the case where every user demands a distinct file (possible for $N \geq K$ ) is shown to be optimal under the constraint of uncoded cache placement. This caching scheme is further proved to be decodable and optimal under the constraint of uncoded cache placement when (i) $K r M \leq 2 N$ or $K r M \geq (K - 1) N$ or…

Figures1

Click any figure to enlarge with its caption.

Equations206

F_{i} = {W_{S} : S \subseteq [N], ∣ S ∣ = r, i \in S}, \forall i \in [N],

F_{i} = {W_{S} : S \subseteq [N], ∣ S ∣ = r, i \in S}, \forall i \in [N],

R^{⋆} (M, s) := Z min E_{d \in D_{s}} [R (d, Z)], \forall s \in [min {K, N}],

R^{⋆} (M, s) := Z min E_{d \in D_{s}} [R (d, Z)], \forall s \in [min {K, N}],

R^{⋆} (M) := Z min E_{d \in [N]^{K}} [R (d, Z)] .

R^{⋆} (M) := Z min E_{d \in [N]^{K}} [R (d, Z)] .

W_{S} = {W_{S, V} : V \subseteq [K]}, \forall S \subseteq [N] : ∣ S ∣ = r,

W_{S} = {W_{S, V} : V \subseteq [K]}, \forall S \subseteq [N] : ∣ S ∣ = r,

(\frac{N t}{K r}, c_{t}^{s})_{u,conv}, \forall t \in [0 : K],

(\frac{N t}{K r}, c_{t}^{s})_{u,conv}, \forall t \in [0 : K],

c_{t}^{s} := \frac{\sum _{j \in [m i n {s, N - r + 1, K - t}]} ( r - 1 N - j ) ( t K - j )}{( r - 1 N - 1 ) ( t K )} .

c_{t}^{s} := \frac{\sum _{j \in [m i n {s, N - r + 1, K - t}]} ( r - 1 N - j ) ( t K - j )}{( r - 1 N - 1 ) ( t K )} .

(\frac{N t}{K r}, E_{d \in [N]^{K}} [c_{t}^{N_{e} (d)}])_{u,conv}, \forall t \in [0 : K] .

(\frac{N t}{K r}, E_{d \in [N]^{K}} [c_{t}^{N_{e} (d)}])_{u,conv}, \forall t \in [0 : K] .

Block subdivision: \forall S \subseteq [N] : ∣ S ∣ = r let

Block subdivision: \forall S \subseteq [N] : ∣ S ∣ = r let

W_{S} = {W_{S, V} : \forall V \subseteq [K] : ∣ V ∣ = t} .

Placement Phase: \forall k \in [K] let

Z_{k} = {W_{S, V} : \forall S \subseteq [N] : ∣ S ∣ = r, \forall V \subseteq [K] : ∣ V ∣ = t, k \in V} .

\forall j \in [min {N_{e} (d), N - r + 1, K - t}],

\forall J \subseteq [K] ∖ {u_{1}, \dots, u_{j}} : ∣ J ∣ = t,

\forall B \subseteq [N] ∖ {d_{u_{1}}, \dots, d_{u_{j}}} : ∣ B ∣ = r - 1,

send C_{J \cup {u_{j}}, B} as defined in \eqref eq:CJi general .

\forall j \in [min {N_{e} (d), N - r + 1, K - t}],

\forall q \in [j + 1 : min {N - r + 2, K - t + 1, N_{e} (d)}],

\forall J \subseteq [K] ∖ {u_{1}, \dots, u_{q}} : ∣ J ∣ = t - 1, J \cap {u_{q + 1}, \dots, u_{N_{e} (d)}} \neq = \emptyset,

\forall B \subseteq [N] ∖ {d_{u_{1}}, \dots, d_{u_{q}}} : ∣ B ∣ = r - 2, B \cap N ([K]) \neq = \emptyset,

send C_{J \cup {u_{j}, u_{q}}, B} as defined in \eqref eq:CJi general .

(\frac{N t}{K r}, c_{t}^{s} + e_{t}^{s})_{u,ach}, \forall t \in [0 : K],

(\frac{N t}{K r}, c_{t}^{s} + e_{t}^{s})_{u,ach}, \forall t \in [0 : K],

e_{t}^{s} := \frac{\sum _{j \in [m i n {s, N - r + 1, K - t}]} \sum _{q = j + 1}^{m i n {N - r + 2, K - t + 1, s}} ( ( r - 2 N - q ) - ( r - 2 N - s ) ) ( ( t - 1 K - q ) - ( t - 1 K - s ) )}{( r - 1 N - 1 ) ( t K )} .

e_{t}^{s} := \frac{\sum _{j \in [m i n {s, N - r + 1, K - t}]} \sum _{q = j + 1}^{m i n {N - r + 2, K - t + 1, s}} ( ( r - 2 N - q ) - ( r - 2 N - s ) ) ( ( t - 1 K - q ) - ( t - 1 K - s ) )}{( r - 1 N - 1 ) ( t K )} .

(\frac{N t}{K r}, E_{d \in [N]^{K}} [c_{t}^{N_{e} (d)} + e_{t}^{N_{e} (d)}])_{u,ach}, \forall t \in [0 : K] .

(\frac{N t}{K r}, E_{d \in [N]^{K}} [c_{t}^{N_{e} (d)} + e_{t}^{N_{e} (d)}])_{u,ach}, \forall t \in [0 : K] .

M = ℓ \in [N] \sum \frac{N t _{ℓ} p _{ℓ}}{K ℓ} .

M = ℓ \in [N] \sum \frac{N t _{ℓ} p _{ℓ}}{K ℓ} .

R = ℓ \in [N] \sum p_{ℓ} c_{t_{ℓ}}^{K} .

R = ℓ \in [N] \sum p_{ℓ} c_{t_{ℓ}}^{K} .

R = ℓ_{1} \in {1, 2, N - 1, N} \sum p_{ℓ_{1}} c_{t_{ℓ_{1}}}^{s} + ℓ_{2} \in [3 : N - 2] \sum (p_{ℓ_{2}} c_{t_{ℓ_{2}}}^{s} + p_{ℓ_{2}} e_{t_{ℓ_{2}}}^{s} \mathbbm 1_{t_{ℓ_{2}} \in / {0, 1, 2, K - 1, K}}),

R = ℓ_{1} \in {1, 2, N - 1, N} \sum p_{ℓ_{1}} c_{t_{ℓ_{1}}}^{s} + ℓ_{2} \in [3 : N - 2] \sum (p_{ℓ_{2}} c_{t_{ℓ_{2}}}^{s} + p_{ℓ_{2}} e_{t_{ℓ_{2}}}^{s} \mathbbm 1_{t_{ℓ_{2}} \in / {0, 1, 2, K - 1, K}}),

k \in [m i n {N_{e} (d), N - r + 1}] ⋃ S \subseteq [N] ∖ {d_{u_{1}}, \dots, d_{u_{k - 1}}} : ∣ S ∣ = r, d_{u_{k}} \in S ⋃ V \subseteq [K] ∖ {u_{1}, \dots, u_{k}} ⋃ W_{S, V},

k \in [m i n {N_{e} (d), N - r + 1}] ⋃ S \subseteq [N] ∖ {d_{u_{1}}, \dots, d_{u_{k - 1}}} : ∣ S ∣ = r, d_{u_{k}} \in S ⋃ V \subseteq [K] ∖ {u_{1}, \dots, u_{k}} ⋃ W_{S, V},

R_{u}^{⋆} (M, N_{e} (d)) \geq k \in [m i n {N_{e} (d), N - r + 1}] \sum S \subseteq [N] ∖ {d_{u_{1}}, \dots, d_{u_{k - 1}}} : ∣ S ∣ = r, d_{u_{k}} \in S \sum V \subseteq [K] ∖ {u_{1}, \dots, u_{k}} \sum \frac{∣ W _{S, V} ∣}{B},

R_{u}^{⋆} (M, N_{e} (d)) \geq k \in [m i n {N_{e} (d), N - r + 1}] \sum S \subseteq [N] ∖ {d_{u_{1}}, \dots, d_{u_{k - 1}}} : ∣ S ∣ = r, d_{u_{k}} \in S \sum V \subseteq [K] ∖ {u_{1}, \dots, u_{k}} \sum \frac{∣ W _{S, V} ∣}{B},

s \in [min {K, N}]

R_{u}^{⋆} (M, s)

k \in [m i n (s, N - r + 1)] \sum S \subseteq [N] ∖ {d_{u_{1}}, \dots, d_{u_{k - 1}}} : ∣ S ∣ = r, d_{u_{k}} \in S \sum t = 0 \sum K - k V \subseteq [K] ∖ {u_{1}, \dots, u_{k}} : ∣ V ∣ = t \sum \frac{∣ W _{S, V} ∣}{B} .

R_{u}^{⋆} (M, s)

= t = 0 \sum K \frac{\sum _{j \in [m i n {s, N - r + 1, K - t}]} ( r - 1 N - j ) ( t K - j )}{( r - 1 N - 1 ) ( t K )} \cdot \frac{\sum _{S \subseteq [N] : ∣ S ∣ = r} \sum _{V \subseteq [K] : ∣ V ∣ = t} r ∣ W _{S, V} ∣}{N B}

= t = 0 \sum K c_{t}^{s} \cdot x_{t},

x_{t} := S \subseteq [N] : ∣ S ∣ = r \sum V \subseteq [K] : ∣ V ∣ = t \sum \frac{r ∣ W _{S, V} ∣}{N B},

c_{t}^{s} := \frac{\sum _{j \in [m i n {s, N - r + 1, K - t}]} ( r - 1 N - j ) ( t K - j )}{( r - 1 N - 1 ) ( t K )}, (as already defined in \eqref eq:ct),

x_{0} + x_{1} + ... + x_{K} = 1, (file size constraint),

x_{1} + 2 x_{2} + ... + t x_{t} + ... + K x_{K} \leq \frac{K M r}{N}, (memory size contraint),

R_{u}^{⋆} (M, s) \geq Conv (c_{t}^{s}) .

R_{u}^{⋆} (M, s) \geq Conv (c_{t}^{s}) .

R_{u}^{⋆} (M)

R_{u}^{⋆} (M)

F_{1}

F_{1}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

On the Fundamental Limits of Coded Caching with Correlated Files

Kai Wan, Daniela Tuninetti, Mingyue Ji, and Giuseppe Caire

A short version of this paper was presented at the 2019 IEEE International Symposium on Information Theory.

K. Wan and G. Caire are with the Electrical Engineering and Computer Science Department, Technische Universität Berlin, 10587 Berlin, Germany (e-mail: [email protected]; [email protected]). The work of K. Wan and G. Caire was partially funded by the European Research Council under the ERC Advanced Grant N. 789190, CARENET. D. Tuninetti is with the Electrical and Computer Engineering Department, University of Illinois at Chicago, Chicago, IL 60607, USA (e-mail: [email protected]). The work of D. Tuninetti was supported in part by NSF Award 1527059. M. Ji is with the Electrical and Computer Engineering Department, University of Utah, Salt Lake City, UT 84112, USA (e-mail: [email protected]). The work of M. Ji was supported in part by NSF Awards 1817154 and 1824558.

Abstract

This paper studies the fundamental limits of the shared-link coded caching problem with correlated files, where a server with a library of ${\mathsf{N}}$ files communicates with ${\mathsf{K}}$ users who can locally cache ${\mathsf{M}}$ files. Given an integer ${\mathsf{r}}\in[{\mathsf{N}}]$ , correlation is modeled as follows: each ${\mathsf{r}}$ -subset of files contains a unique common block. The tradeoff between the cache size and the average transmitted load is considered. First, a converse bound under the constraint of uncoded cache placement (i.e., each user directly stores a subset of the library bits) is derived. Then, a caching scheme for the case where every user demands a distinct file (possible for ${\mathsf{N}}\geq{\mathsf{K}}$ ) is shown to be optimal under the constraint of uncoded cache placement. This caching scheme is further proved to be decodable and optimal under the constraint of uncoded cache placement when (i) ${\mathsf{K}}{\mathsf{r}}{\mathsf{M}}\leq 2{\mathsf{N}}$ or ${\mathsf{K}}{\mathsf{r}}{\mathsf{M}}\geq({\mathsf{K}}-1){\mathsf{N}}$ or ${\mathsf{r}}\in\{1,2,{\mathsf{N}}-1,{\mathsf{N}}\}$ for every demand type (i.e., when the demanded file are not necessarily distinct), and (ii) when the number of distinct demanded files is no larger than four. Finally, a two-phase delivery scheme based on interference alignment is shown to be optimal to within a factor of $2$ under the constraint of uncoded cache placement for every possible demands. As a by-product, the proposed interference alignment scheme is shown to reduce the (worst-case or average) load of state-of-the-art schemes for the coded caching problem where the users can request multiple files.

I Introduction

Cache is a network component that leverages the device memory to transparently store data so that future requests for that data can be served faster. Two phases are included in a caching system: i) cache placement phase: content is pushed into each cache without knowledge of future demands; ii) delivery phase: after each user has made its request and according to the cache contents, the server transmits coded packets in order to satisfy the user demands. The goal is to minimize the number of transmitted bits (or load or rate).

Information theoretic coded caching was originally proposed by Maddah-Ali and Niesen (MAN) in [1] for a shared-link caching systems containing a server with a library of ${\mathsf{N}}$ equal-length files, which is connected to ${\mathsf{K}}$ users through a noiseless shared-link, each of which can store ${\mathsf{M}}$ files in their local cache. Each user demands one file in the delivery phase. The MAN scheme uses a combinatorial design in the placement phase such that during delivery multicast messages simultaneously satisfy the demands of different users. Under the constraint of uncoded cache placement (i.e., each user directly caches a subset of the library bits) and for worst-case load, the MAN scheme was proved to be optimal when ${\mathsf{N}}\geq{\mathsf{K}}$ [2]. On the observation that some MAN linear combinations are redundant if there exist files demanded by several users, the authors in [3] improved the MAN delivery scheme and achieved the optimal worst-case load under the constraint of uncoded cache placement for any ${\mathsf{K}}$ . The same authors proved in [4] that the multiplicative gap between the optimal caching scheme with uncoded cache placement and any caching scheme with coded cache placement is at most $2$ .

Coded caching strategy was also extended to numerous different models, such as decentralized systems [5], device-to-device (D2D) systems [6], topological networks [7, 8, 9], etc. The above works aassume that all the files in the library are independent. However, in practice there may be some overlaps among different files (e.g., videos, image streams, etc.). In this work, we consider such a coded caching problem with correlated sources, as originally proposed in [10], where different files have common parts. In the following, we will review the literature of coded caching with correlated sources, and introduce our main contributions in this paper.

I-A Past Work

Coded Caching with Correlated Sources

In¸itehas2016correctedsource, the authors modeled correlation as each subset of files has an exclusively common part, which is independent to the common parts of other subsets of files. By treating the delivery phase as an index coding problem with multiple requests, the authors in [10] proposed a delivery scheme based on graph coloring. In [11] a caching scheme for two-file ${\mathsf{K}}-$ user system and three-file two-user systems based on Gray-Wyner source coding.

In [12], the caching problem with correlated files, where the length of the common part among each $\ell\in\{1,\dots,{\mathsf{N}}\}$ files (referred to as a ‘ $\ell$ -block’) is the same, was considered; each file contains $\binom{{\mathsf{N}}-1}{\ell-1}$ $\ell-$ blocks. By using the MAN cache placement to store each $\ell$ -block at the user sides, [12] proposed a delivery phase which contains ${\mathsf{N}}$ steps. In step $\ell$ only $\ell$ -blocks are transmitted; thus there are $\binom{{\mathsf{N}}-1}{\ell-1}$ rounds for the transmission of Step $\ell$ . Each round is treated as an MAN caching problem with ${\mathsf{K}}$ users, each of which should decode exactly one $\ell$ -block. The authors then used the caching scheme in [3] to transmit packets for each round. The caching schemes in [11] and [12], were extended in [13] and [14] to caching problems with correlated files where the correlation is dynamic and the channel is a Gaussian broadcast channel, respectively.

Coded Caching with Multiple Requests

The caching problem with correlated files in [12] is a special case of coded caching with multiple requests considered in [15], where the library contains ${\mathsf{N}}$ equal-length and independent files and each user demands ${\mathsf{L}}$ files from the library. With the MAN placement, to divide the delivery phase into ${\mathsf{L}}$ rounds where in each round the MAN scheme in [1] is used to let each user decode one file, can achieve a generally order optimal worst-case load to within a factor of $18$ [15]. By further tightening the converse bound, this order optimality factor was reduced to $11$ in [16]. Instead of using the MAN scheme in each round, the authors in [17] proposed to use the caching scheme in [3] to leverage the multicast opportunities.

In addition, by considering all the ${\mathsf{L}}$ rounds, an overall transmission coding matrix can be generated. If the coding matrix is not full-rank, they then take the full-rank sub-matrix. This delivery scheme was proved to be optimal under the constraint of the MAN placement for demands with ${\mathsf{K}}\leq 4$ , ${\mathsf{M}}={\mathsf{N}}/{\mathsf{K}}$ , and ${\mathsf{L}}=2$ , with the exception of one demand for ${\mathsf{K}}=3$ and three demands for ${\mathsf{K}}=4$ .

Coded caching with multiple requests, where each user demands different number of files, was considered in [18, 19, 20, 21]. The caching schemes in [18, 19, 20] are based on the round-division strategy as described above while the one in [21] considered small memory size regime and used Minimum Distance Separable (MDS) coded cache placement.

Most of the existing works divide the multi-request problem into a sequence of single-request problems (as in [12, 15, 16, 17]). There are three main limitations in dividing the delivery into single-request problems, namley (1) the same file may exist in different rounds and this round-division method may miss some multicast opportunities, (2) even if there does not exist file overlap cross different rounds, this round-division method still cannot fully leverage the multicast opportunities (as illustrated in Example V-A), and (3) finding the best division of the users’ demands into ${\mathsf{L}}$ groups is computationally hard.

I-B Contributions

If one directly considers the most general problem of correlated files, it is very challenging to make general optimality statements. In this paper, we consider a symmetric version of the problem, for which we propose a novel interference alignment based delivery scheme, which jointly serves users’ demands instead of dividing the delivery into single-request problems. The considered model is the following simplification of the model in [12]: we fix ${\mathsf{r}}\in[{\mathsf{N}}]$ and assume each file only contains ${\mathsf{r}}$ -blocks (see Section II). Our main contributions are as follows:

We derive a converse bound on the minimal average load among all possible demands under the constraint of uncoded cache placement. We leverage the acyclic index coding converse bound as in [22, 23]. 2. 2.

By jointly serving the users’ demands, we propose a caching scheme for the case where every user demands a distinct file and whose load matches our proposed converse bound under the constraint of uncoded cache placement. 3. 3.

By combining the above achievable scheme with an interference alignment idea, we then propose a two-phase delivery scheme for general demands, where the first sub-phase is the same as the one for distinct demand case and the additional second sub-phase is used to align interference at the various users. The two-phase interference alignment delivery scheme is proved to be order optimal within a factor of $2$ for any demand type. 4. 4.

By further cancelling interference, we prove that the second sub-phase in the above two-phase delivery is not necessary, thus resulting in exact optimality under the constraint of uncoded cache placement, for (i) for any demand type if either ${\mathsf{K}}{\mathsf{r}}{\mathsf{M}}\leq 2{\mathsf{N}}$ or ${\mathsf{K}}{\mathsf{r}}{\mathsf{M}}\geq({\mathsf{K}}-1){\mathsf{N}}$ or ${\mathsf{r}}\in\{1,2,{\mathsf{N}}-1,{\mathsf{N}}\}$ , and (ii) when the number of distinct demanded files is no larger than four. 5. 5.

As a by-product, a modification of our proposed interference alignment scheme is optimal under the constraint of MAN placement for the four cases left open in [17] for the caching problem with multiple requests.

I-C Paper Organization

The rest of the paper is organized as follows. The system model for the considered coded caching problem with correlated sources is given in Section II. In Section III, our main results and some numerical evaluations are presented. The proof of the proposed converse bound can be found in Section IV, and that of proposed achievable schemes in Section V. Section VI concludes the paper. The proofs of some auxiliary results can be found in Appendix.

I-D Notation Convention

Calligraphic symbols denote sets, bold symbols denote vectors, and sans-serif symbols denote system parameters. We use $|\cdot|$ to represent the cardinality of a set or the length of a vector; $[a:b]:=\left\{a,a+1,\ldots,b\right\}$ and $[n]:=[1,2,\ldots,n]$ ; $\oplus$ represents bit-wise XOR.

II System Model

In a $({\mathsf{N}},{\mathsf{K}},{\mathsf{r}},{\mathsf{M}})$ shared-link caching problem with correlated files, a server has access to a library of ${\mathsf{N}}\in\mathbb{N}$ files (each of which contains ${\mathsf{B}}\in\mathbb{N}$ iid bits) denoted by $\{F_{1},\cdots,F_{\mathsf{N}}\}$ . The server is connected to ${\mathsf{K}}\in\mathbb{N}$ users through a shared error-free link. Each file is composed of $\binom{{\mathsf{N}}-1}{{\mathsf{r}}-1}$ independent and equal-length blocks, where ${\mathsf{r}}\in[{\mathsf{N}}]$ ; we denote

[TABLE]

where the block $W_{{\mathcal{S}}}$ represents the exclusive common part across the files indexed by ${\mathcal{S}}$ . Hence, in the whole library there are $\binom{{\mathsf{N}}}{{\mathsf{r}}}$ independent blocks, each of which has ${\mathsf{B}}/\binom{{\mathsf{N}}-1}{{\mathsf{r}}-1}$ bits. A coded caching scheme has two phases: placement and delivery.

Placement Phase

During the cache placement phase, user $k\in[{\mathsf{K}}]$ stores information about the ${\mathsf{N}}$ files in its cache of size $\mathsf{MB}$ bits, where ${\mathsf{M}}\in[0,{\mathsf{N}}/{\mathsf{r}}]$ . This phase is done without knowledge of users’ demands. We denote the content in the cache of user $k\in[{\mathsf{K}}]$ by $Z_{k}$ and let ${\mathbf{Z}}:=(Z_{1},\ldots,Z_{{\mathsf{K}}})$ .

Delivery Phase

During the delivery phase, user $k\in[{\mathsf{K}}]$ demands file $d_{k}\in[{\mathsf{N}}]$ . The demand vector ${\mathbf{d}}:=(d_{1},\ldots,d_{{\mathsf{K}}})$ is revealed to all nodes. Given $({\mathbf{d}},{\mathbf{Z}})$ , the server broadcasts a message $X({\mathbf{d}},{\mathbf{Z}})$ of ${\mathsf{B}}{\mathsf{R}}({\mathbf{d}},{\mathbf{Z}})$ bits to all users. User $k\in[{\mathsf{K}}]$ must recover its desired file $F_{d_{k}}$ from $Z_{k}$ and $X({\mathbf{d}},{\mathbf{Z}})$ .

Load

For each demand vector ${\mathbf{d}}$ , we define ${\mathcal{N}}({\mathcal{T}}):=\{d_{k}:k\in{\mathcal{T}}\}$ as the set of demanded files by users in ${\mathcal{T}}$ , where ${\mathcal{T}}\subseteq[{\mathsf{K}}]$ . A demand vector ${\mathbf{d}}$ is said to be of type $\mathcal{D}_{N_{\textup{e}}({\mathbf{d}})}$ if it has $N_{\textup{e}}({\mathbf{d}}):=|{\mathcal{N}}([{\mathsf{K}}])|$ distinct entries. Based on the uniform demand distribution, the objective is to determine the optimal average load among all demands of the same type, that is

[TABLE]

and the optimal average load among all possible demands is

[TABLE]

Note that, with an abuse of notation, ${\mathsf{R}}^{\star}({\mathsf{M}})\not=\mathbb{E}_{s}[{\mathsf{R}}^{\star}({\mathsf{M}},s)]$ in general, unless the same cache placement policy optimizes the load in (2) for all $s\in[\min\{{\mathsf{K}},{\mathsf{N}}\}]$ .

Uncoded Cache Placement

The cache placement policy is said to be uncoded if each user directly copies some library bits directly into its cache. Under the constraint of uncoded cache placement, we can partition each block $W_{{\mathcal{S}}}$ is partitioned into sub-blocks as

[TABLE]

where $W_{{\mathcal{S}},{\mathcal{V}}}$ represents the bits of $W_{{\mathcal{S}}}$ which are exclusively cached by users indexed by ${\mathcal{V}}$ . The optimal loads under the constraint of uncoded cache placement are denoted by ${\mathsf{R}}^{\star}_{\mathrm{u}}({\mathsf{M}},s)$ and ${\mathsf{R}}^{\star}_{\mathrm{u}}({\mathsf{M}})$ and are defined similarly to in (2) and (3), respectively.

Special Cases

Our model reduces to the MAN coded caching problem with average load when ${\mathsf{r}}=1$ , and to the case of a library with a single file when ${\mathsf{r}}={\mathsf{N}}$ . Both cases are either solved exactly or to within a factor of $2$ in [4].

Relation to the More General Coded Caching Problem with Correlated Sources

In this paper, in order to make fundamental progress on the problem of caching correlated content, we simplified the model [12] as follows. In [12] a certain parameters $\ell$ ranges from zero to the number of files in the system (each $\ell_{1}$ files have a common part, each $\ell_{2}$ files also have a common part, etc.), while in our model $\ell$ is fixed to a single value ${\mathsf{r}}$ . Our model is thus a special case of the one in [12]. With our models however, we can make conclusive statements (either exact capacity results, or capacity to within a constant multiplicative gap) which eluded in [12].

Relation to the Coded Caching Problem with Multiple Requests

If we identify the ${{\mathsf{N}}\choose{\mathsf{r}}}$ independent blocks as files of a library, and allow each cache-equipped user to request ${{\mathsf{N}}-1\choose{\mathsf{r}}-1}$ such blocks/files, the considered caching problem with correlated sources relates to the symmetric caching problem with multiple requests considered in [15], where ‘symmetric’ means that each user requests the same number of files. There is however a subtle difference between our model and the one in [15]: in our model one file corresponds to ${{\mathsf{N}}-1\choose{\mathsf{r}}-1}$ distinct blocks, thus our model corresponds to the one in [15] under the constraint that a user has multiple but distinct requests. Moreover, here we consider the average load as our performance metric, while in [15] the authors used the worst-case load. Because of these differences, the results in this paper are not special cases of the results in [15].

The relationship among the two problems can be also explained as follows. For the case of multiple requests, assume that the ${{\mathsf{N}}\choose{\mathsf{r}}}$ independent files are equally popular. On average, each of such independent files will appear on average the same number of times over the ensemble of all possible multiple request configurations. We construct ${\mathsf{N}}$ such multiple request configurations, each of which is formed by ${{\mathsf{N}}-1\choose{\mathsf{r}}-1}$ independent files (in fact, each multiple request configuration corresponds to a “file” in the correlated file library of our problem). It follows that each independent file appears on average ${\mathsf{N}}{{\mathsf{N}}-1\choose{\mathsf{r}}-1}/{{\mathsf{N}}\choose{\mathsf{r}}}={\mathsf{r}}$ times in the ensemble of possible multiple requests configurations. If instead of random multiple requests, we consider the deterministic symmetric case, where the possible multiple requests configurations are all and only those for which each independent files appears exactly ${\mathsf{r}}$ times (and not on average ${\mathsf{r}}$ times), we have the exact equivalence of our problem with the case of multiple requests of independent files. With this interpretation, the proposed results in this paper also shed light into the very relevant and intricate problem of how to handle optimally the case where each user makes a sequence of requests of independent files (blocks). The fact that there are repeated elements in such sequence of requests is a ‘fundamental’ aspect of caching (also in practice), where one needs to devise schemes that take advantage of previous requests and do not send the same stuff multiple times.

III Main Results and Numerical Evaluations

In this section, we state our main results and presents numerical evaluations of the proposed converse and achievable bounds. We shall use the subscripts “u,conv” and “u,ach” for converse (conv) and achievable (ach) bounds, respectively, under the constraint of uncoded cache placement (u).

III-A Converse Bound

Inspired by [23], we use the acyclic index coding converse bound from [22] to derive the following converse bound under the constraint of uncoded cache placement for our problem. The proof can be found in Section IV.

Theorem 1 (Converse).

For a $({\mathsf{N}},{\mathsf{K}},{\mathsf{r}},{\mathsf{M}})$ shared-link caching problem with correlated files, ${\mathsf{R}}^{\star}_{\mathrm{u}}({\mathsf{M}},s),\ s\in[\min\{{\mathsf{K}},{\mathsf{N}}\}],$ is lower bounded by the lower convex envelope of the following $({\mathsf{M}},{\mathsf{R}})$ pairs

[TABLE]

where

[TABLE]

In addition, ${\mathsf{R}}^{\star}_{\mathrm{u}}({\mathsf{M}})$ is lower bounded by the lower convex envelope of the following $({\mathsf{M}},{\mathsf{R}})$ pairs

[TABLE]

Theorem 1 for ${\mathsf{r}}=1$ recovers the converse result for the MAN scheme under uncoded placement in [3], in particular the worst-case load is obtained for $s=\min\{{\mathsf{K}},{\mathsf{N}}\}$ in (5), while the average load under uniform demands is given by (7). Theorem 1 for ${\mathsf{r}}={\mathsf{N}}$ recovers the converse result for the MAN scheme with a single file, that is, $c^{s}_{t}=1-t/{\mathsf{K}}\Longleftrightarrow{\mathsf{R}}^{\star}({\mathsf{M}})=1-{\mathsf{M}}$ for ${\mathsf{M}}\in[0,1]$ .

III-B Achievable Scheme

Let ${\mathsf{M}}=\frac{{\mathsf{N}}t}{{\mathsf{K}}{\mathsf{r}}}$ for some integer $t\in[0:{\mathsf{K}}]$ . Recall that we denoted by $N_{\textup{e}}({\mathbf{d}})$ the number of distinct files in the demand vector ${\mathbf{d}}$ , and by ${\mathcal{L}}({\mathbf{d}})=\{u_{1},\ldots,u_{N_{\textup{e}}({\mathbf{d}})}\}$ the set of chosen leader users. We propose the following achievable scheme, which is analyzed in Section V.

[TABLE]

In the rest of this section we analyze the scheme in (8) in various settings of increasing order of complexity. Since the scheme is highly combinatorial, we shall start with a case that is the simplest to analyze and that brings to bear some of the key ideas. We shall then show that the a similar analysis applies also to more complex scenarios. In the following, optimality is understood under the constraint of uncoded cache placement. We have:

•

In Section III-C we show that the general scheme in (8) with only the first delivery sub-phase allows each leader user to decode its desired file. We also show the first sub-phase alone is exactly optimal when the users request different files, that is, all users are leaders, which is possible when $N_{\textup{e}}({\mathbf{d}})={\mathsf{K}}\leq{\mathsf{N}}$ .

•

In Section III-D we show that the scheme in (8), with both delivery sub-phases, can satisfy every user regardless of the demand type, where the transmissions in sub-phase 2 are used to cancel the interferences experienced by the non-leader users. We also show its optimality to within a factor of $2$ for any demand type.

•

In Section III-E we show that for some cases (such as, for example, the case of small or large memory size), each non-leader can re-construct the transmitted multicast messages in sub-phase 2 by performing linear combinations of the transmitted multicast messages in sub-phase 1, that is, sub-phase 2 is redundant. For these cases, we show exact optimality.

•

In Section III-F we show how the scheme in (8) can be used for other caching problems, by either offering simpler codes for the delivery phase than those known in the literature, or by providing an optimal scheme outperforming known state-of-the-art schemes.

In Section III-G we finally give some numerical evaluations of the proposed bounds.

III-C Optimality of (8) for demand type $s={\mathsf{K}}\leq{\mathsf{N}}$

Here we consider the case where each user makes a distinct request, which requires ${\mathsf{K}}\leq{\mathsf{N}}$ and demand type $s={\mathsf{K}}=N_{\textup{e}}({\mathbf{d}})$ . We propose a caching scheme where we jointly serve the users’ demands. Existing methods approach the problem by serving requests in multiple rounds [12, 15, 16, 17], where each round is a MAN scheme with users having a single request. Our scheme here is as in (8), but where only the first sub-phase of the delivery phase takes place. In particular, for ${\mathsf{M}}=\frac{{\mathsf{N}}t}{{\mathsf{K}}{\mathsf{r}}}$ and $s=N_{\textup{e}}({\mathbf{d}})={\mathsf{K}}\leq{\mathsf{N}}$ , our proposed delivery phase contains $\min\{{\mathsf{N}}-{\mathsf{r}}+1,{\mathsf{K}}-t\}$ steps, where in each step we transmit multicast messages to satisfy one leader user at a time. After all steps are done, the remaining users (who are also leaders, since here we consider a distinct request for each user) can also recover their desired file. The achieved load is presented in the following theorem, whose proof can be found in Section V-B.

Theorem 2 (Optimality for Distinct Requests).

For a $({\mathsf{N}},{\mathsf{K}},{\mathsf{r}},{\mathsf{M}})$ shared-link caching problem with correlated files, the loads in Theorem 1 are achievable under the constraint of uncoded cache placement when ${\mathsf{N}}\geq{\mathsf{K}}=s$ by the scheme in (8) with only the first delivery sub-phase.

III-D Performance of (8) for any demand type

We analyze here the scheme in (8) with two sub-phases in the delivery phase, and show that it is able to satisfy general demands. The main ingredients of the schemes are as follows. We pick a leader user among the users demanding the same file; in the first delivery sub-phase, we generate multicast messages as in Theorem 2 so that each leader user can recover its desired file by the end of this sub-phase; in the second delivery sub-phase, we transmit some additional multicast messages so that each non-leader user can cancel all non-intended (aligned interference) sub-blocks from all received multicast messages and thus can eventually recover its desired file. The achieved load is presented in the following theorem, whose proof can be found in Section V-C.

Theorem 3 (Interference-Alignment Based Delivery Scheme).

For a $({\mathsf{N}},{\mathsf{K}},{\mathsf{r}},{\mathsf{M}})$ shared-link caching problem with correlated files, an achievable memory-load tradeoff for any $s\in[\min\{{\mathsf{K}},{\mathsf{N}}\}]$ is given by the lower convex envelope of the following $({\mathsf{M}},{\mathsf{R}})$ pairs

[TABLE]

where $c^{s}_{t}$ is defined in (6) and

[TABLE]

In addition, ${\mathsf{R}}^{\star}_{\mathrm{u}}({\mathsf{M}})$ is upper bounded by the lower convex envelope of the following $({\mathsf{M}},{\mathsf{R}})$ pairs

[TABLE]

By comparing the converse bound in Theorem 1 and the achievable bound in Theorem 3, we have the following result, whose proof can be found in Section V-E.

Theorem 4 (Order Optimality for Theorem 3).

For a $({\mathsf{N}},{\mathsf{K}},{\mathsf{r}},{\mathsf{M}})$ shared-link caching problem with correlated files, the loads in Theorem 3 are order optimal to within a factor of $2$ under the constraint of uncoded cache placement.

III-E Optimality of (8) for ${\mathsf{r}}\in\{1,2,{\mathsf{N}}-1,{\mathsf{N}}\}$ or $t\in\{0,1,2,{\mathsf{K}}-1,{\mathsf{K}}\}$ or $s\in[\min\{{\mathsf{N}},{\mathsf{K}},4\}]$

In Theorem 3 $,c^{s}_{t}$ in (6) is the load for the first delivery sub-phase while $e^{s}_{t}$ in (10) is the one for the second delivery sub-phase. Hence, compared to the converse bound in Theorem 1, $e^{s}_{t}$ is the term leading to the sub-optimality. In Theorem 2, where we showed exact optimality for distinct demands, the second sub-phase was not needed. We investigate here other cases where the second sub-phase is not needed. In particular, we show cases where each non-leader user can re-construct the multicast messages sent in sub-phase 2 by linearly combining multicast messages sent in sub-phase 1. Since in these cases the second delivery sub-phase is not necessary, we obtain the following exact optimality result proved in Section V-F.

Theorem 5 (Exact Optimality for Some Cases).

For a $({\mathsf{N}},{\mathsf{K}},{\mathsf{r}},{\mathsf{M}})$ shared-link caching problem with correlated files, we have that ${\mathsf{R}}^{\star}_{\mathrm{u}}({\mathsf{M}},s)$ and ${\mathsf{R}}^{\star}_{\mathrm{u}}({\mathsf{M}})$ are equal to the lower convex envelops of $\left(\frac{{\mathsf{N}}t}{{\mathsf{K}}{\mathsf{r}}},c^{s}_{t}\right)$ and of $\left(\frac{{\mathsf{N}}t}{{\mathsf{K}}{\mathsf{r}}},\mathbb{E}_{{\mathbf{d}}\in[{\mathsf{N}}]^{{\mathsf{K}}}}\left[c^{N_{\textup{e}}({\mathbf{d}})}_{t}\right]\right)$ , respectively, where $c^{s}_{t}$ is defined in (6), in the following cases:

Case 1 (small or large file correlation): when ${\mathsf{r}}\in\{1,2,{\mathsf{N}}-1,{\mathsf{N}}\}$ , where optimality holds for any $s\in[\min\{{\mathsf{K}},{\mathsf{N}}\}]$ and any $t\in[0:{\mathsf{K}}]$ ; 2. 2.

Case 2 (small or large cache size): when $t\in\{0,1,2,{\mathsf{K}}-1,{\mathsf{K}}\}$ , where optimality holds for any $s\in[\min\{{\mathsf{K}},{\mathsf{N}}\}]$ and any ${\mathsf{r}}\in[{\mathsf{N}}]$ ; 3. 3.

Case 3 (small number of distinct requests): when $s\in[\min\{{\mathsf{K}},{\mathsf{N}},4\}]$ , where optimality holds for any ${\mathsf{r}}\in[{\mathsf{N}}]$ and any $t\in[0:{\mathsf{K}}]$ . In this case, no claim can be made on ${\mathsf{R}}^{\star}_{\mathrm{u}}({\mathsf{M}})$ as only some values of $s$ are exactly characterized.

From Theorem 5 we immediately have the following corollary, which can be proved straightforwardly by noting that Theorem 5.Case 3 covers all possible values of $s$ when $\min({\mathsf{N}},{\mathsf{K}})\leq 4$ .

Corollary 1.

For a $({\mathsf{N}},{\mathsf{K}},{\mathsf{r}},{\mathsf{M}})$ shared-link caching problem with correlated files, the loads in Theorem 1 are achievable under the constraint of uncoded cache placement when $\min({\mathsf{N}},{\mathsf{K}})\leq 4$ by the scheme in (8) with only the first delivery sub-phase.

Remark 1 (Average and Worst-case Loads).

Our proposed caching scheme and our optimality results directly characterize the optimal worst-case load (and not just the average load), because the worst-case load is the case of demand type $s=\min\{{\mathsf{N}},{\mathsf{K}}\}$ . We note that past works only aimed to design schemes that minimize the worst-case load, such as those in [12] (for caching with correlated sources) and in [15, 16] (for caching with multiple requests). Order optimality results (to within factors $11$ and $18$ ) on the worst-case load were derived in [15, 16] for caching with multiple requests; to the best of our knowledge, no order optimality results are known specifically for caching with correlated sources. Therefore, a major contribution of this paper, besides sharpening existing results for the worst-case load, it is to have derived (exact or order) optimality results on the average loads for any demand type and over all possible demands, under the constraint of uncoded cache placement. $\square$

III-F Extensions

Our scheme can be used in models other than the one considered in this paper. Examples are as follows.

Extension to the More General Coded Caching Problem with Correlated Sources

As already mentioned earlier, in this paper we simplified the model [12] by fixing the parameter $\ell$ in [12] to be equal to ${\mathsf{r}}$ (as opposed to let be to within a range). We can extend our results to the case where $\ell$ is within a range as follows. If $\ell$ is in a range as considered in [12], we can construct a caching scheme by “memory-sharing” among the proposed schemes in Theorems 2, 3, and 5 as follows.

•

Library. Assume that the length of each block $W_{{\mathcal{S}}}$ , where ${\mathcal{S}}\subseteq[{\mathsf{N}}]$ and $|{\mathcal{S}}|=\ell$ , is ${\mathsf{p}}_{\ell}{\mathsf{B}}/\binom{{\mathsf{N}}}{\ell}$ , where ${\mathsf{p}}_{\ell}\in[0,1]$ and $\sum_{\ell\in[{\mathsf{N}}]}{\mathsf{p}}_{\ell}=1$ . The values $({\mathsf{p}}_{\ell}:\ell\in[{\mathsf{N}}])$ are assumed to be fixed system parameters.

•

Placement. Choose integers $t_{\ell}\in[{\mathsf{K}}]$ for $\ell\in[{\mathsf{N}}]$ , We partition block $W_{{\mathcal{S}}}$ into $\binom{{\mathsf{K}}}{t_{|{\mathcal{S}}|}}$ equal-length sub-blocks and denote $W_{{\mathcal{S}}}=\{W_{{\mathcal{S}},{\mathcal{V}}}:{\mathcal{V}}\subseteq[{\mathsf{K}}],|{\mathcal{V}}|=t_{|{\mathcal{S}}|}\}$ . User $k\in[{\mathsf{K}}]$ caches sub-block $W_{{\mathcal{S}},{\mathcal{V}}}$ if $k\in{\mathcal{V}}$ , which requires a cache of size

[TABLE]

•

Delivery. For demand vector ${\mathbf{d}}$ , if $N_{\textup{e}}({\mathbf{d}})={\mathsf{K}}$ (i.e., each user demands a distinct file), we use the caching scheme in Theorem 2 and the achieved load is

[TABLE]

If $s=N_{\textup{e}}({\mathbf{d}})<{\mathsf{K}}$ we have two cases; if either $\ell\in\{1,2,{\mathsf{N}}-1,{\mathsf{N}}\}$ or $t_{\ell}\in\{0,1,2,{\mathsf{K}}-1,{\mathsf{K}}\}$ , we use the caching scheme in Theorem 5 to encode all blocks; otherwise, we use the caching scheme in Theorem 3; the achieved load is

[TABLE]

where $\mathbbm{1}$ is the indicator function, where $\mathbbm{1}_{\textrm{event}}=1$ if event is true and $\mathbbm{1}_{\textrm{event}}=0$ otherwise.

•

The achievable memory-load tradeoff is the lower convex envelope of the above points for all possible ${\mathbf{t}}:=(t_{\ell}:\ell\in[{\mathsf{N}}])$ .

Extension to the Coded Caching Problem with Multiple Requests

In this paper, differently from most of the existing work that divides the multi-request problem into a sequence of single-request problems, we can modify our proposed interference alignment scheme for the caching problem with correlated sources to address the caching problem with multiple requests. By doing so, we can give an optimal scheme for the four cases that were left open in [17] for the caching problem with multiple requests, where the setting includes up to four users and where each user demands at most two files. The details of how to modify our proposed interference alignment scheme (so as to account for the lack of symmetry of the multi-request problem) are given in Appendix G.

Extension to Distributed Computation

When ${\mathsf{N}}={\mathsf{K}}$ and each user demands a distinct file, the $({\mathsf{N}},{\mathsf{K}},{\mathsf{r}},{\mathsf{M}})$ shared-link caching problem with correlated files is related to the distributed computation problem in [24]. The only difference is that in [24] the link is D2D (i.e., workers/users communication among each other without a central master/server), as opposed to the shared-link case considered here. In [24], the authors proposed an optimal scheme that requires to exchange messages where symbols are from a large finite field size. In the contrast, for the shared-link caching problem, by using the optimal scheme proposed in this paper, operations are simpler in that they are on the binary field.

III-G Numerical Evaluations

In Figs. 1 and 2 we plot the load vs the cache size for demand types $\mathcal{D}_{20}$ (left subfigure) and $\mathcal{D}_{10}$ (right subfigure) for the $({\mathsf{N}},{\mathsf{K}})=(20,40)$ shared-link caching problem with correlated files.

In Fig. 1 we consider the ${\mathsf{r}}=2$ , for which our proposed scheme (with only the first delivery sub-phase) is optimal as stated in Theorem 5. We also compare the average loads achieved by our scheme in Theorem 5 with that of the suboptimal scheme in [12].

In Fig. 2 we consider the ${\mathsf{r}}=3$ . When $t={\mathsf{K}}{\mathsf{M}}{\mathsf{r}}/{\mathsf{N}}\in\{0,1,2,39,40\}$ , only the first sub-phase of the delivery scheme is necessary and the resulting load is optimal as stated in Theorem 5. For other values of the parameter $t$ , we use both sub-phases of the delivery scheme as in Theorem 3. Fig. 2 shows that our proposed achievable scheme outperforms the scheme in [12].

In Fig. 3 we plot the average load over all possible demands for the $({\mathsf{N}},{\mathsf{K}},{\mathsf{r}})=(30,30,5)$ shared-link caching problem with correlated files, by Monte-Carlo simulation with the number of iteration $10^{6}$ . In each iteration, the demand of each user is generated independently at random based on the discrete uniform distribution. When $t={\mathsf{K}}{\mathsf{M}}{\mathsf{r}}/{\mathsf{N}}\in\{0,1,2,28,29\}$ or $N_{\textup{e}}({\mathbf{d}})\leq 4$ , only the first sub-phase of the delivery scheme is necessary and the resulting load is optimal as stated in Theorem 5. For other values of the parameters $t$ and $N_{\textup{e}}({\mathbf{d}})$ , we use both sub-phases of the delivery scheme as in Theorem 3. Fig. 3 shows that our proposed achievable scheme outperforms the scheme in [12].

IV Converse Bound

IV-A Proof of Theorem 1

The delivery phase with uncoded cache placement is equivalent to a multicast index coding problem [25]. Such a problem can be represented on a directed graph. In this graph, each sub-block demanded but not cached by a user is a node; a directed edge exists from node $a$ to node $b$ if the user demanding the sub-block represented by node $b$ has the sub-block represented by node $a$ in its cache. As in [23], we use the acyclic index coding converse bound from [22] to lower bound the number of transmitted bits needed to satisfy all the nodes/users in this index coding problem as follows.

For a demand vector ${\mathbf{d}}$ , with $N_{\textup{e}}({\mathbf{d}})$ distinct demands, we draw a graph where each sub-block demanded but not cached by some of these $N_{\textup{e}}({\mathbf{d}})$ users is node. We then consider a permutation of these $N_{\textup{e}}({\mathbf{d}})$ users, denoted by $\mathbf{u}=(u_{1},u_{2},...,u_{N_{\textup{e}}({\mathbf{d}})})$ . The set of sub-blocks

[TABLE]

does not contain a directed cycle. This can be seen as follows, similarly to [2, Lemma 1]. We classify the sub-blocks/nodes in the set (15) into levels. More precisely, we say that sub-block/node $W_{{\mathcal{S}},{\mathcal{V}}}$ is in level $i$ if ${\mathcal{S}}\subseteq[{\mathsf{N}}]\setminus\{d_{u_{1}},\ldots,d_{u_{i-1}}\}$ , $d_{u_{i}}\in{\mathcal{S}}$ and ${\mathcal{V}}\subseteq[{\mathsf{K}}]\setminus\{u_{1},\ldots,u_{i}\}$ . It is easy to see each node in level $i$ is a sub-block that is demanded but not-cached by user $u_{i}$ that did not appear in any of the levels with lower index, and corresponds to a user in the index coding problem that has the same side information as user $u_{i}$ in our caching problem (i.e., each node in level $i$ only knows the nodes $W_{{\mathcal{S}},{\mathcal{V}}}$ where $u_{i}\in{\mathcal{V}}$ ). So each node in level $i$ knows neither the nodes in the same level, nor the nodes in the higher levels. As a result, the proposed set in (15) does not contain a directed cycle.

By the acyclic index coding converse bound, the number of transmitted bits is not less than total number of bits of the sub-blocks in the set in (15), that is,

[TABLE]

where $|W_{{\mathcal{S}},{\mathcal{V}}}|$ represents the length of $W_{{\mathcal{S}},{\mathcal{V}}}$ in bits.

[TABLE]

As in [3], we can lower bound (17e) by using Jensen’s inequality and the monotonicity of $\textrm{Conv}(c^{s}_{t})$ (i.e., the convex lower envelope of $c^{s}_{t}$ in terms of $t$ ),

[TABLE]

By considering all the demand types, and from (18), we also have

[TABLE]

Since $c^{s}_{t}$ is convex in $t$ , we can change the order of the expectation and the “Conv” in (19). Thus we prove the converse bound in Theorem 7.

Notice that we could also use Fourier-Motzkin elimination to eliminate the parameters $\{x_{t}\}_{t\in[0:{\mathsf{K}}]}$ in (17e) and derive the bound in (18), as done in [23].

IV-B Discussion

We conclude this session with some observations on the proposed converse bound, which we shall use as a guideline to design our achievable schemes.

The corner points of our converse bound are of the form $\left(\frac{{\mathsf{N}}t}{{\mathsf{K}}{\mathsf{r}}},c^{s}_{t}\right)$ , where $c^{s}_{t}$ is defined in (6), which may suggest the following placement. We partition each block $W_{{\mathcal{S}}}$ into $\binom{{\mathsf{K}}}{t}$ equal-length sub-blocks of length $\frac{{\mathsf{B}}}{\binom{{\mathsf{N}}-1}{{\mathsf{r}}-1}\binom{{\mathsf{K}}}{t}}$ and indicate $W_{{\mathcal{S}}}=\{W_{{\mathcal{S}},{\mathcal{V}}}:{\mathcal{V}}\subseteq[{\mathsf{K}}],|{\mathcal{V}}|=t\}$ . Each user $k\in[{\mathsf{K}}]$ stores the sub-block $W_{{\mathcal{S}},{\mathcal{V}}}$ if $k\in{\mathcal{V}}$ . Hence, user $k\in[{\mathsf{K}}]$ caches $\frac{{\mathsf{B}}\binom{{\mathsf{K}}-1}{t-1}\binom{{\mathsf{N}}}{{\mathsf{r}}}}{\binom{{\mathsf{N}}-1}{{\mathsf{r}}-1}\binom{{\mathsf{K}}}{t}}=\frac{{\mathsf{B}}{\mathsf{N}}t}{{\mathsf{K}}{\mathsf{r}}}$ bits in total.

We will use this interpretation to design the file partitioning and the cache placement of our proposed caching schemes, which is the same as in [12]. 2. 2.

If the above placement is used, each sub-block is cached by $t$ users. In the proof of Theorem 1, for each demand ${\mathbf{d}}$ , we choose a set of leader users (each demanding a differenet file) and consider a permutation $\mathbf{u}=(u_{1},\ldots,u_{N_{\textup{e}}({\mathbf{d}})})$ of these $N_{\textup{e}}({\mathbf{d}})$ leader users. For the permutation $\mathbf{u}$ , we find an acyclic set of $\sum_{j\in[\min\{N_{\textup{e}}({\mathbf{d}}),{\mathsf{N}}-{\mathsf{r}}+1,{\mathsf{K}}-t\}]}\binom{{\mathsf{N}}-j}{{\mathsf{r}}-1}\binom{{\mathsf{K}}-j}{t}$ sub-blocks, and lower bounded the load by the total length of these sub-blocks. In addition, in this acyclic set of sub-blocks, there are $\binom{{\mathsf{N}}-j}{{\mathsf{r}}-1}\binom{{\mathsf{K}}-j}{t}$ sub-blocks desired by user $u_{j}$ where $j\in[\min\{N_{\textup{e}}({\mathbf{d}}),{\mathsf{N}}-{\mathsf{r}}+1,{\mathsf{K}}-t\}]$ ; these sub-blocks are not cached nor desired by any user $u_{j_{1}}$ where $j_{1}<j$ . This may suggest a delivery scheme with $\min\{N_{\textup{e}}({\mathbf{d}}),{\mathsf{N}}-{\mathsf{r}}+1,{\mathsf{K}}-t\}$ steps, where in Step $j$ we transmit $\binom{{\mathsf{N}}-j}{{\mathsf{r}}-1}\binom{{\mathsf{K}}-j}{t}$ linear combinations such that each linear combination contains one of the $\binom{{\mathsf{N}}-j}{{\mathsf{r}}-1}\binom{{\mathsf{K}}-j}{t}$ sub-blocks desired by user $u_{j}$ , and thus at the end of this step user $u_{j}$ is satisfied.

We will use this interpretation to design the first sub-phase of our general delivery scheme in (8), which we shall introduced next in Section V.

V Achievable Scheme

In this section, we analyze the achievable scheme in (8) and prove the statements of Theorems 2, 3 and 5. Notice that when ${\mathsf{r}}\in\{1,{\mathsf{N}}\}$ , the considered problem is equivalent to the MAN problem (solved under the constraint of uncoded cache placement in [3]). Hence, the novelty of our schemes is for ${\mathsf{r}}\in[2:{\mathsf{N}}-1]$ . The scheme we propose was summarized in (8); Theorems 2 and 5 only use the first sub-phase of the delivery, while Theorem 3 uses both sub-phases.

The rest of this section is organized as follows. In Section V-A we give an example of the first sub-phase of the proposed delivery scheme in (8); the objective is to highlight how the multicast messages sent in sub-phase 1 enable all leaders to decode their desired file. Then in Section V-B we show which user can decode which sub-block after receiving the multicast messages in sub-phase 1, regardless of the demand type. In Section V-C we show that every user can decode its desired message by also receiving the multicast messages sent in sub-phase 2. In Section V-D we give an example of the second sub-phase of the proposed delivery scheme in (8). In Section V-E we prove the order optimality results in Theorems 4 for general case. Finally, in Section V-F we prove the exact optimality results in Theorem 5 by observing each non-leader can re-construct the packets of sub-phase 2 by performing linear combinations of the the received packets in sub-phase 1.

V-A An example of (8) with only sub-phase 1 for the delivery scheme

First, we study an example where ${\mathsf{N}}\geq{\mathsf{K}}$ and where each user demands a distinct file (i.e., $s={\mathsf{K}}$ ). In particular, we consider the $({\mathsf{N}},{\mathsf{K}},{\mathsf{r}},{\mathsf{M}})=(4,4,2,1/2)$ shared-link caching problem with correlated files. There are $\binom{{\mathsf{N}}}{{\mathsf{r}}}=6$ blocks denoted as $W_{\{1,2\}}$ , $W_{\{1,3\}}$ , $W_{\{1,4\}}$ , $W_{\{2,3\}}$ , $W_{\{2,4\}}$ , and $W_{\{3,4\}}$ . The files are

[TABLE]

Block Subdivision

Here $t=\frac{{\mathsf{K}}{\mathsf{M}}{\mathsf{r}}}{{\mathsf{N}}}=1$ . We partition each block into $\binom{{\mathsf{K}}}{t}=4$ equal-length sub-blocks and denote $W_{{\mathcal{S}}}=\{W_{{\mathcal{S}},{\mathcal{V}}}:{\mathcal{V}}\subseteq[{\mathsf{K}}],|{\mathcal{V}}|=t=1\}=\{W_{{\mathcal{S}},\{k\}}:k\in[{\mathsf{K}}]\}$ . Hence, each sub-block contains ${\mathsf{B}}/\Big{(}\binom{{\mathsf{N}}-1}{{\mathsf{r}}-1}\binom{{\mathsf{K}}}{t}\Big{)}={\mathsf{B}}/12$ bits.

Placement Phase

The cache placement is inspired by the converse bound (see discussion in Section IV-B). User $k\in[{\mathsf{K}}]$ caches $W_{{\mathcal{S}},{\mathcal{V}}}$ if $k\in{\mathcal{V}}$ , that is, $Z_{k}=\{W_{{\mathcal{S}},\{k\}},\forall{\mathcal{S}}\subseteq[{\mathsf{N}}]:|{\mathcal{S}}|={\mathsf{r}}=2\}$ .

Delivery Phase

Assume ${\mathbf{d}}=(1,2,3,4)$ , which has $N_{\textup{e}}({\mathbf{d}})=4$ distinct demanded files. Pick one user demanding a distinct file, and refer to it as the leader among those users demanding the same file. Since each user has a distinct request in this example, each user is a leader, and the leader set is $[4]$ . Consider a permutation ${\mathsf{u}}$ of the leaders, say ${\mathsf{u}}=(1,2,3,4)$ .

Our proposed first sub-phase of the general delivery scheme contains $\min\{N_{\textup{e}}({\mathbf{d}}),{\mathsf{N}}-{\mathsf{r}}+1,{\mathsf{K}}-t\}=3$ steps; after the $j^{\textrm{th}}$ step, the $j^{\textrm{th}}$ element/leader in the permutation can decode its desired file; after finishing all steps, the remaining leaders can also decode their desired file. We next describe, one by one, the three steps in the delivery phase for this example, where each step we send multicast messages of the type

[TABLE]

where ${\mathcal{N}}({\mathcal{J}})$ is the set of demanded files by the users in ${\mathcal{J}}$ . In plain words, the multicast message $C_{{\mathcal{T}},{\mathcal{H}}}$ in (20) is the binary sum of each sub-block desired by one user in ${\mathcal{J}}$ and known by all the other users in ${\mathcal{J}}$ . Note that, when ${\mathsf{r}}=1$ (in which case our model reduces to the MAN system in [1]), $C_{{\mathcal{T}},\emptyset}$ in (20) is equivalent to the MAN multicast message

[TABLE]

Delivery Sub-Phase 1.Step $1$ . In this step we aim to satisfy leader user $u_{1}=1$ , who misses three sub-blocks of the three blocks the made up the first file, that is, user $1$ must recover nine sub-blocks. Each time we consider one set of users ${\mathcal{J}}\subseteq[{\mathsf{K}}]$ where $|{\mathcal{J}}|=t+1=2$ and $u_{1}\in{\mathcal{J}}$ (recall that $u_{1}=1$ ), and one set of files ${\mathcal{B}}\subseteq[{\mathsf{N}}]\setminus\{d_{u_{1}}\}$ (recall that $d_{u_{1}}=1$ ) where $|{\mathcal{B}}|={\mathsf{r}}-1=1$ .

[TABLE]

From (22) and its cached content, user $u_{1}=1$ can recover $W_{\{1,2\}}$ , $W_{\{1,3\}}$ , and $W_{\{1,4\}}$ . User $u_{1}=1$ is satisfied after this first step (i.e., it has recovered the missing nine sub-blocks from the nine received multicast messages in the first step).

Let us then focus on user $u_{2}=2$ . User $2$ can directly recover $W_{\{1,2\},\{1\}}$ from (22a), $W_{\{2,3\},\{1\}}$ from (22b), $W_{\{2,4\},\{1\}}$ from (22c). Since user $2$ has recovered $W_{\{2,3\},\{1\}}$ , it then can recover $W_{\{1,2\},\{3\}}$ from (22d). Since user $2$ has recovered $W_{\{2,4\},\{1\}}$ , it then can recover $W_{\{1,2\},\{4\}}$ from (22g). In conclusion, after Step $1$ , user $2$ can recover $W_{\{1,2\}}$ and also recover $W_{\{2,3\},\{1\}}$ and $W_{\{2,4\},\{1\}}$ . User $u_{2}=2$ after this first step still misses four sub-blocks, namely $\{W_{\{2,3\},\{k\}},W_{\{2,4\},\{k\}}:k\in[3,4]\}$ .

Similarly to user $u_{2}=2$ , each user $k\in\{2,3,4\}$ can recover $W_{{\mathcal{S}}}$ where $\{d_{u_{1}},d_{k}\}\subseteq{\mathcal{S}}$ , and can also recover $W_{{\mathcal{S}}_{1},{\mathcal{V}}_{1}}$ where $d_{k}\in{\mathcal{S}}_{1}$ and $u_{1}\in{\mathcal{V}}_{1}$ , after Step $1$ . Each of these users still misses four sub-blocks.

Delivery Sub-Phase 1.Step $2$ . In this step we aim to satisfy leader user $u_{2}=2$ . Each time we consider one set of users ${\mathcal{J}}\subseteq([{\mathsf{K}}]\setminus\{u_{1}\})$ where $|{\mathcal{J}}|=t+1$ and $u_{2}\in{\mathcal{J}}$ , and one set of files ${\mathcal{B}}\subseteq([{\mathsf{N}}]\setminus\{d_{u_{1}},d_{u_{2}}\})$ where $|{\mathcal{B}}|={\mathsf{r}}-1=1$ (recall that $u_{1}=d_{u_{1}}=1,u_{2}=d_{u_{2}}=2$ ).

[TABLE]

From (23) user $u_{2}=2$ can recover the desired sub-blocks that were not recovered from Step $1$ . User $u_{2}=2$ is satisfied after this second step (i.e., it has recovered the missing four sub-blocks from the four received multicast messages in the second step).

Let us then focus on user $u_{3}=3$ . User $3$ can directly recover $W_{\{2,3\},\{2\}}$ from (23a) and $W_{\{3,4\},\{2\}}$ from (23b). Since user $3$ has recovered $W_{\{3,4\},\{2\}}$ , it then can recover $W_{\{2,3\},\{4\}}$ from (23c). User $u_{3}=3$ after this second step still misses $W_{\{3,4\},\{4\}}$ .

Similarly to user $u_{3}=3$ , at the end of Step $2$ , each user $k\in\{3,4\}$ can recover $W_{{\mathcal{S}}}$ where $d_{k}\in{\mathcal{S}}$ , $\{d_{u_{1}},d_{u_{2}}\}\cap{\mathcal{S}}\neq\emptyset$ , and also recover $W_{{\mathcal{S}}_{1},{\mathcal{V}}_{1}}$ where $d_{k}\in{\mathcal{S}}_{1}$ and $\{u_{1},u_{2}\}\cap{\mathcal{V}}_{1}\neq\emptyset$ . Each of these users still misses one sub-block.

Delivery Sub-Phase 1.Step $3$ . In this step we aim to satisfy leader user $u_{3}=3$ . Each time we consider one set of users ${\mathcal{J}}\subseteq([{\mathsf{K}}]\setminus\{u_{1},u_{2}\})$ where $|{\mathcal{J}}|=t+1$ and $u_{3}\in{\mathcal{J}}$ , and one set of files ${\mathcal{B}}\subseteq([{\mathsf{N}}]\setminus\{d_{u_{1}},d_{u_{2}},d_{u_{3}}\})$ where $|{\mathcal{B}}|={\mathsf{r}}-1=1$ (recall that $u_{1}=d_{u_{1}}=1,u_{2}=d_{u_{2}}=2,u_{3}=d_{u_{3}}=3$ ). Hence, at this point there is one possibility, ${\mathcal{J}}=\{3,4\}$ and ${\mathcal{B}}=\{4\}$ , for which we transmit

[TABLE]

From (24), user $3$ can recover $W_{\{3,4\},\{4\}}$ , and user $4$ can recover $W_{\{3,4\},\{3\}}$ . Hence, at the end this third step, users $3$ and $4$ are satisfied (i.e., they recovered the missing sub-block from the received multicast message in the third step).

Performance

Based on the above placement and delivery scheme, all users are able to decode their desired blocks. We sent $\sum_{j}\binom{{\mathsf{N}}-j}{{\mathsf{r}}-1}\binom{{\mathsf{K}}-j}{t}=14$ linear combinations, each of length ${\mathsf{B}}/12$ bits. So the load is $7/6$ , which coincides with the converse bound in Theorem 1 for $s=4$ .

Comparison with state-of-the-art ‘round-division’ schemes

Let us then consider the round-division methods in [12, 15, 16, 17]. It is obvious that if there exists some sub-block appearing in different rounds, a round-division strategy that treats each round as an independent MAN caching problem with a single request may miss some multicast opportunities. Here we show that a round-division strategy is sub-optimal even if we can divide users’ demands into multiple rounds such that there does not exist any sub-block appearing in different rounds. More precisely, since each user demands $3$ blocks, we can divide the delivery into the following three rounds:

•

Round 1: In the first round, users $1$ and $2$ demand $W_{\{1,2\}}$ , and users $3$ and $4$ demand $W_{\{3,4\}}$ . This is equivalent to the MAN caching problem with $4$ users and $2$ files. By using the optimal caching scheme under the constraint of uncoded cache placement in [3], we need to transmit $\binom{{\mathsf{K}}}{t}-\binom{{\mathsf{K}}-{\mathsf{N}}}{t}=\binom{4}{2}-\binom{2}{2}=5$ linear combinations, each of which contains ${\mathsf{B}}/12$ bits, in order to satisfy these requests.

•

Round 2: In the second round, users $1$ and $3$ demand $W_{\{1,3\}}$ , and users $2$ and $4$ demand $W_{\{2,4\}}$ . By using the caching scheme in [3], we need to transmit $5$ linear combinations to satisfy these requests.

•

Round 3: In the third round, users $1$ and $4$ demand $W_{\{1,4\}}$ , and users $2$ and $3$ demand $W_{\{2,3\}}$ . By using the caching scheme in [3], we need to transmit $5$ linear combinations to satisfy these requests.

Hence, by this round-division strategy, the load is $15/12>7/6$ , which is strictly sub-optimal. In conclusion, in order to achieve optimality in this example, we need to jointly serve users’ demands (as proposed in this paper) in order to fully leverage all multicast opportunities.

V-B Proof of Theorem 2

Here we shall prove that after the first sub-phase of the delivery scheme in (8) every leader user is able to decode its desired file (as in the example in Section V-A), and that the load of the first sub-phase matches the load of the converse bound in (6). Thus, for the case where every user is a leader (i.e., every user demands a distinct file, as in the example in Section V-A), we have proved the exact optimality under the constraint of uncoded cache placement of the proposed achievable scheme as claimed in Theorem 2.

Decodability after delivery sub-phase 1

We need to establish which user can decode which sub-block at each step of delivery sub-phase 1. The following Lemma 1, which is proved by induction in Appendix A, describes the decoding procedure for delivery sub-phase 1 for general demands. Lemma 1 is the most technical (i.e., highly combinatorial) contribution in this paper.

Lemma 1 (Decoding after sub-phase 1).

In the first sub-phase of the proposed delivery scheme in (8) with leader set $\{u_{1},\ldots,u_{N_{\textup{e}}({\mathbf{d}})}\}$ , in Step $j\in[\min\{N_{\textup{e}}({\mathbf{d}}),{\mathsf{N}}-{\mathsf{r}}+1,{\mathsf{K}}-t\}]$ , for each set of users ${\mathcal{J}}$ 111Please note that here we write the set in the first subscript of $C_{{\mathcal{J}},{\mathcal{B}}}$ is an equivalent but slightly different form compared to (8f). where ${\mathcal{J}}\subseteq[{\mathsf{K}}]\setminus\{u_{1},\ldots,u_{j-1}\}$ such that $|{\mathcal{J}}|=t+1$ and $u_{j}\in{\mathcal{J}}$ , and for each set of files ${\mathcal{B}}\subseteq[{\mathsf{N}}]\setminus\{d_{u_{1}},\ldots,d_{u_{j}}\}$ where $|{\mathcal{B}}|={\mathsf{r}}-1$ , we transmit $C_{{\mathcal{J}},{\mathcal{B}}}$ as defined in (20). Note that by construction (i.e., $|{\mathcal{B}}|={\mathsf{r}}-1$ and $d_{u_{j}}\notin{\mathcal{B}}$ ), $C_{{\mathcal{J}},{\mathcal{B}}}$ contains only one sub-block desired by user $u_{j}$ (which is $W_{\{d_{u_{j}}\}\cup{\mathcal{B}},{\mathcal{J}}\setminus\{u_{j}\}}$ ), while all other sub-blocks are cached by user $u_{j}$ .

Let $u_{g(i)}$ represent the leader user who demands file $i\in{\mathcal{L}}({\mathbf{d}})$ . At the end of the first delivery sub-phase, we have:

For a $({\mathcal{J}},{\mathcal{B}})$ , each user in ${\mathcal{J}}$ can recover all the sub-blocks in $C_{{\mathcal{J}},{\mathcal{B}}}$ . 2. 2.

At the end of Step $j\in[\min\{g(d_{k}),{\mathsf{N}}-{\mathsf{r}}+1,{\mathsf{K}}-t\}]$ , user $k\in[{\mathsf{K}}]$ can recover $W_{{\mathcal{S}}_{1},{\mathcal{V}}_{1}}$ if $d_{k}\in{\mathcal{S}}_{1}$ and $\{u_{1},\ldots,u_{j}\}\cap{\mathcal{V}}_{1}\neq\emptyset$ . 3. 3.

At the end of Step $j\in[\min\{g(d_{q})-1,{\mathsf{N}}-{\mathsf{r}}+1,{\mathsf{K}}-t\}]$ , user $q\in[{\mathsf{K}}]$ can recover $W_{{\mathcal{S}}}$ if $d_{q}\in{\mathcal{S}}$ and $\{d_{u_{1}},\ldots,d_{u_{j}}\}\cap{\mathcal{S}}\neq\emptyset$ .

Decodability for leader users after sub-phase 1

We use Lemma 1 to show that every leader user is able to recovered its demanded file after delivery sub-phase 1. Indeed, for any system parameters, for leader user $u_{p}$ , where $p\in[N_{\textup{e}}({\mathbf{d}})]$ , we have:

•

Case $p\leq\min\{N_{\textup{e}}({\mathbf{d}}),{\mathsf{N}}-{\mathsf{r}}+1,{\mathsf{K}}-t\}$ .

From Lemma 1.Item 3 , user $u_{p}$ can recover $W_{{\mathcal{S}}}$ , where $d_{u_{p}}\in{\mathcal{S}}$ and $\{d_{u_{1}},\dots,d_{u_{p-1}}\}\cap{\mathcal{S}}\neq\emptyset$ , at the end of Step $p-1$ .

In addition, from Lemma 1.Item 2 , user $u_{p}$ can also recover $W_{{\mathcal{S}}_{1},{\mathcal{V}}_{1}}$ , where $d_{u_{p}}\in{\mathcal{S}}_{1}$ and $\{u_{1},\ldots,u_{p-1}\}\cap{\mathcal{V}}_{1}\neq\emptyset$ , at the end of Step $p-1$ .

Hence, user $u_{p}$ still needs to recover $W_{{\mathcal{S}}_{2},{\mathcal{V}}_{2}}$ , where $d_{u_{p}}\in{\mathcal{S}}_{2}$ , $\{d_{u_{1}},\ldots,d_{u_{p-1}}\}\cap{\mathcal{S}}_{2}=\emptyset$ and $\{u_{1},\ldots,u_{p}\}\cap{\mathcal{V}}_{2}=\emptyset$ , at the end of Step $p-1$ (recall that $W_{{\mathcal{S}}_{2},{\mathcal{V}}_{2}}$ where $u_{p}\in{\mathcal{V}}_{2}$ is cached by user $u_{p}$ ). Such a $W_{{\mathcal{S}}_{2},{\mathcal{V}}_{2}}$ appears in $C_{{\mathcal{V}}_{2}\cup\{u_{p}\},{\mathcal{S}}_{2}\setminus\{d_{u_{p}}\}},$ which is sent in Step $p$ . Hence, from Lemma 1.Item 1 , user $u_{p}$ can recover $W_{{\mathcal{S}}_{2},{\mathcal{V}}_{2}}$ at the end of Step $p$ .

•

Case $N_{\textup{e}}({\mathbf{d}})>\min\{{\mathsf{N}}-{\mathsf{r}}+1,{\mathsf{K}}-t\}$ and $\min\{{\mathsf{N}}-{\mathsf{r}}+1,{\mathsf{K}}-t\}<p\leq N_{\textup{e}}({\mathbf{d}})$ .

We distinguish two cases:

–

If ${\mathsf{N}}-{\mathsf{r}}+1\leq{\mathsf{K}}-t$ , it can be seen that for each desired block of user $u_{p}$ (assumed to be $W_{{\mathcal{S}}}$ ), we have ${\mathcal{S}}\cap\{d_{u_{1}},\ldots,d_{u_{{\mathsf{N}}-{\mathsf{r}}+1}}\}\neq\emptyset$ , and thus from Lemma 1.Item 3 , user $u_{p}$ can recover $W_{{\mathcal{S}}}$ .

–

If ${\mathsf{N}}-{\mathsf{r}}+1>{\mathsf{K}}-t$ , it can be seen that for each desired sub-block of user $u_{p}$ (assumed to be $W_{{\mathcal{S}}_{1},{\mathcal{V}}_{1}}$ ), we have ${\mathcal{V}}_{1}\cap\{u_{1},\ldots,u_{{\mathsf{N}}-{\mathsf{r}}+1}\}\neq\emptyset$ , and thus from Lemma 1.Item 2 , user $u_{p}$ can recover $W_{{\mathcal{S}}_{1},{\mathcal{V}}_{1}}$ .

This proves that each leader can recover its demanded file after sub-phase 1.

Load of sub-phase 1

This proposed sub-phase 1 of the delivery scheme contains $\binom{{\mathsf{N}}-j}{{\mathsf{r}}-1}\binom{{\mathsf{K}}-j}{t}$ multicast messages in Step $j\in[\min\{N_{\textup{e}}({\mathbf{d}}),{\mathsf{N}}-{\mathsf{r}}+1,{\mathsf{K}}-t\}]$ , which follows the intuition from the proof of our converse bound (see discussion in Section IV-B). Thus, by summing over all steps in sub-phase 1, we get that the load of this delivery sub-phase matches the load of the converse bound in (6).

Optimality for the case of distinct demands

From the above reasoning, when all users are leaders, that is for the case ${\mathsf{N}}\geq{\mathsf{K}}$ and demand type $s={\mathsf{K}}$ , the claim of Theorem 2 is proved, i.e., every user is satisfied at the end of sub-phase 1, whose load matches the converse bound.

V-C Proof of Theorem 3

Here we shall prove that after the two sub-phases of the delivery scheme in (8) every user is able to decode its desired file. This requires showing that after the second sub-phase the demands of all non-leader users are satisfied. Sub-phase 2 of the delivery scheme in (8) is a form of interference alignment.

The block split and the cache placement phase are as described in (8). The delivery phase contains two sub-phases, where the first sub-phase is the same as in Section V-B, and the second sub-phase is such that non-leader can align or cancel the non-demanded sub-blocks and eventually decode their demanded file. We specify next what each user can decode at the end of each step.

First Delivery Sub-Phase

In Step $j\in[\min\{N_{\textup{e}}({\mathbf{d}}),{\mathsf{N}}-{\mathsf{r}}+1,{\mathsf{K}}-t\}]$ of the first sub-phase, for each set of users ${\mathcal{J}}\subseteq[{\mathsf{K}}]\setminus\{u_{1},\ldots,u_{j-1}\}$ where $|{\mathcal{J}}|=t+1$ and $u_{j}\in{\mathcal{J}}$ , and each set of files ${\mathcal{B}}\subseteq[{\mathsf{N}}]\setminus\{d_{u_{1}},\ldots,d_{u_{j}}\}$ where $|{\mathcal{B}}|={\mathsf{r}}-1$ , we transmit $C_{{\mathcal{J}},{\mathcal{B}}}$ as defined in (20). As shown in Section V-B, at the end of this sub-phase, each leader can recover its desired file.

In addition, from Lemma 1, recalling that $u_{g(i)}$ represent the leader user who demands file $i$ , each non-leader user $k\in[{\mathsf{K}}]\setminus{\mathcal{L}}({\mathbf{d}})$ can decode $W_{{\mathcal{S}}}$ , where $d_{k}\in{\mathcal{S}}$ and $\{d_{u_{1}},\ldots,d_{u_{g(d_{k})-1}}\}\cap{\mathcal{S}}\neq\emptyset$ , and can decode $W_{{\mathcal{S}}_{1},{\mathcal{V}}_{1}}$ where $d_{k}\in{\mathcal{S}}_{1}$ and $\{u_{1},\ldots,u_{g(d_{k})}\}\cap{\mathcal{V}}_{1}\neq\emptyset$ .

The non-leader users are thus not yet satisfy, and thus we proceed to send further multicast messages in sub-phase 2.

Second Delivery Sub-Phase

The second sub-phase also contains $\min\{N_{\textup{e}}({\mathbf{d}}),{\mathsf{N}}-{\mathsf{r}}+1,{\mathsf{K}}-t\}$ steps. In Step $j$ , each time we focus on one integer $q\in[j+1:\min\{{\mathsf{N}}-{\mathsf{r}}+2,{\mathsf{K}}-t+1,N_{\textup{e}}({\mathbf{d}})\}]$ . For each ${\mathcal{J}}^{\prime}\subseteq[{\mathsf{K}}]\setminus\{u_{1},\ldots,u_{q}\}$ where $|{\mathcal{J}}^{\prime}|=t-1$ and ${\mathcal{J}}^{\prime}\cap\{u_{q+1},\ldots,u_{N_{\textup{e}}({\mathbf{d}})}\}\neq\emptyset$ , and each ${\mathcal{B}}^{\prime}\subseteq[{\mathsf{N}}]\setminus\{d_{u_{1}},\ldots,d_{u_{q}}\}$ where $|{\mathcal{B}}^{\prime}|={\mathsf{r}}-2$ and ${\mathcal{B}}^{\prime}\cap{\mathcal{N}}([{\mathsf{K}}])\neq\emptyset$ , we transmit $C_{{\mathcal{J}}^{\prime}\cup\{u_{j},u_{q}\},{\mathcal{B}}^{\prime}}$ as defined in (20). We describe next how each non-leader users can recover the demanded file by combining the multicast messages from both sub-phases. The decoding is rather involved, thus we break down the key steps into lemmas that are proved in Appendix.

In Step $j$ of the second sub-phase, the transmitted multicast message $C_{{\mathcal{J}}^{\prime}\cup\{u_{j},u_{q}\},{\mathcal{B}}^{\prime}}$ by construction satisfies ${\mathcal{J}}^{\prime}\cap\{u_{q+1},\ldots,u_{N_{\textup{e}}({\mathbf{d}})}\}\neq\emptyset$ ; however, non-leader user $k$ also needs multicast message such that ${\mathcal{J}}^{\prime}\cap\{u_{q+1},\ldots,u_{N_{\textup{e}}({\mathbf{d}})}\}=\emptyset$ . It is proved in Appendix B that each user $k$ who demands $F_{d_{u_{j}}}$ can reconstruct $C_{{\mathcal{J}}^{\prime}\cup\{u_{j},u_{q}\},{\mathcal{B}}^{\prime}}$ , where ${\mathcal{J}}^{\prime}\cap\{u_{q+1},\ldots,u_{N_{\textup{e}}({\mathbf{d}})}\}=\emptyset$ by using previously received multicast messages, as formalized in the next lemma.

Lemma 2.

In Step $j\in[\min\{N_{\textup{e}}({\mathbf{d}}),{\mathsf{N}}-{\mathsf{r}}+1,{\mathsf{K}}-t\}]$ of sub-phase 2, for any integer $q\in[j+1:\min\{{\mathsf{N}}-{\mathsf{r}}+2,{\mathsf{K}}-t+1,N_{\textup{e}}({\mathbf{d}})\}]$ , each ${\mathcal{J}}^{\prime}\subseteq[{\mathsf{K}}]\setminus\{u_{1},\ldots,u_{N_{\textup{e}}({\mathbf{d}})}\}$ where $|{\mathcal{J}}^{\prime}|=t-1$ , and each ${\mathcal{B}}^{\prime}\subseteq[{\mathsf{N}}]\setminus\{d_{u_{1}},\ldots,d_{u_{q}}\}$ where $|{\mathcal{B}}^{\prime}|={\mathsf{r}}-2$ and ${\mathcal{B}}^{\prime}\cap{\mathcal{N}}([{\mathsf{K}}])\neq\emptyset$ , user $k$ who demands $d_{u_{j}}$ can obtain $C_{{\mathcal{J}}^{\prime}\cup\{u_{j},u_{q}\},{\mathcal{B}}^{\prime}}$ by making linear combinations of already received multicast messages.

The following Lemma 3, whose proof is in Appendix C, specifies some properties of the linear combinations $C_{{\mathcal{T}},{\mathcal{H}}}$ defined in (20).

Lemma 3 (Properties of function $C_{{\mathcal{T}},{\mathcal{H}}}$ defined in (20)).

For each ${\mathcal{J}}\subseteq[{\mathsf{K}}]$ where $|{\mathcal{J}}|=t+1$ , and each ${\mathcal{B}}\subseteq[{\mathsf{N}}]$ where $|{\mathcal{B}}|={\mathsf{r}}-1$ , we have

[TABLE]

for any $i\in{\mathcal{B}}$ where $u_{g(i)}\notin{\mathcal{J}}$ . In addition, for each ${\mathcal{J}}_{1}\subseteq[{\mathsf{K}}]$ where $|{\mathcal{J}}_{1}|=t+1$ , and each ${\mathcal{B}}_{1}\subseteq[{\mathsf{N}}]$ where $|{\mathcal{B}}_{1}|={\mathsf{r}}-1$ and ${\mathcal{N}}({\mathcal{J}}_{1})\cap{\mathcal{B}}_{1}\neq\emptyset$ , we have

[TABLE]

for any $i_{1}\in{\mathcal{N}}({\mathcal{J}}_{1})\cap{\mathcal{B}}_{1}$ .

From Lemma 3, we prove the following Lemma 4 (whose proof is in Appendix D), which is the key result for our interference alignment based delivery scheme. Recall that ${\mathcal{L}}({\mathbf{d}})$ denotes the set of leader users.

Lemma 4 (Interference alignment lemma).

For each $j\in[\min\{N_{\textup{e}}({\mathbf{d}}),{\mathsf{N}}-{\mathsf{r}}+1,{\mathsf{K}}-t\}]$ and each $i\in\{d_{u_{1}},\ldots,d_{u_{j}}\}$ , any non-leader $k\in[{\mathsf{K}}]\setminus{\mathcal{L}}({\mathbf{d}})$ can reconstruct $C_{{\mathcal{J}}_{2}\cup\{u_{j}\},{\mathcal{B}}_{2}\cup\{i\}}$ where ${\mathcal{J}}_{2}\subseteq[{\mathsf{K}}]\setminus\{u_{1},\ldots,u_{j}\}$ , $|{\mathcal{J}}_{2}|=t$ , ${\mathcal{B}}_{2}\subseteq[{\mathsf{N}}]\setminus\{d_{u_{1}},\ldots,d_{u_{j}}\}$ , $|{\mathcal{B}}_{2}|={\mathsf{r}}-2$ , and ${\mathcal{N}}({\mathcal{J}}_{2}\cap{\mathcal{L}})\setminus{\mathcal{B}}_{2}\neq\emptyset$ .

Lemma 4 can be understood as follows. After the first sub-phase, the remaining sub-blocks to be decoded for each non-leader $k\in[{\mathsf{K}}]\setminus{\mathcal{L}}({\mathbf{d}})$ are $W_{{\mathcal{S}}_{2},{\mathcal{V}}_{2}}$ where $d_{k}\in{\mathcal{S}}_{2}$ , $\{d_{u_{1}},\ldots,d_{u_{g(d_{k})-1}}\}\cap{\mathcal{S}}_{2}=\emptyset$ and $\{k,u_{1},\ldots,u_{g(d_{k})}\}\cap{\mathcal{V}}_{2}=\emptyset$ . In Step $g(d_{k})$ of the first sub-phase, the transmitted message $C_{{\mathcal{J}},{\mathcal{B}}}$ should satisfy $d_{g(d_{k})}\notin{\mathcal{B}}$ . From Lemma 4, we show user $k$ can also reconstruct $C_{{\mathcal{J}}^{\prime},{\mathcal{B}}^{\prime}}$ where $d_{g(d_{k})}\in{\mathcal{B}}^{\prime}$ . Since $d_{u_{g(d_{k})}}\in{\mathcal{B}}^{\prime}$ , each sub-block in $C_{{\mathcal{J}}^{\prime},{\mathcal{B}}^{\prime}}$ is desired or cached by user $k$ who demands $F_{k}$ . In other words, in order to reconstruct $C_{{\mathcal{J}}^{\prime},{\mathcal{B}}^{\prime}}$ , we align/cancel the interferences to user $k$ . By induction, all sub-blocks except one in $C_{{\mathcal{J}}^{\prime},{\mathcal{B}}^{\prime}}$ have been already recovered or cached by user $k$ such that it can recover that sub-block. The details of the decodability proof is presented in Appendix E. An example of how the interference alignment scheme works is given in Section V-D.

Performance

As we showed in Section V-B, in the first sub-phase we transmit $c^{s}_{t}$ bits, with $s=N_{\textup{e}}({\mathbf{d}})$ . In Step $j\in[\min\{s,{\mathsf{N}}-{\mathsf{r}}+1,{\mathsf{K}}-t\}],\ s=N_{\textup{e}}({\mathbf{d}}),$ of the second sub-phase, the number of transmitted bits is

[TABLE]

Hence, by summing the number of transmitted bits in each step of sub-phase 2 and the number of transmitted bits in sub-phase 1, the load equals $e^{s}_{t}+c^{s}_{t}$ as defined in (6) and (10), with $s=N_{\textup{e}}({\mathbf{d}})$ .

This concludes the proof of Theorem 3.

V-D An example of sub-phase 2 in (8)

We will use the following example to illustrate our interference alignment scheme.

Consider an $({\mathsf{N}},{\mathsf{K}},{\mathsf{M}},{\mathsf{r}})=(5,10,1/2,3)$ shared-link caching problem with correlated files. There are $\binom{{\mathsf{N}}}{{\mathsf{r}}}=10$ blocks, $W_{{\mathcal{S}}}$ where ${\mathcal{S}}\subseteq[5]$ and $|{\mathcal{S}}|={\mathsf{r}}=3$ . The files are

[TABLE]

Placement Phase

Here $t=\frac{{\mathsf{K}}{\mathsf{M}}{\mathsf{r}}}{{\mathsf{N}}}=3$ . We partition each block into $\binom{{\mathsf{K}}}{t}=120$ equal-length sub-blocks and denote $W_{{\mathcal{S}}}=\{W_{{\mathcal{S}},{\mathcal{V}}}:{\mathcal{V}}\subseteq[{\mathsf{K}}],|{\mathcal{V}}|=t=3\}$ . Each user $k\in[{\mathsf{K}}]$ caches $W_{{\mathcal{S}},{\mathcal{V}}}$ if $k\in{\mathcal{V}}$ .

Delivery Phase

Assume ${\mathbf{d}}=(1,2,3,4,5,1,2,3,4,5)$ , which has $N_{\textup{e}}({\mathbf{d}})=5$ distinct demanded files. We choose as leaders the users in ${\mathbf{u}}=(1,2,3,4,5)$ .

First delivery sub-phase

In Step $j\in[\min\{N_{\textup{e}}({\mathbf{d}}),{\mathsf{N}}-{\mathsf{r}}+1,{\mathsf{K}}-t\}]=[3]$ of the first sub-phase, for each set of users ${\mathcal{J}}\subseteq[{\mathsf{K}}]\setminus[j-1]$ where $|{\mathcal{J}}|=t+1=4$ and $j\in{\mathcal{J}}$ , and each set of files ${\mathcal{B}}\subseteq[{\mathsf{N}}]\setminus[j]$ where $|{\mathcal{B}}|={\mathsf{r}}-1=2$ , we transmit $C_{{\mathcal{J}},{\mathcal{B}}}$ .

At the end of the first sub-phase, as shown in Section V-B, each leader user can recover its desired file.

For the non-leaders, we focus on user $6$ . From Lemma 1, user $6$ can decode $W_{{\mathcal{S}}_{1},{\mathcal{V}}_{1}}$ where $1\in{\mathcal{S}}_{1}$ and $1\in{\mathcal{V}}_{1}$ . Hence, after the first sub-phase, user $6$ still needs to recover $W_{{\mathcal{S}}_{2},{\mathcal{V}}_{2}}$ where $1\in{\mathcal{S}}_{2}$ and $\{1,6\}\cap{\mathcal{V}}_{2}=\emptyset$ . Similarly, each non-leader user $k\in[6:10]$ still needs to recover $W_{{\mathcal{S}}_{2},{\mathcal{V}}_{2}}$ where $d_{k}\in{\mathcal{S}}_{2}$ , $\{1,\ldots,d_{k}-1\}\cap{\mathcal{S}}_{2}=\emptyset$ , and $\{k,1,\ldots,d_{k}\}\cap{\mathcal{V}}_{2}=\emptyset$ (recall that $g(d_{k})=d_{k}$ in this example).

Second delivery sub-phase

In Step $j\in[3]$ of the second sub-phase, for each $q\in[j+1:4]$ , each ${\mathcal{J}}^{\prime}\subseteq[{\mathsf{K}}]\setminus[q]$ where $|{\mathcal{J}}^{\prime}|=t-1=2$ and ${\mathcal{J}}^{\prime}\cap[q+1:5]\neq\emptyset$ , and each ${\mathcal{B}}^{\prime}\subseteq[{\mathsf{N}}]\setminus[q]$ where $|{\mathcal{B}}^{\prime}|={\mathsf{r}}-2=1$ , we transmit $C_{{\mathcal{J}}^{\prime}\cup\{j,q\},{\mathcal{B}}^{\prime}}$ .

We now prove the decodability of user $6$ . Observe that leader $g(d_{6})=1$ also demands $F_{1}$ , we show the decodability of user $6$ by induction. For each $j\in[g(d_{6})+1:N_{\textup{e}}({\mathbf{d}})]=[2:5]$ , we prove user $6$ can recover its desired sub-block $W_{{\mathcal{S}}_{2},{\mathcal{V}}_{2}}$ where $d_{u_{j}}\in{\mathcal{S}}_{2}$ or $u_{j}\in{\mathcal{V}}_{2}$ .

We start from $j=2$ . In the following, we show user $6$ can recover $W_{\{1,2,3\},{\mathcal{V}}_{2}}$ where $\{1,6\}\cap{\mathcal{V}}_{2}=\emptyset$ by interference alignment decoding (i.e., $W_{\{1,2,3\},\{2,3,4\}}$ , $W_{\{1,2,3\},\{2,3,5\}}$ , $W_{\{1,2,3\},\{2,4,5\}}$ , and $W_{\{1,2,3\},\{3,4,5\}}$ ). A similar argument applies to every non-leader user $k\in[6:10]$ .

We first focus on $W_{\{1,2,3\},{\mathcal{V}}_{2}}$ where $\{g(d_{6}),6\}=\{1,6\}\notin{\mathcal{V}}_{2}$ and $u_{j}=2\in{\mathcal{V}}_{2}$ , e.g., $W_{\{1,2,3\},\{2,3,4\}}$ . In Step $1$ of the first sub-phase, user $6$ receives

[TABLE]

By summing (28) and (29), we can obtain

[TABLE]

which shows the property in (26) in Lemma 4. It can be seen by summing (28) and (29), we cancel the interferences from the sub-blocks of $W_{\{2,3,4\}}$ to user $6$ . From Lemma 1, user $6$ can decode $W_{{\mathcal{S}}_{1},{\mathcal{V}}_{1}}$ where $1\in{\mathcal{S}}_{1}$ and $1\in{\mathcal{V}}_{1}$ . In addition, in

[TABLE]

which is transmitted in Step $2$ of the first sub-phase, user $6$ caches all except $W_{\{1,3,4\},\{2,3,4\}}$ such that it can recover $W_{\{1,3,4\},\{2,3,4\}}$ by directly reading off. Hence, user $6$ has decoded all except $W_{\{1,2,3\},\{2,3,4\}}$ in (30) such that it can recover $W_{\{1,2,3\},\{2,3,4\}}$ .

By similar steps, for each desired sub-block $W_{\{1,2,3\},{\mathcal{V}}_{2}}$ where $\{1,6\}\cap{\mathcal{V}}_{2}=\emptyset$ and $u_{j}=2\in{\mathcal{V}}_{2}$ , user $6$ first reconstructs $C_{{\mathcal{V}}_{2}\cup\{1\},\{1,2,3\}\setminus\{2\}}$ and then recovers $W_{\{1,2,3\},{\mathcal{V}}_{2}}$ from $C_{{\mathcal{V}}_{2}\cup\{1\},\{1,2,3\}\setminus\{2\}}$ .

We then focus on $W_{\{1,2,3\},{\mathcal{V}}_{2}}$ where $\{g(d_{6}),6\}\cup\{u_{j}\}=\{1,2,6\}\cap{\mathcal{V}}_{2}=\emptyset$ , e.g., $W_{\{1,2,3\},\{3,4,5\}}$ . In Step $1$ of the first sub-phase, user $6$ receives

[TABLE]

In Step $1$ of the second sub-phase (with $j=1$ , $q=2$ , ${\mathcal{J}}^{\prime}=\{4,5\}$ , ${\mathcal{B}}^{\prime}=\{3\}$ ), user $6$ receives

[TABLE]

By summing (32)-(36), we have

[TABLE]

which shows the property in (25) in Lemma 4. Hence, by (38), user $6$ can reconstruct $C_{\{2,3,4,5\},\{1,3\}}$ while cancelling the interferences in (32)-(36), coinciding with Lemma 4. We then focus on each sub-block in $C_{\{2,3,4,5\},\{1,3\}}$ . $W_{\{1,2,3\},\{2,4,5\}}$ can be recovered by user $6$ as we showed previously for $W_{\{1,2,3\},\{2,3,4\}}$ . For $W_{\{1,3,4\},\{2,4,5\}}$ , in

[TABLE]

which is transmitted in Step $2$ of the first sub-phase, user $6$ caches all except $W_{\{1,3,4\},\{2,4,5\}}$ such that it can recover $W_{\{1,3,4\},\{2,4,5\}}$ by directly reading off. Similarly, user $6$ can recover $W_{\{1,3,4\},\{2,3,5\}}$ , $W_{\{1,3,5\},\{2,4,5\}}$ , and $W_{\{1,3,5\},\{2,3,4\}}$ from Step $2$ of the first sub-phase by directly reading off. Hence, in $C_{\{2,3,4,5\},\{1,3\}}$ , user $6$ has recovered all except $W_{\{1,2,3\},\{3,4,5\}}$ such that user $6$ can recover $W_{\{1,2,3\},\{3,4,5\}}$ .

Finally, we consider $W_{\{1,2,3\},\{3,9,10\}}$ , where $d_{9}=4$ and $d_{10}=5$ . Notice that, $C_{\{1,2,9,10\},\{3\}}$ is not transmitted in the second sub-phase, because both of users $9,10$ are not leaders, which contradicts the constraint on the transmission of the second sub-phase ( ${\mathcal{J}}^{\prime}\cap[q+1:5]\neq\emptyset$ with $q=2$ and ${\mathcal{J}}^{\prime}=\{9,10\}$ ). However, it can be seen that if user $6$ can reconstruct $C_{\{1,2,9,10\},\{3\}}$ , by the same decoding procedure as $W_{\{1,2,3\},\{3,4,5\}}$ , user $6$ can recover $W_{\{1,2,3\},\{3,9,10\}}$ . So in the following, we prove user $6$ can reconstruct $C_{\{1,2,9,10\},\{3\}}$ , as described in Lemma 2.

Notice that $C_{\{1,2,4,10\},\{2,3\}}$ and $C_{\{1,2,4,10\},\{3\}}$ are transmitted in Step $1$ of the first and second sub-phases, respectively. Hence, user $6$ can obtain

[TABLE]

On the RHS of (40), $W_{\{1,3,4\},\{2,4,10\}}$ and $W_{\{1,3,5\},\{2,4,10\}}$ can be recovered by user $6$ from $C_{\{2,4,6,10\},\{3,4\}}$ and $C_{\{2,4,6,10\},\{3,5\}}$ transmitted in Step $2$ of the first sub-phase, respectively (by directly reading off). $W_{\{1,3,4\},\{1,2,10\}}$ and $W_{\{1,3,5\},\{1,2,4\}}$ can be recovered by user $6$ because they are cached by user $1$ and thus we can use Lemma 1.Item 2. Hence, from (40), user $6$ can recover

[TABLE]

Similarly, user $6$ can recover

[TABLE]

from $C_{\{1,2,4,5\},\{2,3\}}\oplus C_{\{1,2,4,5\},\{3\}}$ and $C_{\{1,2,8,5\},\{2,3\}}\oplus C_{\{1,2,8,5\},\{3\}}$ , respectively. By summing (41)- (43), user $6$ can obtain

[TABLE]

Similar to (40), we have

[TABLE]

On the RHS of (45), $C_{\{1,2,9,10\},\{2,3\}}$ is transmitted in Step $1$ of the first sub-phase. In addition, $W_{\{1,3,4\},\{2,9,10\}}$ and $W_{\{1,3,5\},\{2,9,10\}}$ can be recovered by user $6$ from $C_{\{2,6,9,10\},\{3,4\}}$ and $C_{\{2,6,9,10\},\{3,5\}}$ transmitted in Step $2$ of the first sub-phase, respectively (by directly reading off). $W_{\{1,3,4\},\{1,2,10\}}$ and $W_{\{1,3,5\},\{1,2,9\}}$ can be recovered by user $6$ because they are cached by user $1$ and thus we can use Lemma 1.Item 2 . We also proved in (44) that $W_{\{3,4,5\},\{1,2,10\}}\oplus W_{\{3,4,5\},\{1,2,9\}}$ can be recovered by user $6$ . Hence, user $6$ can reconstruct $C_{\{1,2,9,10\},\{3\}}$ and thus it can recover $W_{\{1,2,3\},\{3,9,10\}}$ .

By similar steps, for each desired sub-block $W_{\{1,2,3\},{\mathcal{V}}_{2}}$ where $\{1,2,6\}\cap{\mathcal{V}}_{2}=\emptyset$ , user $6$ first reconstructs $C_{{\mathcal{V}}_{2}\cup\{2\},\{1,2,3\}\setminus\{2\}}$ and then recovers $W_{\{1,2,3\},{\mathcal{V}}_{2}}$ from $C_{{\mathcal{V}}_{2}\cup\{2\},\{1,2,3\}\setminus\{2\}}$ .

Hence, we prove that user $6$ can recover $W_{\{1,2,3\}}$ . Similarly, we can prove user $6$ can recover $W_{{\mathcal{S}}_{2}}$ where $\{d_{k},d_{u_{j}}\}=\{1,2\}\subseteq{\mathcal{S}}_{2}$ .

For each sub-block $W_{{\mathcal{S}}_{2},{\mathcal{V}}_{2}}$ where $d_{6}=1\in{\mathcal{S}}_{2}$ , $d_{u_{j}}=2\notin{\mathcal{S}}_{2}$ , $6\notin{\mathcal{V}}_{2}$ , and $u_{j}=2\in{\mathcal{V}}_{2}$ , user $6$ can recover $W_{{\mathcal{S}}_{2},{\mathcal{V}}_{2}}$ from $C_{{\mathcal{V}}_{2}\cup\{6\},{\mathcal{S}}_{2}\setminus\{1\}}$ by directly reading off. Hence, we finish the proof of the decodability of user $6$ for $j=2$ .

By the induction method, other desired blocks can also be recovered by user $6$ with the above decoding procedures.

Performance

The achieved load is $31/30\approx 1.033$ while the converse bound in Theorem 1 is $707/720\approx 0.982$ and the achieved load in [12] is $7/6\approx 1.167$ .

V-E Proof of Theorem 4

For type $s\in[\min\{{\mathsf{K}},{\mathsf{N}}\}]$ and each corner point ${\mathsf{M}}=\frac{{\mathsf{N}}t}{{\mathsf{K}}{\mathsf{r}}}$ where $t\in[0:{\mathsf{K}}]$ , from Theorem 3, we can achieve the load

[TABLE]

where (46d) comes from the Pascal’s triangle. Hence, from (46f) and the converse bound in Theorem 1, we proved the proposed caching scheme in Theorem 3 is order optimal to within a factor of $2$ under the constraint of uncoded cache placement for demand type $s$ .

Similarly, we can prove that the average load among all possible demands in Theorem 3 is order optimal to within a factor of $2$ under the constraint of uncoded cache placement.

V-F Proof of Theorem 5

From the proof of the decodability in Appendix E, we have the following observations (Observations 2 and 3 are proved in Appendix E), which will help us to further reduce the load for some special cases:

Observation 1: when ${\mathsf{r}}=2$ , the transmission of the second sub-phase does not exist because $|{\mathcal{B}}^{\prime}|={\mathsf{r}}-2=0$ and ${\mathcal{B}}^{\prime}\cap{\mathcal{N}}([{\mathsf{K}}])\neq\emptyset$ can not hold simultaneously. When $t=1$ , the transmission of the second sub-phase does not exist because $|{\mathcal{J}}^{\prime}|=t-1=0$ and ${\mathcal{J}}^{\prime}\cap\{u_{q+1},\ldots,u_{N_{\textup{e}}({\mathbf{d}})}\}\neq\emptyset$ , can not hold simultaneously. In other words, each non-leader can recover all its desired files from the first sub-phase if ${\mathsf{r}}=2$ or $t=1$ . 2. 2.

Observation 2: for a non-leader $k$ , to decode $W_{{\mathcal{S}}_{2},{\mathcal{V}}_{2}}$ where $d_{k}\in{\mathcal{S}}_{2}$ , $\{d_{u_{1}},\ldots,d_{u_{g(d_{k})-1}}\}\cap{\mathcal{S}}_{2}=\emptyset$ and $\{k,u_{1},\ldots,u_{g(d_{k})}\}\cap{\mathcal{V}}_{2}=\emptyset$ , if there is no user in ${\mathcal{V}}_{2}$ whose demanded file is in $\{d_{u_{1}},\ldots,d_{u_{g(d_{k})-1}}\}$ , user $k$ only needs to use the transmission of the first sub-phase, Step $g(d_{k})$ of the second sub-phase and Step $g(d_{k})$ in Lemma 2. 3. 3.

Observation 3: for a non-leader $k$ , to decode $W_{{\mathcal{S}}_{2},{\mathcal{V}}_{2}}$ where $d_{k}\in{\mathcal{S}}_{2}$ , $\{d_{u_{1}},\ldots,d_{u_{g(d_{k})-1}}\}\cap{\mathcal{S}}_{2}=\emptyset$ , $\{k,u_{1},\ldots,u_{g(d_{k})}\}\cap{\mathcal{V}}_{2}=\emptyset$ , and $(\cup_{k_{1}\in{\mathcal{V}}_{2}}\{d_{k_{1}}\})\cap({\mathcal{S}}_{2}\setminus\{d_{k}\})=\emptyset$ , user $k$ only needs the transmission of the first sub-phase.

In the following, we will show if ${\mathsf{r}}\in\{1,2,{\mathsf{N}}-1,{\mathsf{N}}\}$ or $t\in\{0,1,2,{\mathsf{K}}-1,{\mathsf{K}}\}$ or $s\in[\min\{{\mathsf{K}},{\mathsf{N}},4\}]$ , the transmission of the second sub-phase is not needed. Notice that the transmitted load of the first sub-phase coincides with the proposed converse bound in Theorem 1. Hence, for the above cases, the transmission of the first sub-phase is optimal under the constraint of uncoded cache placement.

When ${\mathsf{r}}\in\{1,{\mathsf{N}}\}$ , the considered problem is equivalent to the MAN caching problem, the first sub-phase is equivalent to the caching scheme in [3], which is optimal under the constraint of uncoded cache placement.

When $t\in\{0,{\mathsf{K}}\}$ , it is trivial to achieve the optimality by transmitting all demanded files or nothing.

When ${\mathsf{r}}=2$ or $t=1$ , as shown in Observation 1, each non-leader can recover its desired files from the transmission of the first sub-phase.

When $t={\mathsf{K}}-1$ , there is only one step in the first sub-phase. From Lemma 1.Item 2 , it can be seen that any non-leader can recover its desired blocks from Step $1$ of the first sub-phase. Hence, the second sub-phase is not necessary.

We now consider ${\mathsf{r}}={\mathsf{N}}-1$ or $t=2$ and let each non-user $k$ recover $W_{{\mathcal{S}}_{2},{\mathcal{V}}_{2}}$ where $d_{k}\in{\mathcal{S}}_{2}$ , $\{d_{u_{1}},\ldots,d_{u_{g(d_{k})-1}}\}\cap{\mathcal{S}}_{2}=\emptyset$ and $\{k,u_{1},\ldots,u_{g(d_{k})}\}\cap{\mathcal{V}}_{2}=\emptyset$ , by the transmission of the first sub-phase. The main reason that the first sub-phase is enough for these two cases, is that Step $g(d_{k})$ of the second sub-phase could be reconstructed by user $k$ from the first sub-phase. Consider one message $C_{{\mathcal{J}}^{\prime}\cup\{u_{g(d_{k})},u_{q}\},{\mathcal{B}}^{\prime}}$ which is transmitted in the second sub-phase. Notice that ${\mathcal{B}}^{\prime}\subseteq[{\mathsf{N}}]\setminus\{d_{u_{1}},\ldots,d_{u_{q}}\}$ . If $t=2$ , we have $|{\mathcal{J}}^{\prime}|=1$ . If ${\mathsf{r}}={\mathsf{N}}-1$ , we have $|{\mathcal{B}}^{\prime}|={\mathsf{r}}-2={\mathsf{N}}-3$ . Hence, for the case ${\mathsf{r}}={\mathsf{N}}-1$ or $t=2$ , all interferences in $C_{{\mathcal{J}}^{\prime}\cup\{u_{g(d_{k})},u_{q}\},{\mathcal{B}}^{\prime}}$ to user $k$ whose demands $F_{d_{u_{g(d_{k})}}}$ , are from one block (assuming this block is $W_{{\mathcal{B}}^{\prime}\cup\{i\}}$ ). Hence, the binary sum of these interferences is equal to the sum of the interferences in $C_{{\mathcal{J}}^{\prime}\cup\{u_{g(d_{k})},u_{q}\},{\mathcal{B}}^{\prime}\cup\{i\}}$ or $C_{{\mathcal{J}}^{\prime}\cup\{u_{g(d_{k})},u_{q}\},{\mathcal{B}}^{\prime}\cup\{d_{u_{q}}\}}$ . It will be proved in Appendix F that user $k$ can recover this sum of interferences from the first sub-phase and then it can reconstruct $C_{{\mathcal{J}}^{\prime}\cup\{u_{g(d_{k})},u_{q}\},{\mathcal{B}}^{\prime}}$ .

Hence, from Observation 2, user $k$ can recover $W_{{\mathcal{S}}_{2},{\mathcal{V}}_{2}}$ if there is no user in ${\mathcal{V}}_{2}$ whose demanded file is in $\{d_{u_{1}},\ldots,d_{u_{g(d_{k})-1}}\}$ . It will be proved in Appendix F, if there is some user in ${\mathcal{V}}_{2}$ whose demanded file is in $\{d_{u_{1}},\ldots,d_{u_{g(d_{k})-1}}\}$ , for the case ${\mathsf{r}}={\mathsf{N}}-1$ or $t=2$ , user $k$ can also recover $W_{{\mathcal{S}}_{2},{\mathcal{V}}_{2}}$ from the reconstruction of Step $g(d_{k})$ of the second sub-phase.

In conclusion, for the cases where ${\mathsf{r}}\in\{1,2,{\mathsf{N}}-1,{\mathsf{N}}\}$ or $t\in\{1,2,{\mathsf{K}}-1,{\mathsf{K}}\}$ , we prove that from the first delivery sub-phase, each user can recover its desired file. Comparing the converse bound in Theorem 1 and the achieved load (given in Section V-F), we have the optimality for Case 1 where ${\mathsf{r}}\in\{1,2,{\mathsf{N}}-1,{\mathsf{N}}\}$ . The optimality for Case 2 where either ${\mathsf{K}}{\mathsf{r}}{\mathsf{M}}/{\mathsf{N}}\leq 2$ or ${\mathsf{K}}{\mathsf{r}}{\mathsf{M}}/{\mathsf{N}}\geq{\mathsf{K}}-1$ , is due to the fact that in the converse bound (7), $c^{N_{\textup{e}}({\mathbf{d}})}_{t}$ is convex in terms of $t$ and when $t\in\{0,1,2,{\mathsf{K}}-1,{\mathsf{K}}\}$ , our proposed scheme is optimal.

Finally, we prove the optimality of ${\mathsf{R}}^{\star}_{\mathrm{u}}({\mathsf{M}},s)$ for Case 3 where $s\in[\min\{{\mathsf{K}},{\mathsf{N}},4\}]$ . We consider the following two cases.

$\min\{{\mathsf{K}},{\mathsf{N}}\}\leq 4$ . Theorem 5.Case 1 covers all possible values of ${\mathsf{r}}$ when $3\geq{\mathsf{N}}-1$ , and Theorem 5.Case 2 covers all possible values of ${\mathsf{M}}$ when $3\geq{\mathsf{K}}-1$ . Hence, when $\min\{{\mathsf{K}},{\mathsf{N}}\}\leq 4$ , we can prove the optimality. 2. 2.

$\min\{{\mathsf{K}},{\mathsf{N}}\}>4$ . In this case, $s=|{\mathcal{N}}([{\mathsf{K}}])|\leq 4$ . For each subset of files ${\mathcal{T}}\subseteq[{\mathsf{N}}]\setminus{\mathcal{N}}([{\mathsf{K}}])$ where ${\mathsf{r}}-4\leq|{\mathcal{T}}|<{\mathsf{r}}$ , we can gather all blocks $W_{{\mathcal{S}}}$ where ${\mathcal{S}}\subseteq[{\mathsf{N}}]$ , $|{\mathcal{S}}|={\mathsf{r}}$ , ${\mathcal{S}}\setminus{\mathcal{N}}([{\mathsf{K}}])={\mathcal{T}}$ . The proposed first delivery sub-phase on these blocks is equivalent to the first delivery sub-phase for ${\mathcal{N}}_{\text{eq}}([{\mathsf{K}}])={\mathsf{N}}_{\text{eq}}=s$ , ${\mathsf{K}}_{\text{eq}}={\mathsf{K}}$ , ${\mathsf{r}}_{\text{eq}}={\mathsf{r}}-|{\mathcal{T}}|$ , and $t_{\text{eq}}=t$ . Since we proved the decodability of the proposed first delivery sub-phase for the system including up to $4$ files, we can prove the blocks in this group can be recovered by the demanding users. Hence, we prove that each user can recover its desired file from the first delivery sub-phase.

As a result, we prove when $s\in[\min\{{\mathsf{K}},{\mathsf{N}},4\}]$ , each user can recover its desired file from the first delivery sub-phase, and thus we prove the optimality for Theorem 5.Case 3.

VI Conclusions

In this paper, we studied the coded caching problem with correlated sources. We proposed a converse bound under the constraint of uncoded cache placement and two-phase delivery scheme. For any demand type, under the constraint of uncoded cache placement, our caching scheme is optimal to within a factor of $2$ . For the case where each user has a distinct request, or the case with any demand type with either ${\mathsf{r}}\in\{1,2,{\mathsf{N}}-1,{\mathsf{N}}\}$ or ${\mathsf{K}}{\mathsf{r}}{\mathsf{M}}\leq 2{\mathsf{N}}$ or ${\mathsf{K}}{\mathsf{r}}{\mathsf{M}}\geq({\mathsf{K}}-1){\mathsf{N}}$ or $s\in[\min\{{\mathsf{K}},{\mathsf{N}},4\}]$ , the second sub-phase is not necessary and thus the proposed scheme is optimal under the constraint of uncoded cache placement. As a by-product, we also showed that the proposed scheme reduces the load of existing schemes for the caching problem with multiple requests.

Appendix A Proof of Lemma 1

For a given demand vector ${\mathbf{d}}$ , let $s:=N_{\textup{e}}({\mathbf{d}})$ , $j_{\max}:=\min\{s,{\mathsf{N}}-{\mathsf{r}}+1,{\mathsf{K}}-t\}$ , and order the leader users as $(u_{1},\ldots,u_{s})$ . Recall that in step $j\in[j_{\max}]$ of delivery sub-phase 1 of the scheme in (8) we satisfy the demand of leader user $u_{j}$ as follows: for each set of users ${\mathcal{J}}\subseteq[{\mathsf{K}}]\setminus\{u_{1},\ldots,u_{j-1}\}$ such that $|{\mathcal{J}}|=t+1$ and $u_{j}\in{\mathcal{J}}$ , and for each set of files ${\mathcal{B}}\subseteq[{\mathsf{N}}]\setminus\{d_{u_{1}},\ldots,d_{u_{j}}\}$ such that $|{\mathcal{B}}|={\mathsf{r}}-1$ , we transmit the multicast message in (20), which we re-write as

[TABLE]

where we introduced the superscript $j$ to indicate the leader user for whom the multicast message $C_{{\mathcal{J}},{\mathcal{B}}}^{j}$ has been “designed,” by which we mean that by construction (i.e., $d_{u_{j}}\notin{\mathcal{B}}$ ), $C_{{\mathcal{J}},{\mathcal{B}}}^{j}$ in (47) contains only one sub-block desired by user $u_{j}$ (which is $W_{{\mathcal{B}}\cup\{d_{u_{j}}\},{\mathcal{J}}\setminus\{u_{j}\}}$ ), while all other sub-blocks in $C_{{\mathcal{J}},{\mathcal{B}}}^{j}$ are in its cache. Based on this observation, we introduce the following terminology:

Directly read off. The observation made for leader user $u_{j}$ actually holds for every user $k\in{\mathcal{J}}$ such that $d_{k}\not\in{\mathcal{B}}$ (i.e., term in (47a)). Thus, we say that user $k$ ‘directly reads off’ its desired sub-block $W_{{\mathcal{B}}\cup\{d_{k}\},{\mathcal{J}}\setminus\{k\}}$ from the multicast message $C_{{\mathcal{J}},{\mathcal{B}}}^{j}$ . Here we use “directly” to mean that it is enough to remove the contribution of cached sub-blocks in order to recover a desired sub-block.

Indirectly read off. For user $k\in{\mathcal{J}}$ such that $d_{k}\in{\mathcal{B}}$ , its desired sub-blocks appear in $C_{{\mathcal{J}},{\mathcal{B}}}^{j}$ as the linear combination $\underset{i\in{\mathcal{N}}({\mathcal{J}})\backslash{\mathcal{B}}}{\oplus}W_{{\mathcal{B}}\cup\{i\},{\mathcal{J}}\setminus\{k\}}$ (i.e., term in (47b)). Evidently, in (47b), the user who desires file $i\not\in{\mathcal{B}}\cup\{d_{u_{j}}\}$ is in ${\mathcal{J}}$ and is not $u_{j}$ , thus $W_{{\mathcal{B}}\cup\{i\},{\mathcal{J}}\setminus\{k\}}$ can be ‘directly read off’ from $C_{{\mathcal{J}},{\mathcal{B}}\backslash\{d_{k}\}\cup\{i\}}^{j}$ . Thus, we say that user $k$ ‘indirectly reads off’ its desired sub-block $W_{{\mathcal{B}}\cup\{d_{u_{j}}\},{\mathcal{J}}\setminus\{k\}}$ from the multicast message $C_{{\mathcal{J}},{\mathcal{B}}}^{j}$ . Here we use “indirectly” to mean that it is not enough to remove the contribution of cached sub-blocks in order to recover a desired sub-block, but in addition one has to remove the contribution of sub-blocks that have been ‘directly read off’ from some other multicast messages.

Lemma 1 is proved by induction.

A-A Step $1$

Lemma 1.Item 1

We focus on one set of users ${\mathcal{J}}\subseteq[{\mathsf{K}}]$ where $|{\mathcal{J}}|=t+1$ and $u_{1}\in{\mathcal{J}}$ , and one set of files ${\mathcal{B}}\subseteq[{\mathsf{N}}]\setminus\{d_{u_{1}}\}$ where $|{\mathcal{B}}|={\mathsf{r}}-1$ . We will prove that from Step $1$ , each user in $k_{1}\in{\mathcal{J}}$ can recover all sub-blocks in $C_{{\mathcal{J}},{\mathcal{B}}}$ . We consider two cases:

•

$d_{k_{1}}\notin{\mathcal{B}}$ : in $C_{{\mathcal{J}},{\mathcal{B}}}$ user $k_{1}$ caches all sub-blocks except $W_{{\mathcal{B}}\cup\{d_{k_{1}}\},{\mathcal{J}}\setminus\{k_{1}\}}$ . Hence, user $k_{1}$ can recover $W_{{\mathcal{B}}\cup\{d_{k_{1}}\},{\mathcal{J}}\setminus\{k_{1}\}}$ by directly reading off.

•

$d_{k_{1}}\in{\mathcal{B}}$ : in $C_{{\mathcal{J}},{\mathcal{B}}}$ user $k_{1}$ caches all sub-blocks except $W_{{\mathcal{B}}\cup\{i\},{\mathcal{J}}\setminus\{k_{1}\}}$ , where $i\in{\mathcal{N}}({\mathcal{J}})\setminus{\mathcal{B}}$ .

–

If $i\neq d_{u_{1}}$ , user $k_{1}$ can recover $W_{{\mathcal{B}}\cup\{i\},{\mathcal{J}}\setminus\{k_{1}\}}$ from $C_{{\mathcal{J}},({\mathcal{B}}\cup\{i\})\setminus\{d_{k_{1}}\}}$ by directly reading off as the similar reason described in the above case.

–

If $i=d_{u_{1}}$ , since we proved that user $k_{1}$ can recover all sub-blocks in $C_{{\mathcal{J}},{\mathcal{B}}}$ except $W_{{\mathcal{B}}\cup\{d_{u_{1}}\},{\mathcal{J}}\setminus\{k_{1}\}}$ , we prove user $k_{1}$ can recover $W_{{\mathcal{B}}\cup\{d_{u_{1}}\},{\mathcal{J}}\setminus\{k_{1}\}}$ by indirectly reading off.

In conclusion, user $k_{1}$ can recover all sub-blocks in $C_{{\mathcal{J}},{\mathcal{B}}}$ .

Hence, we proved Lemma 1.Item 1 for Step $1$ .

Lemma 1.Item 2

Now for each user $k\in[{\mathsf{K}}]$ , if $k=u_{1}$ , it can recover $W_{{\mathcal{S}}_{1},{\mathcal{V}}_{1}}$ where $d_{k}\in{\mathcal{S}}_{1}$ and $u_{1}\in{\mathcal{V}}_{1}$ , from its cache. Hence, in the following, we will prove any user $k\in([{\mathsf{K}}]\setminus\{u_{1}\})$ can recover each $W_{{\mathcal{S}}_{1},{\mathcal{V}}_{1}}$ where $d_{k}\in{\mathcal{S}}_{1}$ , $u_{1}\in{\mathcal{V}}_{1}$ and $k\notin{\mathcal{V}}_{1}$ , from Step $1$ . We consider two cases:

•

$d_{u_{1}}\notin{\mathcal{S}}_{1}$ . We can see that $W_{{\mathcal{S}}_{1},{\mathcal{V}}_{1}}$ appears in $C_{{\mathcal{V}}_{1}\cup\{k\},{\mathcal{S}}_{1}\setminus\{d_{k}\}}$ . By Lemma 1.Item 1 for Step $1$ , we prove that user $k$ can recover $W_{{\mathcal{S}}_{1},{\mathcal{V}}_{1}}$ .

•

$d_{u_{1}}\in{\mathcal{S}}_{1}$ . We can see that $W_{{\mathcal{S}}_{1},{\mathcal{V}}_{1}}$ appears in $C_{{\mathcal{V}}_{1}\cup\{k\},{\mathcal{S}}_{1}\setminus\{d_{u_{1}}\}}$ . By Lemma 1.Item 1 for Step $1$ , we prove that user $k$ can recover $W_{{\mathcal{S}}_{1},{\mathcal{V}}_{1}}$ .

Hence, we proved Lemma 1.Item 2 for Step $1$ .

Lemma 1.Item 3

We then focus on one user $q$ whose demanded file is in $[{\mathsf{N}}]\setminus\{d_{u_{1}}\}$ , and one sub-block $W_{{\mathcal{S}}_{2},{\mathcal{V}}_{2}}$ where $\{d_{q},d_{u_{1}}\}\subseteq{\mathcal{S}}_{2}$ and $\{u_{1},q\}\cap{\mathcal{V}}_{2}=\emptyset$ . In $C_{{\mathcal{V}}_{2}\cup\{u_{1}\},{\mathcal{S}}_{2}\setminus\{d_{u_{1}}\}}$ , all sub-blocks are desired by user $q$ while only one of them is desired by user $u_{1}$ (which is $W_{{\mathcal{S}}_{2},{\mathcal{V}}_{2}}$ ) and the others are cached by user $u_{1}$ . From Lemma 1.Item 2 for Step $1$ , user $q$ has recovered all desired sub-blocks which are cached by user $u_{1}$ , and thus user $q$ can recover $W_{{\mathcal{S}}_{2},{\mathcal{V}}_{2}}$ from $C_{{\mathcal{V}}_{2}\cup\{u_{1}\},{\mathcal{S}}_{2}\setminus\{d_{u_{1}}\}}$ . Hence, we proved Lemma 1.Item 3 for Step $1$ .

In summary, we proved Lemma 1 for Step $1$ .

A-B Step $j$

We focus one $j\in[\min\{N_{\textup{e}}({\mathbf{d}}),{\mathsf{N}}-{\mathsf{r}}+1,{\mathsf{K}}-t\}]$ and assume that Lemma 1 holds for the first $j-1$ steps. In the following, we prove that Lemma 1 holds for Step $j$ .

Lemma 1.Item 1

We focus on one set of users ${\mathcal{J}}\subseteq([{\mathsf{K}}]\setminus\{u_{1},\ldots,u_{j-1}\})$ where $|{\mathcal{J}}|=t+1$ and $u_{j}\in{\mathcal{J}}$ , and one set of files ${\mathcal{B}}\subseteq([{\mathsf{N}}]\setminus\{d_{u_{1}},\ldots.d_{u_{j}}\})$ where $|{\mathcal{B}}|={\mathsf{r}}-1$ . We will prove that from the transmission until Step $j$ , each user in $k_{1}\in{\mathcal{J}}$ can recover all sub-blocks in $C_{{\mathcal{J}},{\mathcal{B}}}$ . We consider two cases:

•

$d_{k_{1}}\notin{\mathcal{B}}$ . In this case, in $C_{{\mathcal{J}},{\mathcal{B}}}$ user $k_{1}$ caches all sub-blocks except $W_{{\mathcal{B}}\cup\{d_{k_{1}}\},{\mathcal{J}}\setminus\{k_{1}\}}$ . Hence, user $k_{1}$ can recover $W_{{\mathcal{B}}\cup\{d_{k_{1}}\},{\mathcal{J}}\setminus\{k_{1}\}}$ by directly reading off.

•

$d_{k_{1}}\in{\mathcal{B}}$ . In this case, $d_{k_{1}}\notin\{d_{u_{1}},\ldots,d_{u_{j}}\}$ . In $C_{{\mathcal{J}},{\mathcal{B}}}$ user $k_{1}$ caches all sub-blocks except $W_{{\mathcal{B}}\cup\{i\},{\mathcal{J}}\setminus\{k_{1}\}}$ , where $i\in{\mathcal{N}}({\mathcal{J}})\setminus{\mathcal{B}}$ .

–

If $i\in\{d_{u_{1}},\ldots,d_{u_{j-1}}\}$ , by the induction assumption, user $k_{1}$ has already recovered the whole block $W_{{\mathcal{B}}\cup\{d_{k_{2}}\}}$ .

–

If $i\notin\{d_{u_{1}},\ldots,d_{u_{j}}\}$ , user $k_{1}$ can recover $W_{{\mathcal{B}}\cup\{d_{k_{2}}\},{\mathcal{J}}\setminus\{k_{1}\}}$ from $C_{{\mathcal{J}},({\mathcal{B}}\cup\{d_{k_{2}}\})\setminus\{d_{k_{1}}\}}$ transmitted in Step $j$ by directly reading off.

–

If $i=d_{u_{j}}$ , in $C_{{\mathcal{J}},{\mathcal{B}}}$ user $k_{1}$ has cached or recovered all sub-blocks except $W_{{\mathcal{B}}\cup\{d_{u_{j}}\},{\mathcal{J}}\setminus\{k_{1}\}}$ . Hence, user $k_{1}$ can recover $W_{{\mathcal{B}}\cup\{d_{u_{j}}\},{\mathcal{J}}\setminus\{k_{1}\}}$ by indirectly reading off.

In conclusion, user $k_{1}$ can recover all sub-blocks in $C_{{\mathcal{J}},{\mathcal{B}}}$ , and thus we proved Lemma 1.Item 1 for Step $j$ .

Lemma 1.Item 2

Now for each user $k\in[{\mathsf{K}}]$ where $d_{k}\notin\{d_{u_{1}},\dots,d_{u_{j-1}}\}$ , if $k=u_{j}$ , it can recover $W_{{\mathcal{S}}_{1},{\mathcal{V}}_{1}}$ where $d_{k}\in{\mathcal{S}}_{1}$ and $u_{j}\in{\mathcal{V}}_{1}$ , from its cache. Hence, in the following, we will prove any user $k\in([{\mathsf{K}}]\setminus\{u_{j}\})$ where $d_{k}\notin\{d_{u_{1}},\dots,d_{u_{j-1}}\}$ , can recover each $W_{{\mathcal{S}}_{1},{\mathcal{V}}_{1}}$ where $d_{k}\in{\mathcal{S}}_{1}$ , $u_{j}\in{\mathcal{V}}_{1}$ and $\{k,u_{1},\ldots,u_{j-1}\}\cap{\mathcal{V}}_{1}=\emptyset$ , at the end of Step $j$ . We consider two cases:

•

$d_{u_{j}}\notin{\mathcal{S}}_{1}$ . We can see that $W_{{\mathcal{S}}_{1},{\mathcal{V}}_{1}}$ appears in $C_{{\mathcal{V}}_{1}\cup\{k\},{\mathcal{S}}_{1}\setminus\{d_{k}\}}$ transmitted in Step $j$ . By Lemma 1.Item 1 for Step $j$ , we prove that user $k$ can recover $W_{{\mathcal{S}}_{1},{\mathcal{V}}_{1}}$ .

•

$d_{u_{j}}\in{\mathcal{S}}_{1}$ . We can see that $W_{{\mathcal{S}}_{1},{\mathcal{V}}_{1}}$ appears in $C_{{\mathcal{V}}_{1}\cup\{k\},{\mathcal{S}}_{1}\setminus\{d_{u_{j}}\}}$ transmitted in Step $j$ . By Lemma 1.Item 1 for Step $j$ , we prove that user $k$ can recover $W_{{\mathcal{S}}_{1},{\mathcal{V}}_{1}}$ .

Hence, we proved Lemma 1.Item 2 for Step $j$ .

Lemma 1.Item 3

We then focus on one user $q$ whose demanded file is in $[{\mathsf{N}}]\setminus\{d_{u_{1}},\ldots,d_{u_{j}}\}$ , and one sub-block $W_{{\mathcal{S}}_{2},{\mathcal{V}}_{2}}$ where $\{d_{q},d_{u_{j}}\}\subseteq{\mathcal{S}}_{2}$ , $\{d_{u_{1}},\ldots,d_{u_{j-1}}\}\cap{\mathcal{S}}_{2}=\emptyset$ , and $\{q,u_{1},\ldots,u_{j}\}\cap{\mathcal{V}}_{2}=\emptyset$ . In $C_{{\mathcal{V}}_{2}\cup\{u_{j}\},{\mathcal{S}}_{2}\setminus\{d_{u_{j}}\}}$ transmitted in Step $j$ , all sub-blocks are desired by user $q$ while only one of them is desired by user $u_{j}$ (which is $W_{{\mathcal{S}}_{2},{\mathcal{V}}_{2}}$ ) and the others are cached by user $u_{1}$ . From Lemma 1.Item 2 for Step $j$ , user $q$ has recovered all desired sub-blocks which are cached by user $u_{j}$ , and thus user $q$ can recover $W_{{\mathcal{S}}_{2},{\mathcal{V}}_{2}}$ from $C_{{\mathcal{V}}_{2}\cup\{u_{1}\},{\mathcal{S}}_{2}\setminus\{d_{u_{1}}\}}$ . Hence, we proved Lemma 1.Item 3 for Step $j$ .

In conclusion, we proved Lemma 1.

Appendix B Proof of Lemma 2

In Step $j$ , we focus on one ${\mathcal{J}}^{\prime}\subseteq([{\mathsf{K}}]\setminus\{u_{1},\ldots,u_{N_{\textup{e}}({\mathbf{d}})}\})$ where $|{\mathcal{J}}^{\prime}|=t-1$ , and one ${\mathcal{B}}^{\prime}\subseteq([{\mathsf{N}}]\setminus\{d_{u_{1}},\ldots,d_{u_{q}}\})$ where $|{\mathcal{B}}^{\prime}|={\mathsf{r}}-2$ and ${\mathcal{B}}^{\prime}\cap{\mathcal{N}}([{\mathsf{K}}])\neq\emptyset$ , and in the following we prove $C_{{\mathcal{J}}^{\prime}\cup\{u_{j},u_{q}\},{\mathcal{B}}^{\prime}}$ can be recovered by each user $k$ demanding $F_{d_{u_{j}}}$ .

Given ${\mathcal{J}}^{\prime}$ , we define a family of sets $\mathbb{S}({\mathcal{J}}^{\prime})\ni{\mathcal{J}}^{\prime}$ as follows. We divide the users ${\mathcal{J}}^{\prime}$ into groups, where each group is corresponding to one file in ${\mathcal{N}}({\mathcal{J}}^{\prime})$ and it contains all users in ${\mathcal{J}}^{\prime}$ demanding this file. Each time, we choose one or zero user in each group which is not corresponding to the file in $\{d_{u_{j}},d_{u_{q}}\}$ , and replace this user by the leader who demands the file corresponding to this group. For example, ${\mathcal{J}}^{\prime}=\{5,6,7,8\}$ where $d_{u_{j}}=1$ , $d_{u_{q}}=2$ , $d_{5}=d_{6}=3$ , $d_{7}=4$ , and $d_{8}=2$ . The leader user demanding $F_{3}$ is user $3$ while the leader user demanding $F_{4}$ is user $4$ . We first choose user $5$ in the first group and replace it by user $3$ , and choose user $7$ in the second group and replace it by user $4$ . Hence, we have the set of users $\{3,4,6,8\}\in\mathbb{S}({\mathcal{J}}^{\prime})$ . Similarly, in this example we have

[TABLE]

For each ${\mathcal{J}}\in\mathbb{S}({\mathcal{J}}^{\prime})$ , with a slight abuse of notation, we let

[TABLE]

In other words, $Q_{{\mathcal{J}}}$ is obtained by removing all sub-blocks from the blocks desired by user $u_{j}$ or $u_{q}$ in $C_{{\mathcal{J}}\cup\{u_{j},u_{q}\},{\mathcal{B}}^{\prime}}$ .

For each ${\mathcal{J}}\in\mathbb{S}({\mathcal{J}}^{\prime})$ , by the definitions, we have

[TABLE]

In (49), if $k_{3}\neq u_{j}$ , $W_{{\mathcal{S}},{\mathcal{J}}\cup\{u_{j},u_{q}\}\setminus\{k_{3}\}}$ is cached by $u_{j}$ and from Lemma 1.Item 2 , user $k$ can recover $W_{{\mathcal{S}},{\mathcal{J}}\cup\{u_{j},u_{q}\}\setminus\{k_{3}\}}$ . We then focus on $k_{3}=u_{j}$ . Since $u_{q}\notin{\mathcal{S}}$ , by Remark 3, it can be seen that $W_{{\mathcal{S}},{\mathcal{J}}\cup\{u_{j},u_{q}\}\setminus\{k_{3}\}}$ can be recovered by user $k$ . Hence, user $k$ can reconstruct the RHS of (49).

For each ${\mathcal{J}}_{1}\in\mathbb{S}({\mathcal{J}}^{\prime})$ where ${\mathcal{J}}_{1}\neq{\mathcal{J}}^{\prime}$ , since there exists at least one leader in ${\mathcal{J}}_{1}$ , it can be seen that $C_{{\mathcal{J}}\cup\{u_{j},u_{q}\},{\mathcal{B}}^{\prime}\cup\{u_{q}\}}$ and $C_{{\mathcal{J}}_{1}\cup\{u_{j},u_{q}\},{\mathcal{B}}^{\prime}}$ are transmitted in Step $j$ of the first and second sub-phases, respectively. Hence, user $k$ can reconstruct ${\mathcal{Q}}_{{\mathcal{J}}_{1}}$ from (49).

At the end of this proof, we will prove the following equation.

[TABLE]

In (50), all the messages except ${\mathcal{Q}}_{{\mathcal{J}}^{\prime}}$ are recovered by user $k$ such that each user can reconstruct ${\mathcal{Q}}_{{\mathcal{J}}^{\prime}}$ . In addition, $C_{{\mathcal{J}}^{\prime}\cup\{u_{j},u_{q}\},{\mathcal{B}}^{\prime}\cup\{u_{q}\}}$ is transmitted in Step $j$ of the first sub-phase. Hence, from (49), user $k$ can reconstruct $C_{{\mathcal{J}}^{\prime}\cup\{u_{j},u_{q}\},{\mathcal{B}}^{\prime}}$ .

Finally, we will prove (50). We focus on one sub-block in (50) and assume that $W_{{\mathcal{S}},{\mathcal{V}}}$ is in ${\mathcal{Q}}_{{\mathcal{J}}}$ , which is desired by user $k_{5}$ . Hence, ${\mathcal{V}}={\mathcal{J}}\cup\{u_{j},u_{q}\}\setminus\{k_{5}\}$ and $d_{k_{5}}\notin\{d_{u_{j}},d_{u_{q}}\}$ . By the construction of $\mathbb{S}({\mathcal{J}}^{\prime})$ , there exists only one user in ${\mathcal{J}}^{\prime}\cup\{u_{g(d_{k_{5}})}\}$ demanding $d_{k_{5}}$ , who is not in ${\mathcal{J}}$ . We assume this user is $k_{6}$ . It can be seen that $W_{{\mathcal{S}},{\mathcal{V}}}$ desired by $k_{6}$ , is also in ${\mathcal{Q}}_{{\mathcal{J}}_{2}}$ where ${\mathcal{J}}_{2}={\mathcal{J}}\cup\{k_{6}\}\setminus\{k_{5}\}$ . In addition, except ${\mathcal{J}}$ and ${\mathcal{J}}_{2}$ , there does not exist other ${\mathcal{J}}_{3}\in\mathbb{S}({\mathcal{J}}^{\prime})$ such that ${\mathcal{Q}}_{{\mathcal{J}}_{3}}$ contains $W_{{\mathcal{S}},{\mathcal{V}}}$ (because ${\mathcal{V}}\setminus\{d_{u_{j}},d_{u_{q}}\}={\mathcal{J}}\setminus\{k_{5}\}$ can not be subset of ${\mathcal{J}}_{3}$ ). Hence, $W_{{\mathcal{S}},{\mathcal{V}}}$ appears twice in (50) and we prove (50).

Appendix C Proof of Lemma 3

C-A Proof of (25)

To prove (25), it is equivalent to prove that

[TABLE]

where we assume ${\mathcal{R}}={\mathcal{J}}\cup\{u_{g(i)}\}$ . Since $u_{g(i)}\notin{\mathcal{J}}$ and $|{\mathcal{J}}|=t+1$ , we have $|{\mathcal{R}}|=t+2$ . Any $C_{{\mathcal{T}},{\mathcal{H}}}$ in (51), should satisfy ${\mathcal{T}}\subseteq{\mathcal{R}}$ and $|{\mathcal{R}}\setminus{\mathcal{T}}|=1$ . For the user in $({\mathcal{R}}\setminus{\mathcal{T}})$ , its desired file is in ${\mathcal{H}}$ . In addition, if $C_{{\mathcal{T}}_{1},{\mathcal{H}}_{1}}$ and $C_{{\mathcal{T}}_{2},{\mathcal{H}}_{2}}$ are in (51), we can see that ${\mathcal{T}}_{1}\neq{\mathcal{T}}_{2}$ .

We focus one sub-block $W_{{\mathcal{S}},{\mathcal{V}}}$ in (51) and assume that $C_{{\mathcal{T}},{\mathcal{H}}}$ contains $W_{{\mathcal{S}},{\mathcal{V}}}$ . It directly indicates that ${\mathcal{S}}\subseteq{\mathcal{N}}({\mathcal{T}})\cup{\mathcal{H}}$ , and that ${\mathcal{T}}\supseteq{\mathcal{V}}$ , $|{\mathcal{T}}\setminus{\mathcal{V}}|=1$ , the user in ${\mathcal{T}}\setminus{\mathcal{V}}$ (assumed to be user $k^{\prime}$ ) desires the sub-block $W_{{\mathcal{S}},{\mathcal{V}}}$ . In addition, since $k^{\prime}\in{\mathcal{T}}\subseteq{\mathcal{R}}$ and $|{\mathcal{R}}\setminus{\mathcal{T}}|=1$ , assuming $k_{1}\in({\mathcal{R}}\setminus{\mathcal{T}})$ , we have $d_{k_{1}}\in{\mathcal{H}}$ and thus $W_{{\mathcal{S}},{\mathcal{V}}}$ is also desired by user $k_{1}$ . Hence, it can be seen that $C_{{\mathcal{V}}\cup\{k_{1}\},{\mathcal{H}}\setminus\{d_{k_{1}}\}\cup\{d_{k^{\prime}}\}}$ is also in (51), and $W_{{\mathcal{S}},{\mathcal{V}}}$ desired by user $k_{1}$ is in $C_{{\mathcal{V}}\cup\{k_{1}\},{\mathcal{H}}\setminus\{d_{k_{1}}\}\cup\{d_{k^{\prime}}\}}$ . Except $C_{{\mathcal{T}},{\mathcal{H}}}$ and $C_{{\mathcal{V}}\cup\{k_{1}\},{\mathcal{H}}\setminus\{d_{k_{1}}\}\cup\{d_{k^{\prime}}\}}$ , there does not exist any other $C_{{\mathcal{T}}_{1},{\mathcal{H}}_{1}}$ in (51) containing $W_{{\mathcal{S}},{\mathcal{V}}}$ because there is no other ${\mathcal{T}}_{1}\subseteq{\mathcal{R}}$ where $|{\mathcal{T}}_{1}|=|{\mathcal{R}}|-1$ and ${\mathcal{V}}\subseteq{\mathcal{T}}_{1}$ (noticing that ${\mathcal{V}}\subseteq{\mathcal{R}}$ and $|{\mathcal{V}}|=|{\mathcal{R}}|-2$ ).

In conclusion, each sub-block in (51) appears twice in (51), and thus we prove (51).

C-B Proof of (26)

To prove (26), it is equivalent to prove that

[TABLE]

If $C_{{\mathcal{J}}_{1},{\mathcal{H}}}$ appears in (52), since $i_{1}\in{\mathcal{N}}_{{\mathcal{J}}_{1}}\cap{\mathcal{B}}_{1}$ , we have $({\mathcal{B}}_{1}\setminus\{i_{1}\})\subseteq{\mathcal{H}}$ and $|{\mathcal{H}}\setminus({\mathcal{B}}_{1}\setminus\{i_{1}\})|=1$ . For the file in ${\mathcal{H}}\setminus({\mathcal{B}}_{1}\setminus\{i_{1}\})$ , it is also in $({\mathcal{N}}_{{\mathcal{J}}_{1}}\setminus{\mathcal{B}}_{1})\cup\{i_{1}\}$ .

We focus one sub-block $W_{{\mathcal{S}},{\mathcal{V}}}$ in (52) and assume that $C_{{\mathcal{J}}_{1},{\mathcal{H}}}$ contains $W_{{\mathcal{S}},{\mathcal{V}}}$ . It directly indicates that ${\mathcal{H}}\subseteq{\mathcal{S}}$ and $|{\mathcal{S}}\setminus{\mathcal{H}}|=1$ (we assume the file in ${\mathcal{S}}\setminus{\mathcal{H}}$ is $i^{\prime}$ ). In addition, we have $({\mathcal{B}}_{1}\setminus\{i_{1}\})\subseteq{\mathcal{H}}$ and $|{\mathcal{H}}\setminus({\mathcal{B}}_{1}\setminus\{i_{1}\})|=1$ (we assume the file in ${\mathcal{H}}\setminus({\mathcal{B}}_{1}\setminus\{i_{1}\})$ is $i_{2}$ ). As described before, $i_{2}$ is $({\mathcal{N}}_{{\mathcal{J}}_{1}}\setminus{\mathcal{B}}_{1})\cup\{i_{1}\}$ and thus file $i_{2}$ is demanded by some user in ${\mathcal{J}}_{1}$ . Hence, it can be seen that $W_{{\mathcal{S}},{\mathcal{V}}}$ is also in $C_{{\mathcal{J}}_{1},{\mathcal{H}}\setminus\{i_{2}\}\cup\{i^{\prime}\}}$ . Except $C_{{\mathcal{J}}_{1},{\mathcal{H}}}$ and $C_{{\mathcal{J}}_{1},{\mathcal{H}}\setminus\{i_{2}\}\cup\{i^{\prime}\}}$ , there does not exist any other $C_{{\mathcal{J}}_{1},{\mathcal{H}}_{1}}$ in (52) containing $W_{{\mathcal{S}},{\mathcal{V}}}$ because there is no other ${\mathcal{H}}_{1}\subseteq{\mathcal{S}}$ where $({\mathcal{B}}_{1}\setminus\{i_{1}\})\subseteq{\mathcal{H}}_{1}$ and $|{\mathcal{H}}_{1}|=|{\mathcal{S}}|-1$ (noticing that $({\mathcal{B}}_{1}\setminus\{i_{1}\})\subseteq{\mathcal{S}}$ and $|{\mathcal{B}}_{1}\setminus\{i_{1}\}|=|{\mathcal{S}}|-2$ ).

In conclusion, each sub-block in (52) appears twice in (52), and thus we prove (52).

Appendix D Proof of Lemma 4

We use the induction method to prove Lemma 4.

$j=1$ . By (26) in Lemma 3 (with $i_{1}=d_{u_{1}}$ ), we have

[TABLE]

where each $C_{{\mathcal{J}}_{2}\cup\{u_{1}\},{\mathcal{B}}_{2}\cup\{i_{2}\}}$ is transmitted in Step $1$ of the first sub-phase. Hence, each user can reconstruct $C_{{\mathcal{J}}_{2}\cup\{u_{1}\},{\mathcal{B}}_{2}\cup\{d_{u_{1}}\}}$ .

$j\in[2:\min\{N_{\textup{e}}({\mathbf{d}}),{\mathsf{N}}-{\mathsf{r}}+1,{\mathsf{K}}-t\}]$ . We first focus on $C_{{\mathcal{J}}_{2}\cup\{u_{j}\},{\mathcal{B}}_{2}\cup\{i\}}$ where $i\in\{d_{u_{1}},\ldots,d_{u_{j-1}}\}$ . By (25) in Lemma 3, we have

[TABLE]

If $d_{k}\in\{d_{u_{1}},\ldots,d_{u_{j-1}}\}$ , each user can reconstruct $C_{{\mathcal{J}}_{2}\cup\{u_{j},u_{g(i)}\}\setminus\{k\},{\mathcal{B}}_{2}\cup\{d_{k}\}}$ by the induction assumption; else if $d_{k}\notin{\mathcal{B}}_{2}$ , $C_{{\mathcal{J}}_{2}\cup\{u_{j},u_{g(i)}\}\setminus\{k\},{\mathcal{B}}_{2}\cup\{d_{k}\}}$ is transmitted in Step $g(i)$ of the first sub-phase; else, we consider $d_{k}\in{\mathcal{B}}_{2}$ . Since ${\mathcal{N}}({\mathcal{J}}_{2}\cap{\mathcal{L}})\setminus{\mathcal{B}}_{2}\neq\emptyset$ (recall that ${\mathcal{L}}$ is the set of leaders), we can see that there exists one leader in ${\mathcal{J}}_{2}\setminus\{k\}$ whose demanded file is not in ${\mathcal{B}}_{2}$ . Thus in this case, $C_{{\mathcal{J}}_{2}\cup\{u_{j},u_{g(i)}\}\setminus\{k\},{\mathcal{B}}_{2}\cup\{d_{k}\}}$ is transmitted in Step $g(i)$ of the second sub-phase.

We then focus on $C_{{\mathcal{J}}_{2}\cup\{u_{j}\},{\mathcal{B}}_{2}\cup\{d_{u_{j}}\}}$ . By (26) in Lemma 3 (with $i_{1}=d_{u_{j}}$ ), we have

[TABLE]

In (55), if $i_{2}\in\{d_{u_{1}},\ldots,d_{u_{j-1}}\}$ , we have proved $C_{{\mathcal{J}}_{2}\cup\{u_{j}\},{\mathcal{B}}_{2}\cup\{i_{2}\}}$ can be reconstructed by each user from (54); otherwise, $C_{{\mathcal{J}}_{2}\cup\{u_{j}\},{\mathcal{B}}_{2}\cup\{i_{2}\}}$ is transmitted in Step $j$ of the first sub-phase.

Remark 2.

Notice that to prove Lemma 4 the transmission in the second sub-phase is only used when there exists some user in ${\mathcal{J}}_{2}\cup\{u_{j}\}$ whose demanded file is in ${\mathcal{B}}_{2}$ (i.e., $d_{k}\in{\mathcal{B}}_{2}$ in (54)). Hence, if ${\mathcal{N}}({\mathcal{J}}_{2})\cap{\mathcal{B}}_{2}=\emptyset$ , to reconstruct $C_{{\mathcal{J}}_{2}\cup\{u_{j}\},{\mathcal{B}}_{2}\cup\{i\}}$ , each user only needs the transmission in the first sub-phase.

Formally, for each $j\in[\min\{N_{\textup{e}}({\mathbf{d}}),{\mathsf{N}}-{\mathsf{r}}+1,{\mathsf{K}}-t\}]$ and each $i\in\{d_{u_{1}},\ldots,d_{u_{j}}\}$ , any non-leader $k\in[{\mathsf{K}}]$ can reconstruct $C_{{\mathcal{J}}_{2}\cup\{u_{j}\},{\mathcal{B}}_{2}\cup\{i\}}$ where ${\mathcal{J}}_{2}\subseteq[{\mathsf{K}}]\setminus\{u_{1},\ldots,u_{j}\}$ , $|{\mathcal{J}}_{2}|=t$ , ${\mathcal{B}}_{2}\subseteq([{\mathsf{N}}]\setminus\{d_{u_{1}},\ldots,d_{u_{j}}\})$ , $|{\mathcal{B}}_{2}|={\mathsf{r}}-2$ , and ${\mathcal{N}}({\mathcal{J}}_{2})\cap{\mathcal{B}}_{2}=\emptyset$ , from the transmission of the first sub-phase.

Appendix E Proof of Decodability of the General Scheme in Section V-C

Now we are ready to prove the decodability of each non-leader $k$ . In other words, we want to prove that it can decode $W_{{\mathcal{S}}_{2},{\mathcal{V}}_{2}}$ where $d_{k}\in{\mathcal{S}}_{2}$ , $\{d_{u_{1}},\ldots,d_{u_{g(d_{k})-1}}\}\cap{\mathcal{S}}_{2}=\emptyset$ and $\{k,u_{1},\ldots,u_{g(d_{k})}\}\cap{\mathcal{V}}_{2}=\emptyset$ (in Lemma 1 we showed that the other desired sub-blocks could be decoded by user $k$ from transmission of the first sub-phase). We consider two cases, $|{\mathcal{S}}_{2}\cap{\mathcal{N}}([{\mathsf{K}}])|>1$ and $|{\mathcal{S}}_{2}\cap{\mathcal{N}}([{\mathsf{K}}])|=1$ .

E-A * $|{\mathcal{S}}_{2}\cap{\mathcal{N}}([{\mathsf{K}}])|>1$ *

Among all desired sub-blocks in this case, we use the induction method to prove for each $j\in[g(d_{k})+1:N_{\textup{e}}({\mathbf{d}})]$ , user $k$ can recover its desired sub-block $W_{{\mathcal{S}}_{2},{\mathcal{V}}_{2}}$ where $d_{u_{j}}\in{\mathcal{S}}_{2}$ or $u_{j}\in{\mathcal{V}}_{2}$ .

Induction on $j=g(d_{k})+1$ . We consider three cases:

•

$u_{j}\in{\mathcal{V}}_{2}$ and $d_{u_{j}}\notin{\mathcal{S}}_{2}$ . In $C_{{\mathcal{V}}_{2}\cup\{k\},{\mathcal{S}}_{2}\setminus\{d_{k}\}}$ transmitted in Step $j$ of the first sub-phase, user $k$ caches all sub-blocks except $W_{{\mathcal{S}}_{2},{\mathcal{V}}_{2}}$ and thus it can recover $W_{{\mathcal{S}}_{2},{\mathcal{V}}_{2}}$ by directly reading off.

•

$u_{j}\in{\mathcal{V}}_{2}$ and $d_{u_{j}}\in{\mathcal{S}}_{2}$ . Since $u_{j}\in{\mathcal{V}}_{2}$ , from Lemma 4 it can be seen that user $k$ can reconstruct $C_{{\mathcal{V}}_{2}\cup\{u_{g(d_{k})}\},{\mathcal{S}}_{2}\setminus\{d_{u_{j}}\}}$ .

In $C_{{\mathcal{V}}_{2}\cup\{u_{g(d_{k})}\},{\mathcal{S}}_{2}\setminus\{d_{u_{j}}\}}$ , all sub-blocks are desired by user $k$ . In addition, all sub-blocks desired by user $k$ which are cached by user $u_{g(d_{k})}$ , can be recovered by user $k$ from Lemma 1.2.

The sub-blocks in $C_{{\mathcal{V}}_{2}\cup\{u_{g(d_{k})}\},{\mathcal{S}}\setminus\{d_{u_{j}}\}}$ which are not cached by user $u_{g(d_{k})}$ , are all cached by user $u_{j}$ (because $u_{j}\in{\mathcal{V}}_{2}$ ). For any file $i\in{\mathcal{N}}({\mathcal{V}}_{2})\setminus({\mathcal{S}}_{2}\setminus\{d_{u_{j}}\})$ , the sub-block $W_{{\mathcal{S}}_{2}\setminus\{d_{u_{j}}\}\cup\{i\},{\mathcal{V}}_{2}}$ is in $C_{{\mathcal{V}}_{2}\cup\{u_{g(d_{k})}\},{\mathcal{S}}_{2}\setminus\{d_{u_{j}}\}}$ which is desired (and not cached) by user $u_{g(d_{k})}$ . If $i\neq d_{u_{j}}$ , since $d_{u_{j}}\notin({\mathcal{S}}_{2}\setminus\{d_{u_{j}}\}\cup\{i\})$ and $u_{j}\in{\mathcal{V}}_{2}$ , we proved in the first case that $W_{{\mathcal{S}}_{2}\setminus\{d_{u_{j}}\}\cup\{i\},{\mathcal{V}}_{2}}$ can be recovered by user $k$ ; otherwise, the sub-block $W_{{\mathcal{S}}_{2}\setminus\{d_{u_{j}}\}\cup\{i\},{\mathcal{V}}_{2}}$ is $W_{{\mathcal{S}}_{2},{\mathcal{V}}_{2}}$ . Hence, in $C_{{\mathcal{V}}_{2}\cup\{u_{g(d_{k})}\},{\mathcal{S}}_{2}\setminus\{d_{u_{j}}\}}$ , only sub-block $W_{{\mathcal{S}}_{2},{\mathcal{V}}_{2}}$ is not recovered by user $k$ , such that user $k$ can recover $W_{{\mathcal{S}}_{2},{\mathcal{V}}_{2}}$ .

•

$u_{j}\notin{\mathcal{V}}_{2}$ and $d_{u_{j}}\in{\mathcal{S}}_{2}$ . We first prove that user $k$ can reconstruct $C_{{\mathcal{V}}_{2}\cup\{u_{j}\},{\mathcal{S}}_{2}\setminus\{d_{u_{j}}\}}$ . From (25) in Lemma 3, we have

[TABLE]

For each $k_{2}\in({\mathcal{V}}_{2}\cup\{u_{j}\})$ in (56),

–

if $k_{2}=u_{j}$ , we have

[TABLE]

which is transmitted in Step $g(d_{k})$ of the first sub-phase;

–

if $k_{2}\neq u_{j}$ and $d_{k_{2}}\notin\{d_{u_{1}},\ldots,d_{u_{g(d_{k})}}\}$ , it can be seen that $C_{{\mathcal{V}}_{2}\cup\{u_{j},u_{g(d_{k})}\}\setminus\{k_{2}\},({\mathcal{S}}_{2}\setminus\{d_{u_{j}},d_{k}\})\cup\{d_{k_{2}}\}}$ is transmitted either in Step $g(d_{k})$ of the first sub-phase (if $|({\mathcal{S}}_{2}\setminus\{d_{u_{j}},d_{k}\})\cup\{d_{k_{2}}\}|={\mathsf{r}}-1$ ) or Step $g(d_{k})$ of the second sub-phase (if $|({\mathcal{S}}_{2}\setminus\{d_{u_{j}},d_{k}\})\cup\{d_{k_{2}}\}|={\mathsf{r}}-2$ and $({\mathcal{V}}_{2}\setminus\{k_{2}\})\cap{\mathcal{N}}([{\mathsf{K}}])\neq\emptyset$ ) or Step $g(d_{k})$ in Lemma 2 (if $|({\mathcal{S}}_{2}\setminus\{d_{u_{j}},d_{k}\})\cup\{d_{k_{2}}\}|={\mathsf{r}}-2$ and $({\mathcal{V}}_{2}\setminus\{k_{2}\})\cap{\mathcal{N}}([{\mathsf{K}}])=\emptyset$ );

–

if $k_{2}\neq u_{j}$ and $d_{k_{2}}\in\{d_{u_{1}},\ldots,d_{u_{g(d_{k})}}\}$ , by Lemma 4, $C_{{\mathcal{V}}_{2}\cup\{u_{j},u_{g(d_{k})}\}\setminus\{k_{2}\},({\mathcal{S}}_{2}\setminus\{d_{u_{j}},d_{k}\})\cup\{d_{k_{2}}\}}$ can be reconstructed by user $k$ .

Hence, user $k$ can recover each message on the RHS of (56) and thus it can reconstruct $C_{{\mathcal{V}}_{2}\cup\{u_{j}\},{\mathcal{S}}_{2}\setminus\{d_{u_{j}}\}}$ .

In $C_{{\mathcal{V}}_{2}\cup\{u_{j}\},{\mathcal{S}}_{2}\setminus\{d_{u_{j}}\}}$ , all sub-blocks are desired by user $k$ . For each $k^{\prime}\in({\mathcal{V}}_{2}\cup\{u_{j}\})$ , if $k^{\prime}\neq u_{j}$ , the desired sub-blocks in $C_{{\mathcal{V}}_{2}\cup\{u_{j}\},{\mathcal{S}}_{2}\setminus\{d_{u_{j}}\}}$ by user $k^{\prime}$ are stored by user $u_{j}$ , which can be recovered by user $k$ from the transmission of the first sub-phase (as we proved above for the case $u_{j}\in{\mathcal{V}}_{2}$ and $d_{u_{j}}\notin{\mathcal{S}}_{2}$ ). If $k^{\prime}=u_{j}$ , the desired sub-block by user $k^{\prime}$ is $W_{{\mathcal{S}}_{2},{\mathcal{V}}_{2}}$ . Hence, user $k$ can recover $W_{{\mathcal{S}}_{2},{\mathcal{V}}_{2}}$ .

Induction on $j\in[g(d_{k})+1:N_{\textup{e}}({\mathbf{d}})]$ . If there exists $j^{\prime}\in[g(d_{k})+1:j-1]$ , where $u_{j^{\prime}}\in{\mathcal{V}}_{2}$ or $d_{u_{j^{\prime}}}\in{\mathcal{S}}_{2}$ , by the induction assumption, user $k$ can recover $W_{{\mathcal{S}}_{2},{\mathcal{V}}_{2}}$ ; otherwise, we can use the similar proof by dividing into three cases and using the induction assumption, to prove user $k$ can recover $W_{{\mathcal{S}}_{2},{\mathcal{V}}_{2}}$ (for the sake of simplicity, we do not repeat).

Remark 3.

If there exists one leader in ${\mathcal{V}}_{2}$ (assumed to be $k^{\prime}$ ) such that $d_{k^{\prime}}\notin{\mathcal{S}}_{2}$ , we can prove user $k$ can recover $W_{{\mathcal{S}}_{2},{\mathcal{V}}_{2}}$ without using Lemma 2. More precisely, we focus on the case $u_{j}\in{\mathcal{V}}_{2}$ and $d_{u_{j}}\in{\mathcal{S}}_{2}$ , where Lemma 2 may be needed. In (56), for each $k_{2}\in{\mathcal{V}}_{2}$ where $d_{k_{2}}\notin\{d_{u_{1}},\ldots,d_{u_{g(d_{k})}}\}$ , if $k_{2}\neq k^{\prime}$ , it can be seen that $({\mathcal{V}}_{2}\setminus\{k_{2}\})\cap{\mathcal{N}}([{\mathsf{K}}])\neq\emptyset$ and thus Lemma 2 is not needed; otherwise, we have $k_{2}=k^{\prime}$ and $|({\mathcal{S}}_{2}\setminus\{d_{u_{j}},d_{k}\})\cup\{d_{k^{\prime}}\}|={\mathsf{r}}-1$ such that Lemma 2 is not needed.

E-B $|{\mathcal{S}}_{2}\cap{\mathcal{N}}([{\mathsf{K}}])|=1$

We can gather all blocks $W_{{\mathcal{S}}}$ where ${\mathcal{S}}\cap([{\mathsf{N}}]\setminus{\mathcal{N}}([{\mathsf{K}}]))={\mathcal{S}}_{2}\cap([{\mathsf{N}}]\setminus{\mathcal{N}}([{\mathsf{K}}]))$ . The transmission for these blocks is equivalent to the MAN caching problem in [1] and thus from the transmission of the first sub-phase on these blocks which is equivalent to the optimal caching scheme in [3], each non leader can recover $W_{{\mathcal{S}}_{2},{\mathcal{V}}_{2}}$ .

E-C Proof of Observations

If $|{\mathcal{S}}_{2}\cap{\mathcal{N}}([{\mathsf{K}}])|=1$ , it has been proved that only the first sub-phase is needed. Hence, in the following we consider $|{\mathcal{S}}_{2}\cap{\mathcal{N}}([{\mathsf{K}}])|>1$ . We focus on each non-leader $k$ and the induction Step $j\in[g(d_{k})+1:N_{\textup{e}}({\mathbf{d}})]$ in the proof of the decodability in Appendix E-A.

Proof of Observation 1

The proof of Observation 1 is trivial and directly given in Section V-F.

Proof of Observation 2

Other steps of the second sub-phase may be needed only when we use Lemma 4 to show that user $k$ can reconstruct

[TABLE]

where $d_{k_{2}}=d_{k}$ . However, in induction Step $g(d_{k})$ of the proof of Lemma 4 with $i=d_{k}$ , the first sub-phase and Step $g(d_{k})$ of the second sub-phase are only needed if there is no user in ${\mathcal{V}}_{2}$ whose demanded file is in $\{d_{u_{1}},\ldots,d_{u_{g(d_{k})-1}}\}$ . Hence, we prove Observation 2.

Proof of Observation 3

•

If $u_{j}\in{\mathcal{V}}_{2}$ and $d_{u_{j}}\notin{\mathcal{S}}_{2}$ , from the proof in Appendix E-A, the first sub-phase is only needed;

•

if $u_{j}\in{\mathcal{V}}_{2}$ and $d_{u_{j}}\in{\mathcal{S}}_{2}$ , user $k$ needs to reconstruct $C_{{\mathcal{V}}_{2}\cup\{u_{g(d_{k})}\},{\mathcal{S}}_{2}\setminus\{d_{u_{j}}\}}$ . Since $(\cup_{k_{1}\in{\mathcal{V}}_{2}}\{d_{k_{1}}\})\cap({\mathcal{S}}_{2}\setminus\{d_{k}\})=\emptyset$ , from Remark 2 we can see that $C_{{\mathcal{V}}_{2}\cup\{u_{g(d_{k})}\},{\mathcal{S}}_{2}\setminus\{d_{u_{j}}\}}$ can be reconstructed by user $k$ from the transmission of the first sub-phase;

•

Lastly we focus on $u_{j}\notin{\mathcal{V}}_{2}$ and $d_{u_{j}}\in{\mathcal{S}}_{2}$ . In (56),

–

if $k_{2}=u_{j}$ , the first sub-phase is only needed;

–

if $k_{2}\neq u_{j}$ and $d_{k_{2}}\notin\{d_{u_{1}},\ldots,d_{u_{g(d_{k})}}\}$ , since $(\cup_{k_{1}\in{\mathcal{V}}_{2}}\{d_{k_{1}}\})\cap({\mathcal{S}}_{2}\setminus\{d_{k}\})=\emptyset$ , we have $|({\mathcal{S}}_{2}\setminus\{d_{u_{j}},d_{k}\})\cup\{d_{k_{2}}\}|={\mathsf{r}}-1$ and thus we only need the first sub-phase;

–

if $k_{2}\neq u_{j}$ and $d_{k_{2}}\in\{d_{u_{1}},\ldots,d_{u_{g(d_{k})}}\}$ , user $k$ should reconstruct $C_{{\mathcal{V}}_{2}\cup\{u_{j},u_{g(d_{k})}\}\setminus\{k_{2}\},({\mathcal{S}}_{2}\setminus\{d_{u_{j}},d_{k}\})\cup\{d_{k_{2}}\}}$ . Again, since $(\cup_{k_{1}\in{\mathcal{V}}_{2}}\{d_{k_{1}}\})\cap({\mathcal{S}}_{2}\setminus\{d_{k}\})=\emptyset$ , from Remark 2 we can see that user $k$ can reconstruct $C_{{\mathcal{V}}_{2}\cup\{u_{j},u_{g(d_{k})}\}\setminus\{k_{2}\},({\mathcal{S}}_{2}\setminus\{d_{u_{j}},d_{k}\})\cup\{d_{k_{2}}\}}$ from the first sub-phase.

Hence, we prove Observation 3.

Appendix F Proof of the Decodability for ${\mathsf{r}}={\mathsf{N}}-1$ or $t=2$

We now consider ${\mathsf{r}}={\mathsf{N}}-1$ or $t=2$ and prove that each non-user $k$ can recover $W_{{\mathcal{S}}_{2},{\mathcal{V}}_{2}}$ where $d_{k}\in{\mathcal{S}}_{2}$ , $\{d_{u_{1}},\ldots,d_{u_{g(d_{k})-1}}\}\cap{\mathcal{S}}_{2}=\emptyset$ and $\{k,u_{1},\ldots,u_{g(d_{k})}\}\cap{\mathcal{V}}_{2}=\emptyset$ , by the transmission of the first sub-phase. If ${\mathcal{N}}({\mathcal{V}}_{2})\cap({\mathcal{S}}_{2}\setminus\{d_{k}\})=\emptyset$ , by Observation 3, user $k$ can recover $W_{{\mathcal{S}}_{2},{\mathcal{V}}_{2}}$ from the first sub-phase. Hence, in the following, we focus on ${\mathcal{N}}({\mathcal{V}}_{2})\cap({\mathcal{S}}_{2}\setminus\{d_{k}\})\neq\emptyset$ . We consider two cases, ${\mathcal{N}}({\mathcal{V}}_{2})\cap\{d_{u_{1}},\ldots,d_{u_{g(d_{k})-1}}\}=\emptyset$ and ${\mathcal{N}}({\mathcal{V}}_{2})\cap\{d_{u_{1}},\ldots,d_{u_{g(d_{k})-1}}\}\neq\emptyset$ .

F-A * ${\mathcal{N}}({\mathcal{V}}_{2})\cap\{d_{u_{1}},\ldots,d_{u_{g(d_{k})-1}}\}=\emptyset$ *

In the following, we prove that for each integer $q\in[g(d_{k})+1:\min\{{\mathsf{N}}-{\mathsf{r}}+2,{\mathsf{K}}-t+1,N_{\textup{e}}({\mathbf{d}})\}]$ , user $k$ can reconstruct $C_{{\mathcal{J}}^{\prime}\cup\{u_{g(d_{k})},u_{q}\},{\mathcal{B}}^{\prime}}$ from the first sub-phase, where ${\mathcal{J}}^{\prime}\subseteq([{\mathsf{K}}]\setminus\{u_{1},\ldots,u_{q}\})$ , $|{\mathcal{J}}^{\prime}|=t-1$ , ${\mathcal{B}}^{\prime}\subseteq([{\mathsf{N}}]\setminus\{d_{u_{1}},\ldots,d_{u_{q}}\})$ , $|{\mathcal{B}}^{\prime}|={\mathsf{r}}-2$ , and ${\mathcal{B}}^{\prime}\cap{\mathcal{N}}([{\mathsf{K}}])\neq\emptyset$ .

If there is no user in ${\mathcal{J}}^{\prime}$ whose demand is in $[{\mathsf{N}}]\setminus(\{d_{k},d_{u_{q}}\}\cup{\mathcal{B}}^{\prime})$ , it can be seen that all sub-blocks in $C_{{\mathcal{J}}^{\prime}\cup\{u_{g(d_{k})},u_{q}\},{\mathcal{B}}^{\prime}}$ are from $W_{{\mathcal{B}}^{\prime}\cup\{d_{k},d_{u_{q}}\}}$ . Hence, we have $C_{{\mathcal{J}}^{\prime}\cup\{u_{g(d_{k})},u_{q}\},{\mathcal{B}}^{\prime}}=C_{{\mathcal{J}}^{\prime}\cup\{u_{g(d_{k})},u_{q}\},{\mathcal{B}}^{\prime}\cup\{d_{u_{q}}\}}$ , which is transmitted in Step $g(d_{k})$ of the first sub-phase. Hence, in the following, we consider that there exists some user in ${\mathcal{J}}^{\prime}$ whose demand is in $[{\mathsf{N}}]\setminus(\{d_{k},d_{u_{q}}\}\cup{\mathcal{B}}^{\prime})$ .

For the case $t=2$ , we have $|{\mathcal{J}}^{\prime}|=1$ and we assume user $k^{\prime}$ is in ${\mathcal{J}}^{\prime}$ . For the case ${\mathsf{r}}={\mathsf{N}}-1$ , we have $|{\mathcal{B}}^{\prime}|={\mathsf{r}}-2={\mathsf{N}}-3$ .

Hence, when ${\mathsf{r}}={\mathsf{N}}-1$ or $t=2$ , all interferences in $C_{{\mathcal{J}}^{\prime}\cup\{u_{g(d_{k})},u_{q}\},{\mathcal{B}}^{\prime}}$ to user $k$ whose demands $F_{d_{u_{g(d_{k})}}}$ , are from one block (assuming this block is $W_{{\mathcal{B}}^{\prime}\cup\{i\}}$ , where $i=d_{k^{\prime}}$ for $t=2$ , and $i=[{\mathsf{N}}]\setminus({\mathcal{B}}^{\prime}\cup\{d_{k},d_{u_{q}}\})$ , with a slight abuse of notation). The sum of the interferences in $C_{{\mathcal{J}}^{\prime}\cup\{u_{g(d_{k})},u_{q}\},{\mathcal{B}}^{\prime}}$ is

[TABLE]

•

if $i\notin\{d_{u_{1}},\ldots,d_{u_{q-1}}\}$ , we can see that

[TABLE]

In (58), $C_{{\mathcal{J}}^{\prime}\cup\{u_{g(d_{k})},u_{q}\},{\mathcal{B}}^{\prime}\cup\{d_{u_{q}}\}}$ is transmitted in Step $g(d_{k})$ of the first sub-phase.

–

If $k_{3}\neq u_{g(d_{k})}$ , the sub-block $W_{{\mathcal{B}}^{\prime}\cup\{d_{k},i\},{\mathcal{J}}^{\prime}\cup\{u_{g(d_{k})},u_{q}\}\setminus\{k_{3}\}}$ is desired by user $k$ and cached by user $u_{g(d_{k})}$ . Thus by Lemma 1.Item 2 , user $k$ can recover this sub-block from the transmission of the first sub-phase;

–

if $k_{3}=u_{g(d_{k})}$ , $W_{{\mathcal{B}}^{\prime}\cup\{d_{k},i\},{\mathcal{J}}^{\prime}\cup\{u_{q}\}}$ can be recovered by user $k$ from $C_{{\mathcal{J}}^{\prime}\cup\{u_{g(d_{k})},u_{q},k\}\setminus\{k_{3}\},{\mathcal{B}}^{\prime}\cup\{i\}}$ transmitted in Step $q$ of the first sub-phase, where in $C_{{\mathcal{J}}^{\prime}\cup\{u_{q},k\},{\mathcal{B}}^{\prime}\cup\{i\}}$ user $k$ caches all except $W_{{\mathcal{B}}^{\prime}\cup\{d_{k},i\},{\mathcal{J}}^{\prime}\cup\{u_{q}\}}$ such that it can recover this sub-block.

Hence, user $k$ can reconstruct $C_{{\mathcal{J}}^{\prime}\cup\{u_{g(d_{k})},u_{q}\},{\mathcal{B}}^{\prime}}$ from the transmission of the first sub-phase;

•

if $i\in\{d_{u_{1}},\ldots,d_{u_{g(d_{k})-1}}\}$ , by Lemma 1.Item 3 , we can see that each sub-block $W_{{\mathcal{B}}^{\prime}\cup\{d_{k},i\},{\mathcal{J}}^{\prime}\cup\{u_{g(d_{k})},u_{q}\}\setminus\{k_{3}\}}$ in (58) is from $W_{{\mathcal{B}}^{\prime}\cup\{d_{k},i\}}$ , which can be recovered by user $k$ from the first sub-phase. Hence, user $k$ can reconstruct $C_{{\mathcal{J}}^{\prime}\cup\{u_{g(d_{k})},u_{q}\},{\mathcal{B}}^{\prime}}$ from the transmission of the first sub-phase;

•

if $i\in\{d_{g(d_{k})+1},\ldots,d_{u_{q}}\}$ , for each user $k_{4}\in{\mathcal{J}}^{\prime}\cup\{u_{q}\}$ where $d_{k_{4}}\neq d_{k}$ , we focus on $C_{{\mathcal{J}}^{\prime}\cup\{u_{g(d_{k})},u_{q},u_{g(i)}\}\setminus\{k_{4}\},{\mathcal{B}}^{\prime}\cup\{d_{u_{q}}\}}$ which is transmitted in Step $g(d_{k})$ of the first sub-phase. In $C_{{\mathcal{J}}^{\prime}\cup\{u_{g(d_{k})},u_{q},u_{g(i)}\}\setminus\{k_{4}\},{\mathcal{B}}^{\prime}\cup\{d_{u_{q}}\}}$ , since we have $|{\mathcal{J}}^{\prime}|=1$ or $|{\mathcal{B}}^{\prime}|={\mathsf{N}}-3$ , it can be seen that all sub-blocks are from either $W_{{\mathcal{B}}^{\prime}\cup\{d_{u_{q}},d_{k}\}}$ or $W_{{\mathcal{B}}^{\prime}\cup\{d_{u_{q}},i\}}$ , and cached by either cached by user $u_{g(d_{k})}$ or by user $u_{g(i)}$ .

By Lemma 1.Item 2 , user $k$ can recover the desired sub-block cached by user $u_{g(d_{k})}$ from the first sub-phase. Each sub-block of $W_{{\mathcal{B}}^{\prime}\cup\{d_{u_{q}},d_{k}\}}$ cached by user $u_{g(i)}$ and not by $u_{g(d_{k})}$ (assumed to be $W_{{\mathcal{B}}^{\prime}\cup\{d_{u_{q}},d_{k}\},{\mathcal{V}}^{\prime}}$ ), can be recovered by user $k$ from $C_{{\mathcal{V}}^{\prime}\cup\{k\},{\mathcal{B}}^{\prime}\cup\{d_{u_{q}}\}}$ (transmitted in Step $g(i)$ of the first sub-phase), because all sub-blocks in $C_{{\mathcal{V}}^{\prime}\cup\{k\},{\mathcal{B}}^{\prime}\cup\{d_{u_{q}}\}}$ except $W_{{\mathcal{B}}^{\prime}\cup\{d_{u_{q}},d_{k}\},{\mathcal{V}}^{\prime}}$ are cached by user $k$ . Hence, in $C_{{\mathcal{J}}^{\prime}\cup\{u_{g(d_{k})},u_{q},u_{g(i)}\}\setminus\{k_{4}\},{\mathcal{B}}^{\prime}\cup\{d_{u_{q}}\}}$ , user $k$ can recover all sub-blocks of $W_{{\mathcal{B}}^{\prime}\cup\{d_{u_{q}},d_{k}\}}$ .

So user $k$ can recover the sum of the sub-blocks of $W_{{\mathcal{B}}^{\prime}\cup\{d_{u_{q}},i\}}$ in $C_{{\mathcal{J}}^{\prime}\cup\{u_{g(d_{k})},u_{q},u_{g(i)}\}\setminus\{k_{4}\},{\mathcal{B}}^{\prime}\cup\{d_{u_{q}}\}}$ from the transmission of the first sub-phase,

[TABLE]

By the similar proof as (51) and (52), we can prove that

[TABLE]

from the fact that each sub-block in (60) appears twice in (60). Hence, user $k$ can recover $I$ from the transmission of the first sub-phase. In addition, by the definition, we have

[TABLE]

where $C_{{\mathcal{J}}^{\prime}\cup\{u_{g(d_{k})},u_{q}\},{\mathcal{B}}^{\prime}\cup\{i\}}$ and $C_{{\mathcal{J}}^{\prime}\cup\{u_{g(d_{k})},u_{q}\},{\mathcal{B}}^{\prime}\cup\{d_{u_{q}}\}}$ are transmitted in Step $g(d_{k})$ of the first sub-phase. Hence, user $k$ can reconstruct $C_{{\mathcal{J}}^{\prime}\cup\{u_{g(d_{k})},u_{q}\},{\mathcal{B}}^{\prime}}$ from the transmission of the first sub-phase.

In conclusion, we prove that from the transmission of the first sub-phase, user $k$ can reconstruct $C_{{\mathcal{J}}^{\prime}\cup\{u_{g(d_{k})},u_{q}\},{\mathcal{B}}^{\prime}}$ , for each integer $q\in[g(d_{k})+1:\min\{{\mathsf{N}}-{\mathsf{r}}+2,{\mathsf{K}}-t+1,N_{\textup{e}}({\mathbf{d}})\}]$ , each ${\mathcal{J}}^{\prime}\subseteq([{\mathsf{K}}]\setminus\{u_{1},\ldots,u_{q}\})$ where $|{\mathcal{J}}^{\prime}|=t-1$ , and each ${\mathcal{B}}^{\prime}\subseteq([{\mathsf{N}}]\setminus\{d_{u_{1}},\ldots,d_{u_{q}}\})$ , where $|{\mathcal{B}}^{\prime}|={\mathsf{r}}-2$ and ${\mathcal{B}}^{\prime}\cap{\mathcal{N}}([{\mathsf{K}}])\neq\emptyset$ .

Hence, from Observation 2, user $k$ can recover $W_{{\mathcal{S}}_{2},{\mathcal{V}}_{2}}$ where ${\mathcal{N}}({\mathcal{V}}_{2})\cap\{d_{u_{1}},\ldots,d_{u_{g(d_{k})-1}}\}=\emptyset$ , from the transmission of the first sub-phase.

F-B * ${\mathcal{N}}({\mathcal{V}}_{2})\cap\{d_{u_{1}},\ldots,d_{u_{g(d_{k})-1}}\}\neq\emptyset$ *

For the case $t=2$ , since ${\mathcal{N}}({\mathcal{V}}_{2})\cap({\mathcal{S}}_{2}\setminus\{d_{k}\})\neq\emptyset$ and ${\mathcal{N}}({\mathcal{V}}_{2})\cap\{d_{u_{1}},\ldots,d_{u_{g(d_{k})-1}}\}\neq\emptyset$ where $\{d_{u_{1}},\ldots,d_{u_{g(d_{k})-1}}\}\cap{\mathcal{S}}_{2}=\emptyset$ and $|{\mathcal{V}}_{2}|=2$ , it can be seen that $|{\mathcal{N}}({\mathcal{V}}_{2})\setminus{\mathcal{S}}_{2}|=1$ and $d_{k}\notin({\mathcal{N}}({\mathcal{V}}_{2})\setminus{\mathcal{S}}_{2})$ . For the case ${\mathsf{r}}={\mathsf{N}}-1$ , since ${\mathcal{N}}({\mathcal{V}}_{2})\cap\{d_{u_{1}},\ldots,d_{u_{g(d_{k})-1}}\}\neq\emptyset$ where $\{d_{u_{1}},\ldots,d_{u_{g(d_{k})-1}}\}\cap{\mathcal{S}}_{2}=\emptyset$ and $|{\mathcal{S}}_{2}|={\mathsf{r}}={\mathsf{N}}-1$ , it can also be seen that $|{\mathcal{N}}({\mathcal{V}}_{2})\setminus{\mathcal{S}}_{2}|=1$ and $d_{k}\notin({\mathcal{N}}({\mathcal{V}}_{2})\setminus{\mathcal{S}}_{2})$ .

Hence, when $t=2$ or ${\mathsf{r}}={\mathsf{N}}-1$ , the interferences in $C_{{\mathcal{V}}_{2}\cup\{u_{g(d_{k})}\},{\mathcal{S}}_{2}\setminus\{d_{k}\}}$ (transmitted in Step $g(d_{k})$ of the first sub-phase) to user $k$ are all from the block $W_{{\mathcal{S}}_{2}\cup\{i\}\setminus\{d_{k}\}}$ , where $i$ is the element in ${\mathcal{N}}({\mathcal{V}}_{2})\setminus{\mathcal{S}}_{2}$ . The sum of the interferences in $C_{{\mathcal{V}}_{2}\cup\{u_{g(d_{k})}\},{\mathcal{S}}_{2}\setminus\{d_{k}\}}$ is

[TABLE]

For each user $k_{4}\in{\mathcal{V}}_{2}$ where $d_{k_{4}}\neq d_{k}$ , we focus on $C_{{\mathcal{V}}_{2}\setminus\{k_{4}\}\cup\{u_{g(d_{k})},u_{g(i)}\},{\mathcal{S}}_{2}\setminus\{d_{k}\}}$ which is transmitted in Step $i$ of the first sub-phase. In $C_{{\mathcal{V}}_{2}\setminus\{k_{4}\}\cup\{u_{g(d_{k})},u_{g(i)}\},{\mathcal{S}}_{2}\setminus\{d_{k}\}}$ , since $|{\mathcal{V}}_{2}|=2$ or $|{\mathcal{S}}_{2}|={\mathsf{N}}-1$ , it can be seen that all sub-blocks are from either $W_{{\mathcal{S}}_{2}}$ or $W_{{\mathcal{S}}_{2}\setminus\{d_{k}\}\cup\{i\}}$ . Each sub-block from $W_{{\mathcal{S}}_{2}}$ is either cached by user $u_{g(d_{k})}$ or by user $u_{g(i)}$ , which can be recovered by user $k$ from the first sub-phase, by Lemma 1.Item 2 . Hence, user $k$ can recover the sum of the sub-blocks of $W_{{\mathcal{S}}_{2}\setminus\{d_{k}\}\cup\{i\}}$ in $C_{{\mathcal{V}}_{2}\setminus\{k_{4}\}\cup\{u_{g(d_{k})},u_{g(i)}\},{\mathcal{S}}_{2}\setminus\{d_{k}\}}$ as follows,

[TABLE]

By the similar proof as (51) and (52), we can prove that

[TABLE]

from the fact that each sub-block in (64) appears twice in (64). Hence, user $k$ can reconstruct the sum of all interferences $I^{\prime}$ in $C_{{\mathcal{V}}_{2}\cup\{u_{g(d_{k})}\},{\mathcal{S}}_{2}\setminus\{d_{k}\}}$ . Other sub-blocks in $C_{{\mathcal{V}}_{2}\cup\{u_{g(d_{k})}\},{\mathcal{S}}_{2}\setminus\{d_{k}\}}$ are from the block $W_{{\mathcal{S}}_{2}}$ which is desired by user $k$ . In addition, all these sub-blocks are cached by user $u_{g(d_{k})}$ except $W_{{\mathcal{S}}_{2},{\mathcal{V}}_{2}}$ . By Lemma 1.Item 2 , from the transmission of first sub-phase user $k$ can recover the sub-blocks of $W_{{\mathcal{S}}_{2}}$ which are cached by user $u_{g(d_{k})}$ . Hence, user $k$ can also recover $W_{{\mathcal{S}}_{2},{\mathcal{V}}_{2}}$ in the first sub-phase.

Appendix G Codes for Extension to Caching with Multiple Requests

For the caching problem with multiple requests considered in [17] where each user demands ${\mathsf{L}}$ uncorrelated and equal-length files, the proposed delivery scheme in [17] was proved to be optimal under the constraint of the MAN placement for most demands with ${\mathsf{K}}\leq 4$ , ${\mathsf{M}}={\mathsf{N}}/{\mathsf{K}}$ , and ${\mathsf{L}}=2$ , except one demand for ${\mathsf{K}}=3$ and three demands for ${\mathsf{K}}=4$ . Different from the considered problem in this paper, the demands are not generally symmetric for the caching problem with multiple requests. Hence, for the caching problem with multiple requests, we pick a set of leaders such that each leader has at least one specific demanded file which is not demanded by other leaders, and the union set of demanded files by the leaders should be equal to the union set of demanded files by all users. In addition, the number of leaders should be as small as possible. We can then extend the proposed scheme for $t=1$ in order to achieve the optimality for those four exceptional demands, by satisfying the demands of leaders subsequently and aligning the interferences to non-leaders simultaneously.

$d_{1}=\{F_{1},F_{2}\}$ , $d_{2}=\{F_{1},F_{3}\}$ , and $d_{3}=\{F_{2},F_{3}\}$ (case $D_{7}$ in [17]). We use the MAN placement and divide each file $F_{i}$ where $i\in[{\mathsf{N}}]$ into $\binom{{\mathsf{K}}}{t}$ non-overlapping and equal-length subfiles, $F_{i}=\{F_{i,{\mathcal{W}}}:{\mathcal{W}}\subseteq[{\mathsf{K}}],|{\mathcal{W}}|=t\}$ , where $t={\mathsf{K}}{\mathsf{M}}/{\mathsf{N}}=1$ . It can be seen this case is equivalent to our considered $({\mathsf{N}},{\mathsf{K}},{\mathsf{M}},{\mathsf{r}})=(3,3,1,2)$ shared-link caching problem with correlated files. Hence, we can directly use the proposed delivery phase in this paper to transmit the linear combinations (with leader permutation $(1,2)$ )

[TABLE]

Hence, the load is $5/3$ which coincides with the converse bound under the constraint of MAN placement in [17], while the proposed caching scheme in [17] achieves $2$ . 2. 2.

$d_{1}=\{F_{1},F_{2}\}$ , $d_{2}=\{F_{1},F_{3}\}$ , $d_{3}=\{F_{2},F_{3}\}$ , and $d_{4}=\{F_{4},F_{5}\}$ (case $D^{\prime}_{15}$ in [17]). It can be seen that if we only focus on the demands of users $1,2,3$ , it is equivalent to our considered $({\mathsf{N}},{\mathsf{K}},{\mathsf{M}},{\mathsf{r}})=(3,3,1,2)$ shared-link caching problem with correlated files. In addition, the demanded file by user $4$ are independent to any demanded file by users $1,2,3$ . Hence, we first satisfy the demands of user $4$ and then use the codes for our considered $({\mathsf{N}},{\mathsf{K}},{\mathsf{M}},{\mathsf{r}})=(3,3,1,2)$ shared-link caching problem with correlated files. Thus we transmit (with leader permutation $(4,1,2)$ )

[TABLE]

Hence, the load is $11/4$ which coincides with the converse bound under the constraint of MAN placement in [17], while the proposed caching scheme in [17] achieves $3$ . 3. 3.

$d_{1}=\{F_{1},F_{2}\}$ , $d_{2}=\{F_{1},F_{3}\}$ , $d_{3}=\{F_{1},F_{4}\}$ , and $d_{4}=\{F_{2},F_{3}\}$ (case $D^{\prime}_{17}$ in [17]). We choose the leader set as $\{3,4\}$ and the chose permutation is $(3,4)$ . Inspired from proposed scheme for $t=1$ , the delivery contains two steps where in the first and second steps, we satisfy the demands of users $3$ and $4$ , respectively.

In Step $1$ , we first let user $3$ recover $F_{1}$ . For each user $k\in\{1,2,4\}$ , if $F_{1}$ is demanded by user $k$ , we transmit $F_{1,\{k\}}\oplus F_{1,\{3\}}$ ; otherwise, we pick one demanded file by user $k$ which is not $F_{4}$ (assumed to be $F_{i}$ ), and transmit $F_{1,\{k\}}\oplus F_{i,\{3\}}$ .

We then let user $3$ recover $F_{4}$ . For each user $k\in\{1,2,4\}$ , if $F_{4}$ is demanded by user $k$ , we transmit $F_{4,\{k\}}\oplus F_{4,\{3\}}$ ; otherwise, we pick one demanded file by user $k$ which is not $F_{i}$ nor $F_{1}$ (assumed to be $F_{i^{\prime}}$ ), and transmit $F_{4,\{k\}}\oplus F_{i^{\prime},\{3\}}$ .

By this way, we transmit in the the steps (with leader permutation $(3,4)$ )

[TABLE]

Hence, the load is $10/4$ which coincides with the converse bound under the constraint of MAN placement in [17], while the proposed caching scheme in [17] achieves $11/4$ . 4. 4.

$d_{1}=\{F_{1},F_{2}\}$ , $d_{2}=\{F_{1},F_{2}\}$ , $d_{3}=\{F_{1},F_{3}\}$ , and $d_{4}=\{F_{2},F_{3}\}$ (case $D^{\prime}_{20}$ in [17]). It can be seen this case is equivalent to our considered $({\mathsf{N}},{\mathsf{K}},{\mathsf{M}},{\mathsf{r}})=(3,4,1,2)$ shared-link caching problem with correlated files. Hence, we can directly use the proposed delivery phase in this paper to transmit the linear combinations (with leader permutation $(1,3)$ )

[TABLE]

Hence, the load is $2$ , which coincides with the converse bound under the constraint of MAN placement in [17], while the proposed caching scheme in [17] achieves $9/4$ .

Bibliography25

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] M. A. Maddah-Ali and U. Niesen, “Fundamental limits of caching,” IEEE Trans. Infor. Theory , vol. 60, no. 5, pp. 2856–2867, May 2014.
2[2] K. Wan, D. Tuninetti, and P. Piantanida, “On the optimality of uncoded cache placement,” in IEEE Infor. Theory Workshop (ITW) , Sep. 2016.
3[3] Q. Yu, M. A. Maddah-Ali, and S. Avestimehr, “The exact rate-memory tradeoff for caching with uncoded prefetching,” IEEE Trans. Infor. Theory , vol. 64, no. 2, pp. 1281–1296, Feb. 2018.
4[4] ——, “Characterizing the rate-memory tradeoff in cache networks within a factor of 2,” in IEEE Int. Symp. Inf. Theory (ISIT) , Jun. 2017.
5[5] M. A. Maddah-Ali and U. Niesen, “Decentralized coded caching attains order-optimal memory-rate tradeoff,” IEEE/ACM Trans. Networking , vol. 23, no. 4, pp. 1029–1040, Aug. 2015.
6[6] M. Ji, G. Caire, and A. Molisch, “Fundamental limits of caching in wireless D 2D networks,” IEEE Trans. Inf. Theory , vol. 62, no. 1, pp. 849–869, 2016.
7[7] N. Karamchandani, U. Niesen, M. A. Maddah-Ali, and S. Diggavi, “Hierarchical coded caching,” IEEE Trans. Infor. Theory , vol. 62, no. 6, pp. 3212–3229, Jun. 2016.
8[8] M. Ji, A. M. Tulino, J. Llorca, and G. Caire, “Caching in combination networks,” 49th Asilomar Conf. on Sig., Sys. and Comp., , Nov. 2015.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

On the Fundamental Limits of Coded Caching with Correlated Files

Abstract

I Introduction

I-A Past Work

Coded Caching with Correlated Sources

Coded Caching with Multiple Requests

I-B Contributions

I-C Paper Organization

I-D Notation Convention

II System Model

Placement Phase

Delivery Phase

Load

Uncoded Cache Placement

Special Cases

Relation to the More General Coded Caching Problem with Correlated Sources

Relation to the Coded Caching Problem with Multiple Requests

III Main Results and Numerical Evaluations

III-A Converse Bound

Theorem 1** (Converse).**

III-B Achievable Scheme

III-C Optimality of (8) for demand type s=K≤Ns={\mathsf{K}}\leq{\mathsf{N}}s=K≤N

Theorem 2** (Optimality for Distinct Requests).**

III-D Performance of (8) for any demand type

Theorem 3** (Interference-Alignment Based Delivery Scheme).**

Theorem 4** (Order Optimality for Theorem 3).**

III-E Optimality of (8) for r∈{1,2,N−1,N}{\mathsf{r}}\in\{1,2,{\mathsf{N}}-1,{\mathsf{N}}\}r∈{1,2,N−1,N} or t∈{0,1,2,K−1,K}t\in\{0,1,2,{\mathsf{K}}-1,{\mathsf{K}}\}t∈{0,1,2,K−1,K} or s∈[min⁡{N,K,4}]s\in[\min\{{\mathsf{N}},{\mathsf{K}},4\}]s∈[min{N,K,4}]

Theorem 5** (Exact Optimality for Some Cases).**

Corollary 1**.**

Remark 1** (Average and Worst-case Loads).**

III-F Extensions

Extension to the More General Coded Caching Problem with Correlated Sources

Extension to the Coded Caching Problem with Multiple Requests

Extension to Distributed Computation

III-G Numerical Evaluations

IV Converse Bound

IV-A Proof of Theorem 1

IV-B Discussion

V Achievable Scheme

V-A An example of (8) with only sub-phase 1 for the delivery scheme

Block Subdivision

Placement Phase

Delivery Phase

Performance

Comparison with state-of-the-art ‘round-division’ schemes

V-B Proof of Theorem 2

Decodability after delivery sub-phase 1

Lemma 1** (Decoding after sub-phase 1).**

Decodability for leader users after sub-phase 1

Load of sub-phase 1

Optimality for the case of distinct demands

V-C Proof of Theorem 3

First Delivery Sub-Phase

Second Delivery Sub-Phase

Lemma 2**.**

Lemma 3** (Properties of function CT,HC_{{\mathcal{T}},{\mathcal{H}}}CT,H​ defined in (20)).**

Lemma 4** (Interference alignment lemma).**

Performance

V-D An example of sub-phase 2 in (8)

Placement Phase

Delivery Phase

First delivery sub-phase

Second delivery sub-phase

Performance

V-E Proof of Theorem 4

V-F Proof of Theorem 5

VI Conclusions

Appendix A Proof of Lemma 1

A-A Step 111

Lemma 1.Item 1

Lemma 1.Item 2

Lemma 1.Item 3

A-B Step jjj

Lemma 1.Item 1

Theorem 1 (Converse).

III-C Optimality of (8) for demand type $s={\mathsf{K}}\leq{\mathsf{N}}$

Theorem 2 (Optimality for Distinct Requests).

Theorem 3 (Interference-Alignment Based Delivery Scheme).

Theorem 4 (Order Optimality for Theorem 3).

III-E Optimality of (8) for ${\mathsf{r}}\in\{1,2,{\mathsf{N}}-1,{\mathsf{N}}\}$ or $t\in\{0,1,2,{\mathsf{K}}-1,{\mathsf{K}}\}$ or $s\in[\min\{{\mathsf{N}},{\mathsf{K}},4\}]$

Theorem 5 (Exact Optimality for Some Cases).

Corollary 1.

Remark 1 (Average and Worst-case Loads).

Lemma 1 (Decoding after sub-phase 1).

Lemma 2.

Lemma 3 (Properties of function $C_{{\mathcal{T}},{\mathcal{H}}}$ defined in (20)).

Lemma 4 (Interference alignment lemma).

A-A Step $1$

A-B Step $j$

Remark 2.

E-A * $|{\mathcal{S}}_{2}\cap{\mathcal{N}}([{\mathsf{K}}])|>1$ *

Remark 3.

E-B $|{\mathcal{S}}_{2}\cap{\mathcal{N}}([{\mathsf{K}}])|=1$

Appendix F Proof of the Decodability for ${\mathsf{r}}={\mathsf{N}}-1$ or $t=2$

F-A * ${\mathcal{N}}({\mathcal{V}}_{2})\cap\{d_{u_{1}},\ldots,d_{u_{g(d_{k})-1}}\}=\emptyset$ *

F-B * ${\mathcal{N}}({\mathcal{V}}_{2})\cap\{d_{u_{1}},\ldots,d_{u_{g(d_{k})-1}}\}\neq\emptyset$ *