Spatial Soft-Core Caching

Derya Malak; Muriel M\'edard; Edmund Yeh

arXiv:1901.11102·cs.IT·February 1, 2019

Spatial Soft-Core Caching

Derya Malak, Muriel M\'edard, Edmund Yeh

PDF

TL;DR

This paper introduces a decentralized spatial soft-core cache placement policy for wireless networks that improves cache efficiency and supports proximity-based applications by balancing cache diversity and size constraints.

Contribution

The paper presents a novel spatial soft-core cache placement method that achieves high cache efficiency and can be tuned for cache size constraints in wireless networks.

Findings

01

SSCC provides up to 180% cache size savings compared to independent placement.

02

SSCC achieves over 100% savings compared to hard-core placement.

03

It enables effective proximity-based applications like D2D and P2P networking.

Abstract

We propose a decentralized spatial soft-core cache placement (SSCC) policy for wireless networks. SSCC yields a spatially balanced sampling via negative dependence across caches, and can be tuned to satisfy cache size constraints with high probability. Given a desired cache hit probability, we compare the 95% confidence intervals of the required cache sizes for independent placement, hard-core placement and SSCC policies. We demonstrate that in terms of the required cache storage size, SSCC can provide up to more than 180% and 100% gains with respect to the independent and hard-core placement policies, respectively. SSCC can be used to enable proximity-based applications such as device-to-device communications and peer-to-peer networking as it promotes the item diversity and reciprocation among the nodes.

Figures4

Click any figure to enlarge with its caption.

Equations35

\displaystyle F(Z)=\mathbb{E}_{\mathcal{I}}\left[\sum\limits_{k=1}^{\infty}{w_{k}\Big{(}1-\prod\limits_{k^{\prime}=1}^{k}\big{(}1-z_{p_{k^{\prime}}I}\big{)}\Big{)}}\right],

\displaystyle F(Z)=\mathbb{E}_{\mathcal{I}}\left[\sum\limits_{k=1}^{\infty}{w_{k}\Big{(}1-\prod\limits_{k^{\prime}=1}^{k}\big{(}1-z_{p_{k^{\prime}}I}\big{)}\Big{)}}\right],

z_{x_{k} i} = \mathbbm 1 {i \in Cache (x_{k})} = \mathbbm 1 {x_{k} \in Φ_{t h, i}} .

z_{x_{k} i} = \mathbbm 1 {i \in Cache (x_{k})} = \mathbbm 1 {x_{k} \in Φ_{t h, i}} .

N = E [C (x_{k})] = \sum_{i} p (x_{k}, m_{k}^{(i)}, v_{k}^{(i)}, Φ), x_{k} \in Φ.

N = E [C (x_{k})] = \sum_{i} p (x_{k}, m_{k}^{(i)}, v_{k}^{(i)}, Φ), x_{k} \in Φ.

p (x, m, v, Φ) = p_{0} (y, n, w) \in Φ, y \neq = x \prod [1 - \mathbbm 1 {v \geq w} f (∣∣ x - y ∣∣, m, n)]

p (x, m, v, Φ) = p_{0} (y, n, w) \in Φ, y \neq = x \prod [1 - \mathbbm 1 {v \geq w} f (∣∣ x - y ∣∣, m, n)]

f_{c} (r, m, n) = exp (- c ⌊ r - m - n ⌋_{+}), r \geq 0.

f_{c} (r, m, n) = exp (- c ⌊ r - m - n ⌋_{+}), r \geq 0.

\lambda_{th}=\lambda p_{0}\int\nolimits_{\mathbb{R}}\int\nolimits_{\mathbb{R}}\exp\Big{(}-\lambda\int\nolimits_{\mathbb{R}}F_{\nu_{n}}(w)\\ \int\nolimits_{\mathbb{R}^{2}}f(||x||,m,n){\rm d}x\ \mu({\rm d}n)\Big{)}\,\nu_{m}({\rm d}w)\,\mu({\rm d}m),

\lambda_{th}=\lambda p_{0}\int\nolimits_{\mathbb{R}}\int\nolimits_{\mathbb{R}}\exp\Big{(}-\lambda\int\nolimits_{\mathbb{R}}F_{\nu_{n}}(w)\\ \int\nolimits_{\mathbb{R}^{2}}f(||x||,m,n){\rm d}x\ \mu({\rm d}n)\Big{)}\,\nu_{m}({\rm d}w)\,\mu({\rm d}m),

H (r) = P (R_{Sph} \leq r ∣ R_{Sph} > 0), r \geq 0,

H (r) = P (R_{Sph} \leq r ∣ R_{Sph} > 0), r \geq 0,

E_{π} [F (Z)] = E_{I} [H_{π, I} (R_{c})],

E_{π} [F (Z)] = E_{I} [H_{π, I} (R_{c})],

F (Z) = \sum_{i} p_{r} (i) \mathbbm 1 (Φ_{t h, i} (B) > 0) .

F (Z) = \sum_{i} p_{r} (i) \mathbbm 1 (Φ_{t h, i} (B) > 0) .

P (Φ_{t h, i} (B) > 0) = P (R_{Sph} \leq R_{c} ∣ R_{Sph} > 0),

P (Φ_{t h, i} (B) > 0) = P (R_{Sph} \leq R_{c} ∣ R_{Sph} > 0),

Var_{π} [F (Z)] = \sum_{i} p_{r}^{2} (i) H_{π, i} (R_{c}) (1 - H_{π, i} (R_{c}))

Var_{π} [F (Z)] = \sum_{i} p_{r}^{2} (i) H_{π, i} (R_{c}) (1 - H_{π, i} (R_{c}))

\displaystyle\operatorname*{H_{\pi}}(R)=1-\exp\Big{(}-\int\nolimits_{0}^{R}2\pi r\lambda\eta_{\pi}(r,\delta){\rm d}r\Big{)},

\displaystyle\operatorname*{H_{\pi}}(R)=1-\exp\Big{(}-\int\nolimits_{0}^{R}2\pi r\lambda\eta_{\pi}(r,\delta){\rm d}r\Big{)},

η_{π} (r, δ) = P (x \in Φ_{t h} ∣ Φ_{t h} \cap B_{x_{0}} (r) = \emptyset, x_{0} \in Φ) .

η_{π} (r, δ) = P (x \in Φ_{t h} ∣ Φ_{t h} \cap B_{x_{0}} (r) = \emptyset, x_{0} \in Φ) .

η_{SSCC} (r, δ) = \int_{R} \int_{0}^{1} e^{- u λ \int_{R} \int_{R^{2}} h (∣∣ x ∣∣, m, n) d x μ (d n)} d u μ (d m),

η_{SSCC} (r, δ) = \int_{R} \int_{0}^{1} e^{- u λ \int_{R} \int_{R^{2}} h (∣∣ x ∣∣, m, n) d x μ (d n)} d u μ (d m),

η_{MatII} (r, δ) = \frac{1 - e ^{- λ (π δ^{2} - l_{2} (r, δ))}}{λ ( π δ ^{2} - l _{2} ( r , δ ))} .

η_{MatII} (r, δ) = \frac{1 - e ^{- λ (π δ^{2} - l_{2} (r, δ))}}{λ ( π δ ^{2} - l _{2} ( r , δ ))} .

η_{SSCC} (r, δ) = \int_{R} \int_{0}^{1} e^{- u λ \int_{R} (π (m + n)^{2} - l_{2} (r, n)) μ (d n)} d u μ (d m)

η_{SSCC} (r, δ) = \int_{R} \int_{0}^{1} e^{- u λ \int_{R} (π (m + n)^{2} - l_{2} (r, n)) μ (d n)} d u μ (d m)

\displaystyle=\mathbb{E}_{m}\Big{[}\mathbb{E}_{U}\Big{[}e^{-Uq(\lambda,r,m)}\Big{]}\Big{]}=\mathbb{E}_{m}\left[\frac{1-e^{-q(\lambda,r,m)}}{q(\lambda,r,m)}\right],

P (C (x) > C) \leq exp (- \frac{( C - N ) ^{2}}{Var [ C ( x )] + \frac{1}{3} ( C - N )}),

P (C (x) > C) \leq exp (- \frac{( C - N ) ^{2}}{Var [ C ( x )] + \frac{1}{3} ( C - N )}),

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Spatial Soft-Core Caching

Derya Malak and Muriel Médard

Research Laboratory of Electronics, MIT, Cambridge, MA, USA

Email: {deryam, medard}@mit.edu

Edmund M. Yeh

Northeastern University, Boston, MA, USA

Email: [email protected]

Abstract

We propose a decentralized spatial soft-core cache placement (SSCC) policy for wireless networks. SSCC yields a spatially balanced sampling via negative dependence across caches, and can be tuned to satisfy cache size constraints with high probability. Given a desired cache hit probability, we compare the 95% confidence intervals of the required cache sizes for independent placement, hard-core placement and SSCC policies. We demonstrate that in terms of the required cache storage size, SSCC can provide up to more than 180% and 100% gains with respect to the independent and hard-core placement policies, respectively. SSCC can be used to enable proximity-based applications such as device-to-device communications and peer-to-peer networking as it promotes the item diversity and reciprocation among the nodes.

I Introduction

Distributed caching is a powerful technique to minimize the total average delay by replacing the backhaul capacity with storage capacity at small cells [1], and to enable spectral reuse and throughput gain in networks [2]. The goal of an efficient cache placement is to maximize the hit probability, i.e. the probability of obtaining the desired item from a neighboring cache. This is affected by the demand distribution, network topology, range of communication, and cache storage size.

Fundamental limits of caching have been studied in [2], in which the content placement phase is carefully designed so that a single coded multicast transmission can satisfy different demands. Capacity scaling laws have been explored in [2], and rate-memory and storage-latency tradeoffs have been studied in [3]. Caching has been studied in the context of device-to-device (D2D) communications in [4], and interference management in [5], and in optimization of cloud and edge processing for radio access networks in [6].

Temporal caching models have been analyzed in [7] for popular cache replacement algorithms, e.g. least recently used (LRU), least-frequently used, and most recently used cache update. Decentralized spatial LRU caching strategies have been developed in [8]. These combine the temporal and spatial aspects of caching, and approach the performance of centralized policies as the coverage increases. However, they are restricted to the LRU principle. A time-to-live (TTL) policy with a stochastic capacity constraint and low variance has been proposed in [9]. The BitTorrent protocol employs the rarest first and choke algorithms to promote diversity of the pieces among peers, and foster reciprocation, respectively. These have been demonstrated in the context of peer-to-peer (P2P) file replication in the Internet [10]. A good piece replication algorithm should minimize the time spent in the transient state.

There exist studies focusing on decentralized (geographic) content placement policies such as [1], [11], [12], [13]. The main focus of the literature in this direction is to maximize the average cache hit probability subject to an average cache constraint. This optimization problem can be solved as a convex program. However, to the best of our knowledge, the related literature does not provide guarantees in terms of (a) how far-off the average cache size is from reality, (b) how far-off the average cache hit rate is from reality, and (c) how stable the cache hit probability across the caches.

In the current paper, we provide a decentralized spatial soft-core cache placement (SSCC) policy. Since the cache storage size is finite, it is intuitive to have an exclusion range-based caching model such that the caches storing the same item are never closer to each other than some given distance (negative dependence), so as to promote diversity and reciprocation. SSCC roots in spatially balanced sampling, which is motivated by the request arrivals. For example, in P2P networking, the actual demand distribution is not known by nodes, and the cache updates in each peer are triggered by the requests. Furthermore, the traffic density in cellular networks is in general not uniform across the network, and the peak hour density can be approximated by a log-normal distribution [14]. Hence, instead of having a fixed exclusion range, it is desirable to have a variable exclusion range, depending on the popularity of the item. The SSCC policy come to the fore by putting a mark distribution on the exclusion range of an item based on its popularity. The marks may correspond to the detection ranges or the transmit powers of the nodes in heterogeneous network scenarios. Our objective is to address the issues (a)-(c) above in order to provide a better trade-off between the actual cache hit rate and the cache size violation probability. Our main contributions and use cases of SSCC are:

i.

SSCC has desirable properties: spatially balanced sampling across caches, concentration of the cache size, better cache over-provisioning, and multi-hop connectivity. 2. ii.

SSCC yields a better cache hit probability-cache violation probability tradeoff than the state of the art. In terms of the required cache storage size, SSCC can provide more than 180% and 100% gains with respect to independent placement [11], and hard-core placement [13], respectively. 3. iii.

SSCC is suited for enabling proximity-based applications (D2D, P2P), and offloading mobile users in networks. 4. iv.

SSCC has connections with rarest first caching as it promotes the item diversity and reciprocation among the nodes. Hence, it can be well-suited for P2P applications.

Notation. Let $\Phi$ denote the mother point process (p.p.), and $\Phi_{th}$ be the child p.p. obtained via the thinning of $\Phi$ . Let $\pi$ be a spatial caching policy that yields a set of child p.p.’s $\{\Phi_{th,i}\}_{i}$ , where $\Phi_{th,i}$ is the set of retained points that cache item $i$ . Let $A$ be a given bounded convex set in $\mathbb{R}^{2}$ containing the origin, and $rA$ be its dilation by the factor $r$ . $\mathbbm{1}\{A\}$ is the indicator of event $A$ . Let $B$ be a bounded Borel set. Let $\Phi(B)$ be the random number of points of the spatial p.p. $\Phi$ which lie in $B$ . Any receiver can obtain the desired item $i$ if it is within a critical communication range $\operatorname*{R_{\sf c}}$ . Assume that $B=B_{0}(\operatorname*{R_{\sf c}})$ , where $B_{0}(r)$ is a ball in $\mathbb{R}^{2}$ with radius $r$ , centered at origin.

II How to Optimize the Caching Gain

The locations of the nodes (caches) in the network are modeled by a homogeneous Poisson point process (PPP) $\Phi$ in $\mathbb{R}^{2}$ with intensity $\lambda$ . There are $M$ items in the network, each having the same size, and each node has the same cache storage size $N<M$ . Each user makes requests based on a Zipf popularity distribution over the set of the items. The probability mass function (pmf) of such requests (demand) is given by $p_{r}(i)=i^{-\gamma_{r}}\big{/}\sum\nolimits_{j=1}^{M}{j^{-\gamma_{r}}}$ , where $\gamma_{r}$ determines the tilt of the Zipf distribution. The demand profile is the Independent Reference Model (IRM), i.e., the standard synthetic traffic model in which the request distribution does not change over time [15]. The request distribution is uniform across the network, i.e., isotropic, and does not change over time. Hence, the intensity of the requests for item $i$ , i.e. $\lambda_{i}$ , is proportional to its demand probability $p_{r}(i)$ . Let $\mathcal{I}\sim p_{r}$ be the random variable that models the demand. Each node is associated with the variables $z_{xi}=\mathbbm{1}\{i\in{\rm Cache}(x)\}$ that denote whether item $i$ is available in its cache or not. There is also a cost $w_{k}$ associated with obtaining an item within the presence of $k$ nodes in the range. Given these parameters, consider the caching gain function of the following form:

[TABLE]

where (1) can be used to model multi-hop coverage scenarios as in [12], and Boolean Model coverage scenarios as in [11], [13]. Let $\lambda_{i}=p_{r}(i)$ , and $w_{k}=\mathbb{P}(\Phi(B)=k)$ , which is the probability that $k$ caches (nodes of the original p.p. $\Phi$ ) cover the typical receiver, and $w_{0}$ is the probability of having no connection. Assume that $k^{*}$ is the first index such that a transmitter has the desired item $i$ . Then, from (1), the caching gain for item $i$ is $\sum\nolimits_{k=k^{*}}^{\infty}{w_{k}}=\mathbb{P}(\Phi(B)\geq k^{*})$ , which is the same as probability of having at least $k^{*}$ transmitters. Equivalently, the cost of caching is $\sum\nolimits_{k=1}^{k^{*}-1}{w_{k}}$ .

Since both the multi-hop and Boolean coverage scenarios are equivalent up to scaling, we focus on the second scenario.

We have the following immediate observation.

Proposition 1.

$F(Z)$ * is convex if $z_{p_{k}^{\prime}i}$ ’s are negatively associated (NA) [16] across $k^{\prime}\in\{1,\ldots,k\}$ , for all $i\in\{1,\ldots,M\}$ .*

Proof.

Exploiting (1), we have the following relation: $\mathbb{E}[F(Z)]=\mathbb{E}_{\mathcal{I}}\left[\sum\nolimits_{k=0}^{\infty}w_{k}{\Big{(}1-\mathbb{E}\Big{[}\prod\nolimits_{k^{\prime}=1}^{k}\big{(}1-z_{p_{k^{\prime}}\mathcal{I}}\big{)}\Big{]}\Big{)}}\right]\overset{(a)}{\geq}\mathbb{E}_{\mathcal{I}}\left[\sum\nolimits_{k=0}^{\infty}w_{k}{\Big{(}1-\prod\nolimits_{k^{\prime}=1}^{k}\big{(}1-\mathbb{E}[z_{p_{k^{\prime}}\mathcal{I}}]\big{)}\Big{)}}\right]=F(\mathbb{E}[Z])$ , where $(a)$ is due to that $\mathbb{E}\Big{[}\prod\nolimits_{k^{\prime}=1}^{k}\big{(}1-z_{p_{k^{\prime}}i}\big{)}\Big{]}\leq\prod\nolimits_{k^{\prime}=1}^{k}\big{(}1-\mathbb{E}[z_{p_{k^{\prime}}i}]\big{)}$ as $z_{p_{k^{\prime}}i}$ ’s are NA across $k^{\prime}\in\{1,\ldots,k\}$ , $\forall$ $i$ . ∎

From Prop. 1, $\mathbb{E}[F(Z)]\geq F(\mathbb{E}[Z])$ . The expected cache hit probability obtained via NA placement upper bounds the independent placement solution with probabilities $\mathbb{E}[z_{p_{k^{\prime}}i}]$ . NA has desirable properties in terms of sampling and concentration. Some important results that hold for independent variables, e.g., the Chernoff-Hoeffding bounds, and the Kolmogorov’s inequality [16], also hold for NA variables.

From Prop. 1, it is clear that in terms of average cache hit performance, NA placement performs better than independent placement. Therefore, our main focus is on a class of placement policies that are NA. We also demonstrate that NA placement policies have lower variance across the nodes, hence are more stable than independent placement policies.

III A Soft-Core Caching Model

The spatial soft-core caching (SSCC) policy is constructed from the underlying PPP $\Phi$ by removing certain nodes depending on the positions of the neighboring nodes, and on the marks and weights attached to them. It generalizes the Matérn II hard-core p.p. (MatII) such that there is a distinct distribution modeling the exclusion radius of each item.

For each item $i$ , let $\tilde{\Phi}_{i}=\{(x_{k},m_{k}^{(i)},v_{k}^{(i)})\}_{k}$ be a homogeneous independently marked PPP with intensity $\lambda$ , and i.i.d. $\mathbb{R}^{2}$ -valued marks, where $\Phi=\{x_{k}\}$ , and $\{(m_{k}^{(i)},v_{k}^{(i)})\}$ is the random bivariate mark. The first component $m^{(i)}$ of the bivariate mark is referred to as mark, and has distribution $\mu^{(i)}$ . The mark of item $i$ , i.e., $m^{(i)}$ , denotes its exclusion radius, and depends on its popularity in the network. If item $i$ is more popular than item $j$ , then $m^{(i)}$ is stochastically dominated111 $X$ is stochastically dominated by $Y$ , which is denoted by $X\leq^{st}Y$ , if for all increasing functions $g$ , we have $\mathbb{E}[g(X)]\leq\mathbb{E}[g(Y)]$ . by $m^{(j)}$ . The second component $v^{(i)}$ of the bivariate mark is weight, which serves as a weight in the thinning procedure, and has distribution $\nu^{(i)}_{m^{(i)}}$ which might depend on $m^{(i)}$ .

Let $\Phi_{th,i}$ be a soft-core p.p. that denotes the set of points that cache item $i$ . The cache placement model is such that item $i$ is stored in cache $x_{k}\in\Phi$ if and only if cache $x_{k}$ is kept as a point of $\Phi_{th,i}$ . Equivalently, we have

[TABLE]

Node $x_{k}$ is retained as a point of $\Phi_{th,i}$ with probability $\mathbb{E}[z_{x_{k}i}]=p(x_{k},m_{k}^{(i)},v_{k}^{(i)},\Phi)$ . The weights are i.i.d. and uniformly distributed, i.e. $v_{k}^{(i)}\sim U[0,1]$ , for each node $x_{k}$ and item $i$ . The marks $m_{k}^{(i)}$ are distributed according to $\mu^{(i)}$ for each $x_{k}$ , and $i$ . For the special case of MatII, i.e., when the marks are fixed, we optimized the exclusion radii in [13].

The number of items in cache $x_{k}$ is the sum of the individual items’ indicator functions $C(x_{k})=\sum\nolimits_{i}\mathbbm{1}\{i\in{\rm Cache}(x_{k})\}$ . The cache size constraint has to be satisfied on average, i.e.

[TABLE]

We next detail the dependent thinning procedure, and investigate the relationship between $\Phi$ and $\Phi_{th,i}$ , $i=\{1,\ldots,M\}$ .

III-A Dependent Sampling of Nodes for Placement

In this section and onwards, for brevity of notation, we omit the index $i$ , and consider the generic thinned process $\Phi_{th}$ , which is derived from $\Phi$ by applying the following probabilistic dependent thinning rule. Assume that mark $m$ has a distribution $\mu$ , and $\vec{\delta}=\{m\}$ is the set of marks for all points in $\tilde{\Phi}$ , where $m\sim\mu$ and $\bar{m}=\mathbb{E}_{m}[m]$ . Assume that weight $\nu_{m}$ does not depend on the mark $m$ . The marked point $(x,m,v)\in\tilde{\Phi}$ is retained as a point of $\Phi_{th}$ with probability

[TABLE]

independently from deleting or retaining other points of $\Phi$ . In other words, a node $x\in\Phi$ is retained to cache item $i$ with probability $p_{0}$ , if it has the lowest weight among all the points within its exclusion range. In (4), $p_{0}\in(0,1]$ , $f:[0,\infty[\times\mathbb{R}^{2}\to[0,1]$ is a deterministic function satisfying $f(\cdot,m,n)=f(\cdot,n,m)$ for all $m,\,n\in\mathbb{R}$ . This means that if two points with marks $m$ and $n$ , and weights $v\geq w$ are a distance $r>0$ apart, then the point with weight $v$ is deleted by the other point with probability $f(r,m,n)$ . Additionally, each surviving point is then again independently $p_{0}$ -thinned. The function $f(||x-y||,m,n)$ in (4) should be determined according to (3). Inspired from [17], assume that $f$ satisfies

[TABLE]

Denote by SSCC $[\lambda,\mu,(\nu_{m})_{m\in\mathbb{R}},p_{0},f]$ the distribution of $\Phi_{th}$ . We next give its intensity, i.e., $\lambda_{th}=\lambda\mathbb{E}[p(x,m,v,\Phi)]$ .

Theorem 1.

[17, Theorem 12]** The intensity of the process $\Phi_{th}\sim$ SSCC $[\lambda,\mu,(\nu_{m})_{m\in\mathbb{R}},p_{0},f]$ is given by

[TABLE]

where $F_{\nu_{m}}$ is the cumulative distribution function of $\nu_{m}$ .

Proof.

The probability generating functional (PGFL) [18] of the PPP states for function $f(x)$ that $\mathbb{E}\left[\prod\nolimits_{x\in\Phi}f(x)\right]=\exp\big{(}-\lambda\int\nolimits_{\mathbb{R}^{2}}(1-f(x)){\rm d}x\big{)}$ . We obtain $\lambda_{th}$ using the PGFL and $\mathbb{E}[\mathbbm{1}\{\nu_{n}\leq w\}]=\int\nolimits_{\mathbb{R}}F_{\nu_{n}}(w)\mu({\rm d}n)$ , along with (4). ∎

In Fig. 1, we plot different realizations of SSCC $\Phi_{th}$ formed by thinning $\Phi$ . As the mark variance increases, the packing is denser, which is desired for spatially balanced caching.

III-B Spherical Contact Distribution Function

Our goal in this section is to relate the cache hit probability distribution to the (spherical) contact distribution function.

Definition 1.

The spherical contact distribution function (SCDF) of the p.p. $\Xi$ is the conditional distribution function of the distance from a point chosen randomly outside $\Xi$ (i.e. [math]), to the nearest point of $\Xi$ given $0\notin\Xi$ [18]. It is given by

[TABLE]

where $\operatorname*{R_{\sf Sph}}=\inf\{s:\Xi\cap sA\neq\emptyset\}$ , where $A=B_{0}(1)$ , and $rA$ is the dilation of the set $A$ by the factor $r$ .

As an example, in Fig. 2 we show the SCDF for the Boolean model with random spherical grains in [19, Ch. 3.1].

Theorem 2.

The average cache hit probability of policy $\pi$ is

[TABLE]

where $\operatorname*{H_{\pi,\mathcal{I}}}(\operatorname*{R_{\sf c}})$ is the SCDF of the thinned p.p. $\Phi_{th,\mathcal{I}}$ for $\mathcal{I}$ .

Proof.

Let $B=B_{0}(\operatorname*{R_{\sf c}})$ and $\Phi_{th,i}(B)=\sum\nolimits_{x\in\Phi_{th,i}}1(x\in B)$ be the number of transmitters containing item $i$ within a circular region of radius $\operatorname*{R_{\sf c}}$ around the origin. Then we have

[TABLE]

The average cache hit probability is given by $\mathbb{E}[F(Z)]=\sum\nolimits_{i}{p_{r}(i)\mathbb{P}(\Phi_{th,i}(B)>0)}$ , where defining $\operatorname*{R_{\sf Sph}}=\inf\{s:\Phi_{th,i}(B_{0}(s))\neq 0\}$ , given $0\notin\Phi_{th,i}$ we have that

[TABLE]

which is the SCDF of $\Phi_{th,i}$ evaluated at $\operatorname*{R_{\sf c}}$ . ∎

The variance of $F(Z)$ across the nodes satisfies

[TABLE]

since the spatial thinning processes across different items are independent of each other. Under the IRM and a Zipf popularity model, $\mathrm{Var}_{\pi}\!\left[{F(Z)}\right]$ decreases with increasing variance of marks when $\mathbb{E}[C(x)]$ is held constant. A spatially balanced sampling yields a low $\mathrm{Var}_{\pi}\!\left[{F(Z)}\right]$ as expected.

III-C Migration to the Child Process: Effective Thinning

Consider the pair $\Phi-\Phi_{th}$ of mother and child p.p.’s. The spherical contact distance denotes the distance between a typical point in $\Phi$ and its nearest neighbor from $\Phi_{th}$ .

Using (7), the SCDF for the p.p. $\Phi$ can be written as:

[TABLE]

where $\eta_{\pi}(r,\delta)$ is the conditional thinning Palm-probability (CTPP), i.e. the probability of the point $x\in\Phi$ migrating to $\Phi_{th}$ under policy $\pi$ , with a fixed (exclusion) radius $\delta$ . It equals

[TABLE]

Remark 1.

An effective thinning policy yields a larger CTPP $\eta_{\pi}(r,\delta)$ . The more effective the thinning is, the larger (11) is. From Theorem 2, $\mathbb{E}_{\pi}[F(Z)]$ is improved if $\pi$ is more effective.

We next compute the CTPP for the SSCC policy.

Proposition 2.

The CTPP for PPP-SSCC is given as

[TABLE]

where given radius marks $m,\,n$ , $h(||x||,m,n)$ satisfies the relation $\int\nolimits_{\mathbb{R}^{2}}h(||x||,m,n)\,{\rm d}x=\pi(m+n)^{2}-l_{2}(r,n)$ , where $l_{2}(r,\delta)$ is the area of the intersection of $B_{x_{0}}(r)$ and $B_{x}(\delta)$ .

Proof.

The proof follows from generalizing [20, Eq. (15)]. ∎

Corollary 1.

The CTPP for PPP-MatII is given as

[TABLE]

The next Theorem shows that having a distribution on the marks yields a more effective thinning than MatII does.

Theorem 3.

The CTPPs satisfy $\operatorname*{\eta_{\sf SSCC}}(r,\vec{\delta})\geq\operatorname*{\eta_{\sf MatII}}(r,\bar{m})$ , where $\vec{\delta}=\{m\}$ is the set of marks in $\tilde{\Phi}$ , with $\bar{m}=\mathbb{E}_{m}[m]$ , .

Proof.

From Prop. 2, we have that

[TABLE]

where $U\sim U[0,1]$ , and $q(\lambda,r,m)=\lambda\pi\big{(}m^{2}+2m\bar{m}_{2}\big{)}+\lambda\mathbb{E}_{m_{2}}\big{[}\pi m_{2}^{2}-l_{2}(r,m_{2})\big{]}$ . Let $f=e^{-x}$ , $x=U\lambda\pi(m^{2}+2m\bar{m}_{2})$ . We have, $\operatorname*{\eta_{\sf SSCC}}(r,\vec{\delta})=\mathbb{E}_{m}[g(m)]$ , with $g=\frac{f-1}{log(f)}=\frac{1-e^{-x}}{x}$ . Then $g^{\prime}=\frac{e^{-x}(x+1)-1}{x^{2}}$ , $g^{\prime\prime}=\frac{2-e^{-x}[x^{2}+2x+2]}{x^{3}}>0$ . Hence, $\operatorname*{\eta_{\sf SSCC}}(r,\vec{\delta})=\mathbb{E}_{m}[g(m)]\geq g(\bar{m})=\operatorname*{\eta_{\sf MatII}}(r,\bar{m})$ . ∎

Exploiting Theorem 3, $\operatorname*{\eta_{\sf SSCC}}(r,\vec{\delta})$ can be improved using a mixture of marks. The variable exclusion range model can suit to the case of cellular networks where demand is not uniform across the network [14], which we left as future work.

III-D Cache Over-Utilization

The cache placement requires $C(x)=\sum\nolimits_{i}\mathbbm{1}_{x\in\Phi_{th,i}}\leq N$ , for all $x\in\Phi$ , where $N$ is finite. The storage constraint is satisfied on average, i.e. $N=\mathbb{E}[C(x)]=\sum_{i}p(x,m_{i},v,\Phi)$ , $x\in\Phi$ . However, the set of child p.p.’s $\{\Phi_{th,i}\}_{i}$ , $i=1,\ldots,M$ might overlap. We need to make sure that the cache capacities are not over-utilized. Hence, the intersection of the sampled processes, i.e. $\cap_{i}\Phi_{th,i}$ , should not include any $x\in\Phi$ more than $N$ times with high probability. We next provide an upper bound for the violation probability of the cache size for SSCC.

Proposition 3.

Bernstein bound for cache size.* The cache violation probability is upper bounded as*

[TABLE]

where ${\rm Var}[C(x)]=\sum\nolimits_{i=1}^{M}{\rm Var}[z_{xi}]$ since the placement is independent across items, where ${\rm Var}[z_{xi}]=\mathbb{E}[z_{xi}^{2}]-\mathbb{E}[z_{xi}]^{2}=p(x,m_{i},v,\Phi)(1-p(x,m_{i},v,\Phi))$ , for $i\in\{1,\ldots,M\}$ , $x\in\Phi$ .

Proof.

It follows from employing Bernstein inequality since $\mathbbm{1}_{x\in\Phi_{th,i}}$ are independent $1$ -[math] random variables across $i$ . ∎

As ${\rm Var}[C(x)]$ drops, the bound in (13) becomes lower. Hence, the cache violation probability is negligible if the cache placement strategy has very low-variance. In Sect. IV, we demonstrate that SSCC has very small violation probability.

For the spatially independent placement policy in [11], where nodes are sampled i.i.d., authors have proposed a probabilistic placement technique to guarantee that the cache constraint is satisfied with equality. However, in SSCC, nodes are not sampled independently. Because the placement policy is NA across the nodes, it is nontrivial to design probabilistic placement techniques to satisfy the cache size constraint. In this section, we discuss how to bound the violation probability, and demonstrate in Sect. IV that for SSCC the cache violation probability can be made negligibly small.

IV Numerical Simulations

The nodes live in a square region of the Euclidean plane with area $L^{2}$ where $L=100$ . To avoid edge effects, we evaluate the performance only for the middle square region with area $L^{2}/9$ . The network parameters are $\lambda=0.1$ and $\operatorname*{R_{\sf c}}\in\{3,\,10\}$ . The request process is isotropic and Zipf distributed with parameter $\gamma_{r}=0.1$ over $M=100$ items.

For MatII, there is a fixed exclusion range for a given item, and we have derived the optimal exclusion radii in [13]. Let $r_{i}$ be the optimized exclusion range for item $i$ for MatII. For SSCC, we assume that the marks $m^{(i)}$ for item $i$ (exclusion radii) are distributed according to a gamma distribution $\mu^{(i)}=\Gamma(0.7r_{i},1)$ for each $x\in\Phi$ , and all items $i$ , where we choose its parameters such that the average value of the radius mark for item $i$ equals $\bar{m}^{(i)}=0.7r_{i}$ . Hence, $\Phi_{th}\sim$ SSCC $[0.1,\Gamma(0.7r_{i},1),U[0,1],1,f_{10}]$ . We can observe that the SSCC model can be used to optimize the cache hit probability-cache violation probability tradeoff. As variance of exclusion range increases, the violation probability might also increase for a desired cache hit probability. Note that we do not optimize the distributions of the marks $\mu^{(i)}$ across all $i$ over a class of distributions. We leave the study of the fundamental performance limits of SSCC as future work.

We numerically investigate how much cache over-provisioning is required for different spatial cache placement policies: spatially independent [11], MatII [13], and SSCC cache placement. In Fig. 3, we investigate the required cache size $N$ (normalized) of each policy given that the probability of cache violation is small such that $\mathbb{P}[|C(x)-N|\leq\epsilon]>0.95$ in order to characterize the required cache size for a given average cache hit probability. We also illustrate the $95\%$ confidence intervals represented by the shaded regions, and mark the cache sizes for different policies when the average cache hit probability is $\mathbb{E}_{\pi}[F(Z)]=0.7$ . For example, when $\operatorname*{R_{\sf c}}=3$ , for the $95\%$ confidence interval, the excess cache ratio for independent placement in [11], and MatII placement in [13] with respect to the SSCC policy is $142\%$ , and $93\%$ , respectively. When we have $\operatorname*{R_{\sf c}}=10$ , the respective excess ratios for the independent and MatII placement policies are $188\%$ , and $109\%$ , which are illustrated on the plots. SSCC yields a better concentration of the required cache size, which is desired. Hence, policies like SSCC can be exploited so that the cache does not overrun or underrun its capacity constraint.

SSCC gives insights into not only how to cache the content, but also how to effectively sample in spatial settings. SSCC is suited for enabling applications such as D2D and P2P as it promotes the item diversity and reciprocation. Extensions include the incorporation of the spatial variation of the demand. They also include employing the exclusion based models to optimize the performance of time-to-live (TTL) caches.

Acknowledgment

We thank Salman Salamatian for helpful discussions.

Bibliography20

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] N. G. K. Shanmugam, A. G. Dimakis, A. F. Molisch, and G. Caire, “Femto Caching: Wireless content delivery through distributed caching helpers,” IEEE Trans. Inf. Theory , vol. 59, no. 12, pp. 8402–13, Dec. 2013.
2[2] M. A. Maddah-Ali and U. Niesen, “Fundamental limits of caching,” IEEE Trans. Inf. Theory , vol. 60, no. 5, pp. 2856–67, May 2014.
3[3] Q. Yu, M. A. Maddah-Ali, and A. S. Avestimehr, “The exact rate-memory tradeoff for caching with uncoded prefetching,” IEEE Trans. Inf. Theory , vol. 64, no. 2, pp. 1281–1296, 2018.
4[4] M. Ji, G. Caire, and A. F. Molisch, “Fundamental limits of caching in wireless D 2D networks,” IEEE Trans. Inf. Theory , vol. 62, no. 2, pp. 849–869, Feb. 2016.
5[5] M. A. Maddah-Ali and U. Niesen, “Cache-aided interference channels,” in Proc., IEEE Int. Sym. Inf. Theory . IEEE, 2015, pp. 809–813.
6[6] S.-H. Park, O. Simeone, and S. Shamai, “Joint optimization of cloud and edge processing for fog radio access networks,” in Proc., IEEE Int. Sym. Inf. Theory , 2016, pp. 315–319.
7[7] H. Che, Y. Tung, and Z. Wang, “Hierarchical web caching systems: Modeling, design and experimental results,” IEEE J. Sel. Areas Commun. , vol. 20, no. 7, pp. 1305–14, Sep. 2002.
8[8] A. Giovanidis and A. Avranas, “Spatial multi-LRU caching for wireless networks with coverage overlaps,” in Proc., ACM Sigmetrics/IFIP Performance , Antibes, France, Jun. 2016, pp. 403–405.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Spatial Soft-Core Caching

Abstract

I Introduction

II How to Optimize the Caching Gain

Proposition 1**.**

Proof.

III A Soft-Core Caching Model

III-A Dependent Sampling of Nodes for Placement

Theorem 1**.**

Proof.

III-B Spherical Contact Distribution Function

Definition 1**.**

Theorem 2**.**

Proof.

III-C Migration to the Child Process: Effective Thinning

Remark 1**.**

Proposition 2**.**

Proof.

Corollary 1**.**

Theorem 3**.**

Proof.

III-D Cache Over-Utilization

Proposition 3**.**

Proof.

IV Numerical Simulations

Acknowledgment

Proposition 1.

Theorem 1.

Definition 1.

Theorem 2.

Remark 1.

Proposition 2.

Corollary 1.

Theorem 3.

Proposition 3.