Cooperative Caching in Fog Radio Access Networks: A Graph-based Approach

Yanxiang Jiang; Xiaoting Cui; Mehdi Bennis; Fu-Chun Zheng

arXiv:1903.01858·cs.IT·March 6, 2019

Cooperative Caching in Fog Radio Access Networks: A Graph-based Approach

Yanxiang Jiang, Xiaoting Cui, Mehdi Bennis, Fu-Chun Zheng

PDF

Open Access

TL;DR

This paper presents a graph-based cooperative caching method for fog radio access networks that optimizes content placement to significantly increase offloaded traffic while reducing computational complexity.

Contribution

It introduces a novel graph-based approach to solve clustering and content placement problems in cooperative caching, improving efficiency over traditional methods.

Findings

01

Significant increase in offloaded traffic in simulations.

02

Lower complexity compared to traditional solutions.

03

Effective clustering and content placement achieved.

Abstract

In this paper, cooperative caching is investigated in fog radio access networks (F-RAN). To maximize the offloaded traffic, cooperative caching optimization problem is formulated. By analyzing the relationship between clustering and cooperation and utilizing the solutions of the knapsack problems, the above challenging optimization problem is transformed into a clustering subproblem and a content placement subproblem. To further reduce complexity, we propose an effective graph-based approach to solve the two subproblems. In the graph-based clustering approach, a node graph and a weighted graph are constructed. By setting the weights of the vertices of the weighted graph to be the incremental offloaded traffics of their corresponding complete subgraphs, the objective cluster sets can be readily obtained by using an effective greedy algorithm to search for the max-weight independent…

Tables1

Table 1. TABLE I: Summary of major notations

$M$ , $ℳ$ , $m$ , $𝒮_{m}$ , $𝒮_{m}^{1}$ , $𝒮_{m}^{2}$ , $𝒮_{m}^{3}$	Number of the considered F-APs, set of the $M$ F-APs, index of F-AP, set of all the cooperators of F-AP $m$ , set of intra-cluster cooperators of F-AP $m$ , set of inter-cluster cooperators of F-AP $m$ , set of nonclustered cooperators of F-AP $m$
$n$ , $ℳ_{n}^{c}$ , $ℳ^{n}$ , $S_{n}$	Index of cluster, set of F-APs in cluster $n$ , set of nonclustered F-APs, set size of $ℳ_{n}^{c}$
$f$ , $ℱ$ , $F$	Index of file, content library, library size
$K$ , $K_{n}$	Storage size of each F-AP, storage size of cluster $n$
$λ_{m}$ , $w_{m}$	Aggregate request arrival rate at F-AP $m$ , ratio of the traffic load at F-AP $m$ to the sum load of the $M$ F-APs
$p_{m f}$ , $p_{n f}$ , $p_{m f}^{o}$ , $p_{n f}^{o}$	Request probability of file $f$ at F-AP $m$ , request probability of file $f$ in cluster $n$ , request probability of the $f$ th most popular file at F-AP $m$ , request probability of the $f$ th most popular file in cluster $n$
$x_{m f}$ , $x_{n f}$ , $x_{m f}^{l}$	Caching decision of file $f$ at F-AP $m$ , caching decision of file $f$ in cluster $n$ , local state of file $f$ at F-AP $m$ and its cooperators
$T$ , $T^{c}$ , $T^{n}$ , $T^{d}$	Whole offloaded traffic for all the $M$ F-APs, offloaded traffic for all the $M$ F-APs through fetching files that are cached at the requested F-APs and their intra-cluster cooperators, offloaded traffic for all the $M$ F-APs through fetching files that are cached at the nonclustered cooperators and the inter- cluster cooperators of the requested F-APs, offloaded traffic for all the $M$ F-APs through fetching duplicate files that are cached between the requested F-APs (or their intra-cluster cooperators) and their inter-cluster cooperators
$T_{m}$ , $T^{i}$	Offloaded traffic at F-AP m through fetching files that are cached in its own storage space, incremental offloaded traffic of the $N$ clusters
$d_{m}$ , $D_{m m^{'}}$ , $L_{m m^{'}}$ , $γ^{d}$ , $γ^{l}$	Geographical coordinate of F-AP $m$ in the Euclidean space, distance between F-AP $m$ and F-AP $m^{'}$ , load difference between F-AP $m$ and F-AP $m^{'}$ , distance threshold, load threshold

Equations91

x_{m f}^{l} = x_{m f} + (1 - x_{m f}) [1 - \prod_{m^{'} \in S_{m}} (1 - x_{m^{'} f})] .

x_{m f}^{l} = x_{m f} + (1 - x_{m f}) [1 - \prod_{m^{'} \in S_{m}} (1 - x_{m^{'} f})] .

x_{m f}^{l} = x_{m f} + (1 - x_{m f}) [1 - \prod_{m^{'} \in S_{m}^{1}} (1 - x_{m^{'} f})] + (1 - x_{m f}) \prod_{m^{'} \in S_{m}^{1}} (1 - x_{m^{'} f}) [1 - \prod_{m^{'} \in S_{m}^{2} \cup S_{m}^{3}} (1 - x_{m^{'} f})] .

x_{m f}^{l} = x_{m f} + (1 - x_{m f}) [1 - \prod_{m^{'} \in S_{m}^{1}} (1 - x_{m^{'} f})] + (1 - x_{m f}) \prod_{m^{'} \in S_{m}^{1}} (1 - x_{m^{'} f}) [1 - \prod_{m^{'} \in S_{m}^{2} \cup S_{m}^{3}} (1 - x_{m^{'} f})] .

T

T

x_{m f} max T

x_{m f} max T

D_{m m^{'}} \leq γ^{d}, \forall m \in M, \forall m^{'} \in S_{m},

L_{m m^{'}} \geq γ^{l}, \forall m \in M, \forall m^{'} \in S_{m},

x_{m f} \in {1, 0}, \forall m \in M, \forall f \in F,

\sum_{f \in F} x_{m f} \leq K, \forall m \in M,

T = \sum_{n \in N} \sum_{m \in M_{n}^{c}} \sum_{f \in F} λ_{m} p_{m f} {x_{n f} + (1 - x_{n f}) [1 - \prod_{m^{'} \in S_{m}^{2} \cup S_{m}^{3}} (1 - x_{m^{'} f})]} L + \sum_{m \in M^{n}} \sum_{f \in F} λ_{m} p_{m f} {x_{m f} + (1 - x_{m f}) [1 - \prod_{m^{'} \in S_{m}^{2}} (1 - x_{m^{'} f})]} L .

T = \sum_{n \in N} \sum_{m \in M_{n}^{c}} \sum_{f \in F} λ_{m} p_{m f} {x_{n f} + (1 - x_{n f}) [1 - \prod_{m^{'} \in S_{m}^{2} \cup S_{m}^{3}} (1 - x_{m^{'} f})]} L + \sum_{m \in M^{n}} \sum_{f \in F} λ_{m} p_{m f} {x_{m f} + (1 - x_{m f}) [1 - \prod_{m^{'} \in S_{m}^{2}} (1 - x_{m^{'} f})]} L .

T^{n} = \sum_{n \in N} \sum_{m \in M_{n}^{c}} \sum_{f \in F} λ_{m} p_{m f} [1 - \prod_{m^{'} \in S_{m}^{2} \cup S_{m}^{3}} (1 - x_{m^{'} f})] L + \sum_{m \in M^{n}} \sum_{f \in F} λ_{m} p_{m f} [1 - \prod_{m^{'} \in S_{m}^{2}} (1 - x_{m^{'} f})] L .

T^{n} = \sum_{n \in N} \sum_{m \in M_{n}^{c}} \sum_{f \in F} λ_{m} p_{m f} [1 - \prod_{m^{'} \in S_{m}^{2} \cup S_{m}^{3}} (1 - x_{m^{'} f})] L + \sum_{m \in M^{n}} \sum_{f \in F} λ_{m} p_{m f} [1 - \prod_{m^{'} \in S_{m}^{2}} (1 - x_{m^{'} f})] L .

T^{d} = \sum_{n \in N} \sum_{m \in M_{n}^{c}} \sum_{f \in F} λ_{m} p_{m f} x_{n f} [1 - \prod_{m^{'} \in S_{m}^{2} \cup S_{m}^{3}} (1 - x_{m^{'} f})] L + \sum_{m \in M^{n}} \sum_{f \in F} λ_{m} p_{m f} x_{m f} [1 - \prod_{m^{'} \in S_{m}^{2}} (1 - x_{m^{'} f})] L .

T^{d} = \sum_{n \in N} \sum_{m \in M_{n}^{c}} \sum_{f \in F} λ_{m} p_{m f} x_{n f} [1 - \prod_{m^{'} \in S_{m}^{2} \cup S_{m}^{3}} (1 - x_{m^{'} f})] L + \sum_{m \in M^{n}} \sum_{f \in F} λ_{m} p_{m f} x_{m f} [1 - \prod_{m^{'} \in S_{m}^{2}} (1 - x_{m^{'} f})] L .

M = (\cup_{n \in N} M_{n}^{c}) \cup M^{n},

M = (\cup_{n \in N} M_{n}^{c}) \cup M^{n},

M_{n}^{c} \cap M_{n^{'}}^{c} = \emptyset, \forall n, n^{'} \in N, n \neq = n^{'} .

M_{n}^{c} \cap M_{n^{'}}^{c} = \emptyset, \forall n, n^{'} \in N, n \neq = n^{'} .

S_{m} = S_{m}^{1} \cup S_{m}^{2} \cup S_{m}^{3} .

S_{m} = S_{m}^{1} \cup S_{m}^{2} \cup S_{m}^{3} .

S_{m}^{i} \cap S_{m}^{j}

S_{m}^{i} \cap S_{m}^{j}

m \cup S_{m}^{1}

S_{m}^{1}

p_{n f} = \sum_{m \in M_{n}^{c}} p_{m f} \frac{w _{m}}{\sum _{m^{'} \in M_{n}^{c}} w _{m^{'}}} .

p_{n f} = \sum_{m \in M_{n}^{c}} p_{m f} \frac{w _{m}}{\sum _{m^{'} \in M_{n}^{c}} w _{m^{'}}} .

x_{n f}

x_{n f}

= 1 - (1 - x_{m f}) \prod_{m^{'} \in S_{m}^{1}} (1 - x_{m^{'} f}), m \in M_{n}^{c},

\sum_{f \in F} x_{n f} \leq K_{n}, n \in N .

\sum_{f \in F} x_{n f} \leq K_{n}, n \in N .

x_{m f}^{l} = x_{m f} + (1 - x_{m f}) [1 - \prod_{m^{'} \in S_{m}^{2}} (1 - x_{m^{'} f})], m \in M^{n} .

x_{m f}^{l} = x_{m f} + (1 - x_{m f}) [1 - \prod_{m^{'} \in S_{m}^{2}} (1 - x_{m^{'} f})], m \in M^{n} .

x_{m f}^{l} = x_{n f} + (1 - x_{n f}) [1 - \prod_{m^{'} \in S_{m}^{2} \cup S_{m}^{3}} (1 - x_{m^{'} f})], m \in M_{n}^{c}, n \in N .

x_{m f}^{l} = x_{n f} + (1 - x_{n f}) [1 - \prod_{m^{'} \in S_{m}^{2} \cup S_{m}^{3}} (1 - x_{m^{'} f})], m \in M_{n}^{c}, n \in N .

T = T^{c} + T^{n} - T^{d},

T = T^{c} + T^{n} - T^{d},

T^{c} = \sum_{n \in N} \sum_{m \in M_{n}^{c}} \sum_{f \in F} λ_{m} p_{m f} x_{n f} L + \sum_{m \in M^{n}} \sum_{f \in F} λ_{m} p_{m f} x_{m f} L,

T^{c} = \sum_{n \in N} \sum_{m \in M_{n}^{c}} \sum_{f \in F} λ_{m} p_{m f} x_{n f} L + \sum_{m \in M^{n}} \sum_{f \in F} λ_{m} p_{m f} x_{m f} L,

T^{c} = \sum_{n \in N} \sum_{m \in M_{n}^{c}} \sum_{f = 1}^{K_{n}} λ_{m} p_{n f}^{o} L + \sum_{m \in M} \sum_{f = 1}^{K} λ_{m} p_{m f}^{o} L .

T^{c} = \sum_{n \in N} \sum_{m \in M_{n}^{c}} \sum_{f = 1}^{K_{n}} λ_{m} p_{n f}^{o} L + \sum_{m \in M} \sum_{f = 1}^{K} λ_{m} p_{m f}^{o} L .

T_{m} = \sum_{f = 1}^{K} λ_{m} p_{m f}^{o} L,

T_{m} = \sum_{f = 1}^{K} λ_{m} p_{m f}^{o} L,

T_{n}^{i} = \sum_{m \in M_{n}^{c}} λ_{m} (\sum_{f = 1}^{K_{n}} p_{n f}^{o} - \sum_{f = 1}^{K} p_{m f}^{o}) L,

T_{n}^{i} = \sum_{m \in M_{n}^{c}} λ_{m} (\sum_{f = 1}^{K_{n}} p_{n f}^{o} - \sum_{f = 1}^{K} p_{m f}^{o}) L,

T^{i} = \sum_{n \in N} T_{n}^{i},

T^{i} = \sum_{n \in N} T_{n}^{i},

T^{c}

T^{c}

{M_{n}^{c}}_{n \in N}, M^{n} max T^{i}

{M_{n}^{c}}_{n \in N}, M^{n} max T^{i}

(\ref P-0 a), (\ref P-0 b) .

T^{d} = \sum_{n \in N} \sum_{m \in M_{n}^{c}} \sum_{f \in F_{n}^{c}} λ_{m} p_{m f} [1 - \prod_{m^{'} \in S_{m}^{2} \cup S_{m}^{3}} (1 - x_{m^{'} f})] L + \sum_{m \in M^{n}} \sum_{f \in F_{m}^{n}} λ_{m} p_{m f} [1 - \prod_{m^{'} \in S_{m}^{2}} (1 - x_{m^{'} f})] L .

T^{d} = \sum_{n \in N} \sum_{m \in M_{n}^{c}} \sum_{f \in F_{n}^{c}} λ_{m} p_{m f} [1 - \prod_{m^{'} \in S_{m}^{2} \cup S_{m}^{3}} (1 - x_{m^{'} f})] L + \sum_{m \in M^{n}} \sum_{f \in F_{m}^{n}} λ_{m} p_{m f} [1 - \prod_{m^{'} \in S_{m}^{2}} (1 - x_{m^{'} f})] L .

F_{n}^{c} = {f p_{n 1}^{o} \geq p_{n 2}^{o} \geq \dots \geq p_{n f}^{o} \geq \dots \geq p_{n K_{n}}^{o}}, n \in N,

F_{n}^{c} = {f p_{n 1}^{o} \geq p_{n 2}^{o} \geq \dots \geq p_{n f}^{o} \geq \dots \geq p_{n K_{n}}^{o}}, n \in N,

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCaching and Content Delivery · Opportunistic and Delay-Tolerant Networks · Cooperative Communication and Network Coding

Full text

\newcaptionstyle

mystyle1TABLE \captiontext \captionstylemystyle1 \newcaptionstylemystyle2\captionlabel. \captiontext \captionstylemystyle2 \newcaptionstylemystyle3\captionlabel. \captiontext \captionstylemystyle3

Cooperative Caching in Fog Radio Access Networks: A Graph-based Approach

Yanxiang Jiang, , Xiaoting Cui, Mehdi Bennis, , and Fu-Chun Zheng, This work was supported in part by the Natural Science Foundation of China under Grant 61521061, the Natural Science Foundation of Jiangsu Province under grant BK20181264, the Research Fund of the State Key Laboratory of Integrated Services Networks (Xidian University) under grant ISN19-10, the Research Fund of the Key Laboratory of Wireless Sensor Network $\&$ Communication (Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences) under grant 2017002, the National Basic Research Program of China (973 Program) under grant 2012CB316004, and the U.K. Engineering and Physical Sciences Research Council under Grant EP/K040685/2.Y. Jiang is with the National Mobile Communications Research Laboratory, Southeast University, Nanjing 210096, China, the State Key Laboratory of Integrated Services Networks, Xidian University, Xi’an 710071, China, and the Key Laboratory of Wireless Sensor Network $\&$ Communication, Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, 865 Changning Road, Shanghai 200050, China (e-mail: [email protected]).X. Cui is with the National Mobile Communications Research Laboratory, Southeast University, Nanjing 210096, China. M. Bennis is with the Centre for Wireless Communications, University of Oulu, Oulu 90014, Finland (e-mail: [email protected]).F. Zheng is with the School of Electronic and Information Engineering, Harbin Institute of Technology, Shenzhen 518055, China, and the National Mobile Communications Research Laboratory, Southeast University, Nanjing 210096, China (e-mail: [email protected]).

Abstract

In this paper, cooperative caching is investigated in fog radio access networks (F-RAN). To maximize the offloaded traffic, cooperative caching optimization problem is formulated. By analyzing the relationship between clustering and cooperation and utilizing the solutions of the knapsack problems, the above challenging optimization problem is transformed into a clustering subproblem and a content placement subproblem. To further reduce complexity, we propose an effective graph-based approach to solve the two subproblems. In the graph-based clustering approach, a node graph and a weighted graph are constructed. By setting the weights of the vertices of the weighted graph to be the incremental offloaded traffics of their corresponding complete subgraphs, the objective cluster sets can be readily obtained by using an effective greedy algorithm to search for the max-weight independent subset. In the graph-based content placement approach, a redundancy graph is constructed by removing the edges in the complete subgraphs of the node graph corresponding to the obtained cluster sets. Furthermore, we enhance the caching decisions to ensure each duplicate file is cached only once. Compared with traditional approximate solutions, our proposed graph-based approach has lower complexity. Simulation results show remarkable improvements in terms of offloaded traffic by using our proposed approach.

Index Terms:

F-RAN, cooperative caching, clustering, content placement, fronthaul offloading.

I Introduction

With the continuous and rapid proliferation of various intelligent devices and advanced mobile application services, wireless networks have been suffering an unprecedented data traffic pressure in recent years. Ever-increasing mobile data traffic brings tremendous load on capacity-limited fronthaul links, especially at peak traffic moments. As a promising architecture, fog radio access networks (F-RAN) can effectively offload the traffic in fronthaul links by placing popular contents at fog access points (F-APs) which are equipped with limited caching resources [1]. Due to storage constraint and fluctuant spatio-temporal traffic demands, cooperative caching is an effective way to increase the offloaded traffic.

Recently, there have been a lot of works on cooperative caching. In [2], a cooperative caching and delivery policy was proposed to minimize the latency, where each base station (BS) and user equipments (UEs) cached files according to the request probability independently. However, the caching decision of one BS was influenced by that of the neighboring cooperative BSs, and different BSs should cache diverse files in a cooperative manner [3, 4]. In [5, 6, 7], the cooperative content placement strategy for the given cache nodes cluster was studied. In [5], a cooperative content placement strategy was proposed to maximize the service probability, where the storage space of each BS in the given cluster was divided into a proportion for caching the same files and a rest proportion for caching different files. In [6], a cooperative caching algorithm for multiple operators was proposed to maximize the delay savings, where all the cache nodes in the given cluster firstly cached the globally popular files together and then cached the locally popular files independently. In [7], a cooperative content placement method was proposed to minimize the latency for multi-cell cooperative networks, where a heuristic greedy algorithm with limited performance guarantee was developed. In [8, 9, 10], the cooperative content placement strategy for unknown cache nodes cluster was studied. In [8], the uncoded and coded cooperative content assignment strategies were proposed to minimize the expected downloading time, where the connectivity graph between UEs and BSs was used to reflect cooperation relationship among neighboring BSs. By optimizing relay clustering and content placement in a joint manner, a cooperative caching strategy was developed to minimize the outage probability in [9], where identical files were cached among the relays in each cluster for simplicity. Based on the similarities among users requesting similar contents, a user clustering and cooperative caching algorithm to improve the cache hit rate was proposed in [10].

However, the prior works on cooperative content placement tend to exploit the global content popularity rather than the local content popularity, which might not even replicate the global content popularity. The local content popularity indeed reflects user interest at the coverage of each cache node and might be different from each other [11, 12]. It was investigated in [13] and [14] that the cooperative content placement algorithms based on the local content popularity could obtain lower delay or higher cache hit rate than that based on the global content popularity.

Motivated by the aforementioned discussions, the main contributions of this paper are summarized below.

•

We propose a new idea for solving the challenging cooperative caching optimization problem based on the local content popularity. Analyzing the relationship between clustering and cooperation and utilizing the solutions of the knapsack problems, we transform the cooperative caching optimization problem into a clustering subproblem and a content placement subproblem.

•

We propose a graph-based clustering approach. Constructing a node graph and a weighted graph, we transform the clustering subproblem into an equivalent 0-1 integer programming problem. Furthermore, we propose an effective greedy algorithm to search for the objective cluster sets.

•

We propose a graph-based content placement approach. Constructing a redundancy graph based on the obtained cluster sets, we determine the duplicate files that will indeed cause cache redundancy at each edge and further enhance the caching decisions for each file. Correspondingly, all the possible cache redundancy can be eliminated by caching each duplicate popular file only once.

The rest of this paper is organized as follows. In Section II, the system model and problem formulation are briefly described. In Section III, the problem transformation is presented. The proposed graph-based cooperative caching scheme including clustering and content placement is presented in Section IV. Simulation results are shown in Section V. Final conclusions are drawn in Section VI.

II System Model and Problem Formulation

Consider a cooperative caching scenario in F-RAN as illustrated in Fig. 1, which consists of a cloud server, ${M}$ F-APs, and a certain number of users. The cloud server can be accessed by the F-APs via fronthaul links. Let ${\cal M}=\left\{{1,2,\cdots,m,\cdots,M}\right\}$ denote the F-AP set. Assume that neighboring F-APs can share files and cooperate with each other [15]. Whether two F-APs can cooperate or not depends on how well they satisfy some certain rules. Let ${{\cal S}_{m}}$ denote the set of all the cooperators of F-AP $m$ . Without loss of generality, assume that all the files have the same size of $L$ bits, each F-AP has the same storage space and can store up to $K$ files from the content library ${\cal F}=\left\{{1,2,\cdots,f,\cdots,F}\right\}$ located in the cloud server. Let ${p_{m{f}}}$ denote the request probability of file ${f}$ at F-AP $m$ (referred to as the local content popularity). Assume that the request probability at each F-AP is stationary during the given time period. Let ${\lambda_{m}}$ denote the aggregate request arrival rate at F-AP $m$ , and ${w_{m}}=\lambda_{m}/{\sum\nolimits_{m^{\prime}\in\mathcal{M}}{{\lambda_{m^{\prime}}}}}$ denote the ratio of the traffic load at F-AP $m$ to the sum load of the $M$ F-APs.

Let ${x_{mf}}\in\left\{{1,0}\right\}$ denote the caching decision of file $f$ at F-AP $m$ , where $x_{mf}=1$ if file $f$ is cached at F-AP $m$ and $x_{mf}=0$ otherwise. Let $x_{mf}^{\text{l}}\in\left\{{1,0}\right\}$ denote the local state of file $f$ at F-AP $m$ and its cooperators, where $x_{mf}^{\text{l}}=1$ if file $f$ is successfully cached locally; $x_{mf}^{\text{l}}=0$ if file $f$ is not cached locally and must be fetched from the cloud server. Then, $x_{mf}^{\text{l}}$ can be expressed as follows:

[TABLE]

Once the requested file is cached locally, the traffic in the fronthaul links can be offloaded. Let $T$ denote the offloaded traffic for all the considered $M$ F-APs. Then, it can be expressed as follows:

[TABLE]

Note that the offloaded traffic increases with the number of locally cached files, and decreases with duplicate cached files at the requested F-APs and their cooperators. The caching decisions should be determined cooperatively by the neighboring F-APs for a larger number of unduplicated cached files.

To maximize the offloaded traffic, the cooperators should be neighboring F-APs with closer distance and greater load difference. The selected F-APs can efficiently offload traffic among each other and are more likely to cooperative with each other [16]. Let $d_{{m}}$ denote the geographical coordinate of F-AP $m$ in the Euclidean space, $D_{{m}{m^{\prime}}}=\left\|{{d_{{m}}}-{d_{{m^{\prime}}}}}\right\|_{2}$ denote the distance between F-AP $m$ and F-AP $m^{\prime}$ , and $L_{{m}{m^{\prime}}}=\left\|{{\lambda_{{m}}}-{\lambda_{{m^{\prime}}}}}\right\|_{2}$ denote the load difference between F-AP $m$ and F-AP $m^{\prime}$ . Then, the cooperative caching optimization problem can be formulated as follows:

[TABLE]

where ${{\gamma^{\text{d}}}}$ and ${{\gamma^{\text{l}}}}$ denote the distance threshold and the load threshold, respectively.

The objective of this paper is to find the optimal caching decisions $\left\{{{x_{mf}}\left|m\in{\cal M},f\in{\cal F}\right.}\right\}$ by maximizing the offloaded traffic using cooperative caching in F-RAN.

III Problem Transformation

The optimization problem in (3) is a 0-1 integer programming problem, which is NP-hard [2, 6]. A dynamic programming approach is generally required for obtaining a global optimal solution [17]. However, such an approach has an exponential complexity with respect to (w.r.t.) the number of F-APs and the size of the content library, and it is computationally impracticable even for a small size network. In the previous works in [8, 17], by reformulating the original problem into a matroid constrained monotone submodular optimization problem, the approximate solutions with limited performance can be obtained. However, by using the above approach, it incurs a long running time to evaluate the marginal value of the objective function.

In fact, by utilizing the relationship between clustering and cooperation, the cooperators of an F-AP can be divided into intra-cluster cooperators, inter-cluster cooperators, and nonclustered cooperators. Correspondingly, the objective function of the cooperative caching optimization problem in (3) can be decomposed into three items. All the three items indicate that the offloaded traffic is affected by the clustering strategy. In addition, the first item indicates that the offloaded traffic is also affected by the cached files at the requested F-APs and their intra-cluster cooperators. The second item indicates that the offloaded traffic is also affected by the cached files at the nonclustered cooperators and the inter-cluster cooperators of the requested F-APs. The third item indicates that the offloaded traffic is also affected by the duplicate cached files between the requested F-APs (or their intra-cluster cooperators) and their inter-cluster cooperators. In summary, all the three items indicate that the solution of the original optimization problem requires to determine clusters and content placement. Therefore, in this paper, we propose to transform the challenging cooperative caching optimization problem into a clustering subproblem and a content placement subproblem.

III-A Clustering and Cooperation

Cooperative F-APs can form a cluster to make the storage space in a cluster be seen as an entirety [18]. Correspondingly, clustering can increase the content diversity. Any two F-APs in a cluster can cooperate with each other. However, two F-APs that can cooperate may not necessarily be members of a cluster.

Assume that the considered $M$ F-APs can constitute $N$ disjoint clustered sets denoted by ${{\cal M}^{\text{c}}_{n}}$ for $n\in\mathcal{N}=\{1,2,\cdots,N\}$ and one nonclustered set denoted by ${{\cal M}^{\text{n}}}$ , and the set size of ${{\cal M}^{\text{c}}_{n}}$ is denoted by $S_{n}$ . Disjoint clustering makes one F-AP only be a member of one cluster, which ensures exclusive and sufficient usage of its storage space to all the users in the cluster. Correspondingly, the following relationship can be readily established:

[TABLE]

Without loss of generality, let ${\cal S}_{m}^{1}$ , ${\cal S}_{m}^{2}$ , and ${\cal S}_{m}^{3}$ denote the set of intra-cluster cooperators, inter-cluster cooperators, and nonclustered cooperators of F-AP $m$ , respectively. Define

[TABLE]

Then, the following relationship can be readily established:

[TABLE]

Let ${p_{n{f}}}$ denote the request probability of file ${f}$ in cluster ${n}$ . Then, according to [19], we have:

[TABLE]

Assume that cluster ${n}$ can cache $K_{n}=S_{n}K$ different files. Generally, ${S_{n}}K\ll F$ .Let $x_{nf}\in\left\{{1,0}\right\}$ denote the caching decision of file $f$ in cluster $n$ , where $x_{nf}=1$ if file $f$ is cached at any F-AP in cluster $n$ and $x_{nf}=0$ otherwise. Then, we have:

[TABLE]

III-B Objective Function Decomposition

Substituting (6) into (1), the local state $x_{mf}^{\text{l}}$ of the requested file $f$ at F-AP $m\in{\cal M}$ and its cooperators can be expressed in an equivalent form in (14) as shown at the bottom of this page. When $m\in{{\cal M}^{\text{n}}}$ , according to (9) and (14), $x_{mf}^{\text{l}}$ can be further expressed as follows:

[TABLE]

When $m\in{{\cal M}_{n}^{\text{c}}}$ , according to (12) and (14), $x_{mf}^{\text{l}}$ can be further expressed as follows:

[TABLE]

Substitute (4), (15), and (16) into (2). Then, the objective function of the original optimization problem in (3) can be expressed in an equivalent form in (17) as shown at the bottom of this page. For all the considered $M$ F-APs, let ${T^{\text{c}}}$ denote the offloaded traffic through fetching files that are cached at the requested F-APs and their intra-cluster cooperators, ${T^{\text{n}}}$ denote the offloaded traffic through fetching files that are cached at the nonclustered cooperators and the inter-cluster cooperators of the requested F-APs, and ${T^{\text{d}}}$ denote the offloaded traffic through fetching duplicate files that are cached between the requested F-APs (or their intra-cluster cooperators) and their inter-cluster cooperators, respectively. Then, (17) can be decomposed into three items as follows:

[TABLE]

where

[TABLE]

and ${T^{\text{n}}}$ and ${T^{\text{d}}}$ are expressed in (20) and (21), respectively, as shown at the bottom of this page.

It can be readily seen from (19) that $T^{\text{c}}$ can be maximized if the cluster sets are determined, the most popular $K_{n}$ files in each cluster and the most popular $K$ files at each nonclustered F-AP are cached, respectively. It can be readily seen from (20) that $T^{\text{n}}$ can be maximized if the cluster sets are determined, the most popular $K$ files at the inter-cluster cooperators and the nonclustered cooperators of a clustered F-AP are cached at the clustered F-AP, and the most popular $K$ files at the inter-cluster cooperators of a nonclustered F-AP are cached at the nonclustered F-AP. It can be readily seen from (21) that $T^{\text{d}}$ can be minimized if the cluster sets are determined, different files are cached between a clustered F-AP and its inter-cluster cooperators (or its nonclustered cooperators), different files are cached between a nonclustered F-AP and its inter-cluster cooperators.

According to the above presentation, firstly, if a clustered F-AP does not have inter-cluster cooperators and nonclustered cooperators, or a nonclustered F-AP does not have inter-cluster cooperators, the cache files at this F-AP cannot be determined through maximizing $T^{\text{n}}$ whereas they must be determined through maximizing $T^{\text{c}}$ . Secondly, for the problem of maximizing $T^{\text{n}}$ , the number of most popular files at the inter-cluster cooperators (or the nonclustered cooperators) of a clustered F-AP that should be cached at the clustered F-AP, and the number of popular files at the inter-cluster cooperators of a nonclustered F-AP that should be cached at the nonclustered F-AP cannot be determined. It is hardly possible to solve the problem of maximizing $T^{\text{n}}$ . Thirdly, for the clustered F-APs which have inter-cluster cooperators or nonclustered cooperators, and the non-clustered F-APs which have inter-cluster cooperators, both maximizing $T^{\text{c}}$ and maximizing $T^{\text{n}}$ require them to cache the most popular files. The difference between maximizing $T^{\text{c}}$ and maximizing $T^{\text{n}}$ lies in the caching locations of these files between each pair of a clustered F-AP and its inter-cluster cooperator (or nonclustered cooperator), and between each pair of a nonclustered F-AP and its inter-cluster cooperator. There exists an exchange relationship between the caching locations of the above F-AP pairs. Finally, once the popular files at the clustered and nonclustered F-APs are determined, the duplicate caches files between a clustered F-AP and its inter-cluster cooperators (or noncluster cooperators), and the duplicate caches files between a nonclustered F-AP and its inter-cluster cooperators can be determined. By reducing the number of duplicate cached files, caching a duplicate popular file at one F-AP and replacing it by a new popular file at the other F-AP, $T^{\text{d}}$ can then be minimized. Based on the above analysis, we propose to solve the cooperation caching optimization problem through firstly maximizing $T^{\text{c}}$ and further minimizing $T^{\text{d}}$ .

III-C Optimization Problem Reformulation

From (19), we can see that $T^{\text{c}}$ is affected by the clustering strategy and the caching decisions $\left\{{{x_{nf}},{x_{mf}}\left|f\in{\cal F},n\in\mathcal{N},m\in{\cal M}^{\text{n}}\right.}\right\}$ . If the cluster sets are determined, $T^{\text{c}}$ can be maximized through solving the $N+\left|{\cal M}^{\text{n}}\right|$ independent knapsack problems for each $n\in\mathcal{N}$ and each $m\in{\cal M}^{\text{n}}$ [2]. Sort $p_{mf}$ and $p_{nf}$ in descending order. Let $p_{{mf}}^{\text{o}}$ and $p_{{nf}}^{\text{o}}$ denote the request probability of the $f$ th most popular file at F-AP $m$ and in cluster $n$ , respectively. According to the solutions of the knapsack problems [2], and the caching storage constraints in (3c), (3d), and (13), we have:

[TABLE]

Define

[TABLE]

where ${{T_{m}}}$ denotes the offloaded traffic at F-AP $m$ through fetching files that are cached in its own storage space, ${{T_{n}^{\text{i}}}}$ denotes the incremental offloaded traffic of cluster $n$ , and ${{T^{\text{i}}}}$ denotes the incremental offloaded traffic of all the $N$ clusters. Then, (22) can be further expressed in an equivalent form as follows [20]:

[TABLE]

It can be readily seen that the second item in the right hand side of (26) is unaffected by the clustering strategy, and that maximizing ${T^{\text{c}}}$ under the constraints in (3a)-(3d) is equivalent to maximizing $T^{\text{i}}$ under the constraints in (3a)-(3b). Therefore, we can reformulate the clustering subproblem to maximize $T^{\text{c}}$ as follows:

[TABLE]

Solving the above optimization problem, we can obtain the clustered and nonclustered F-AP sets. Let ${\cal F}_{n}^{\text{c}}$ and ${\cal F}_{m}^{\text{n}}$ denote the set of $K_{n}$ most popular files in cluster $n$ and the set of $K$ most popular files at F-AP $m\in{\cal M}^{\text{n}}$ , respectively. Then, they can be expressed as follows:

[TABLE]

Correspondingly, the caching decisions $\left\{x_{nf},x_{mf}\left|f\in{\cal F},\right.\right.$ $\left.n\in\mathcal{N},m\in{\cal M}^{\text{n}}\right\}$ through maximizing $T^{\text{c}}$ can be expressed as follows:

[TABLE]

Substitute (30) and (31) into (21). Then, $T^{\text{d}}$ can be expressed in an equivalent form in (32) as shown at the top of next page. Therefore, we can reformulate the content placement subproblem to minimize $T^{\text{d}}$ as follows:

[TABLE]

For convenience, a summary of major notations is presented in Table I.

IV Proposed Graph-based Cooperative Caching Scheme

In the previous Section, we have transformed the challenging cooperative caching optimization problem into a clustering subproblem and a content placement subproblem. The clustering subproblem in (27) and the content placement subproblem in (33) fall into the scope of combinatorial programming [16, 20]. A brute force approach is generally required to obtain the globally optimal solution of each subproblem. However, such an approach has an exponential complexity w.r.t. the number of F-APs and the number of disjoint cluster sets or the sizes of popular file sets ${\cal F}_{n}^{\text{c}}$ and ${\cal F}_{m}^{\text{n}}$ . Although its computational complexity is indeed reduced compared to the original dynamic programming approach, it is still computationally impracticable even for a small size network. By mapping each F-AP as one vertex in a graph, the candidate cluster can be represented by its subgraph [21]. By mapping each obtained subgraph as the vertex in a new graph, the disjoint cluster sets can be represented by an independent subset of the vertex set of this new graph. According to graph theory [22], all the subgraphs of a graph and the independent subset of the vertex set of a graph can be obtained in polynomial time complexity. Correspondingly, the clustering subproblem can be solved in polynomial time complexity. Furthermore, by mapping each pair of cooperative F-APs which are not in the same cluster as two vertices that are connected by one edge in a graph, all the edges can be traversed to control the cached files at the corresponding paired F-APs and the duplicate cached files can then be eliminated. According to graph theory [22], all the edges in a graph can be found in polynomial time complexity. Correspondingly, the content placement subproblem can also be solved in polynomial time complexity. Therefore, we commit to an effective graph-based approach to solve the clustering subproblem and the content placement subproblem, respectively.

IV-A Proposed Graph-based Clustering Approach

IV-A1 Description of the Proposed Approach

In our proposed graph-based clustering approach, firstly, all the considered $M$ F-APs are checked to determine which pair satisfies the constraints in (27a). It is already known that F-APs with appropriate distance and load difference from each other are more likely to cooperate together [16]. Then, according to the checking results, the node graph denoted by ${{\cal G}^{\text{n}}}=({\cal M},{\cal E})$ is constructed, whose vertex set denoted by $\cal M$ is the F-AP set and whose edge set denoted by $\cal E$ reflects the distance and load difference among the F-APs. In ${{\cal G}^{\text{n}}}$ , two vertices are connected through an edge if their representing F-APs can cooperate with each other. Note that one subgraph of ${{\cal G}^{\text{n}}}$ , any vertex of which can connect through an edge with a certain vertex in the same subgraph, represents one cluster which consists of a certain number of cooperative F-APs, and one complete subgraph of ${{\cal G}^{\text{n}}}$ , any two vertices of which can connect through an edge, essentially represents one candidate cluster of the optimization problem in (27) whose cluster members can cooperative with each other. We point out here that there may exist a certain vertex not belonging to any subgraph of ${{\cal G}^{\text{n}}}$ , which means that its representing F-AP is nonclustered. For illustration, a node graph with thirteen vertices as shown in Fig. 2 is taken for example. According to the above descriptions, seeking candidate clusters is equivalent to searching for complete subgraphs in ${{\cal G}^{\text{n}}}$ . The algorithm of searching for complete subgraphs will be presented in detail in Section IV-A-2.

Let ${\cal H}=\{{h_{1}},{h_{2}},\cdots,{h_{n}},\cdots,{h_{N^{\prime}}}\}$ denote the complete subgraph set that has been obtained through the above searching algorithm, where $N^{\prime}$ denotes the number of the complete subgraphs so obtained. It is clear that $\left\{{{\cal M}_{n}^{\text{c}}}\right\}_{n=1}^{N}\subseteq{\cal H}$ . Then, a weighted graph denoted by ${{\cal G}^{\text{w}}}=({\cal H},{\cal B},\bm{w})$ can be constructed, where $\cal H$ denotes the vertex set, $\cal B$ denotes the edge set, and $\bm{w}$ denotes the weight vector corresponding to the vertices of ${{\cal G}^{\text{w}}}$ whose elements are set to be the incremental offloaded traffic of their corresponding complete subgraphs, i.e., ${\left[\bm{w}\right]_{n}}=\mathop{T}\nolimits_{n}^{\text{i}}$ . In ${{\cal G}^{\text{w}}}$ , two vertices are connected through an edge if their representing complete subgraphs have a certain identical vertex. It is known from graph theory that an independent or stable set is a set of vertices in a graph, no two of which are adjacent [22]. Then, the independent subset of ${\cal H}$ certainly satisfies the constraint in (5). Correspondingly, the objective cluster sets of the optimization problem in (27) can be readily obtained by searching for the equivalent max-weight independent subset of ${\cal H}$ of the corresponding weighted graph ${{\cal G}^{\text{w}}}$ . The max-weight independent subset of ${\cal H}$ can be obtained by solving a 0-1 integer programming problem, which will be presented in detail in Section IV-A-3.

Remark here that we map one cluster to one complete subgraph, which guarantees proper-sized clusters and avoids unnecessary intra-cluster signaling overhead, instead of one connected subgraph as in [21], which tends not to constrain the cluster size.

IV-A2 Searching for Complete Subgraphs

We propose to search for maximal complete subgraphs to find all the possible complete subgraphs. It is known from [22] that any complete subgraph must belong to a maximal complete subgraph and it is more difficult to find complete subgraphs through direct searching than through indirect searching for maximal complete subgraphs. We propose to exploit the adjacency table of each vertex in the node graph ${{\cal G}^{\text{n}}}$ to search for maximal complete subgraphs. For $m\in{\cal M}$ , let ${{\cal T}_{m}}=\left\{m\right\}\cup\left\{{{m^{\prime}}\left|{m^{\prime}\in{\cal M},m^{\prime}>m,\left({m^{\prime},m}\right)\in{\cal E}}\right.}\right\}$ denote the adjacency table of vertex $m$ of ${{\cal G}^{\text{n}}}$ , and ${L_{m}}$ denotes the table size of ${{\cal T}_{m}}$ . If ${L_{m}}=1$ or ${{\cal T}_{m}}\subseteq{{\cal T}_{m^{\prime}}}$ for $m,m^{\prime}\in{\cal M}$ and $m^{\prime}<m$ , it is unnecessary to search for a maximal complete subgraph in ${{\cal T}_{m}}$ . Remove all the unnecessary or redundant adjacency tables and sort the remaining in descending order denoted by ${\cal T}_{m}^{\text{o}}$ according to their table sizes. Let ${{\cal T}}$ denote the set of the reordered adjacency tables of ${{\cal G}^{\text{n}}}$ . Remove any vertex that does not connect with all the other vertices in ${\cal T}_{m}^{\text{o}}$ . Then, the remaining vertices in ${{\cal T}_{m}^{\text{o}}}$ form a maximal complete subgraph. Let ${{\cal G}^{\text{m}}}$ denote the set of maximal complete subgraphs. The detailed description of our proposed algorithm of searching for maximal complete subgraphs is presented in Algorithm 1. After maximal complete subgraphs are found, all the possible complete subgraphs can be readily obtained.

IV-A3 Searching for Max-Weight Independent Subset

According to the construction of the weighted graph ${{\cal G}^{\text{w}}}$ , two vertices in $\cal H$ are adjacent and there exists an edge between them if their representing candidate cluster sets have some identical elements. Let ${\bm{x}}$ denote the binary indicating vector for the the vertices in $\cal H$ with ${[\bm{x}]_{n}}=1$ if the candidate cluster set represented by the vertex $h_{n}$ belongs to the objective disjoint cluster sets of the original optimization problem in (27) and ${[\bm{x}]_{n}}=0$ otherwise. If the vertices $h_{n}$ and $h_{n^{\prime}}$ can be connected through an edge $\left({{h_{n}},{h_{n^{\prime}}}}\right)\in{\cal B}$ , the relationship ${[\bm{x}]_{n}}[\bm{x}]_{n^{\prime}}=0$ should be satisfied.

According to the above description, the original optimization problem in (27) can be transformed into the following 0-1 integer programming problem,

[TABLE]

The above optimization problem can be solved by linear programming only if its linear relaxation is tight and has a unique integral solution. However, the above two conditions are hard to be satisfied [23]. Actually, the optimization problem in (34) is a classical problem that maximizes a submodular set function and can often be solved by greedy algorithms [24]. Considering that traditional greedy algorithms cannot take full advantage of the specific constraints in (34a)-(34b), we then propose a more effective greedy algorithm. Let ${{\cal G}_{n}}$ denote the independent subset of $\cal H$ and $w_{n}$ denote the sum weight of all the vertices in ${{\cal G}_{n}}$ . Each time move one vertex with the largest weight from $\cal H$ to ${{\cal G}_{n}}$ and remove its adjacent vertices from $\cal H$ . Repeat the above step until $\cal H$ is empty. The independent subset ${{\cal G}_{n}}$ so obtained with the maximum sum weight $w^{\text{n}}$ is just the max-weight independent subset denoted by ${\cal G}^{\text{o}}$ that we are searching for. The detailed description of our proposed greedy algorithm of searching for the max-weight independent subset is presented in Algorithm 2.

In traditional greedy algorithms [25, 26], the vertex with the largest weight is generally chosen as the initial vertex to search for the max-weight independent subset. In contrast, we set $N^{\prime}$ outer loops in Algorithm 2. Correspondingly, each vertex in $\cal H$ has chance to be the initial vertex to constitute an independent subset. Therefore, the $N^{\prime}$ outer loops in Algorithm 2 guarantee to find the max-weight independent subset of the vertex set of the weighted graph.

To further illustrate the above issue, take a weighted graph with nine vertices as shown in Fig. 3 for example. In the weighted graph, the vertices are divided into three groups according to their weights, the weight of each vertex in the first group is larger than that in the second and third groups, and the weight of each vertex in the second group is larger than that in the third group. Assume vertex 1 has the largest weight among the nine vertices. In traditional greedy algorithms, vertex 1 will be chosen as the initial vertex. Then, the output max-weight independent subset will be $\{1,2,3\}$ . However, in Algorithm 2, vertex 4 is also allowed to be the initial vertex. Then, the output max-weight independent subset will be $\left\{{4,5,6,7,8}\right\}$ if its sum weight is larger than that of $\{1,2,3\}$ . It can be readily seen that the sum weight of all the vertices in the obtained independent subset will not be the maximum if the initial vertex is not selected properly. Therefore, the $N^{\prime}$ outer loops in Algorithm 2 can indeed guarantee to find the max-weight independent subset.

After the max-weight independent subset ${\cal G}^{\text{o}}$ is found, the clustered sets and nonclustered set can be determined. Then, the set of popular files ${\cal F}_{n}^{\text{c}}$ for $n\in{\cal N}$ in cluster $n$ and the set of popular files ${\cal F}_{m}^{\text{n}}$ for $m\in{\cal M}^{\text{n}}$ at the nonclustered F-AP $m$ can be determined according to (28) and (29), respectively.

IV-B Proposed Graph-based Content Placement Approach

IV-B1 Description of the Proposed Approach

In our proposed graph-based content placement approach, firstly, we find the complete subgraphs corresponding to the elements in the obtained max-weight independent subset ${\cal G}^{\text{o}}$ , and remove the edges in these complete subgraphs. Utilizing the vertices and the remaining edges in the node graph ${\cal G}^{\text{n}}$ , we propose to construct a redundancy graph denoted by ${\cal G}^{\text{r}}=\left({\cal M},{\cal E}^{\text{r}}\right)$ , where ${\cal M}$ denotes its vertex set and ${\cal E}^{\text{r}}$ denotes its edge set reflecting the cache redundancy among cooperative F-APs. Let $e=\left\{{m,m^{\prime}}\right\}\in{\cal E}^{\text{r}}$ denote the edge that connects vertex $m$ and vertex $m^{\prime}$ , and ${\cal F}_{e}^{\text{d}}$ denote the set of duplicate popular files in the obtained popular file set of the cooperative F-APs corresponding to the vertices connected by edge $e$ . If edge $e$ connects $m\in{{{\cal M}_{n}^{\text{c}}}}$ and $m^{\prime}\in{\cal M}^{\text{n}}$ , then we have: ${\cal F}_{e}^{\text{d}}={\cal F}_{n}^{\text{c}}\cap{\cal F}_{m}^{\text{n}}$ . If edge $e$ connects $m\in{{\cal M}_{n}^{\text{c}}}$ and $m^{\prime}\in{{\cal M}_{n^{\prime}}^{\text{c}}}$ , then we have: ${\cal F}_{e}^{\text{d}}={\cal F}_{n}^{\text{c}}\cap{\cal F}_{n^{\prime}}^{\text{c}}$ , whose size may exceed $K$ , i.e., the storage size of each F-AP. Correspondingly, only a portion of the duplicate popular files in ${\cal F}_{e}^{\text{d}}$ will indeed cause cache redundancy. Furthermore, when edge $e$ connects $m\in{{\cal M}_{n}^{\text{c}}}$ and edge $e^{\prime}$ connects $m^{\prime}\in{{\cal M}_{n}^{\text{c}}}$ with $e^{\prime}\neq e$ and $m^{\prime}\neq m$ , ${\cal F}_{e}^{\text{d}}$ and ${\cal F}_{e^{\prime}}^{\text{d}}$ may contain duplicate files. Correspondingly, only a portion of the duplicate popular files in ${\cal F}_{e}^{\text{d}}$ and ${\cal F}_{e^{\prime}}^{\text{d}}$ will indeed cause cache redundancy. Therefore, we propose to separate ${\cal F}_{e}^{\text{d}}$ to determine the duplicate files that will indeed cause cache redundancy at edge $e$ . The process of separating ${\cal F}_{e}^{\text{d}}$ will be presented in detail in Section IV-B-2.

Then, we propose to enhance the caching decisions to control the caching locations of the duplicate popular files and ensure that each duplicate popular file is cached only once between each pair of cooperative F-APs. After determining the caching locations for all the duplicate popular files, the remaining storage space of each F-AP is filled by the rest files according to their request probability. The process of caching-decision enhancement will be presented in detail in Section IV-B-3.

IV-B2 Separate the Set of Duplicate Popular Files

Let ${\cal T}_{m}$ denote the adjacency table of vertex $m$ of ${\cal G}^{\text{r}}$ . Sort all the adjacency tables in descending order according to their table sizes. Let ${\cal T}$ denote the set of the reordered adjacency tables of ${\cal G}^{\text{r}}$ , and ${\cal T}_{m}^{\text{o}}$ denote the $m$ th adjacency table in ${\cal T}$ . Let $K_{m}$ denote the size of the remaining storage space of the F-AP corresponding to vertex $m$ . Initialize $K_{m}=K$ . Let ${\cal F}_{m}^{\text{i}}$ denote the intersection of the sets of duplicate popular files at all the edges that connect vertex $m$ and its adjacency vertices in ${\cal T}_{m}^{\text{o}}$ . Then, it can be expressed as follows:

[TABLE]

Let ${\cal F}_{e}^{\text{r}}$ denote the set of files that will indeed cause cache redundancy at edge $e$ after separating ${\cal F}_{e}^{\text{d}}$ . If $\left|{\cal F}_{m}^{\text{i}}\right|\geq K_{m}$ , ${\cal F}_{e}^{\text{r}}$ will be constituted by the random $K_{m}$ files in ${\cal F}_{m}^{\text{i}}$ . Otherwise, ${\cal F}_{e}^{\text{r}}$ will be constituted by the random ${{(K_{m}-\left|{\cal F}_{m}^{\text{i}}\right|)}\mathord{\left/{\vphantom{{{K_{m}}}({\left|{{\cal T}_{m}^{\text{o}}}\right|}-1)}}\right.\kern-1.2pt}({\left|{{\cal T}_{m}^{\text{o}}}\right|}-1)}$ files in ${\cal F}_{e}^{\text{d}}\backslash{\cal F}_{m}^{\text{i}}$ and all the files in ${\cal F}_{m}^{\text{i}}$ . Once ${\cal F}_{e}^{\text{r}}$ is determined, update $K_{m^{\prime}}=K_{m^{\prime}}-\left|{\cal F}_{e}^{\text{r}}\right|$ for vertex $m^{\prime}\in{\cal T}_{m}^{\text{o}}$ and update ${\cal F}_{e^{\prime}}^{\text{d}}={\cal F}_{e^{\prime}}^{\text{d}}\backslash{\cal F}_{e}^{\text{r}}$ if edge $e^{\prime}\in{\cal E}^{\text{r}}$ connects vertex $m^{\prime}$ .

IV-B3 Enhance the Caching Decisions

Let $\Delta{x_{mf}}\in\left\{-1,0,1\right\}$ denote the indicator of the caching-decision enhancement for file $f\in\cal F$ at vertex $m\in{\cal M}$ , where $\Delta{x_{mf}}=1$ indicates that the F-AP corresponding to vertex $m$ is chosen as the caching location for file $f$ , $\Delta{x_{mf}}=-1$ indicates that the F-AP corresponding to vertex $m$ is not allowed to cache file $f$ so as to eliminate redundancy, and $\Delta{x_{mf}}=0$ indicates that the caching location for file $f$ has not been determined yet. Initialize ${\Delta{x_{mf}}}=0$ and set $K_{m}=K$ .

Firstly, calculate the indicators of the caching-decision enhancements for file $f\in{\cal F}_{e}^{\text{r}}$ . For each ${\cal T}_{m}^{\text{o}}\in{\cal T}$ and each $m^{\prime}\in{\cal T}_{m}^{\text{o}}$ with $m^{\prime}\neq m$ , find the files whose caching locations are at the F-AP corresponding to vertex $m$ , and forbid these files to be cached at the F-AP corresponding to vertex $m^{\prime}$ . Then, the indicators of the corresponding caching-decision enhancements are set as follows:

[TABLE]

Update ${\cal F}_{e}^{\text{r}}$ by removing these files. Furthermore, find the files whose caching locations are not allowed to be at the F-AP corresponding to vertex $m$ , and choose the F-AP corresponding to vertex $m^{\prime}$ as the caching locations for these files. Then, the indicators of the corresponding caching-decision enhancements are set as follows:

[TABLE]

Update $K_{m^{\prime}}$ and ${\cal F}_{e}^{\text{r}}$ by removing these files. If F-AP $m^{\prime}\in{\cal M}_{n}^{\text{c}}$ , update ${\cal F}_{n}^{\text{c}}$ by removing these files. Let $T_{em}^{\text{p}}$ denote the possible offloaded traffic due to caching the remaining files in ${{\cal F}_{e}^{\text{r}}}$ at the F-AP corresponding to vertex $m$ . Then, it can be expressed as follows:

[TABLE]

Suppose $T_{em}^{\text{p}}\geq T_{em^{\prime}}^{\text{p}}$ . Then, set the indicators of the corresponding caching-decision enhancements as follows:

[TABLE]

Update $K_{m}=K_{m}-\left|{\cal F}_{e}^{\text{r}}\right|.$ If F-AP $m\in{\cal M}_{n}^{\text{c}}$ , update ${\cal F}_{n}^{\text{c}}$ by removing these files.

Secondly, calculate the indicators of the corresponding caching-decision enhancements for the remaining files in ${\cal F}_{n}^{\text{c}}$ . For each $m\in{\cal M}_{n}^{\text{c}}$ , find the files whose caching locations can be at the F-AP corresponding to vertex $m$ . Then, set the indicators of the corresponding caching-decision enhancements as follows:

[TABLE]

Update $K_{m}$ and ${\cal F}_{n}^{\text{c}}$ by removing these files. For each $m\in{\cal M}_{n}^{\text{c}}$ which satisfies $K_{m}>0$ , randomly select $K_{m}$ files from ${\cal F}_{n}^{\text{c}}$ , and set the indicators of the corresponding caching-decision enhancements as follows:

[TABLE]

Update $K_{m}$ and ${\cal F}_{n}^{\text{c}}$ by removing these files.

Thirdly, calculate the indicators of the corresponding caching-decision enhancements at the vertices corresponding to nonclustered F-APs. For each $m\in{{\cal M}^{\text{n}}}$ which satisfies $K_{m}>0$ , select $K_{m}$ most popular files from the uncached files at the F-AP corresponding to vertex $m$ and its cooperators according to their request probability, and set $\Delta{x_{mf}}$ as follows:

[TABLE]

Finally, enhance the caching-decision for each $m\in{{\cal M}}$ as follows:

[TABLE]

The detailed description of our proposed graph-based content placement algorithm is presented in Algorithm 3.

IV-C Complexity Analysis

Let ${\bar{L}}$ denote the average size of the adjacency tables of all the vertices in the node graph $\cal G^{\text{n}}$ . Then, the computational complexity of searching for maximal complete subgraphs in Algorithm 1 is ${\mathcal{O}}(M\bar{L})$ . Furthermore, the computational complexity of obtaining all the complete subgraphs is ${\mathcal{O}}(P{{\bar{V}}})$ , where $P$ denotes the number of the maximal complete subgraphs that have been found, and ${{\bar{V}}}$ denotes the average vertex number of all the complete subgraphs. Besides, the computational complexity of searching for the max-weight independent subset in Algorithm 2 is ${\mathcal{O}}(N^{\prime}{N})$ . Therefore, the computational complexity of the proposed graph-based clustering approach is ${\mathcal{O}}(M\bar{L}+P{{\bar{V}}}+N^{\prime}{N})$ .

Let $\delta$ denote the maximum degree of the redundancy graph. The computation complexity of the proposed graph-based content placement algorithm is ${\mathcal{O}}(M\delta+2M)$ .

In summary, the computational complexity of the proposed graph-based cooperative caching scheme is ${\mathcal{O}}(M\bar{L}+P{{\bar{V}}}+N^{\prime}{N}+M\delta+2M)$ . By considering ${\bar{L}<M}$ , ${{\bar{V}}}<M$ , $N<M$ , and $\delta<M$ , the computation complexity of the proposed graph-based cooperative caching scheme is ${\mathcal{O}}(M^{2}+PM+N^{\prime}M)$ for the worst case. It is obviously lower than that of ${\mathcal{O}}(M^{3}KF^{2})$ in [8] and ${\mathcal{O}}(M^{4}K+MKF)$ in [17] by taking $M\ll F,P<F,$ and $N^{\prime}<F$ into account.

V Simulation Results

In this section, the performance of the proposed graph-based cooperative caching scheme is evaluated via simulations. In the simulations, the request probability at each F-AP is generated from the global request probability which follows Zipf distribution with the skewness parameter $z$ .111Let $p_{f}$ denote the global request probability of file $f$ . Assume that the global request probability and the request probability at the considered $M$ F-APs have the following relationship: ${p_{f}}=\sum\nolimits_{m\in{\cal M}}{{w_{m}}{p_{mf}}}$ [19]. Unless otherwise stated, the system parameters are set as follows: $z=0.6$ , $M=10$ , $F=5000$ , $K=250$ , $L=2$ Gb. We choose the locally popular caching (LPC) scheme and the globally popular caching (GPC) scheme as two baselines [15]. In the LPC scheme, the most $K$ popular files are cached at each F-AP based on the local request probability, and neighboring F-APs can cooperate with each other. In the GPC scheme, the most $K$ popular files are cached at each F-AP based on the global request probability, and neighboring F-APs cannot cooperative with each other.

In Fig. 4, we show the offloaded traffic $T$ of our proposed scheme and the two baselines versus the storage size $K$ of each F-AP with different distance threshold ${\gamma^{\text{d}}}$ . It can be observed that the offloaded traffic of all the three schemes increases with the storage size. It can also be observed that the performance of the proposed scheme is superior to that of the baselines.222Clearly, a centralized approach has been assumed in this paper. This would certainly incur the necessary signaling overhead, and the impact of such overhead will be an interesting issue for future research. The reason is that the proposed scheme improves clustering and reduces the repetitive and redundant storage of files. Correspondingly, more user requests can be satisfied locally compared with the baselines. Furthermore, the offloaded traffic of the proposed and LPC schemes increases with distance threshold ${\gamma^{\text{d}}}$ , and ${\gamma^{\text{d}}}$ has a greater influence on the performance of the proposed scheme. The reason is that as ${{\gamma^{\text{d}}}}$ becomes larger, the constraints of the clustering subproblem in our proposed scheme will be relaxed, the cluster size will become larger, more F-APs can cooperate with each other, and more files can then be successfully cached locally.

In Fig. 5, we show the offloaded traffic $T$ of our proposed scheme and the two baselines versus the skewness parameter $z$ of Zipf distribution with ${\gamma^{\text{d}}}=20$ and $K=1000$ . It can be observed that the offloaded traffic of all the three schemes increases with $z$ . The reason is that as $z$ becomes larger, the most popular files will concentrate in a fewer files and more traffic can then be offloaded. It can also be observed that the performance of the proposed scheme is superior to that of the baselines for all $z$ .

In Fig. 6, we show the offloaded traffic $T$ of our proposed scheme and the two baselines versus the content library size $F$ with ${\gamma^{\text{d}}}=15$ and $z=0.4$ . It can be observed that the offloaded traffic of all the three schemes decreases with $F$ . The reason is that as $F$ becomes larger, the requested files will become more diverse and the number of requested files that are not cached locally will increase. It can also be observed that the performance of the proposed scheme is superior to that of the baselines for all $F$ .

VI Conclusions

In this paper, we have proposed a graph-based cooperative caching scheme including clustering and content placement in F-RAN. By constructing the relevant node graph and weighted graph, the objective cluster sets have been obtained by searching for the max-weight independent subset of the vertex set of the weighted graph. By constructing the redundancy graph, the final caching decisions have been obtained by calculating the indicators of the caching-decision enhancements. Both significant computational complexity reduction and remarkable offloaded traffic have been achieved by using our proposed graph-based cooperative caching scheme.

Bibliography26

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] M. Peng, S. Yan, K. Zhang, and et al., “Fog-computing-based radio access networks: Issues and challenges,” IEEE Netw. , vol. 30, no. 4, pp. 46–53, July 2016.
2[2] W. Jiang, G. Feng, and S. Qin, “Optimal cooperative content caching and delivery policy for heterogeneous cellular networks,” IEEE Trans. Mob. Comput. , vol. 16, no. 5, pp. 1382–1393, May 2017.
3[3] X. Li, X. Wang, and V. C. M. Leung, “Weighted network traffic offloading in cache-enabled heterogeneous networks,” in Proc. IEEE ICC , July 2016.
4[4] S. Zhang, P. He, K. Suto, and et al., “Cooperative edge caching in user-centric clustered mobile networks,” IEEE Trans. Mob. Comput. , vol. 17, no. 8, pp. 1791–1805, Aug. 2018.
5[5] S. Chae, T. Quek, and W. Choi, “Content placement for wireless cooperative caching helpers a tradeoff between cooperative gain and content diversity gain,” IEEE Trans. Wireless Commun. , vol. 16, no. 10, pp. 6795–6807, Oct. 2017.
6[6] K. Poularakis, G. Iosifidis, A. Argyriou, and et al., “Caching and operator cooperation policies for layered video content delivery,” in Proc. IEEE INFOCOM , July 2016.
7[7] Y. Sun, Z. Chen, and H. Liu, “Delay analysis and optimization in cache-enabled multi-cell cooperative networks,” in Proc. IEEE GLOBECOM , Dec. 2016.
8[8] K. Shanmugam, N. Golrezaei, A. Dimakis, and et al., “Femtocaching: Wireless content delivery through distributed caching helpers,” IEEE Trans. Info. Theory , vol. 59, no. 12, pp. 8402–8413, Dec. 2013.