Online Reinforcement Learning of X-Haul Content Delivery Mode in Fog   Radio Access Networks

Jihwan Moon; Osvaldo Simeone; Seok-Hwan Park; Inkyu Lee

arXiv:1903.07364·eess.SP·December 23, 2019

Online Reinforcement Learning of X-Haul Content Delivery Mode in Fog Radio Access Networks

Jihwan Moon, Osvaldo Simeone, Seok-Hwan Park, Inkyu Lee

PDF

TL;DR

This paper introduces an adaptive, reinforcement learning-based method for selecting content delivery modes in fog radio access networks, balancing current and future latency to optimize overall performance.

Contribution

It proposes a novel RL-based approach for mode selection in F-RANs that accounts for unknown, changing content popularity, improving latency management.

Findings

01

The RL scheme effectively reduces long-term delivery latency.

02

Adaptive mode selection outperforms static strategies.

03

Numerical results validate the approach's efficiency.

Abstract

We consider a Fog Radio Access Network (F-RAN) with a Base Band Unit (BBU) in the cloud and multiple cache-enabled enhanced Remote Radio Heads (eRRHs). The system aims at delivering contents on demand with minimal average latency from a time-varying library of popular contents. Information about uncached requested files can be transferred from the cloud to the eRRHs by following either backhaul or fronthaul modes. The backhaul mode transfers fractions of the requested files, while the fronthaul mode transmits quantized baseband samples as in Cloud-RAN (C-RAN). The backhaul mode allows the caches of the eRRHs to be updated, which may lower future delivery latencies. In contrast, the fronthaul mode enables cooperative C-RAN transmissions that may reduce the current delivery latency. Taking into account the trade-off between current and future delivery performance, this paper proposes an…

Tables1

\begin{matrix} Algorithm 1: Proposed RL-based solution for problem (P) \\ Initialize the total number of episodes N_{epi}, weight vector w = 0, \\ eligibility trace E = 0, and parameter γ, λ \in (0, 1] \\ For n_{epi} = 1 : N_{epi} \\ Randomly initialize cached contents ℱ^{R} ​ (0) and generate {H_{m ​ k}} \\ For t = 1 : T_{B} \\ Collect observation o ​ (t) = {ℱ_{req} (t), ℱ^{R} ​ (t), {τ_{req, f} ​ (t)}_{f \in ℱ^{R} ​ (t)}} \\ Choose the delivery mode greedily with probability 1 - 1 / n_{epi} \\ as a ​ (t) = \arg \max_{a^{'}} w^{T} ​ x ​ (o ​ (t), a^{'}), and uniformly with \\ probability 1 / n_{epi} \\ If a ​ (t) = 1, update ℱ_{cache, R} ​ (t) according to LRU \\ Set r ​ (t + 1) = - Δ ​ (t, a ​ (t)) \\ Update E \leftarrow γ ​ λ ​ E + x ​ (o, a) \\ Update w \leftarrow w + β ​ δ ​ (t, w) ​ E with β = 1 / n_{epi} \\ End \\ End \end{matrix}

(25)

Equations20

(P): π min E_{π} [\sum_{t = 1}^{\infty} γ^{t} Δ (t, a (t))] s.t. a (t) \in {0, 1}, \forall t,

(P): π min E_{π} [\sum_{t = 1}^{\infty} γ^{t} Δ (t, a (t))] s.t. a (t) \in {0, 1}, \forall t,

\displaystyle R_{\scriptscriptstyle{\text{back},k}}^{U}\left(\left\{\textbf{G}_{\scriptscriptstyle{k}}\right\}\right)=\log_{2}\big{|}\textbf{I}_{\scriptscriptstyle{N_{k}^{U}}}+\boldsymbol{\Phi}_{\scriptscriptstyle{\text{back},k}}^{U}\big{|}\ \text{[bits/symbol]},

\displaystyle R_{\scriptscriptstyle{\text{back},k}}^{U}\left(\left\{\textbf{G}_{\scriptscriptstyle{k}}\right\}\right)=\log_{2}\big{|}\textbf{I}_{\scriptscriptstyle{N_{k}^{U}}}+\boldsymbol{\Phi}_{\scriptscriptstyle{\text{back},k}}^{U}\big{|}\ \text{[bits/symbol]},

min_{Δ^{U}, {G_{k}}} Δ^{R} + Δ^{U}

min_{Δ^{U}, {G_{k}}} Δ^{R} + Δ^{U}

Δ^{U} \geq L / R_{back, k}^{U} ({G_{k}}), \forall k \in K_{req},

\displaystyle\text{tr}\big{(}\sum\nolimits_{k\in\mathcal{K}_{\scriptscriptstyle{\text{req}}}}\!\!\!\!\textbf{E}_{\scriptscriptstyle{m}}\textbf{G}_{\scriptscriptstyle{k}}\textbf{G}_{\scriptscriptstyle{k}}^{H}\textbf{E}_{\scriptscriptstyle{m}}^{H}\big{)}\leq P_{\scriptscriptstyle{m}}^{R},m=1,...,M,

\displaystyle R_{\scriptscriptstyle{\text{front},k}}^{U}\big{(}\big{\{}\tilde{\textbf{G}}_{\scriptscriptstyle{k}}\big{\}},\boldsymbol{\Omega}_{\scriptscriptstyle{R}}\big{)}=\log_{2}\big{|}\textbf{I}_{\scriptscriptstyle{N_{k}^{U}}}+\boldsymbol{\Phi}_{\scriptscriptstyle{\text{front},k}}^{U}\big{|}\ \text{[bits/symbol]},

\displaystyle R_{\scriptscriptstyle{\text{front},k}}^{U}\big{(}\big{\{}\tilde{\textbf{G}}_{\scriptscriptstyle{k}}\big{\}},\boldsymbol{\Omega}_{\scriptscriptstyle{R}}\big{)}=\log_{2}\big{|}\textbf{I}_{\scriptscriptstyle{N_{k}^{U}}}+\boldsymbol{\Phi}_{\scriptscriptstyle{\text{front},k}}^{U}\big{|}\ \text{[bits/symbol]},

\displaystyle g_{\scriptscriptstyle{m}}\big{(}\big{\{}\tilde{\textbf{G}}_{\scriptscriptstyle{k}}\big{\}},\boldsymbol{\Omega}_{\scriptscriptstyle{R}}\big{)}=\log_{2}\big{|}\textbf{I}_{\scriptscriptstyle{N_{m}^{R}}}+\boldsymbol{\Phi}_{\scriptscriptstyle{m}}^{R}\big{|}\ \text{[bits/symbol]},

\displaystyle g_{\scriptscriptstyle{m}}\big{(}\big{\{}\tilde{\textbf{G}}_{\scriptscriptstyle{k}}\big{\}},\boldsymbol{\Omega}_{\scriptscriptstyle{R}}\big{)}=\log_{2}\big{|}\textbf{I}_{\scriptscriptstyle{N_{m}^{R}}}+\boldsymbol{\Phi}_{\scriptscriptstyle{m}}^{R}\big{|}\ \text{[bits/symbol]},

min_{Δ^{R}, Δ^{U}, {\tilde{G}_{k}}, Ω_{R}} Δ^{R} + Δ^{U}

min_{Δ^{R}, Δ^{U}, {\tilde{G}_{k}}, Ω_{R}} Δ^{R} + Δ^{U}

\displaystyle\Delta^{R}\geq{\Delta^{U}g_{\scriptscriptstyle{m}}\big{(}\big{\{}\tilde{\textbf{G}}_{\scriptscriptstyle{k}}\big{\}},\boldsymbol{\Omega}_{\scriptscriptstyle{R}}\big{)}}/{C_{\scriptscriptstyle{m}}^{R}},m=1,...,M,

\displaystyle\Delta^{U}\geq{L}/{R_{\scriptscriptstyle{\text{front},k}}^{U}\big{(}\big{\{}\tilde{\textbf{G}}_{\scriptscriptstyle{k}}\big{\}},\boldsymbol{\Omega}_{\scriptscriptstyle{R}}\big{)}},~{}\forall k\in\mathcal{K}_{\scriptscriptstyle{\text{req}}},

\displaystyle\text{tr}\big{(}\sum\nolimits_{k\in\mathcal{K}_{\scriptscriptstyle{\text{req}}}}\textbf{E}_{\scriptscriptstyle{m}}\tilde{\textbf{G}}_{\scriptscriptstyle{k}}\tilde{\textbf{G}}_{\scriptscriptstyle{k}}^{H}\textbf{E}_{\scriptscriptstyle{m}}^{H}+\textbf{E}_{\scriptscriptstyle{m}}\boldsymbol{\Omega}_{\scriptscriptstyle{R}}\textbf{E}_{\scriptscriptstyle{m}}^{H}\big{)}\leq P_{\scriptscriptstyle{m}}^{R},

m = 1, ..., M,

\displaystyle\textbf{x}\left(o(t),a(t)\right)=\big{[}\boldsymbol{\phi}_{\scriptscriptstyle{1}}^{T}(t)\ \cdots\ \boldsymbol{\phi}_{\scriptscriptstyle{F}}^{T}(t)\ \boldsymbol{\theta}^{T}(t)\big{]}^{T}\otimes\textbf{a}(t),

\displaystyle\textbf{x}\left(o(t),a(t)\right)=\big{[}\boldsymbol{\phi}_{\scriptscriptstyle{1}}^{T}(t)\ \cdots\ \boldsymbol{\phi}_{\scriptscriptstyle{F}}^{T}(t)\ \boldsymbol{\theta}^{T}(t)\big{]}^{T}\otimes\textbf{a}(t),

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Online Reinforcement Learning of X-Haul Content Delivery Mode in Fog Radio Access Networks

Jihwan Moon, Member, IEEE, Osvaldo Simeone, Fellow, IEEE, Seok-Hwan Park, Member, IEEE,

and Inkyu Lee, Fellow, IEEE This work was supported by the National Research Foundation through the Ministry of Science, ICT, and Future Planning (MSIP), Korean Government under Grant 2017R1A2B3012316. O. Simeone has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 Research and Innovation Programme (Grant Agreement No. 725731). J. Moon and I. Lee are with the School of Electrical Engineering, Korea University, Seoul 02841, South Korea (e-mail: {anschino, inkyu}@korea.ac.kr). O. Simeone is with the Department of Informatics, King’s College London, London WC2R 2LS, U.K. (e-mail: [email protected]). S.-H. Park is with the Division of Electronic Engineering, Chonbuk National University, Jeonju 54896, South Korea (e-mail: [email protected]).

Abstract

We consider a Fog Radio Access Network (F-RAN) with a Base Band Unit (BBU) in the cloud and multiple cache-enabled enhanced Remote Radio Heads (eRRHs). The system aims at delivering contents on demand with minimal average latency from a time-varying library of popular contents. Information about uncached requested files can be transferred from the cloud to the eRRHs by following either backhaul or fronthaul modes. The backhaul mode transfers fractions of the requested files, while the fronthaul mode transmits quantized baseband samples as in Cloud-RAN (C-RAN). The backhaul mode allows the caches of the eRRHs to be updated, which may lower future delivery latencies. In contrast, the fronthaul mode enables cooperative C-RAN transmissions that may reduce the current delivery latency. Taking into account the trade-off between current and future delivery performance, this paper proposes an adaptive selection method between the two delivery modes to minimize the long-term delivery latency. Assuming an unknown and time-varying popularity model, the method is based on model-free Reinforcement Learning (RL). Numerical results confirm the effectiveness of the proposed RL scheme.

I Introduction

The architecture of the recently launched fifth generation (5G) mobile system can leverage cloud processing at Base Band Units (BBUs), as well as edge processing, including edge caching, at enhanced Remote Radio Heads (eRRHs) [1]. In order to enable a flexible functional split in this architecture, referred to as Fog-Radio Access Network (F-RAN) [2], the concept of X-haul has been introduced to integrate the traditionally distinct backhaul and fronthaul connectivity modes for the interface between the BBU and the eRRH into a unified framework [3, 4, 5]. The backhaul mode enables the transfer of data packets from the BBU in the cloud to the eRRHs. In contrast, the fronthaul mode allows the BBU to carry out joint baseband processing and deliver quantized baseband samples to the eRRHs as in Cloud-RAN (C-RAN) [6, 7, 8].

In this work, we study an adaptive selection of backhaul and fronthaul transfer modes with the aim of optimizing the performance of content delivery. The content delivery in F-RANs has been widely studied in recent years [9, 10, 11, 12, 13, 14, 15]. Most studies assume offline caching with a static popularity model. Under these assumptions, references [9] and [10] investigated the problem of instantaneous delivery latency minimization and minimum data rate maximization, respectively, while keeping the contents of the caches fixed. In contrast, in [11] and [12], information-theoretic performance bounds were provided on the optimal high Signal-to-Noise-Ratio (SNR) performance by considering also the optimization of uncoded caching strategies. An extension of this work that accounts for time-varying and possibly unknown file popularity with online caching was described in [13]. Under an unknown dynamic popularity model, the works [14] and [15] presented a Reinforcement Learning (RL) based optimization of online caching by assuming a backhaul mode.

In this paper, we investigate for the first time the online minimization of the long-term delivery latency over X-haul links in an F-RAN with time-varying unknown file popularity. We focus on the joint optimization of linear precoding strategies and the choice between fronthaul and backhaul modes. The backhaul mode enables cache updates at the eRRHs, hence potentially reducing future latencies. In contrast, the fronthaul mode allows cooperative C-RAN transmissions which decrease the current delivery latency [9, 10, 11]. We propose a new model-free RL approach based on a linear value function approximation with properly selected features, and numerical results confirm the effectiveness of the proposed RL scheme.

Notations: $\mathbb{E}\left[\cdot\right]$ and $\text{Pr}\left(\cdot\right)$ stand for expectation and probability, respectively. $\left|\mathcal{A}\right|$ represents the cardinality of set $\mathcal{A}$ , and $\mathbb{C}^{m\times n}$ denotes an $m\times n$ complex matrix. $\mathbb{I}\left\{c\right\}$ outputs one if condition $c$ is true and zero otherwise. For a matrix X, $\left|\textbf{X}\right|$ , $\textbf{X}^{T}$ , $\textbf{X}^{H}$ , $\textbf{X}^{-1}$ and $\text{tr}\left(\textbf{X}\right)$ are defined as determinant, transpose, Hermitian, inverse and trace, respectively. $\textbf{I}_{\scriptscriptstyle{m}}$ means an $m\times m$ identity matrix while $\otimes$ equals a Kronecker product operation. Also, $\text{diag}\big{(}\textbf{X}_{\scriptscriptstyle{1}},...,\textbf{X}_{\scriptscriptstyle{N}}\big{)}$ represents block-wise diagonalization of matrices $\textbf{X}_{\scriptscriptstyle{1}},...,\textbf{X}_{\scriptscriptstyle{N}}$ . Lastly, $\mathcal{CN}\left(\boldsymbol{\mu},\boldsymbol{\Omega}\right)$ indicates a circularly symmetric complex Gaussian distribution with mean vector $\boldsymbol{\mu}$ and covariance matrix $\boldsymbol{\Omega}$ .

II System Model

We study the F-RAN system illustrated in Fig. 1, which consists of a BBU in the cloud, connected to $M$ cache-enabled eRRHs and $K$ users. Each X-haul link between the BBU and the $m$ -th eRRH has capacity $C_{\scriptscriptstyle{m}}^{R}$ bits per symbols and can be operated in both backhaul and fronthaul modes [4][5]. The $k$ -th user and the $m$ -th eRRH are equipped with $N_{\scriptscriptstyle{k}}^{U}$ and $N_{\scriptscriptstyle{m}}^{R}$ antennas, respectively. We assume a time-slotted operation [15], and the wireless channel matrix $\textbf{H}_{\scriptscriptstyle{mk}}$ between the $m$ -th eRRH and the $k$ -th user is assumed to be fixed for the given time scale of interest $T_{\scriptscriptstyle{\text{B}}}$ slots. We also define $\mathcal{F}\triangleq\left\{f_{\scriptscriptstyle{1}},...,f_{\scriptscriptstyle{F}}\right\}$ as the library of $F$ $L$ -bit files, which may be requested by the users. Finally, we denote $\mathcal{F}^{R}(t)\subseteq\mathcal{F}$ as the subset of files cached at time slot $t$ at the eRRHs whose cardinality is bounded by $F_{\scriptscriptstyle{\max}}^{R}$ files due to storage capacity constraints. Note that in this letter, we make a simplifying assumption that all the eRRHs store the same files in their respective caches. Generalization of the framework is possible but at the cost of a more cumbersome notation. Detailed request, online caching and delivery models are described in the following.

II-A Request Model and Online Caching

In each time slot $t$ , a subset $\mathcal{F}_{\scriptscriptstyle{\text{pop}}}(t)\subseteq\mathcal{F}$ of files is popular in the sense that all users request files from $\mathcal{F}_{\scriptscriptstyle{\text{pop}}}(t)$ . Specifically, the $k$ -th user requests a uniformly selected file $f_{\scriptscriptstyle{k}}^{U}(t)$ from subset $\mathcal{F}_{\scriptscriptstyle{\text{pop}}}(t)$ without replacement [13]. The assumption of no replacement ensures that all requested files are distinct, yielding a worst-case performance analysis [11]. We assume that the popularity $\mathcal{F}_{\scriptscriptstyle{\text{pop}}}(t)$ varies as a Markov process as in [14, 16, 17, 18]. This is a standard assumption which provides a first-order approximation of the evolution of the content popularity [19][20]. Let $\mathcal{K}_{\scriptscriptstyle{\text{req},\text{C}}}(t)$ and $\mathcal{K}_{\scriptscriptstyle{\text{req},\text{NC}}}(t)$ denote the indices of the users whose requested files $\mathcal{F}_{\scriptscriptstyle{\text{req},\text{C}}}(t)\triangleq\left\{f_{\scriptscriptstyle{k}}^{U}(t)\right\}_{\scriptscriptstyle{k\in\mathcal{K}_{\scriptscriptstyle{\text{req},\text{C}}}(t)}}$ are cached and the indices of users whose requested files $\mathcal{F}_{\scriptscriptstyle{\text{req},\text{NC}}}(t)\triangleq\left\{f_{\scriptscriptstyle{k}}^{U}(t)\right\}_{\scriptscriptstyle{k\in\mathcal{K}_{\scriptscriptstyle{\text{req},\text{NC}}}(t)}}$ are not cached at time $t$ , respectively. In case the backhaul mode is selected at time slot $t$ , the requested but uncached files in $\mathcal{F}_{\scriptscriptstyle{\text{req},\text{NC}}}(t)$ are sent on all the X-haul links and cached. In order to make space for a new file, a previously cached file is evicted by following the standard Least Recently Used (LRU) rule [21].

II-B Delivery Operation

At each slot $t$ , the X-haul link is used in either fronthaul or backhaul mode for $\Delta^{R}(t,\text{a}(t))$ symbols, where $\text{a}(t)=0$ and $1$ indicate the selection of fronthaul and backhaul modes, respectively. Subsequently, the eRRHs deliver the requested files in set $\mathcal{F}_{\scriptscriptstyle{\text{req}}}(t)\triangleq\mathcal{F}_{\scriptscriptstyle{\text{req},\text{C}}}(t)\cup\mathcal{F}_{\scriptscriptstyle{\text{req},\text{NC}}}(t)$ over the wireless channel for $\Delta^{U}(t,\text{a}(t))$ symbols, based on the signals received on the X-haul links and on the cached contents. This results in a total latency of $\Delta(t,\text{a}(t))=\Delta^{R}(t,\text{a}(t))+\Delta^{U}(t,\text{a}(t))$ symbols for time slot $t$ . Note that the eRRHs’ caches are updated according to the caching mechanism described in Section II-A only if the backhaul mode is selected as $\text{a}(t)=1$ .

II-C Problem Formulation

The delivery time $\Delta(t,\text{a}(t))$ at slot $t$ depends on the state of the system $\text{s}(t)=\{\mathcal{F}_{\scriptscriptstyle{\text{pop}}}(t)$ , $\mathcal{F}^{R}(t)$ , $\mathcal{F}_{\scriptscriptstyle{\text{req}}}(t)\}$ , which includes the set of popular files, cached files and requested files, respectively. Given the Markovity of the process $\mathcal{F}_{\text{pop}}(t)$ , the state $\text{s}(t)$ evolves as a controlled Markov process. $\text{s}(t)$ is partially observable since the set $\mathcal{F}_{\scriptscriptstyle{\text{pop}}}(t)$ is unknown, and it is only observed indirectly via the file set $\mathcal{F}_{\scriptscriptstyle{\text{req}}}(t)$ . In particular, at time $t$ , only the history of observations $\text{o}(1\text{:}t)\triangleq\left\{\text{o}(1),...,\text{o}(t)\right\}$ with $\text{o}(t)=\{\mathcal{F}_{\scriptscriptstyle{\text{req}}}(t)$ , $\mathcal{F}^{R}(t)\}$ is available to the system. Thus, a general policy can map the observations $\text{o}(1\text{:}t)$ to the selected action $\text{a}(t)$ through a conditional distribution $\pi(\text{a}(t)|\text{o}(1\text{:}t))$ .

In this work, we aim at minimizing the average long-term delivery latency of the proposed F-RAN system over the selection of policy $\pi(\text{a}(t)|\text{o}(1\text{:}t))$ . Given a forgetting factor $\gamma\leq 1$ , the problem can be formulated as

[TABLE]

where calculation of the total latency $\Delta(t,\text{a}(t))$ will be reviewed in Section III. The expectation in (P) is over the state distribution, which depends on the policy.

III Minimum Instantaneous Latency

In this section, we discuss how to evaluate the delivery latency $\Delta(t,\text{a}(t))$ in problem (P). We emphasize that $\Delta(t,\text{a}(t))$ for $\text{a}(t)=0$ and $1$ is assumed known when solving problem (P) at each time slot $t$ , and is derived as defined in this section. Following [9], we omit the time index $t$ for simplicity.

III-A Backhaul Mode

In the backhaul mode ( $\text{a}=1$ ), the BBU first fetches the requested but uncached files $\mathcal{F}_{\scriptscriptstyle{\text{req},\text{NC}}}$ and transmits them to the eRRHs. The backhaul transmission to the $m$ -th eRRH takes $\Delta_{\scriptscriptstyle{m}}^{R}=\big{|}\mathcal{F}_{\scriptscriptstyle{\text{req},\text{NC}}}\big{|}L/C_{\scriptscriptstyle{m}}^{R}$ symbols, and the total backhaul latency is $\Delta^{R}=\max\nolimits_{m}\Delta_{\scriptscriptstyle{m}}^{R}$ , since all the eRRHs need to receive the files in $\mathcal{F}_{\scriptscriptstyle{\text{req},\text{NC}}}$ . As a result, all the requested files in $\mathcal{F}_{\scriptscriptstyle{\text{req}}}$ are available at the eRRHs and cooperative transmission across all eRRHs is feasible. Each file $f_{\scriptscriptstyle{k}}^{U}\in\mathcal{F}_{\scriptscriptstyle{\text{req}}}$ for the $k$ -th user is encoded by each eRRH as the signal $\textbf{s}_{\scriptscriptstyle{k}}\in\mathbb{C}^{n_{\scriptscriptstyle{k}}\times 1}\sim\mathcal{CN}\left(\textbf{0},\textbf{I}_{\scriptscriptstyle{n_{\scriptscriptstyle{k}}}}\right)$ , where $n_{\scriptscriptstyle{k}}\leq N_{\scriptscriptstyle{k}}^{U}$ denotes the number of data streams allocated to the $k$ -th user, which is assumed to be a fixed parameter. The transmit signal from the $m$ -th eRRH is then given as $\textbf{x}_{\scriptscriptstyle{m}}=\sum\nolimits_{k\in\mathcal{K}_{\scriptscriptstyle{\text{req}}}}\textbf{G}_{\scriptscriptstyle{mk}}\textbf{s}_{\scriptscriptstyle{k}}$ where $\mathcal{K}_{\scriptscriptstyle{\text{req}}}\triangleq\mathcal{K}_{\scriptscriptstyle{\text{req},\text{C}}}\cup\mathcal{K}_{\scriptscriptstyle{\text{req},\text{NC}}}$ , and $\textbf{G}_{\scriptscriptstyle{mk}}\in\mathbb{C}^{N_{\scriptscriptstyle{m}}^{R}\times n_{\scriptscriptstyle{k}}}$ is the precoding matrix for $\textbf{s}_{\scriptscriptstyle{k}}$ at the $m$ -th eRRH. Accordingly, the achievable rate for the $k$ -th user on the wireless channel can be written as [9]

[TABLE]

where we have $\boldsymbol{\Phi}_{\scriptscriptstyle{\text{back},k}}^{U}\triangleq\big{(}\sum\nolimits_{\ell\in\mathcal{K}_{\scriptscriptstyle{\text{req}}}\backslash k}\textbf{H}_{\scriptscriptstyle{k}}\textbf{G}_{\scriptscriptstyle{\ell}}\textbf{G}_{\scriptscriptstyle{\ell}}^{H}\textbf{H}_{\scriptscriptstyle{k}}^{H}+\sigma_{\scriptscriptstyle{k}}^{2}\textbf{I}_{\scriptscriptstyle{N_{k}^{U}}}\big{)}^{-1}\textbf{H}_{\scriptscriptstyle{k}}\textbf{G}_{\scriptscriptstyle{k}}\textbf{G}_{\scriptscriptstyle{k}}^{H}\textbf{H}_{\scriptscriptstyle{k}}^{H}$ with $\textbf{H}_{\scriptscriptstyle{k}}\triangleq\big{[}\textbf{H}_{\scriptscriptstyle{1k}}\cdots\textbf{H}_{\scriptscriptstyle{Mk}}\big{]}$ and $\textbf{G}_{\scriptscriptstyle{k}}\triangleq\big{[}\textbf{G}_{\scriptscriptstyle{1k}}^{T}\cdots\textbf{G}_{\scriptscriptstyle{Mk}}^{T}\big{]}^{T}$ , and $\sigma_{\scriptscriptstyle{k}}^{2}$ represents the additive white Gaussian noise variance at the $k$ -th user.

The latency $\Delta_{\scriptscriptstyle{k}}^{U}$ for delivering file $f_{\scriptscriptstyle{k}}^{U}$ for the $k$ -th user is obtained as $\Delta_{\scriptscriptstyle{k}}^{U}=L/R_{\scriptscriptstyle{\text{back},k}}^{U}\left(\left\{\textbf{G}_{\scriptscriptstyle{k}}\right\}\right)$ , and the overall wireless channel latency equals $\Delta^{U}=\max\nolimits_{k}\Delta_{\scriptscriptstyle{k}}^{U}$ , since every requesting user needs to receive the requested file. The minimum instantaneous latency $\Delta$ for $\text{a}=1$ can hence be found as a solution of the problem

[TABLE]

where $P_{\scriptscriptstyle{m}}^{R}$ denotes the maximum transmit power of the $m$ -th eRRH, and we define $\textbf{E}_{\scriptscriptstyle{m}}\triangleq\big{[}\textbf{0}\cdots\textbf{I}_{\scriptscriptstyle{N_{m}^{R}}}\cdots\textbf{0}\big{]}$ in which an identity matrix $\textbf{I}_{\scriptscriptstyle{N_{m}^{R}}}$ spans columns from $\sum\nolimits_{\ell=1}^{m-1}N_{\scriptscriptstyle{\ell}}^{R}+1$ to $\sum\nolimits_{\ell=1}^{m}N_{\scriptscriptstyle{\ell}}^{R}$ . Although problem (P1) is jointly non-convex, a stationary point can be attained by leveraging Successive Convex Approximation (SCA) as detailed in [9].

III-B Fronthaul Mode

Under the fronthaul mode, any requested but uncached file $f_{\scriptscriptstyle{k}}^{U}\in\mathcal{F}_{\scriptscriptstyle{\text{req},\text{NC}}}$ for the $k$ -th user is jointly encoded and precoded at the BBU. The resulting signal dedicated for the $m$ -th eRRH is written as $\hat{\textbf{x}}_{\scriptscriptstyle{m}}=\sum\nolimits_{k\in\mathcal{K}_{\scriptscriptstyle{\text{req},\text{NC}}}}\textbf{W}_{\scriptscriptstyle{mk}}\textbf{s}_{\scriptscriptstyle{k}}$ , where $\textbf{s}_{\scriptscriptstyle{k}}\in\mathbb{C}^{n_{\scriptscriptstyle{k}}\times 1}\sim\mathcal{CN}\left(\textbf{0},\textbf{I}_{\scriptscriptstyle{n_{\scriptscriptstyle{k}}}}\right)$ encodes file $f_{\scriptscriptstyle{k}}^{U}$ , and $\textbf{W}_{\scriptscriptstyle{mk}}\in\mathbb{C}^{N_{\scriptscriptstyle{m}}^{R}\times n_{\scriptscriptstyle{k}}}$ represents the corresponding precoding matrix for the $m$ -th eRRH. The BBU then performs compression on $\hat{\textbf{x}}_{\scriptscriptstyle{m}}$ prior to transferring to the eRRHs. As a result, the decompressed signal at the $m$ -th eRRH can be written by $\tilde{\textbf{x}}_{\scriptscriptstyle{m}}=\hat{\textbf{x}}_{\scriptscriptstyle{m}}+\textbf{q}_{\scriptscriptstyle{m}}$ with quantization noise $\textbf{q}_{\scriptscriptstyle{m}}\in\mathbb{C}^{N_{\scriptscriptstyle{m}}^{R}\times 1}\in\mathcal{CN}\left(\textbf{0},\boldsymbol{\Omega}_{\scriptscriptstyle{m}}\right)$ for a given covariance matrix $\boldsymbol{\Omega}_{\scriptscriptstyle{m}}$ [9][10].

The rest of the requested cached files $\mathcal{F}_{\scriptscriptstyle{\text{req},\text{C}}}$ are locally precoded with $\left\{\textbf{G}_{\scriptscriptstyle{mk}}\right\}$ at the eRRHs in the same manner as in the backhaul mode. The final transmit signal at the $m$ -th eRRH is then given as $\textbf{x}_{\scriptscriptstyle{m}}=\sum\nolimits_{k\in\mathcal{K}_{\scriptscriptstyle{\text{req},\text{C}}}}\textbf{G}_{\scriptscriptstyle{mk}}\textbf{s}_{\scriptscriptstyle{k}}+\tilde{\textbf{x}}_{\scriptscriptstyle{m}}$ , and the achievable rate for the $k$ -th user can be obtained as [9]

[TABLE]

where we have $\boldsymbol{\Phi}_{\scriptscriptstyle{\text{front},k}}^{U}\triangleq\big{(}\sum\nolimits_{\ell\in\mathcal{K}_{\scriptscriptstyle{\text{req}}}\backslash k}\textbf{H}_{\scriptscriptstyle{k}}\tilde{\textbf{G}}_{\scriptscriptstyle{\ell}}\tilde{\textbf{G}}_{\scriptscriptstyle{\ell}}^{H}\textbf{H}_{\scriptscriptstyle{k}}^{H}+\textbf{H}_{\scriptscriptstyle{k}}\boldsymbol{\Omega}_{\scriptscriptstyle{R}}\textbf{H}_{\scriptscriptstyle{k}}^{H}+\sigma_{\scriptscriptstyle{k}}^{2}\textbf{I}_{\scriptscriptstyle{N_{k}^{U}}}\big{)}^{-1}\textbf{H}_{\scriptscriptstyle{k}}\tilde{\textbf{G}}_{\scriptscriptstyle{k}}\tilde{\textbf{G}}_{\scriptscriptstyle{k}}^{H}\textbf{H}_{\scriptscriptstyle{k}}^{H}$ , $\boldsymbol{\Omega}_{\scriptscriptstyle{R}}\triangleq\text{diag}\big{(}\boldsymbol{\Omega}_{\scriptscriptstyle{1}},...,\boldsymbol{\Omega}_{\scriptscriptstyle{M}}\big{)}$ , $\tilde{\textbf{G}}_{\scriptscriptstyle{k}}\triangleq\big{[}\tilde{\textbf{G}}_{\scriptscriptstyle{1k}}^{T}\cdots\tilde{\textbf{G}}_{\scriptscriptstyle{Mk}}^{T}\big{]}^{T}$ with $\tilde{\textbf{G}}_{\scriptscriptstyle{mk}}\triangleq b_{\scriptscriptstyle{k}}^{U}\textbf{G}_{\scriptscriptstyle{mk}}+\left(1-b_{\scriptscriptstyle{k}}^{U}\right)\textbf{W}_{\scriptscriptstyle{mk}}$ , and $b_{\scriptscriptstyle{k}}^{U}=1$ if $f_{\scriptscriptstyle{k}}^{U}\in\mathcal{K}_{\scriptscriptstyle{\text{req},\text{C}}}$ and $b_{\scriptscriptstyle{k}}^{U}=0$ otherwise for the $k$ -th user.

The wireless channel latency $\Delta^{U}$ is defined in the same way as in the backhaul mode. For the fronthaul latency, by the rate-distortion theory, sending quantized signals to the $m$ -th eRRH consumes

[TABLE]

with $\boldsymbol{\Phi}_{\scriptscriptstyle{m}}^{R}\triangleq\big{(}\textbf{E}_{\scriptscriptstyle{m}}\boldsymbol{\Omega}_{\scriptscriptstyle{R}}\textbf{E}_{\scriptscriptstyle{m}}^{H}\big{)}^{-1}\sum\nolimits_{k\in\mathcal{K}_{\scriptscriptstyle{\text{req},\text{NC}}}}\textbf{E}_{\scriptscriptstyle{m}}\tilde{\textbf{G}}_{\scriptscriptstyle{k}}\tilde{\textbf{G}}_{\scriptscriptstyle{k}}^{H}\textbf{E}_{\scriptscriptstyle{m}}^{H}$ [9]. Compressing $\Delta^{U}$ symbols produces $\Delta^{U}g_{\scriptscriptstyle{m}}\big{(}\big{\{}\tilde{\textbf{G}}_{\scriptscriptstyle{k}}\big{\}},\boldsymbol{\Omega}_{\scriptscriptstyle{R}}\big{)}$ bits, which need to be transferred from the BBU to the $m$ -th eRRH. Therefore, the fronthaul latency is given by $\Delta^{R}=\max\nolimits_{m}\Delta_{\scriptscriptstyle{m}}^{R}$ where $\Delta_{\scriptscriptstyle{m}}^{R}=\Delta^{U}g_{\scriptscriptstyle{m}}\big{(}\big{\{}\tilde{\textbf{G}}_{\scriptscriptstyle{k}}\big{\}},\boldsymbol{\Omega}_{\scriptscriptstyle{R}}\big{)}/C_{\scriptscriptstyle{m}}^{R}$ , and the minimum instantaneous latency $\Delta$ for $\text{a}=0$ is calculated as a solution of the problem

[TABLE]

which can be tackled via the SCA approach detailed in [9]. The total worst-case order of complexity for the SCA method can be expressed as $\mathcal{O}(N_{\scriptscriptstyle{\text{SCA}}}\sqrt{N_{\scriptscriptstyle{\text{const}}}}\log(N_{\scriptscriptstyle{\text{const}}}/\epsilon))$ where $\epsilon$ , $N_{\scriptscriptstyle{\text{SCA}}}$ and $N_{\scriptscriptstyle{\text{const}}}$ indicate the desired error tolerance, the maximum number of the SCA iterations and the number of constraints, respectively [22]. Here, $N_{\scriptscriptstyle{\text{const}}}$ equals $\left|\mathcal{K}_{\scriptscriptstyle{\text{req}}}\right|+M$ in (P1) and $\left|\mathcal{K}_{\scriptscriptstyle{\text{req}}}\right|+2M$ in (P2).

IV RL-Based X-Haul Online Optimization

In this section, we solve problem (P) by proposing an online on-policy RL-based optimization strategy [23].

IV-A Problem (P) as a Partially Observable Decision Process

As discussed in Section II, problem (P) is a Partially Observable Markov Decision Process (POMDP) with the action space $\left\{0,1\right\}$ and the instantaneous reward given by the negative latency $\text{r}(t+1)=-\Delta(t,\text{a}(t))$ . In order to reduce the complexity of the policy, we optimize here over memoryless policies that select an action $\text{a}(t)$ based only on the latest observation $\text{o}(t)$ at time slot $t$ [24][25] as well as a summary of the previous observations $\text{o}(1\text{:}t)$ given by the set $\{\tau_{\scriptscriptstyle{\text{req},f}}(t)\}_{\scriptscriptstyle{f\in\mathcal{F}^{R}(t)}}\}$ where $\tau_{\scriptscriptstyle{\text{req},f}}(t)$ is the most recent time slot at which cached file $f$ was requested at time slot $t$ .

IV-B SARSA with Linear Value Function Approximation

To optimize over memoryless policies, we adopt the online on-policy value-based strategy State-Action-Reward-State-Action (SARSA) with a carefully designed linear approximation [23]. The SARSA updates an action-value function, or Q-function, $q\left(o,a\right)$ that estimates the expected return $\mathbb{E}[G(t)|\text{o}=o,\text{a}=a]$ with $G(t)\triangleq\sum\nolimits_{\tau=0}^{\infty}\gamma^{\tau}\text{r}(t+\tau+1)$ . Since the total size of the observation space in (P) grows exponentially with $F$ , we propose a linear value function approximation $\hat{q}\left(o,a,\textbf{w}\right)\triangleq\textbf{w}^{T}\textbf{x}\left(o,a\right)$ , where w is a parameter vector to be learned, and $\textbf{x}\left(o,a\right)$ denotes a feature vector representing the observation-action pair $\left(o,a\right)$ [23].

In order to determine a suitable feature vector, we first note that vector $\textbf{x}\left(o,a\right)$ should contain sufficient information to quantify the value of caching for currently cached and requested files. Frequently requested files typically yield lower future latencies when cached, but an optimal choice should account not only for their popularity but also for their remaining life time, which is a duration that a file remains popular (see Sec. II of [26] for further discussion).

Based on these considerations, we introduce a variable $\phi_{\scriptscriptstyle{\ell}}(t)$ for every file $f_{\scriptscriptstyle{\ell}}\in\mathcal{F}$ as a function of the current observation $\text{o}(t)$ at time slot $t$ . We set it as $\phi_{\scriptscriptstyle{\ell}}(t)=1$ if $f_{\scriptscriptstyle{\ell}}\in\mathcal{F}_{\scriptscriptstyle{\text{req},\text{NC}}}(t)$ , $\phi_{\scriptscriptstyle{\ell}}(t)=2$ if $f_{\scriptscriptstyle{\ell}}\in\mathcal{F}^{R}(t)$ and $\phi_{\scriptscriptstyle{\ell}}(t)=0$ otherwise. Furthermore, we also include a variable $\theta(t)\triangleq t-\max\nolimits_{f\in\mathcal{F}^{R}(t)}\tau_{\scriptscriptstyle{\text{req},f}}$ that measures the “age” of the currently cached files, that is, the maximum time elapsed since the last request of the cached files. We can quantize this variable by $N_{\scriptscriptstyle{\Theta}}$ ranges $\Theta_{\scriptscriptstyle{1}},...,\Theta_{\scriptscriptstyle{N_{\scriptscriptstyle{\Theta}}}}\subseteq\mathbb{R}^{+}$ with $\Theta_{\scriptscriptstyle{i}}\cap\Theta_{\scriptscriptstyle{j}}=\emptyset$ for all $i\neq j$ and $\bigcup\Theta_{\scriptscriptstyle{i}}=\mathbb{R}^{+}$ . If the caches are up to date, the quantity $t-\tau_{\scriptscriptstyle{\text{req},f}}$ is small for all $f\in\mathcal{F}^{R}(t)$ , and hence $\theta(t)$ is also small. Otherwise, if there exists any file $f\in\mathcal{F}^{R}(t)$ with large $t-\tau_{\scriptscriptstyle{\text{req},f}}$ , a refresh of the caches may be required.

Using the variables introduced above, we define the feature vector $\textbf{x}\left(o(t),a(t)\right)$ as

[TABLE]

where we have used the one-hot encoded vectors $\boldsymbol{\phi}_{\scriptscriptstyle{\ell}}(t)\triangleq[\mathbb{I}\{\phi_{\scriptscriptstyle{\ell}}(t)=1\}\ \mathbb{I}\{\phi_{\scriptscriptstyle{\ell}}(t)=2\}\ \mathbb{I}\{\phi_{\scriptscriptstyle{\ell}}(t)=0\}]^{T}$ , $\boldsymbol{\theta}(t)\triangleq[\mathbb{I}\{\theta(t)\in\Theta_{\scriptscriptstyle{1}}\}\ \cdots\ \mathbb{I}\{\theta(t)\in\Theta_{\scriptscriptstyle{N_{\Theta}}}\}]^{T}$ and $\textbf{a}(t)\triangleq[\mathbb{I}\{\text{a}(t)=0\}\ \mathbb{I}\{\text{a}(t)=1\}]^{T}$ . The feature vector $\textbf{x}\left(o\left(t\right),a\left(t\right)\right)$ in (7) has dimension $2(N_{\scriptscriptstyle{\Theta}}+3F)$ , which increases linearly in $F$ and is hence significantly smaller than the size of the conventional look-up table-based SARSA. The effectiveness of the proposed feature vector $\textbf{x}\left(\text{o}(t),a\left(t\right)\right)$ will be verified in Section V.

The overall proposed procedure for solving (P) is summarized in Algorithm $1$ where $\delta(t,\textbf{w})\triangleq\text{r}(t+1)+\gamma\hat{q}\left(\text{o}(t+1),\text{a}(t+1),\textbf{w}\right)-\hat{q}\left(o,a,\textbf{w}\right)$ denotes the temporal difference error, and E indicates the eligibility trace. Here, an $\epsilon$ -greedy exploration strategy with decreasing $\epsilon$ is adopted. Note that E is used to assign credit for the current reward to the most frequently visited states and selected actions, so as to enable online learning (see [23] for details).

V Numerical Results

In this section, the performance of the proposed RL-based algorithm is evaluated via numerical examples. We adopt the channel model $\textbf{H}_{\scriptscriptstyle{mk}}=\sqrt{\rho_{\scriptscriptstyle{mk}}}\hat{\textbf{H}}_{\scriptscriptstyle{mk}}$ , where $\rho_{\scriptscriptstyle{mk}}\triangleq\rho_{\scriptscriptstyle{0}}\big{(}\frac{d_{\scriptscriptstyle{mk}}}{d_{\scriptscriptstyle{0}}}\big{)}^{-\eta}$ equals the distance-dependent path loss between eRRH $R_{\scriptscriptstyle{m}}$ and user $U_{\scriptscriptstyle{k}}$ , $\rho_{\scriptscriptstyle{0}}$ indicates the path loss at reference distance $d_{\scriptscriptstyle{0}}$ , $\eta$ is the path loss exponent, and $d_{\scriptscriptstyle{mk}}$ represents the distance between the $m$ -th eRRH and the $k$ -th user. Each element of $\hat{\textbf{H}}_{\scriptscriptstyle{mk}}$ follows an independent complex Gaussian distribution with zero mean and unit variance. The eRRHs and the users are circularly placed from the BBU at the center with uniformly distributed angles and distance $d_{\scriptscriptstyle{BR}}=200$ m and $d_{\scriptscriptstyle{BU}}=400$ m, respectively. The bandwidth is $20$ MHz and the thermal noise is $-170$ dBm/Hz. We set $K=10$ , $M=3$ , $\rho_{\scriptscriptstyle{0}}=10^{-3}$ , $d_{\scriptscriptstyle{0}}=1$ m, $\eta=3.5$ , $T_{\scriptscriptstyle{\text{B}}}=100$ time slots, $F_{\scriptscriptstyle{\max}}^{R}=4$ files, $P_{\scriptscriptstyle{m}}^{R}=30$ dBm, $N_{\scriptscriptstyle{m}}^{R}=N_{\scriptscriptstyle{k}}^{U}=1$ and $C_{\scriptscriptstyle{m}}^{R}=0.1$ bits per symbol. For RL, we use the hyperparameters $\gamma=1$ , $\lambda=0.5$ , and $\Theta_{\scriptscriptstyle{\ell}}=[2(\ell-1),\min(2(\ell-1)+1,\theta_{\scriptscriptstyle{\max}})]$ with $N_{\scriptscriptstyle{\Theta}}=11$ where $\theta_{\scriptscriptstyle{\max}}=20$ limits the maximum value of $\theta(t)$ .

Reference [26] demonstrated that the popularity of files often exhibits temporal locality in the sense that the content is frequently requested in a bursty fashion for a certain life time. Motivated by these findings, we model the evolution of the subset $\mathcal{F}_{\scriptscriptstyle{\text{pop}}}(t)$ of popular files such that a currently unpopular file $f$ has a probability of $\text{P}_{\scriptscriptstyle{\text{pop},f}}$ to become popular, and file $f$ remains popular for $T_{\scriptscriptstyle{\text{life},f}}$ time slots. We assume Zipf’s distribution [27] for $\text{P}_{\scriptscriptstyle{\text{pop},f_{\ell}}}=\ell^{-\xi}/\sum\nolimits_{\nu=1}^{F}\nu^{-\xi}$ with $\xi=1$ . The proposed RL scheme is compared with a greedy fronthaul/backhaul mode selection that minimizes the current delivery latency at each time slot as well as with an offline scheme that keeps the $F_{\scriptscriptstyle{\max}}^{R}$ most popular files with the largest $\text{P}_{\scriptscriptstyle{\text{pop},f}}$ under the idealized assumption that this is known in prior.

Fig. 2 compares the average long-term latency performance as a function of the eRRHs’ cache size $F_{\scriptscriptstyle{\max}}^{R}$ for $P_{\scriptscriptstyle{m}}^{R}=30$ dBm, $T_{\scriptscriptstyle{\text{life},f}}=10$ and $F=20$ . We also limit the maximum number of the SCA iterations for solving (P1) and (P2) as $N_{\scriptscriptstyle{\text{SCA}}}=10$ . Note that the convergence to a stationary point for SCA does not affect the convergence of SARSA since we treat the negative reward function $-\Delta(t,\text{a}(t))$ as fixed. With $F_{\scriptscriptstyle{\max}}^{R}\leq 4$ , the fronthaul mode is seen to yield a lower latency than the backhaul mode given the limited advantage of caching in this regime. The opposite is true when the eRRHs have larger caches, such as $F_{\scriptscriptstyle{\max}}^{R}>4$ , in which the backhaul mode outperforms the fronthaul mode. In agreement with the results in [9, 10, 11] and [13], the greedy scheme almost always selects the fronthaul mode and is hence strongly suboptimal for large enough $F_{\scriptscriptstyle{\max}}^{R}$ . The proposed RL method exhibits the lowest latency among all schemes that do not assume the knowledge of the popularity probability. It can be checked that the gain is not obtained by statically selecting the best mode at each time instant, but rather by carrying out an optimized dynamic selection. It is also observed that in a large $F_{\scriptscriptstyle{\max}}^{R}$ regime, the proposed strategy can outperform the static offline scheme which assumes popularity dynamics to be known in advance.

VI Conclusions

In this paper, we have demonstrated the advantage of adaptively selecting between the backhaul and fronthaul transfer modes as a function of the current cache contents and the history of past requests in an F-RAN system. The proposed RL-based strategy has been shown via numerical results to outperform baseline schemes, confirming the potential advantages of an X-haul implementation over static fronthaul or backhaul deployments.

Bibliography27

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Y.-J. Ku, D.-Y. Lin, C.-F. Lee, P.-J. Hsieh, H.-Y. Wei, C.-T. Chou, and A.-C. Pang, “5G radio access network design with the fog paradigm: confluence of communications and computing,” IEEE Commun. Mag. , vol. 55, pp. 46–52, Apr. 2017.
2[2] Y.-Y. Shih, W.-H. Chung, A.-C. Pang, T.-C. Chiu, and H.-Y. Wei, “Enabling low-latency applications in fog-radio access networks,” IEEE Netw. , vol. 31, pp. 52–58, Jan. 2017.
3[3] A. D. L. Oliva, X. C. Pérez, A. Azcorra, A. D. Giglio, F. Cavaliere, D. Tiegelbekkers, J. Lessmann, T. Haustein, A. Mourad, and P. Iovanna, “Xhaul: toward an integrated fronthaul/backhaul architecture in 5G networks,” IEEE Wireless Commun. , vol. 22, pp. 32–40, Oct. 2015.
4[4] T. Pfeiffer, “Next generation mobile fronthaul and midhaul architectures,” J. Opt. Commun. Netw. , vol. 7, pp. 38–45, Nov. 2015.
5[5] N. J. Gomes, P. Chanclou, P. Turnbull, A. Magee, and V. Jungnickel, “Fronthaul evolution: From CPRI to Ethernet,” Opt. Fiber Technol. , vol. 26, pp. 50–58, Dec. 2015.
6[6] H. Ren, N. Liu, C. Pan, M. Elkashlan, A. Nallanathan, X. You, and L. Hanzo, “Low-latency C-RAN: an next-generation wireless approach,” IEEE Veh. Technol. Mag. , vol. 13, pp. 48–56, Jun. 2018.
7[7] J. Kim, H. Lee, S.-H. Park, and I. Lee, “Minimum rate maximization for wireless powered cloud radio access networks,” IEEE Trans. Veh. Technol. , vol. 68, pp. 1045–1049, Jan. 2019.
8[8] J. Kim, S.-H. Park, O. Simeone, I. Lee, and S. S. (Shitz), “Joint design of fronthauling and hybrid beamforming for downlink C-RAN systems,” accepted for IEEE Trans. Commun.