Asynchronous Coded Caching with Uncoded Prefetching
Hooshang Ghasemi, Aditya Ramamoorthy

TL;DR
This paper extends coded caching techniques to asynchronous user requests with deadlines, proposing optimization methods for offline and online scenarios, and demonstrating that caching benefits persist despite request timing variations.
Contribution
It introduces a new formulation for asynchronous coded caching with deadlines, including a linear programming approach for the offline case and a heuristic for the online case.
Findings
Offline LP minimizes rate while meeting deadlines.
Heuristic achieves high probability of meeting user deadlines.
Coded caching benefits remain under mild asynchronism.
Abstract
Coded caching is a technique that promises huge reductions in network traffic in content-delivery networks. However, the original formulation and several subsequent contributions in the area, assume that the file requests from the users are synchronized, i.e., they arrive at the server at the same time. In this work we formulate and study the coded caching problem when the file requests from the users arrive at different times. We assume that each user also has a prescribed deadline by which they want their request to be completed. In the offline case, we assume that the server knows the arrival times before starting transmission and in the online case, the user requests are revealed to the server over time. We present a linear programming formulation for the offline case that minimizes the overall rate subject to constraint that each user meets his/her deadline. While the online case…
| Variable | Description |
|---|---|
| arrival time of user | |
| deadline of user | |
| number of time intervals | |
| time interval | |
| set of the active users in time interval | |
| set of the indices of missing subfiles of user | |
| set of all subsets of that are user groups | |
| set of the indices so that | |
| set of all user groups that can be transmitted within | |
| portion of allocated to user group | |
| portion of transmitted within user group | |
| set of the indices of that can be transmitted within |
| User | Cache content indices | Missing subfiles indices |
|---|---|---|
| No. of nodes | No. of edges | Exec. time (min) | Exec. time Orig. (min) | |
| — | ||||
| — | ||||
| — | ||||
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Asynchronous Coded Caching with Uncoded Prefetching
Hooshang Ghasemi and Aditya Ramamoorthy This work was supported in part by the National Science Foundation (NSF) under grants CCF-1718470 and CCF-1910840. The material in this work has appeared in part at the 2017 IEEE International Symposium on Information Theory and the 2017 Asilomar Conference on Signals, Systems and Computers. Hooshang Ghasemi was with Iowa State University, he is now with Qualcomm Inc. Aditya Ramamoorthy is with Iowa State University, Ames, IA, 50011 USA. E-mail: {[email protected], [email protected]}.
Abstract
Coded caching is a technique that promises huge reductions in network traffic in content-delivery networks. However, the original formulation and several subsequent contributions in the area, assume that the file requests from the users are synchronized, i.e., they arrive at the server at the same time. In this work, we formulate and study the coded caching problem when the file requests from the users arrive at different times. We assume that each user also has a prescribed deadline by which they want their request to be completed. In the offline case, we assume that the server knows the arrival times before starting transmission and in the online case, the user requests are revealed to the server over time. We present a linear programming formulation for the offline case that minimizes the overall transmission rate from the server subject to the constraint that each user meets his/her deadline. While the online case is much harder, we introduce a novel heuristic for it and show that under certain conditions, with high probability the request of each user can be satisfied with her/his deadline. Our simulation results indicate that in the presence of mild asynchronism, much of the benefit of coded caching can still be leveraged.
Index Terms:
coded caching, asynchronous, deadlines, linear programming
I Introduction
Caching is a core component of solving the problem of large scale content delivery over the Internet. Conventional caching typically relies on placing popular content closer to end-users. Statistically, popular content is requested more frequently and the cache can be used to serve the user requests in this case. Contacting the central server that has all the content is not needed. This serves to reduce the induced network traffic.
In their pioneering work [1], Maddah-Ali and Niesen considered the usage of coding in the caching problem. In this “coded caching” setting, there is a server containing a library of files. There are users each with a cache that can store up to files. The users are connected to the server via an error-free shared broadcast link (see Fig. 1). The system operates in two distinct phases. In the placement phase the content of the caches is populated by the server. This phase does not depend on the future requests of the users which are assumed to be arbitrary. In the delivery phase each user makes a request and the server transmits potentially coded signals to satisfy the requests of the users. The work of [1] demonstrated that significant reductions in the network traffic were possible as compared to conventional caching. Crucially, these gains continue to hold even if the popularity of the files is not taken into account.
While this is a significant result, the original formulation of the coded caching problem assumes that the user requests are synchronized, i.e., all file requests from the users arrive at the server at the same time. Henceforth, we refer to this as the synchronous setting. From a practical perspective, it is important to consider the asynchronous setting where user requests arrive at different times. In this case, a simple strategy would be to wait for the last request to arrive and then apply the scheme of [1]. Such a strategy will be quite good in terms of the overall rate of transmission from the server. However, this may be quite bad for an end user’s experience, e.g., the delay experienced by the users will essentially be dominated by the arrival time of the last request.
In this work, we formulate and study the coded caching problem when the user requests arrive at different times. Each user has a specific deadline by which his/her demand needs to be satisfied. The goal is to schedule the transmission of packets so that each user is able to recover the requested file from the transmitted packets and his/her cache content within the prescribed deadline. We present algorithms for both the offline and online versions of this problem.
This paper is organized as follows. In Section II we discuss the background and related work and overview our main contributions. The problem formulation appears in Section III. Sections IV and V discuss our work on the offline and the online versions of the problem, respectively. We conclude the paper with a discussion of opportunities for future work in Section VII.
II Background, Related Work and Summary of Contributions
A coded caching system contains a server with files, denoted , , each of size subfiles, where a subfile is a basic unit of storage. These subfiles are indexed as . The system also contains users each connected to the server through an error free, broadcast shared link. Each of the users is equipped with a local cache. The -th cache can store the equivalent of subfiles. We denote the cache content of user by , where is a function of . Our formulation supports users with different cache sizes. A block diagram of a coded caching system for is depicted in Fig. 1.
In general, a user might choose to store a coded combination of the subfiles, i.e., can be a non-trivial function of . However, in this work, we assume that an uncoded placement scheme is being used by the coded caching system, i.e., user caches at most a -sized subset of all the subfiles at the server, i.e., is a subset of . The uncoded placement scheme was shown in [1] to have excellent performance. It has also been considered in several follow-up works as well. It is well recognized that the delivery phase in the uncoded placement case, corresponds to an index coding problem [2]. While the optimal solution for an arbitrary index coding problem is known to be hard, techniques such as clique cover on the side information graph are well-recognized to have good performance [2]. In this case, each transmitted equation from the server is such that a certain number of users “benefit” from it simultaneously. Under this assumption, we formulate and study the asynchronous coded caching problem when the file requests arrive at the server at different times. Each user specifies a deadline by which he/she expects the request to be satisfied111It is not too hard to see that in the absence of deadlines the server can simply wait for enough user requests to arrive before starting transmission. Thus, the deadline-free case essentially reduces to the synchronous setting. Section IV.B of [3] has more details.. We assume that
- •
the delivery phase proceeds via a clique cover and
- •
transmitting a single packet over the shared link takes a certain number of time slots.
We study the rate gains of coded caching under this setup, i.e., among the class of strategies that allow the users to meet their deadlines, we attempt to determine those where the server transmits the fewest number of packets. Both the offline and online versions of the problem are studied. In the offline scenario, we assume that information about all request arrival times and deadlines are known to the server before transmission, whereas in the online scenario, the arrival times and deadlines are revealed to the server as time progresses.
II-A Main contributions
- •
Linear programming (LP) formulation in the offline case. We propose an LP in the offline scenario that determines a schedule for the equations that need to be transmitted from the server. A feasible point of the LP can be interpreted as a coding solution that can be used by the server, such that each user meets its deadlines.
The computational complexity of solving this LP can be quite high for a large number of users. Accordingly, we develop a dual decomposition technique where the dual problem decouples into a set of independent minimum cost network flow problems that can be solved efficiently [4].
- •
A novel online algorithm. For the online problem, we demonstrate that, in general, coding within subfiles of the same file is essential. Interestingly, this is not needed in the synchronous case where the transmitted signals by the server are coded combination of subfiles belonging to different files. Furthermore, we propose a novel online algorithm that is inspired by recursively solving the offline LP and interpreting the corresponding output appropriately. Under certain conditions, we also show that the algorithm will result in a solution that satisfies the deadline constraints with high probability.
For both scenarios, we present exhaustive simulation results that corroborate our findings and demonstrate the superiority of our algorithm concerning prior work. Overall, our work indicates that under mild asynchronism, much of the benefits of coded caching can still be leveraged.
II-B Related Work
The area of coded caching has seen a flurry of research activity along several dimensions in recent years. From a theoretical perspective, significant work has attempted to understand the fundamental rate limits of a coded caching system [5, 6, 7]. Extensions of the basic model to general networks have been examined in [8, 9, 10]. Issues related to subpacketization (i.e., the number of subfiles ) have been considered in [11, 12, 13]. A high subpacketization level can cause several issues in practical implementations. Coded caching ideas have also been used within the domain of distributed computing [14, 15, kostasR20].
There are relatively few prior works that have considered asynchronism within the context of coded caching. To our best knowledge, it was first studied in [17]. They considered the decentralized coded caching model [18] and a situation where each subfile has a specific deadline. Only the online case was considered and heuristics for transmission from the server were proposed. The heuristics are found to have good performance. However, the transmission time for a packet was not considered in their formulation. Reference [19] also considers the asynchronous setting; again, they do not consider the transmission time of a transmitted packet. In that sense, their setting is closer to the work of [17] and can be viewed as a set of rules that the server should follow in the online case. [19] (Section III.C) also considers an offline setting for the centralized placement scheme of [1]. Our LP formulation can be viewed as a bound on the possible performance of any online scheme. Our proposed online algorithm has significantly better performance than the ones presented in [17].
Reference [18] (Section V.C) also discusses the issue of asynchronism within the context of decentralized coded caching, without considering deadlines or packet transmission times. They advocate a further subpacketization of each subfile (referred to as a segment in [18]). It is important to note that any system will need to commit to a certain subpacketization scheme before deployment. Given this subpacketization and with user specified deadlines, the formalism of our work and our algorithms can be used to arrive at schemes that address asynchronous requests.
The work of [20] proposes an algorithm for the online scenario under the assumption of decentralized coded caching for reducing the worst-case load of fronthaul links in fog radio access networks (F-RANs); this is a different model than ours. Their work does not take transmission time into account and considers the scenario where each user has the same deadline.
The asynchronous setting has also been considered in [21] for video delivery by taking into account an appropriately defined audience retention rate. Their work considers a probabilistic arrival model and presents a decentralized coded caching scheme for it.
III Problem Formulation and Preliminaries
We assume that time is slotted. Let denote the set and the symbol represent the XOR operation. We assume that the server contains files222We assume that as it corresponds to the worst case rate (under most reasonable placement schemes) where each of the users can request a different file. Furthermore, it is also the more practical scenario. denoted by . The subfiles are denoted by so that and the cache of user by . contains at most subfiles. In the delivery phase, user requests file , where , from the server. We let denote the indices of the subfiles that are not present in the -th user’s cache, i.e.,
[TABLE]
The equations in the delivery phase are assumed to be of the all-but-one type.
Definition 1
All-but-one equation. Consider an equation such that
[TABLE]
We say that is of the all-but-one type if for each , we have and for all .
It is evident that an all-but-one equation transmitted from the server allows each of the users participating in the equation to recover a missing subfile that they need. The asynchronous coded caching problem can be formulated as follows.
Inputs.
- •
User requests. User requests file , with at time .
- •
Deadlines. The -th user needs to be satisfied by time , where is a positive integer.
- •
Transmission delay. Each subfile needs time-slots to be transmitted over the shared link, i.e., each subfile can be treated as equivalent to packets, where each packet can be transmitted in one time slot.
As the problem is symmetric with respect to users, w.l.o.g. we assume that . Let . Note that upon sorting the set of arrival times and deadlines, i.e., , we can divide the interval into at most non-overlapping intervals. Let the integer , where denote the number of intervals. Let represent the intervals where appears before if ; denotes the length of interval . The intervals are left-closed and right-open. An easy to see but very useful property of the intervals that we have defined is that for a given , either or . Fig. 2 shows an example when . We define , and . Thus, is the set of active users in time interval and is the corresponding set of active file requests.
Outputs.
- •
Transmissions at each time slot. If the problem is feasible, the schedule specifies which equations (of the all-but-one type) need to be transmitted at each time. The schedule is such that each user can recover all its missing subfiles within its deadline. The equations transmitted at time only depend on .
We consider two versions of the above problem.
- •
Offline version. In the offline version, we assume that the server is aware of at . However, at time the transmitted equation(s) will only depend on , i.e., the server cannot start sending missing subfiles for a given user until its request arrives.
- •
Online version. In the online version, information about the file requests are revealed to the server as time progresses. At each time , the server only has information about if , i.e., the requests that have arrived by time .
We begin by defining some relevant sets; for convenience, a tabulated list of most of the items needed in the subsequent sections can be found in Table I. Consider a subset of users . For each user we let denote the indices of all missing subfiles of the -th user that have been stored in the cache of the other users in , i.e.,
[TABLE]
Definition 2
User Group. A subset is said to be a user group if for all users so that there is at least one all-but-one type equation associated with .
For a user group there are different all-but-one equations. This is because for any choice of for , we can construct the all-but-one equation . Thus, for each there are choices for . Recall that is the set of active users in time interval and represents their file requests. Let be a subset of the power set of (i.e. the set of all subsets of ) such that each element in is an user group (cf. Definition 2). For any , let be the set of indices of all time intervals where the users in are simultaneously active, i.e.,
[TABLE]
For each missing subfile (where ) we let denote the set of user groups where it can be transmitted, i.e.,
[TABLE]
We note here that for a fixed , there are potentially multiple indices such that for .
Example 1
Consider a system (shown in Fig. 2) with files, , , and where each file is divided into three subfiles, so that . There are users with the following cache content, , , and . Thus, for . The arrival times are , , , and deadlines are , , and . The -th user requests file , for . Therefore, , , and .
In this system we have as , and as . Therefore, is an user group and the corresponding all-but-one equations are and . However, thus is not an user group.
As is an user group, we have . The set of time intervals where user group is active is . Finally, note that user group is a member of since and . Similarly, as well since and .
IV Offline Asynchronous Coded Caching
In this section, we discuss the offline version of the problem where the server knows the arrival times/deadlines of all the requests at . The offline solution of the system in Example 1 is depicted in Fig. 2 where the transmitted equation in each time slot appears above the timeline. It can be verified that each user can recover the missing subfiles that they need. In what follows we argue that the offline setting can be cast as a linear programming problem.
IV-A Linear programming formulation
For each time interval with and for each , we define variable that represents the portion of time interval that is allocated to an equation that benefits user group . The actual equation will be determined shortly. For each missing subfile and each , we define variable that represents the portion of the missing subfile transmitted within some or all of the equations associated with for . As pointed out before, for a fixed , can be used to transmit different missing subfiles needed by user . However, a single equation can only help recover one missing subfile needed by . Thus, must be shared between the appropriate ’s. Accordingly, we need the following constraint for user and a user group which contains .
[TABLE]
In addition, at time interval at most packets can be transmitted, so that . To ensure that each missing subfile is transmitted in exactly time slots, we have .
The following LP minimizes the overall rate of transmission from the server while respecting all the deadline constraints of the users under the assumption that the server only transmits all-but-one equations. However, we point out that in general this may not be the information-theoretically optimal strategy for the server.
[TABLE]
Note that [17] considers the case when each missing subfile has a prescribed deadline. Our LP above can be modified in a straightforward manner to incorporate this aspect.
IV-B Interpretation of feasible point of (1) as a coding solution
We start by assigning time intervals to user groups. The time interval , , will be arbitrarily assigned to user groups so that the time assigned to one user group does not overlap with another. The constraint implies that such an assignment exists. For each user group and each , suppose that are such that for . We assign part of the total time allocated to user group , i.e., , to the missing subfile for . The constraint ensures that such an assignment always exists, i.e., it is possible to assign ’s (for fixed ) to the available (strictly) positive ’s, such that there is no overlap between them. This assignment is not unique in general. However, this is not a problem as any assignment can be used to determine the equations. This process is repeated for all users .
The equation transmitted on a particular interval is simply the XOR of the subfile indices that map to that interval. This equation is valid since the missing subfile with is in the cache of all the users in .
Finally, according to the constraint , each missing subfile is transmitted in its entirety in some equations. The following example serves to illustrate the arguments above.
Example 2
Consider again the system in Example 1. Part of a feasible solution to the LP in (1), corresponding to user groups and , is presented below.
[TABLE]
*According to the solution, . Therefore, only one unit of is assigned to (though ). This is denoted by the light blue color line in Fig. 3. For user , there is only one missing subfile in , namely . As it is assigned to in its entirety. This is depicted by the gray line in Fig. 3. For user 2 in we have . The solution specifies . Thus, we assign the first half of to missing subfile and the second half to (see the dark blue and dotted dark blue lines in Fig. 3). Accordingly, the server transmits equations such that the first half of the time interval assigned to user group corresponds to the whereas the second half corresponds to . The interpretation of the user group is similar (see Fig. 3). *
Remark 1
The output of the above LP will typically result in a fractional solution for the variables. A fractional solution can be interpreted by assuming that each packet that is transmitted over the shared edge can be subdivided as finely as needed. Thus, in each time slot, we could transmit multiple equations that may serve potentially different subsets of users. This assumption is reasonable if the underlying subfiles and hence the packets are quite large. In any case, the above LP provides a lower bound on the performance of a solution where integrality constraints are enforced.
Remark 2
We note that for the offline solution, within a given time interval, the user groups can be assigned in any order according to the as long as they don’t overlap. Moreover, the assignment of is also arbitrary as long as the constraints of the LP are respected. However, for the online case (cf. Section V), the order does matter since we make the best effort decision on each individual slot as we do not know the future arrivals.
IV-C Modified LP with fixed user group assignments
Note that the LP in (1) includes the variables ’s that determine the user groups in the different time-intervals. Suppose instead that at a certain time we are given the total time allocated to user group thus far (denoted by ) and only need to determine the values for each . Let be the set of user groups used until time . For user , this can be written as a related LP as shown below which returns the total number of missing packets that user has obtained until time .
[TABLE]
This follows since given the ’s we only need to find an assignment of the ’s to the corresponding ’s that respect the first constraint above. Moreover, since we are not considering the entire transmission time, each missing subfile may not be transmitted in its entirety.
For instance, in Example 1 suppose that at time , we have , then the LP above in (2) for user 1 has the optimum point , . Likewise for user 2, the LP has the optimum point and .
Remark 3
The complexity of our solution in (1) does not have any dependence on arrival times ’s and deadlines ’s. Our formulation of the LP in terms of the intervals allows us to circumvent this potential dependence.
Nevertheless, the complexity of solving the LP does grow quite quickly (cubic) in the problem parameters (number of constraints + number of variables) [22]. Next, we discuss a solution based on dual decomposition that is much faster.
IV-D Dual Decomposition based LP solution
As it stands, the LP in (1) cannot be interpreted as a network flow. Yet, intuitively one can view the missing subfiles from each user as flowing through the user groups and getting absorbed in sinks that correspond to their valid time intervals. However, the flows corresponding to different users can be shared as the all-but-one equations allow different users to benefit from the same equation. We note here that a similar sharing of flows also occurs in the problem of minimum cost multicast with network coding [23]. The LP in (1) can, however, be modified slightly so that the corresponding dual function is such that it can be evaluated by solving a set of decoupled minimum cost network flow optimizations.
IV-D1 Decoupling procedure
For each user the variable represents the amount of flow corresponding to user outgoing from user group to time interval . Evidently, this amount can’t be more than . Therefore, we have
[TABLE]
which holds for all and all , . We define to be the subset of possible user groups at time interval that include user , i.e., for all . By the flow interpretation of , we have
[TABLE]
for all . For , let denote the following set of constraints.
[TABLE]
Then, the original LP can be compactly rewritten as
[TABLE]
It is not too hard to see that the LPs in (1) and (3) are equivalent. The only difference with (1) is the introduction of variables (for appropriate ranges of , and ) such that the second set of inequality constraints in (1) are replaced by equality constraints. Moreover, the original constraints are maintained by setting .
By the Slater’s constraint qualification condition [24], we know that if the primal LP is feasible, then strong duality holds and the primal and dual optimal values are the same. Thus, we proceed by considering the dual of the LP in (3) with respect to the constraints that involve the variables . The Lagrangian can be expressed as
[TABLE]
where ’s and ’s are nonnegative dual variables. It turns out that minimizing the Lagrangian for fixed dual variables can be simplified by defining for , , and . We define , , and . The dual function is obtained by solving for
[TABLE]
It is evident that the dual function takes a nontrivial value only if
[TABLE]
The evaluation of at a fixed set of dual variables ’s and ’s can therefore be written as
[TABLE]
We emphasize that (IV-D1) is still a convex problem and that . Let , be
[TABLE]
Then, the dual function becomes
[TABLE]
if for all . We present an approach to maximize the dual function in (6) shortly.
The sub-problem in (IV-D1) for fixed and , is a standard minimum-cost flow problem. The associated flow network corresponding to user , , depends on and and we denote it by . It contains a source node and three intermediate layers followed by a terminal node (see Fig. 4 for an example). The nodes in the first, second, and third layer correspond to missing subfiles in , user groups in , and time intervals respectively. The edges in can be expressed as follows. There are edges going from source node to each of missing subfiles in . Also, for each there is an edge going from missing subfile node to user group node . Furthermore, there is an edge going from user group to time interval for each . Finally, corresponding to each time interval in there is an edge going from this time interval to the terminal node .
In flow network , , a zero cost is assigned to all edges except those from the user group nodes to the time intervals. The cost of the edge between user group and time interval is . The edge between time interval and the terminal node has a capacity constraint of and the edge between the source node and a missing subfile has a capacity constraint of ; the other edges have no capacity constraint. The variable is the amount of flow carried by the edge from user group to time interval . The source injects a flow of value which needs to be absorbed in the terminal.
We emphasize that minimum cost network flow algorithms have been subject of much investigation [4] within the optimization literature and large scale instances can be solved very quickly. For our work, we leverage Capacity Scaling algorithms within the open-source LEMON package [25].
IV-D2 Maximizing the dual function
The dual function in (6) is concave (as it can be expressed as the pointwise infimum of a family of affine functions of the dual variables [24]). We exploit the projected subgradient method to maximize the dual function iteratively. Let for all denote the optimal point of (IV-D1) when solved for at the iteration. Let denote a dual feasible point of (6) at the -th iteration.
According to the subgradient method, at the -th iteration, for , we first compute
[TABLE]
where is the step size. These intermediate variables are projected onto the feasible set and primal recovery is performed by the method of [26]. The details can be found in the Appendix -A. Numerical results appear in Section VI.
V Online Asynchronous Coded Caching
In the online scenario, at time only information about the already arrived requests are known to the server, i.e., it only knows , and for such that . Ideally, one would want to design an online algorithm that is guaranteed to be feasible whenever the corresponding offline version is feasible. However, this appears to be a hard problem. Specifically, routinely used algorithms such as earliest-deadline-first (EDF) do not have this property [27]. In the upcoming subsection we demonstrate that the online solution requires additional ideas from a coding standpoint.
V-A Necessity of coding across missing subfiles of a user
Example 3
Consider a system with and with for (also depicted in Fig. 1). The arrival times and deadlines of the users are , and for (as shown in Fig. 5). We assume that user is interested in files for and that transmitting a subfile takes a single time slot, i.e., .
Suppose that the server does not code across any user’s missing subfiles. At , it has the choice to transmit either or . We emphasize that it has to transmit either of these as the deadline for user 1 is . If the server transmits , then consider the scenario where and , i.e., the third user’s request comes at and the second user’s request comes at . In this case, the server is forced to transmit at , which implies that user 3 misses its deadline. On the other hand, if the server transmits at , then and will cause user 2 to miss its deadline.
This issue can be circumvented if we transmit a linear combination of both and in the first time slot as shown in Fig 5. Intuitively, this is the correct strategy since transmitting allows the server to hedge its bets against the identity of the next request arrival. This example demonstrates that coding across missing subfiles of user is strictly better than the alternative. We emphasize that the synchronized model of [1] and the offline scenario do not require this.
Accordingly, for the online scenario we treat each missing subfile as an element of a large enough finite field . This allows us to consider linear combinations of the missing subfiles over . Note that any equation of the form
[TABLE]
where the coefficients belong to the field and represents -addition is also an all-but-one equation from which user can recover .
V-B Recursive LP based algorithm
In the online scenario at time our only decision is to transmit an equation in the time slot . In particular, it is possible that a request arrives at and that can change the situation drastically. It makes intuitive sense to transmit equations that benefit a large number of users. However, we also need to take into account the deadline constraints of each user. These requirements need to be balanced. At the top level, our approach can be summarized as follows.
- •
Solving a linear program when a user request arrives. We solve an LP which is similar to (1) each time a new user request comes into the system. This specifies a set of and variables. However, in the offline case, the ordering of the ’s within an interval does not matter (cf. Remark 2). In the online case, this is no longer true. As we have no knowledge of future arrivals, it becomes important to choose the “best” user group for the time slot in which transmission needs to take place.
- •
Deciding which user group to pick. Based on the variables we first decide a candidate list of feasible user groups that can be chosen for transmission at each time slot. Suppose user group is a candidate. We calculate a metric for depending upon (i) the benefit of this equation to the participating users within , and (ii) the stringency of the deadlines of the users in . Our measure of stringency for user is the ratio of the remaining number of missing packets of user to the number of remaining time-slots for user . If the calculated metric for is above a pre-defined threshold then an equation corresponding to (of the form in (7)) is transmitted.
- •
Updating variables and continuing recursively. Following this, we update certain variables and the process continues for each time slot thereafter. When the next user request arrives into the system, the history of the variable assignments is used to solve a new LP (similar to (1)), and the process continues recursively.
V-B1 Measuring the benefit of user group to user
As we need to commit to a user group at each time instant in the online case, we first discuss how we can measure the benefit to a user if user group is chosen for transmission at a given time slot. In particular, if has been used for transmission in the past, then the current transmission may be less beneficial to some of the users or of no benefit. We demonstrate this by means of the following example.
Example 4
Consider a system , for all users , and . The placement scheme is the same as [1] so that each file is divided to subfiles and each user misses subfiles. The cache content and missing subfiles are specified in Table II. We assume that the current time is and that the request of users have arrived to the server. More specifically, we have with deadlines for all users . The server has already transmitted eight equations corresponding to the user groups depicted in Fig. 6.
Suppose that the server considers scheduling the user group at . We note here that users 2 and 3 have already participated in previous transmissions. Thus, it needs to be determined how beneficial this equation is to each user. Considering user 2, as shown in Table II, it can recover a linear combination of (from user group ), (from user group ) and (from user group ) from the prior equations; this in turn implies that it can also recover by solving linear equations. However, note the the user group also results in user 2 recovering . Thus, from user 2’s perspective, this equation is of no benefit. A similar argument shows that this user group does not benefit user 3 either.
We observe at this point that the modified LP in Section IV-C, can be used here to determine a given user’s benefit. In particular, the past history of the transmission contains information on the total time allocated to a given user group . At time , if we consider transmitting user group , we can add it to the set of user groups and update its time allocation. Following this, for each user the LP in (2) can be used to measure the number of subfiles that can be transmitted for it, were to be chosen.
Let denote the set of user groups chosen for time , be a user group under consideration at time , and let . Let denote the time allocated to user group , where is incremented by one if , otherwise it is set to 1. Consider the following LP.
[TABLE]
Now suppose that we have already tracked the number of useful packets for user until time . Then the above LP can be used to determine the benefit of transmitting user group at time .
Remark 4
The LP in (8) can also be expressed as a maximum flow problem. The associated flow network consists of a source node , a node for each , a node for each user group , and a terminal node . There are edges with capacity going from to each and edges from to node if . The flow on such an edge is . Moreover, from each node to there exist an edge of capacity . These capacity constraints model the first two inequality constraints in (8). Fig. 7 illustrates an example of this network. It is well-known that if all capacities in a flow network are integers, there exists an integral maximum flow ([28], Chapter 7). Therefore, there exists an integral solution for ’s in (8) if ’s are integers.
V-B2 Solving LP upon user arrival
Consider a time when the request of the -th user arrives at the server. We let be the set of user groups associated with the previously transmitted equations. We also let be the total time allocated to equations corresponding to user group prior to time . Thus, if in time interval the server transmits an equation that exclusively benefit users in then otherwise . Time intervals are formed by the set of times in
[TABLE]
As in the offline case in (1), the sets of active users , user groups and are defined corresponding to these time intervals, e.g., is the set of active users in . Moreover, is a set of user groups that either already have been transmitted or might be transmitted after . That is . The variables ’s and ’s have the same interpretation as the offline case. With these variables, the server solves the following LP.
[TABLE]
An important feature of time intervals , …, is that these time intervals end at a deadline and except the first time interval that starts with arrival time , the other time intervals start with a deadline. Thus, we have , i.e., the set of active users in interval is a subset of the active users in interval for the range of .
Next, the server creates a list of candidate user groups. Let be the solution of (9) and let . The elements of are first ordered based on time intervals. Then, among the elements with the same time interval, they are ordered based on length of user group. Therefore, for two elements we say is before if , or if and . We let denote the sorted version of using this procedure. Let be the number of missing packets (subfiles when ) that have been transmitted for user until time ; this value is tracked in Algorithm 1.
Note that user needs to recover missing packets and it has time slots to obtain them. We use the ratio of these quantities as a measure of the stringency of the deadline of user . Let denote the total number of missing packets of user that can be communicated by the user groups chosen thus far and by picking at time . The LP in (8) allows us to compute and in turn . Therefore the metric is obtained by the following weighted sum.
[TABLE]
At time , the server picks the first element such that for some threshold and transmits an equation corresponding to it. Unlike the synchronous case, we choose a random linear combination of all missing packets of user that can be transmitted by user group .
When , we subdivide a missing subfile into packets that are denoted for . Thus, the server transmits
[TABLE]
at time interval where denotes the -th equation transmitted by the server and are chosen independently and uniformly at random from the finite field . If none of the elements in satisfy then nothing will be transmitted at this time interval.
If a new user request does not come at time , then the server updates the user group values and then solves (8) again to decide the user group for the time slot . The process continues this way until the next user request comes when the LP in (9) is solved. The complete details are provided in Algorithm 1.
In general, there is no guarantee that Algorithm 1 will return a feasible schedule if the corresponding offline schedule is feasible. In that sense, Algorithm 1 can be viewed as a heuristic with good experimental performance. However, if Algorithm 1 does not return “INFEASIBLE”, we can show that a feasible solution for the corresponding offline LP can be identified. This fact coupled with usage of the Schwartz-Zippel Lemma allows us to conclude that our algorithm works with high probability if it does not return “INFEASIBLE”. The proofs of the following claim and lemma appear in Appendix -B and -C respectively.
Claim 1
For user requests, , where , if Algorithm 1 does not return “INFEASIBLE” then there exists a feasible integral solution for the offline LP in (1).
The following lemma shows that if Algorithm 1 does not return “INFEASIBLE” then with high probability each user recovers all its missing subfiles from the transmitted equations.
Lemma 1
If Algorithm 1 does not return “INFEASIBLE” then with probability at least all requests will be satisfied within their deadline.
Thus, by choosing the field size large enough, we can make the probability of success as large as we want. We point out that increasing the field size results in a corresponding increase in the computational requirements at the server and the user nodes.
VI Simulation Results and Comparisons with prior work
In this section we present simulation results for both the proposed offline and the online algorithms (software are available in [29]). Prior work in this area is primarily the work of [17] that presents heuristics for the online scenario. However, we note that [17] works with deadlines for subfiles and does not take into account the time required to transmit a packet. It uses intuitively plausible rules to decide the equations transmitted by the server depending on the deadlines of the users.
For both scenarios, the request arrival times are generated according to a Poisson process with parameter . The arrival time is quantized to the nearest time slot. The deadlines , are generated uniformly at random from the range (these values will be specified for each setting below).
VI-A Offline scenario simulation
In the first set of simulations we examine the execution time of our approach for various values of where is an integer; the placement scheme in [1] was used. In these simulations we set , , , , and . Table III shows the details of the overall execution time and the size of the corresponding flow networks for the various instances. The last column of the table corresponds to the execution time (in MATLAB) of the LP in (1), while the second-last column corresponds to the execution time of the proposed approach above. It is evident that the proposed approach is significantly faster. In fact, memory requirements make it infeasible to even formulate the problems corresponding to the first three rows in MATLAB. Fig. 8 shows the convergence of the primal recovery procedure to the actual rate for a system with , , and . It can be observed that there is a clear convergence of the solution to the optimal value.
VI-B Online scenario simulation
For the online scenario we consider both centralized [1] and decentralized [18] placement schemes for a system with and with and . For each experiment we run trials for generating the arrivals. For the centralized case, we use the placement scheme of [1] and the placement is fixed during each experiment. In the decentralized scheme, at each trial the cache content of each user is independently and uniformly chosen as well.
For each set of generated arrivals, we first run the offline LP to check whether it is feasible. The online algorithm is run only if the offline LP is feasible. The online algorithm requires a threshold (see Section V-B). We run simulations with a low threshold (case I) and a high threshold (case II). The coding gain is defined as the ratio of the uncoded rate333The uncoded rate is simply the total number of missing subfiles of all users normalized by . to the rate achieved by the system. Fig. 9 (a) and Fig. 10 (a) depict plots of the coding gain vs. in centralized and decentralized cases, respectively. As decreases, the arrivals are spaced further apart on average, and the coding gain of any scheme is expected to reduce.
The coding gain is computed by taking an average over all instances where a given scheme is feasible. For the offline scheme, this means that we take the average of all instances where it is feasible. For the online algorithm, some of the arrival patterns may result in infeasibility; these instances were not taken into account when computing the average coding gain. This explains why the coding gain of the case II sometimes appears to be higher than the offline algorithm. However, the coding gain of the case I is significantly lower, because of its low threshold.
The feasibility probability of a scheme vs. the arrival rate is plotted in Fig. 9 (b) and Fig. 10 (b) for the centralized and decentralized placement schemes respectively. As expected the low threshold online algorithm has a very high feasibility probability for a range of arrival parameters, while the high threshold algorithm has a lower feasibility probability. Note that the high threshold algorithm (when compared to the low threshold case) only transmits an equation when a large enough number of users benefit from the transmission. Thus, its feasibility probability is lower, but when it is feasible, its coding gain is much higher than the low threshold case.
For both plots, we also include the results of [17]. In this scheme feasibility and coding gain can be traded off by setting a threshold for the defined misfit function (Section III in [17]). We use this scheme by setting the threshold to zero; this is the so-called First-Fit Rule in [17]. The First-Fit rule prefers feasibility over coding gain. The setting in [17] considers a scenario where each subfile has a deadline. We have adapted their algorithm for our case. It can be observed that the feasibility probability of [17] is quite poor. Accordingly, we also plot the fraction of subfiles that meet the deadline; this is somewhat better. The coding gain numbers for [17] are also quite unreliable as the algorithm is infeasible in most cases. Thus, we do not plot it.
VI-C Scenario where individual subfiles have deadlines
The work of [17] considers a situation where each subfile has its own deadline. This is inspired by applications such as video delivery over the Internet. We emphasize that this setting can be captured by our techniques. In particular, suppose that each user requests a set of subfiles from the server where the subfile requests arrive at different times and each subfile has a different deadline. In this case, we can treat each subfile request of user as corresponding to a distinct virtual user whose cache content is the same as user . However, the requests of the users are different. In this situation, each virtual user has precisely one missing subfile. Thus, the issue of coding over the corresponding subfiles does not arise.
Our setting is again one where , . Each file is subdivided into subfiles. Arrival times and deadlines are generated similar to the previous simulations with Poisson parameters and the deadlines are randomly chosen uniformly from with and . Similar to the previous experiments we run trials and at each trial, the cache content of each user is populated randomly and uniformly among all placement schemes with cache of size subfiles. Thus, different users might request different number of subfiles from the server. The only difference is that here each requested subfile has its own arrival time and deadline. The results are illustrated in Fig. 11. It can be observed that our proposed approach provides significantly superior coding gain and feasibility probability as compared to the work of [17].
VII Conclusions and Future Work
In this work, we considered the asynchronous coded caching problem where user requests (with deadlines) arrive at the main server at different times. We considered both offline and online versions of this problem. We demonstrated that under the assumption of all-but-one equations, the offline scenario can be optimally solved by a linear program (LP). Moreover, we presented a low-complexity solution to this LP based on dual decomposition. In contrast to the synchronous case and the offline scenario, we show that the online scenario requires coding across missing subfiles of a given user. Furthermore, we present an online algorithm that leverages offline LP in a recursive fashion. Extensive simulation results indicate that our proposed algorithm significantly outperforms prior algorithms.
Our online algorithm considers the situation where there is no knowledge about future request arrival times and file identities; this corresponds to a worst-case scenario. It would be interesting to consider cases where there is statistical information available on the arrival times and file popularity and/or algorithms where these can be learned and to investigate how this knowledge can be used to further improve the performance of the algorithm. For instance, it may be possible to design better placement schemes under this knowledge. Throughout the paper, we implicitly considered the case when different users request different files. It may be interesting to adapt our techniques for the case of repeated requests.
-A Quadratic Projection and Primal Recovery in Dual Decomposition
For the projection of and to the constraint space we simply set and is obtained via the following quadratic optimization.
[TABLE]
In [23, Appendix I] an algorithm has been proposed to solve (10). This solution can be explained as follows. For fixed and for each , we sort so that . We take to be the minimum such that
[TABLE]
or let if such a doesn’t exist. Then if and zero otherwise.
The initial setting for the dual variables is chosen as , for , and for .
Primal Recovery: After solving the dual problem, the primal variables, i.e., ’s, are recovered by the method of [26] whereby
[TABLE]
where ’s are sequence of convex combination weights for each non-negative integer , i.e. and for all . In [23], it has been shown that if the step size and convex combination weights are chosen so that
- •
for all and ,
- •
as , and
- •
as and for all for some ,
then is an optimal primal solution. Here and . Some sequences for and that satisfy the above conditions has been proposed by [23] . Among them we choose and where . Then, the primal solution will be updated as,
[TABLE]
-B Proof of Claim 1
Proof:
For simplicity, we prove the claim for and the proof for the general case follows directly. We will construct and variables for the offline LP from the decisions made in Algorithm 1. Note that we update the set with the user groups chosen in Algorithm 1. It is not difficult to verify that for any user group is a member of . Moreover, the algorithm assigns integer values to . Now, for any in (1), we set if and otherwise. Therefore, ’s take integer values. Since at each time only one equation is transmitted in Algorithm 1, the first condition holds for all .
For each we define to be the last time slot that user benefits from the equation transmitted by the server. Clearly we have that otherwise Algorithm 1 will be infeasible at . We let to be the user group associated with this equation where .
Note that Algorithm 1 tracks a set that contains all the user groups that have been used by the algorithm before time . We let , and with be the solution of (8) when solving it for . Then, for each with and for each we assign if and otherwise. We apply this assignment for all . Algorithm 1 assigns integer values to ’s. From Remark 4 it follows that there exists an integral solution for ’s and consequently the ’s as well. With these assignments, we now demonstrate that the second and third conditions in (1) hold.
For the second condition we note that if then and we have nothing to show. For we have that . Recall that is the solution of (8) at time . By the way that has been updated in Algorithm 1, we have . Therefore, we have and from (8) for ,
[TABLE]
For the third condition, consider any user and any . Recalling the definition of and , we know that which implies that in (8), we have
[TABLE]
where the last inequality comes from the second constraint in (8). The middle equality holds by counting arguments for missing subfiles and user groups in . To verify this, consider a bipartite graph in which the left and right nodes correspond to and with respectively. There is an edge between nodes corresponding to and if and only if . We let to be the label of this edge. By the definition of we know that implies . Therefore, outgoing edges from the node corresponding to are the edges between and the nodes . Similarly, the outgoing edges between node with are the edges between and . By counting in two ways, from the left and right nodes, we have the required equality. Therefore, we have that for any . This further implies that for all and completes the proof. ∎
-C Proof of Lemma 1
Proof:
For simplicity, in the discussion below we assume that . The proof for follows in straightforward manner. By the way that and are updated in Algorithm 1, we have at each time . Furthermore, for all . Therefore, each user benefits from equations. For a , let represent the -th equation (the dependence on index is suppressed since we assume that ). User can recover from this equation since the missing subfiles , for and , exist in the cache of user .
For each user we define matrix whose rows and columns correspond to equation numbers in and missing subfiles in respectively. For , assume that -th equation is associated with user group , where . Then, the entry of for the row and column corresponding to and is if and zero otherwise. Therefore, if matrix is invertible then user can recover all the missing subfiles , for , from equations for . Thus, we need to show that the determinant of is nonzero for all with high probability.
Towards this end, let denote the determinant of ; we treat the as indeterminates at this point. Note that since Algorithm 1 did not return “INFEASIBLE”, we have a feasible integral solution for the corresponding offline LP (cf. Claim 1). Thus, there exists an interpretation of this solution (cf. Section IV-B) such that in each time slot, only one equation is transmitted, i.e., unlike a fractional solution, we do not need to potentially transmit multiple equations in the same time slot. This in turn implies that there is a setting for coefficients with such that the multivariate polynomial evaluates to a non-zero value over , i.e., is not identically zero. This further implies that is not identically zero. Now, since each appears only once in thus its degree in polynomial is one. Also, is a polynomial of degree thus is a polynomial of degree at most . Therefore, we can use Lemma 4 in [30] to show that by choosing ’s independently and uniformly at random from , the determinants of ’s, , are nonzero with probability at least .
When we will need to split a missing subfile into packets and code over these as well. Thus, the corresponding system of equations will be of size leading to the bound . ∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] M. Maddah-Ali and U. Niesen, “Fundamental limits of caching,” IEEE Trans. on Info. Th. , vol. 60, no. 5, pp. 2856–2867, May 2014.
- 2[2] F. Arbabjolfaei, Y.-H. Kim et al. , “Fundamentals of index coding,” Foundations and Trends® in Communications and Information Theory , vol. 14, no. 3-4, pp. 163–346, 2018.
- 3[3] H. Ghasemi and A. Ramamoorthy, “Asynchronous coded caching,” IEEE Intl. Symp. on Info. Th. , pp. 2438–2442, 2017.
- 4[4] R. K. Ahuja, T. L. Maganti, and J. B. Orlin, Network Flows:Theory, Algorithms and Applications . Prentice-Hall, 1993.
- 5[5] H. Ghasemi and A. Ramamoorthy, “Improved lower bounds for coded caching,” IEEE Trans. on Info. Th. , vol. 63, no. 7, pp. 4388–4413, 2017.
- 6[6] Q. Yu, M. A. Maddah-Ali, and A. S. Avestimehr, “The exact rate-memory tradeoff for caching with uncoded prefetching,” IEEE Trans. on Info. Th. , vol. 64, no. 2, pp. 1281–1296, 2017.
- 7[7] Q. Yu, M. A. Maddah-Ali, and A. S. Avestimehr, “Characterizing the rate-memory tradeoff in cache networks within a factor of 2,” IEEE Trans. on Info. Th. , vol. 65, no. 1, pp. 647–663, 2019.
- 8[8] L. Tang and A. Ramamoorthy, “Coded caching for networks with the resolvability property,” in IEEE Intl. Symp. on Info. Th. , 2016.
