Minimizing Computation and Communication Costs of Two-Sided Secure Distributed Matrix Multiplication under Arbitrary Collusion Pattern
Jin Li, Nan Liu, Wei Kang

TL;DR
This paper presents a method to reduce computation and communication costs in secure distributed matrix multiplication while maintaining security and efficiency.
Contribution
The novel approach includes appending zeros to input matrices and using alternating optimization to minimize costs under arbitrary collusion.
Findings
Appending zeros helps overcome matrix splitting divisibility issues.
Alternating optimization improves performance compared to existing methods.
Simulation results show reduced costs and better feasibility.
Abstract
This paper studies the problem of minimizing the total cost, including computation cost and communication cost, in the system of two-sided secure distributed matrix multiplication (SDMM) under an arbitrary collusion pattern. In order to perform SDMM, the two input matrices are split into some blocks, blocks of random matrices are appended to protect the security of the two input matrices, and encoded copies of the blocks are distributed to all computing nodes for matrix multiplication calculation. Our aim is to minimize the total cost, overall matrix splitting factors, number of appended random matrices, and distribution vector, while satisfying the security constraint of the two input matrices, the decodability constraint of the desired result of the multiplication, the storage capacity of the computing nodes, and the delay constraint. First, a strategy of appending zeros to the input…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7- —National Natural Science Foundation of China
- —Research Fund of National Mobile Communications Research Laboratory, Southeast University
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Interconnection Networks and Systems · Cryptography and Data Security
1. Introduction
With the development of the Internet of Things (IoT), the ubiquitous wireless devices can generate massive data via environment monitoring or target tracking [1]. However, due to the limited power or hardware architecture, these wireless devices cannot satisfy the data processing and computation requirements by themselves. This inspires wireless devices to seek help from online computing nodes who can assist in computation and data processing. Furthermore, distributed computing nodes can be employed to further accelerate the computation and data processing tasks, which means wireless devices can assign computation tasks to many different computing nodes, e.g., Apache Spark [2] and MapReduce [3]. On the other hand, if the online computing nodes are untrustworthy, we should also guarantee data security. Hence, how to perform computation of data with the aid of distributed computing nodes in a secure fashion is an important problem.
In this paper, we focus on the secure distributed matrix multiplication (SDMM) problem [4,5,6,7,8]. In [7,8], the trace-mapping framework has been employed to achieve communication-efficient schemes in the SDMM. The authors of [9] proposed a model of SDMM from an information-theoretic perspective. The user wishes to compute the product of two input matrices and with the aid of distributed computing nodes while guaranteeing the security of the information about the two input matrices. Two cases are considered: one-sided security and two-sided security. In the first case, the user only wants to protect the information security of matrix , and is a public matrix known to all computing nodes [10]. In the second case, we need to consider the information security of both matrices and [9,11]. The information theft by the distributed computing nodes can be modeled by the collusion pattern, which has also been studied in problems of secret sharing [12] and private information retrieval [13,14]. Some of the existing literature has studied the SDMM problem under homogeneous collusion patterns, where up to l computing nodes may collude to obtain the information of the two input matrices [9,15,16,17,18]. To balance the tradeoff between the uplink and downlink cost, the works proposed two schemes based on the secure cross subspace alignment [15]. In [9], the authors characterized the fundamental limits of minimum communication overhead for the SDMM problem under homogeneous collusion pattern. The work in [16] proposed a scheme based on the polynomial codes on sub-tasks assigned to computing nodes, which can mitigate the straggling effects efficiently. In [18], the authors have adopted some random matrices to encode two input matrices for the purpose of meeting the requirement of security. Then, many encoded copies are sent to different computing nodes for computation. Finally, the user receives these computation results from computing nodes and recovers the product of the two input matrices. It has considered two cases: (1) encoding the input matrices without extra random matrices, i.e., generalized polydot code, and (2) encoding the input matrices with some random matrices to satisfy the security constraint, i.e., secure the generalized polydot code. They also show the superiority of the proposed scheme on the recovery threshold, i.e., the number of computation results that is needed for users to decode the desired result without error, and the communication load between the user and computing nodes, i.e., the amount of downloaded information from computing nodes. Recently, rather than focusing on the homogeneous collusion pattern, ref. [19] studied the SDMM problem under the arbitrary collusion pattern. Considering the two proposed performance metrics, i.e., the normalized download cost and normalized upload cost, they provide the optimal scheme for the one-sided SDMM problem and an achievable scheme for the two-sided SDMM problem.
Both the private information retrieval and SDMM problem considered in [14,19] deal with the non-homogeneous collusion pattern scenario. The common approach of these two problems is assigning different number of copies to different servers. Intuitively speaking, the servers that collude more will be assigned a lower number of copies. More specifically, in [14], the authors considered the ratio between the message size and the amount of downloaded information from the servers. Then, the work of [19] studied the SDMM problem under the arbitrary collusion pattern for a fixed matrix splitting factor, and different numbers of copies were distributed to different computing nodes based on the collusion pattern to minimize the performance of normalized download and upload costs. However, the heterogeneity of the computing nodes in terms of storage capacity, communication capability, and computing capability was not taken into consideration. When full heterogeneity is taken into consideration, the numbers of copies assigned to different servers will not only depend on its colluding behavior but also on its storage capacity, communication capability, and computing capability. Furthermore, the fixed matrix splitting factor may affect the performance of SDMM. Hence, in this work, we study the problem of two-sided SDMM under an arbitrary collusion pattern with the flexible matrix splitting factor. Furthermore, in order to measure the communication and computation performance of the system, a new performance metric called the total cost, which is composed of the computation cost and communication cost, has been proposed in our paper. Additionally, the storage capability of the computing nodes and the delay requirement of the user are also considered. Then, an optimization problem is formulated by minimizing the total cost, subject to the security constraint of the two input matrices, the decodability constraint of the desired result of the multiplication, the storage capacity of the computing nodes, and the delay constraint. In order to overcome the divisibility problem of matrix splitting, we also propose a strategy of appending zeros to the input matrices and discuss the feasible set of some matrix splitting factors for the optimality of the problem. Finally, an alternating optimization (AO) algorithm based on some solvers is adopted to obtain a feasible solution, and some necessary conditions for the feasibility of problem have been provided.
The contributions of our paper are summarized as follows:
- We propose a new performance metric, the total cost, which includes communication cost and computation cost, to measure the performance of the SDMM problem under arbitrary collusion pattern. Our aim is to minimize the total cost, overall matrix splitting factors, number of appended random matrices, and distribution vector, while satisfying the security constraint of the two input matrices, the decodability constraint of the desired result of the multiplication, the storage capacity of the computing nodes, and the delay constraint.
- To overcome the divisibility problem of matrix splitting, we propose a strategy of padding zeros to the input matrices, which can split the input matrices into an arbitrary number of blocks compared to the scheme without appending zeros. Moreover, the value ranges of some matrix splitting factors are discussed for the optimality of the problem.
- The formulated optimization problem is solved by an AO algorithm based on some solvers. More specifically, for the optimization subproblem corresponding to number of appended random matrices and distribution vector, the relationship between number of appended random matrices and distribution vector can be found so that the subproblem is transformed into an integer linear programming over the distribution vector, which can be solved by the MATLAB function “intlinprog”. Furthermore, we also provide some necessary conditions to verify the feasibility of this subproblem. Then, for the optimization subproblem corresponding to all matrix splitting factors, by relaxing the ceiling function and integer constraints, the subproblem can be transformed into an integer geometric programming problem solved by using “YALMIP”. Simulation results show that our proposed scheme with padding zeros is superior to the scheme without appending zeros and the scheme with no alternating optimization.
The rest of this paper is organized as follows: Section 2 introduces the system model of the two-sided SDMM under arbitrary collusion pattern. Section 3 proposes a zero-padding strategy, discusses the feasible set of some matrix splitting factors, and formulates an optimization problem. Section 4 provides the algorithm to solve the problem. Simulation results and conclusions are shown in Section 5 and Section 6, respectively.
Notation 1. In this paper, the following notations are used. denotes the set . represents the n-th column vector of the matrix . denotes the column vector. Positive integer is represented by , natural number is denoted by , and the ceiling function is denoted by .
2. System Model
As shown in Figure 1, we consider a user who wants to calculate the multiplication of two input matrices and . We suppose that and D are all integers and the finite field is sufficiently large. Due to its own limited computational ability, the user wishes to split the two matrices and into many blocks and upload them to N computing nodes for computation. At the same time, both matrices and contain sensitive information, and the user does not want to leak any information to the N computing nodes.
We study the case where the computing nodes may collude with others to obtain information about the two matrices and . We represent the colluding behaviors by a collusion pattern , which contains M colluding sets, i.e., . Here, is the m-th colluding set, which means that computing nodes in may collude to obtain the information of the two matrices. We make the following two assumptions about the collusion pattern :
- (1)For ease of presentation, we only include the maximal colluding set in . For instance, a colluding set means that computing nodes 3, 4, 5, and 6 collude. This implies that computing nodes belonging to any subset of also collude. However, for ease of presentation, we do not include the subsets of in .
- (2)Every computing node must appear in at least one colluding set. This is because we assume that all computing nodes are curious, and no computing node can be trusted with the sensitive information of and .
A collusion pattern can be represented by its incidence matrix , of size , i.e., if computing node i in the j-th colluding set of , the value of the -th element in is 1. For example, when , its incidence matrix is
Due to the need to keep the two matrices secure, the user must encode before uploading them to the computing nodes for computation. Assume that there are encoded copies with , then these encoding functions are denoted as: We use and to represent the i-th encoded copy of matrices and , respectively, , i.e., The user distributes a subset of the encoded matrices to computing node n, where the indices of this subset are written as , . This is termed the upload phase.
The computing node n computes the product, i.e., , . Then, computing node n would send the computed results , back to the user. This is termed the download phase.
In order to ensure the security of matrices and , the following security constraint must be satisfied,
which indicates that computing nodes in each colluding set, when putting their received copies together, can not obtain any information about the two matrices.
In addition, the user must be able to decode the desired product from the answers received from all the computing nodes, i.e., the decodability constraint
must be satisfied.
2.1. Matrix Encoding Scheme
We use the secure generalized polydot code (SGPD) in [18] to encode the two input matrices. First, we split into blocks, while can be split into blocks, i.e.,
where T is divisible by t, S is divisible by s, and D is divisible by d. Then, is of size , and is of size , where we have defined
In view of the security constraint (2), we append some random matrices and as
where rows of random matrices are appended to matrix , and columns of random matrices are appended to matrix , where is a positive integer. Each element of the random matrices and are generated in an i.i.d. fashion according to the uniform distribution on . Note that (6) and (7) are just one way of appending random matrices. The other case is given by Method 2 in [19]. For simplicity, we only study the case of (6) and (7), and the other case of appending random matrices can be treated in a similar fashion.
In this case, the encoded matrices are generated according to
where , are distinct non-zero elements in , and we have defined .
The generated encoded copies of (8) and (9), i.e., , , will be distributed to the computing nodes, where computing node n will receive and , where is the index set of the encoded matrices distributed to computing node n. We assume that form a partition of the set , which means that each encoded copy will be distributed to one and only one computing node. Upon receiving and , computing node n will calculate , , and return to the user. We distribute the encoded matrices to the computing nodes in the following way. Let be the distribution vector where is the number of distributed encoded matrices given to the n-th computing node. Then, we have . It has been proved in [19] that when
the security constraint (2) is satisfied. The physical meaning of (10) is that the number of encoded matrices for computing nodes in every colluding set must be smaller than the minimal number of random matrices appended in or , which is . Furthermore, the decodability constraint (3) is guaranteed by the following inequality [19]:
It means that the encoded copies must be no smaller than for decoding the desired results without error.
2.2. Storage, Communication and Computing Requirements of Each Computing Node
The amount of storage each encoded copy occupies is . Suppose computing node n’s storage capacity is , then, if , computing node n can not even store one encoded copy of and its corresponding answer, i.e., . If
then the computing node could store one encoded copy , , compute the multiplication , return the corresponding result and then retrieve another encoded copies from the user for further computation. Hence, (12) must be satisfied for all . Written in vector form, we have
Suppose computing node n’s computation speed is multiplications per second, then the time it takes for the user to complete the computation assigned to it, is
Further suppose that the uplink and downlink capacity between the user and computing node n are and symbols per second, respectively. Then, the amount of upload delay incurred at computing node n is
and the amount of download delay incurred at computing node n is
Then the total amount of delay incurred at computing node n when assigned with number of encoded copies is
where we have assumed that the computing nodes can only do one of the three actions at any time instant: compute or receive upload or send download. This is also in line with the assumption that the computing nodes may not have enough memory to store all copies all at once. Rather, it receives one copy, computes, and then sends it back to the user and then retrieves the next copy and repeats.
Thus, the total delay incurred for this computation is
and we require that the total delay is no larger than a given threshold , i.e.,
Besides the delay constraint, cost should also be considered for efficient SDMM. More specifically, the cost we consider is comprised of the computation cost of computing nodes and the data transmission cost, where the data transmission cost can be can be written twice divided into the upload and download transmission cost. More specifically, we assume that the upload and download transmission cost for computing node n is , and per symbol, and the computation cost of each multiplication at computing node n is , then the total required cost for the user doing the secure matrix multiplication of matrices and is
where is the upload cost, which is given by
is the download cost, which is given by
and is the computation cost, which is given by
2.3. Problem Formulation
In this work, we would like to jointly optimize the distribution vector , and the matrix split parameter such that the cost of the user, defined in (17), is minimized. At the same time, the security constraint (10), the decodability constraint (11), the storage constraint (13), and the delay constraint (16) must be satisfied.
3. The Feasible Set of (T,S,D)
Since we are splitting the two matrices and as shown in (4), it is natural to assume that t, s, and d have to take values such that T, S, and D be divisible by t, s, and d, respectively. For example, if , t can only take values in the set , because is not divisible by . However, this significantly limit the values that can take and may provide a high cost for the user.
In this section, we propose a better and more general way as follows: we allow any values, and to make the matrix splittable, we append zeros to the original matrix, i.e., append columns and rows to the matrix and append rows and columns to the matrix , such that , , and are integers. This increases the dimension of the two matrices but enables us to split them into blocks in a more flexible way. For example, , i.e., , and we would like to take . However, is not divisible by . Then, we can append one row of zeros to so that the appended matrix has dimension and thus can be divisible by .
More generally, we propose that for any with , we may append many rows to the bottom of matrix and many columns to the right side of matrix . Similarly, we append many rows to the bottom of matrix and many columns to the right side of matrix . As a result, instead of (5), we have
As can be seen, not padding zeros and only using that is a divisor of is a special case.
Since we are considering padding zeros, it is also possible to have or or . We show in the next lemma that this will only increase the cost at the user, defined in (17), for and .
Lemma 1. To minimize the cost at the user, i.e., (17), it is sufficient to consider and .
Proof. For and , the only decodability constraint (11) becomes relaxed, and other constraints are unchanged. In this case, we can prove that the optimal cost will increase compared to and . Please refer to Appendix A for detailed proof. □
Remark 1. The case of s is different from the cases of t and d. When is fixed, from security constraint (10) and decodability constraint (11), we see that on one hand, increasing s increases the number of blocks, but on the other hand, it also relaxes the security constraint. When computing nodes are heterogeneous, i.e., computing nodes have different computation cost, upload transmission cost and download transmission cost, the increase in s does not necessarily increase the total cost, because due to the more relaxed security constraint, we can distribute more blocks to computing nodes with lower costs. As a result, when we apply the strategy of appending zeros, the optimal s may not take values in .
After the above discussions, the problem described in Section 2.3 can be formally formulated as
where (19b) provides the security constraint and decodability constraint, (19c) is the storage constraint with N computing nodes’ storage capacity vector defined as , and (19d) is the delay constraint. In the cost function (19a), we have defined the upload transmission cost vector, the download transmission cost vector and the computation cost vector of N computing nodes as , and , respectively. Note that the scheme of appending zeros makes the dimension of every block in and to be and , respectively, as indicated by (19e). Furthermore, note that in (19f), while the values of t and d are limited to and , respectively, the value of s does not have an upper bound due to Remark 1.
4. Algorithm Design
Due to coupling variables, integer constraints and nonlinear constraints and objective function of the problem in (19), it is hard to find a global optimal or suboptimal solution. In the following, we propose an algorithm to obtain a feasible solution.
Coupling variables in Problem (19) inspires us to utilize the alternating optimization (AO) technique. Then, a feasible solution to Problem (19) can be obtained by solving the next two subproblems: one is fixing to optimize , and the other is optimizing given .
4.1. Optimization Subproblem of (J,lΔ) for a Fixed (T,S,D)
In this subsection, for a fixed , the optimization subproblem of Problem (19) corresponding to is given as
Note that when is fixed, the corresponding is also fixed according to (19e). Further note that when is fixed, the objective function (20a) is only a function of , and not . Due to the fact that , the inequality of (11) must be satisfied with the equality when is optimal. Hence, (11) can be rewritten as follows
With equality (21), can be expressed as a function of , i.e., . Then, substituting in Problem (20) as a the function of , Problem (20) can be reformulated as
Problem (22) is an integer linear programming problem with only one optimizing variable . This problem can be solved using MATLAB function “intlinprog”. MATLAB’s built-in “intlinprog” function is based on the branch and bound (BnB) algorithm and the interior point method [20,21] and is typically used to solve integer linear programming problems, such as the one in (22).
For certain system parameters and values, Problem (22) is not feasible. To identify a necessary condition for the feasibility of Problem (22), we have the following lemma.
Before presenting the lemma, we define a variable p as the smallest number of colluding sets that contain all computing nodes. For example, for the collusion pattern represented by incidence matrix (1), p is equal to 3, because three colluding sets, i.e., , include all computing nodes, and any 2 colluding sets in the collusion pattern can not include all computing nodes.
Lemma 2. For fixed parameters , if Problem (20) is feasible, the following inequalities must be satisfied:
where Y is defined as
where satisfies (19e). Variable p is defined as the smallest number of colluding sets that contain all computing nodes.
Proof. Let us first derive a lower bound of . According to the second assumption made about the collusion pattern in Section 2, every computing node must appear in at least one colluding set. So, we have
for any which is a subset of colluding sets in that include all computing nodes, i.e., for some for any . For example, in the collusion pattern represented by the incidence matrix in (1), may be , , , or .The constraint (10) can be rewritten as
where is the m-th colluding set. Inequality (25) shows that the total number of encoded matrices received by computing nodes in every colluding set can not be more than that of random matrices. Hence, from (25), we have
Thus, from (21), (24), and (26), we have
When is satisfied, must satisfy
On the other hand, when is not satisfied, there exists no feasible .Next, we derive an upper bound on . We have
where (28) follows from (21), and (29) follows from (20c). Hence, an upper bound on is given by
where Y is as defined in (23).If Problem (20) is feasible, we must have that , and the upper bound of in (30) must be greater than or equal to the lower bound of in (27).Hence, the proof is complete. □
Based on Lemma 2, Algorithm 1 is proposed to solve Problem (20), where we check the necessary conditions of the feasibility of Problem (20) before solving it using the MATLAB “intlinprog” function. Algorithm 1 Iterative Algorithm for Problem (20)Input: , and calculated according to (19e)Output: . 1: if and then 2: Solve Problem (22) with MATLAB function “intlinprog”. 3: else 4: Problem (20) is infeasible. 5: end if
4.2. Optimization Subproblem of (T,S,D) for a Fixed (J,lΔ)
In this subsection, given , the optimization subproblem of Problem (19) corresponding to is formulated as
where constraint (31d) is derived from Lemma 2.
Ceiling functions, i.e., in (19e), and integer constraint (31e) make Problem (31) hard to address. We can solve this subproblem by relaxing the ceiling functions, i.e., Problem (31) can be recast as
Problem (32) is an integer geometric programming problem which can be solved using the Matlab toolbox YALMIP directly. MATLAB’s built-in “YALMIP” function is based on the interior point method and BnB algorithm [22,23] and is typically used to solve integer geometric programming problems, such as the one in (32).
4.3. The Proposed Alternating Optimization (AO) Algorithm
Based on the above discussions of the two subproblems, we propose an AO algorithm as follows: In every AO iteration, for a fixed , and , which is calculated as , we use Algorithm 1 to solve (22). Then, for the output of Algorithm 1, we solve Problem (32) with YALMIP directly and obtain .
Since Problem (32) is obtained by the relaxation of the ceiling functions, we may face the problem where even though the found by YALMIP are integers, which we call , the corresponding , and in Problem (32) may not be integers. In order to overcome this problem, for the converged solution of the AO, we check whether constraints (19c) and (19d) in the original problem (19) are satisfied according to the definition of (19e). If they are, is taken as the solution to Problem (31), and the corresponding block dimensions are taken to be by padding zeros. If they are not, then we can employ an exhaustive search within a neighborhood near the converged solution for a feasible solution or restart the algorithm with a new random initial point . In a time-constrained system, we can also abandon optimizing and simply use the initial values to obtain a timely solution.
Finally, the proposed AO algorithm to solve Problem (19) is summarized in Algorithm 2. (The source code can be found in the following link: https://github.com/SendBullet/SDMM-opt (accessed on 14 March 2024)) Algorithm 2 Alternating Optimization Algorithm for Problem (19)
- 1:Initialize and the tolerance , where are chosen from divisors of randomly.
- 2:repeat
- 3: Given , calculate by Algorithm 1.
- 4: Given , calculate by solving Problem (32) with YALMIP directly.
- 5: Set .
- 6:until The fractional increase of the objective function of Problem (19) is less than .
- 7:if Constraints (19c), (19d) and (19e) are satisfied simultaneously then
- 8: Output and block dimension .
- 9:else
- 10: Return to step 1 and restart with a new random initial point.
- 11:end if
4.4. Complexity Analysis
The complexity of Algorithm 2 per iteration mainly lies in Steps 3 and 4. In Step 3, the complexity of Algorithm 1 is derived from solving Problem (20) by MATLAB function “intlinprog” which uses BnB method. By omitting the lower-order terms, the main complexity of Algorithm 1 per iteration is , where N is the number of computing nodes. In Step 4, similarly, the main complexity of solving Problem (32) by YALMIP with BnB method is , where 3 is the dimension of optimizing variables [24]. Hence, by neglecting the lower-order terms, the approximate computational complexity of Algorithm 2 per iteration is when , and when . As can be seen, the complexity scales exponentially with the number of computing nodes.
5. Simulation Results
In this section, we provide simulation results to evaluate the performance of the two-sided SDMM under arbitrary collusion pattern. We consider two collusion patterns. The first one has computing nodes with the collusion pattern being , while the second one consists of computing nodes with the collusion pattern being So, for these two collusion patterns, the smallest number of colluding sets containing all of the computing nodes is and , respectively. The stopping criterion in Algorithms 2 is set to [25]. Other system parameters are listed in the Table 1.
For simplicity, our proposed scheme in this paper is denoted by “ ”. Then, the following two benchmarks are considered to compare with our proposed scheme:
- (1)“ ”: In this scenario, we do not append zeros to the input matrices. The optimization subproblem corresponding to for a fixed is solved by exhaustive search in feasible pairs , which are divisors of . Other details are similar to Algorithm 2. This corresponds to the optimal performance of AO when no zeros are appended.
- (2)“ ”: First, is initialized by divisors of randomly. Then, we solve Problem (22) to obtain . This is a low complexity algorithm where no zeros are appended, and also, are randomly chosen without being optimized. Only are optimized for the fixed randomly chosen .
First, we consider the collusion pattern and the number of computing nodes being 11. Figure 2 shows the total cost versus the number of rows of matrix , i.e., T. Firstly, with the increase in T, the total cost of all schemes increases, which is caused by the growth of input matrix dimension. Secondly, our proposed algorithm outperforms the “ ” scheme when . This means that when , it is better to append zeros to the matrices to obtain a lower cost. Thirdly, our proposed algorithm always performs better than the “ ” scheme, which demonstrates the necessity of both appending zeros and performing AO. Lastly, from the comparison between and , we can observe that the cost of the proposed scheme increases with the increase of the size of the matrices.
Figure 3 plots the total cost versus the number of columns of matrix , i.e., D. Similar to Figure 2, the difference between our proposed scheme and the “ ” scheme becomes larger with the increase in the dimensions of the input matrices. However, the “ ” scheme achieves the same total cost as our proposed algorithm. This shows that, in this case, there is no need to pad zeros. Though the proposed scheme and the “ ” scheme have the same performance, the proposed scheme has less complexity because it can avoid the exhaustive search of the “ ” scheme.
Figure 4 illustrates the total cost versus the number of columns of matrix , i.e., S, which is also the number of rows of the matrix . Although the total cost of our proposed scheme is the same as that of the “ ” scheme for some S values, the gain of our proposed algorithm over the “ ” scheme increases with the increase in S. In fact, the gain is very significant for large S values, for example, when , the total cost incurred by the proposed scheme is only 55.77% of the “ ” scheme when and 55.07% when .
Figure 5, Figure 6 and Figure 7, respectively, depict the total cost with respect to T, S, and D when the number of computing nodes is 20. Similarly, our proposed scheme strictly outperforms the other two benchmarks in some cases, which further shows the superiority of the proposed scheme. Comparing Figure 2, Figure 3 and Figure 4 with Figure 5, Figure 6 and Figure 7, respectively, we see that the total cost decreases significantly with the increase in the number of computing nodes. Thus, when possible, the user should utilize more computing nodes to reduce the total cost.
6. Conclusions
In this paper, we investigated the minimization problem of the total cost, comprised of the computation cost and the communication cost, in the system of two-sided SDMM under an arbitrary collusion pattern. For realizing SDMM, we split the two input matrices into many blocks and appended some extra blocks of random matrices to guarantee the security of the two input matrices. Then, the matrix multiplication is calculated based on the encoded copies in the computing nodes. Our aim is to minimize the total cost, while ensuring the security constraint of the two input matrices, the decodability constraint of the desired result of the multiplication, the storage capacity of the computing nodes, and the delay constraint. The distribution vector, the number of appended random matrices, and all matrix splitting factors were optimized. In order to overcome divisibility problem of matrix splitting, we firstly proposed a strategy of appending zeros to the two input matrices and then discussed the value ranges of some matrix splitting factors for the optimality of the problem. Next, an AO algorithm was provided to obtain a feasible solution. Furthermore, to verify the feasibility of the proposed optimization problem, some necessary conditions were provided. Numerical results demonstrated that our proposed scheme achieves a lower total cost compared to the scheme without appending zeros and the scheme without AO optimization.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1El-Sayed H. Sankar S. Prasad M. Puthal D. Gupta A. Mohanty M. Lin C.T. Edge of Things: The Big Picture on the Integration of Edge, Io T and the Cloud in a Distributed Computing Environment IEEE Access 201861706171710.1109/ACCESS.2017.2780087 · doi ↗
- 2Zaharia M. Chowdhury M. Franklin M.J. Shenker S. Stoica I. Spark: Cluster computing with working sets Proceedings of the 2nd USENIX Workshop on Hot Topics in Cloud Computing (Hot Cloud 10)Boston, MA, USA 22–25 June 2010
- 3Dean J. Ghemawat S. Map Reduce: Simplified data processing on large clusters Proceedings of the 6th Symposium on Operating Systems Design and Implementation San Francisco, CA, USA 6–8 December 2004
- 4D’Oliveira R.G. El Rouayheb S. Karpuk D. GASP codes for secure distributed matrix multiplication IEEE Trans. Inf. Theory 2020664038405010.1109/TIT.2020.2975021 · doi ↗
- 5D’Oliveira R.G. El Rouayheb S. Heinlein D. Karpuk D. Degree tables for secure distributed matrix multiplication IEEE J. Sel. Areas Inf. Theory 2021290791810.1109/JSAIT.2021.3102882 · doi ↗
- 6Jia Z. Jafar S.A. On the capacity of secure distributed batch matrix multiplication IEEE Trans. Inf. Theory 2021677420743710.1109/TIT.2021.3112952 · doi ↗
- 7Kiah H.M. Kim W. Kruglik S. Ling S. Wang H. Explicit Low-Bandwidth Evaluation Schemes for Weighted Sums of Reed-Solomon-Coded Symbols IEEE Trans. Inf. Theory 202410.1109/TIT.2024.3366817 · doi ↗
- 8Machado R.A. D’Oliveira R.G.L. Rouayheb S.E. Heinlein D. Field Trace Polynomial Codes for Secure Distributed Matrix Multiplication Proceedings of the 2021 XVII International Symposium “Problems of Redundancy in Information and Control Systems” (REDUNDANCY)Moscow, Russia 25–29 October 202118819310.1109/REDUNDANCY 52534.2021.9606447 · doi ↗
