Random Sampling for Distributed Coded Matrix Multiplication

Wei-Ting Chang; Ravi Tandon

arXiv:1905.06942·cs.IT·May 17, 2019

Random Sampling for Distributed Coded Matrix Multiplication

Wei-Ting Chang, Ravi Tandon

PDF

Open Access

TL;DR

This paper explores the use of random sampling combined with coding techniques to perform approximate distributed matrix multiplication efficiently, balancing recovery threshold and approximation error.

Contribution

It introduces two novel coded randomized sampling schemes that leverage coding and randomization for approximate matrix multiplication in distributed systems.

Findings

01

Tradeoffs between recovery threshold and approximation error are characterized.

02

Proposed schemes achieve robustness to stragglers with controlled approximation.

03

The methods improve efficiency in large-scale matrix computations.

Abstract

Matrix multiplication is a fundamental building block for large scale computations arising in various applications, including machine learning. There has been significant recent interest in using coding to speed up distributed matrix multiplication, that are robust to stragglers (i.e., machines that may perform slower computations). In many scenarios, instead of exact computation, approximate matrix multiplication, i.e., allowing for a tolerable error is also sufficient. Such approximate schemes make use of randomization techniques to speed up the computation process. In this paper, we initiate the study of approximate coded matrix multiplication, and investigate the joint synergies offered by randomization and coding. Specifically, we propose two coded randomized sampling schemes that use (a) codes to achieve a desired recovery threshold and (b) random sampling to obtain approximation…

Tables1

Table 1. TABLE I: The normalized empirical errors, where the bolded values indicates the best scheme for each K 𝐾 K .

	Independent Sampling		Set-wise Sampling
Recovery	Uniform	Optimal	Uniform	Optimal
Threshold
$K = 1$	$3.1314$	$3.0917$	$3.1155$	$3.0972$
$K = 3$	$1.5679$	$1.5349$	$1.0409$	$1.0337$
$K = 5$	$1.0545$	$1.0489$	$0.3468$	$0.3463$
$K = 7$	$0.8105$	$0.7633$	$𝟎$	$𝟎$

Equations33

A = [A_{0} A_{1}], B = [B_{0} B_{1}],

A = [A_{0} A_{1}], B = [B_{0} B_{1}],

A B = A_{0} B_{0} + A_{1} B_{1} .

A B = A_{0} B_{0} + A_{1} B_{1} .

A_{n} = A_{0} + x_{n} A_{1}, B_{n} = x_{n} B_{0} + B_{1},

A_{n} = A_{0} + x_{n} A_{1}, B_{n} = x_{n} B_{0} + B_{1},

Z_{0}

Z_{0}

Z_{1}

Z_{2}

A_{n} = ℓ = 0 \sum s - 1 \frac{A _{q_{ℓ}} x _{n}^{ℓ}}{c P _{S}}, B_{n} = ℓ^{'} = 0 \sum s - 1 \frac{B _{q_{ℓ^{'}}} x _{n}^{s - 1 - ℓ^{'}}}{c P _{S}},

A_{n} = ℓ = 0 \sum s - 1 \frac{A _{q_{ℓ}} x _{n}^{ℓ}}{c P _{S}}, B_{n} = ℓ^{'} = 0 \sum s - 1 \frac{B _{q_{ℓ^{'}}} x _{n}^{s - 1 - ℓ^{'}}}{c P _{S}},

h (x_{n_{k}}) = \frac{1}{c P _{S}} ℓ = 0 \sum s - 1 ℓ^{'} = 0 \sum s - 1 A_{q_{ℓ}} B_{q_{ℓ^{'}}} x_{n_{k}}^{ℓ + s - 1 - ℓ^{'}},

h (x_{n_{k}}) = \frac{1}{c P _{S}} ℓ = 0 \sum s - 1 ℓ^{'} = 0 \sum s - 1 A_{q_{ℓ}} B_{q_{ℓ^{'}}} x_{n_{k}}^{ℓ + s - 1 - ℓ^{'}},

E [∥ A B - \hat{A} \hat{B}_{S} ∥_{F}^{2}] = \frac{( S \sum ∥ q \in S \sum A _{q} B _{q} ∥ _{F} ) ^{2}}{c ^{2}} - ∥ A B ∥_{F}^{2},

E [∥ A B - \hat{A} \hat{B}_{S} ∥_{F}^{2}] = \frac{( S \sum ∥ q \in S \sum A _{q} B _{q} ∥ _{F} ) ^{2}}{c ^{2}} - ∥ A B ∥_{F}^{2},

E [(\hat{A} \hat{B}_{S})_{ij}]

E [(\hat{A} \hat{B}_{S})_{ij}]

= \frac{1}{c} S \sum P_{S} q \in S \sum \frac{( A _{q} B _{q} ) _{ij}}{P _{S}}

= (A B)_{ij},

E [(\hat{A} \hat{B}_{S})_{ij}^{2}] = \frac{1}{c ^{2}} S \sum \frac{( q \in S \sum A _{q} B _{q} ) _{ij}^{2}}{P _{S}} .

E [(\hat{A} \hat{B}_{S})_{ij}^{2}] = \frac{1}{c ^{2}} S \sum \frac{( q \in S \sum A _{q} B _{q} ) _{ij}^{2}}{P _{S}} .

Var [(\hat{A} \hat{B}_{S})_{ij}] = \frac{1}{c ^{2}} S \sum \frac{( q \in S \sum A _{q} B _{q} ) _{ij}^{2}}{P _{S}} - (A B)_{ij}^{2} .

Var [(\hat{A} \hat{B}_{S})_{ij}] = \frac{1}{c ^{2}} S \sum \frac{( q \in S \sum A _{q} B _{q} ) _{ij}^{2}}{P _{S}} - (A B)_{ij}^{2} .

E [∥ A B - \hat{A} \hat{B}_{S} ∥_{F}^{2}] =

E [∥ A B - \hat{A} \hat{B}_{S} ∥_{F}^{2}] =

=

=

=

h (x) = t = 0 \sum s - 1 t^{'} = 0 \sum s - 1 \frac{A _{q_{t}} B _{q_{t^{'}}} x _{n}^{t + s - 1 - t^{'}}}{s P _{q_{t}} P _{q_{t^{'}}}},

h (x) = t = 0 \sum s - 1 t^{'} = 0 \sum s - 1 \frac{A _{q_{t}} B _{q_{t^{'}}} x _{n}^{t + s - 1 - t^{'}}}{s P _{q_{t}} P _{q_{t^{'}}}},

E [∥ A B - \hat{A} \hat{B} ∥_{F}^{2}] = \frac{1}{s} (q = 0 \sum m - 1 ∥ A_{q} B_{q} ∥_{F})^{2} - \frac{1}{s} ∥ A B ∥_{F}^{2} .

E [∥ A B - \hat{A} \hat{B} ∥_{F}^{2}] = \frac{1}{s} (q = 0 \sum m - 1 ∥ A_{q} B_{q} ∥_{F})^{2} - \frac{1}{s} ∥ A B ∥_{F}^{2} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Complexity and Algorithms in Graphs · Privacy-Preserving Technologies in Data

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings

Full text

Random Sampling for Distributed Coded Matrix Multiplication

Wei-Ting Chang Ravi Tandon

Department of Electrical and Computer Engineering

University of Arizona, Tucson, AZ, USA

E-mail: {wchang, tandonr}@email.arizona.edu

Abstract

Matrix multiplication is a fundamental building block for large scale computations arising in various applications, including machine learning. There has been significant recent interest in using coding to speed up distributed matrix multiplication, that are robust to stragglers (i.e., machines that may perform slower computations). In many scenarios, instead of exact computation, approximate matrix multiplication, i.e., allowing for a tolerable error is also sufficient. Such approximate schemes make use of randomization techniques to speed up the computation process. In this paper, we initiate the study of approximate coded matrix multiplication, and investigate the joint synergies offered by randomization and coding. Specifically, we propose two coded randomized sampling schemes that use (a) codes to achieve a desired recovery threshold and (b) random sampling to obtain approximation of the matrix multiplication. Tradeoffs between the recovery threshold and approximation error obtained through random sampling are investigated for a class of coded matrix multiplication schemes.

Keywords – Matrix multiplication, Random sampling, Coded Distributed Computing

I Introduction

††This work was supported by NSF Grant CAREER 1651492.

Matrix multiplication has been one of the most essential fundamental building blocks for various applications in fields such as signal and image processing, machine learning, optimization and wireless communications. Outsourcing the computations to distributed machines has become a preferable way to speed up the process when one is dealing with large scale data. However, distributed systems suffer from the straggler effect where the slowest worker(s) can limit the speed-ups offered by distributed computation.

In order to mitigate the impact of stragglers, the idea of using coded distributed computation has gained significant recent interest. In general, these codes are used to introduce redundancy to the computations. For example, by applying one of the simplest codes - repetition codes, one can let multiple machines work on the same computation. One can then obtain the desired result whenever the fastest machine finishes the assigned tasks. Much more efficient codes have been applied to the distributed computing problems. Significant recent progress has been made on understanding the additional speed-ups gained by mitigating stragglers using codes. Several codes that are particularly efficient for the distributed matrix multiplication problems include Polynomial codes, MatDot codes and Lagrange codes [1, 2, 3, 4]. These codes add redundancy in a way that one can obtain the desired result with the responses from an arbitrary subset of machines. The smallest number of machines which allow perfect recovery of the computation is referred as the recovery threshold.

In contrast to adding redundancy, another methodology to speed up matrix multiplication comes from the idea of randomization. By allowing some tolerable error in the computation, randomized algorithms can provide speed-ups by working on matrices of smaller dimensionality. However, the randomization techniques must be carefully designed, in order to provide guarantees on the error. Random sampling and random projection are two commonly used techniques for this purpose. Random sampling algorithms sample either the columns or rows from the original matrix to construct sketches of original matrices, and the subsequent task is performed on sketched matrices. The key to a good sampling scheme is to carefully design what to sample, since not all columns/rows carry the same amount of information. Several works on random sampling include [5, 6, 7, 8, 9, 10]. Random projection algorithms construct the sketch matrix by projecting the original matrix to a vector space with a lower dimension. Projection algorithms are typically designed to have good distance preserving properties (Johnson-Lindenstrauss lemma [11, 12]), and have been investigated in various works [11, 12, 13, 14, 15, 16].

**Main Contributions: ** In this paper, we explore the synergies between coding and randomization, and explore the tradeoffs between reconstruction error and recovery threshold for distributed matrix multiplication. To answer this question, we devise two novel coded sampling schemes that can achieve various levels of speed-ups depending on how well one wishes to approximate the desired result. For the scope of this paper, we focus on Matdot codes [3], and design sampling strategies tailored to these codes. We present a family of coded sampling schemes, which sample a sub-set of columns from the matrices, followed by application of Matdot codes on the sampled matrices. We analyze two sampling strategies: one where the sampling of rows/columns is done independently (with replacement), and one where we sample a subset of rows/columns (without replacement).

We show that if the matrices $A,B$ to be multiplied are divided into $m$ parts (for details, see Section IV), and for any integer $1\leq s\leq m$ , a recovery threshold of $K=2s-1$ is achievable. Moreover, the expected approximation errors of the proposed coded sampling schemes for a recovery threshold of $K=2s-1$ are as follows: $(a)~{}\mathbb{E}[\|AB-\hat{A}\hat{B}_{\mathcal{S}}\|_{F}^{2}]=(\sum_{\mathcal{S}}\|\sum_{q\in\mathcal{S}}A_{q}B_{q}\|_{F})^{2}/c^{2}-\|AB\|_{F}^{2}$ , where $\mathcal{S},~{}|\mathcal{S}|=s$ denotes the set of $s$ sampled indices and $c=\binom{m}{s}\cdot s/m$ when coded set-wise sampling scheme is used; and $(b)~{}\mathbb{E}[\|AB-\hat{A}\hat{B}\|_{F}^{2}]=(\sum_{q=0}^{m-1}\|A_{q}B_{q}\|_{F})^{2}/s-\|AB\|_{F}^{2}/s$ when coded independent sampling scheme is used. These results reveal a tradeoff between recovery threshold and approximation error, i.e., a lower recovery threshold can be obtained by allowing reconstruction error.

II System Model

We consider a distributed system which consists of a master and $N$ workers. Each worker is connected to the master through a separate link. The goal of the master is to approximate matrix multiplication $AB$ , where $A\in\mathbb{F}^{d_{1}\times d_{2}}$ and $B\in\mathbb{F}^{d_{2}\times d_{3}}$ , using $N$ workers, in the presence of stragglers, for some sufficiently large field $\mathbb{F}$ . We note that depending on the computation strategy used, the master may not need to wait for all $N$ workers to recover the approximation of $AB$ . The smallest number of workers needed to recover the approximation is referred as the recovery threshold $K$ .

To tolerate stragglers, the master encodes $A$ and $B$ separately, and workers multiply the encoded versions of $A$ and $B$ . The encoding functions used are $\bm{f}=(\mathit{f}_{0},\cdots,\mathit{f}_{N-1})$ and $\bm{g}=(\mathit{g}_{0},\cdots,\mathit{g}_{N-1})$ , where $\mathit{f}_{n}$ and $\mathit{g}_{n}$ are the encoding functions for worker $n$ . Specifically, the encoded matrices for worker $n$ are $\widetilde{A}_{n}$ and $\widetilde{B}_{n}$ , where $\widetilde{A}_{n}=\mathit{f}_{n}(A)$ and $\widetilde{B}_{n}=\mathit{g}_{n}(B)$ . We denote the answer from worker $n$ as $Z_{n}=\widetilde{A}_{n}\widetilde{B}_{n}$ . The master must be able to decode the desired result from any $K$ workers. We denote the approximated result as $\hat{A}\hat{B}=d(Z_{n_{0}},\cdots,Z_{n_{K-1}})$ , where $d(\cdot)$ is the decoding function. The performance of coded sampling schemes is measured through the expected approximation error $\mathbb{E}[\|AB-\hat{A}\hat{B}\|_{F}^{2}]$ , where $\|M\|_{F}$ denotes the Frobenius norm of a matrix $M$ . Note that we choose Frobenius norm for its properties, which will be useful for our analysis. Other norms could potentially be used for evaluating the schemes.

III Coded Matrix Multiplication

For the scope of this paper, we focus on one of the codes, namely MatDot codes [3]††footnotemark: . We show the intuition behind MatDot codes and its application to approximate matrix multiplication through an illustrative example.

Example 1.

Consider a matrix multiplication problem with $N$ workers using $m=2$ -MatDot code, where $N\geq 3$ . The input matrices are partitioned into $m=2$ submatrices as follows,

[TABLE]

where $A_{q}\in\mathbb{F}^{d_{1}\times\frac{d_{2}}{2}}$ and $B_{q}\in\mathbb{F}^{\frac{d_{2}}{2}\times d_{3}},\text{ for }q=0,1$ . The product of $AB$ can then be written as,

[TABLE]

The submatrices $A_{q}$ and $B_{q}$ are encoded as follows,

[TABLE]

for $n=0,\cdots,N-1$ , where $\widetilde{A}_{n}$ and $\widetilde{B}_{n}$ have the same dimensions as $A_{q}$ and $B_{q}$ , and $x_{n}\in\mathbb{F}$ is a distinct non-zero element assigned to worker $n$ . After encoding, worker $n$ computes $\widetilde{A}_{n}\widetilde{B}_{n}$ and sends the result to the master. Without loss of generality, we assume that the first $3$ workers respond and the master receives,

[TABLE]

It can be seen that the results can be viewed as $3$ distinct evaluations of a degree $2$ polynomial. Thus, the master can apply any polynomial interpolation technique and obtain the coefficients $A_{0}B_{1},A_{0}B_{0}+A_{1}B_{1}$ and $A_{1}B_{0}$ using any $3$ evaluations received. Since the desired result $A_{0}B_{0}+A_{1}B_{1}$ can be obtained from any $K=3$ evaluations, we say $2$ -MatDot code achieves a recovery threshold of $K=3$ .

††footnotetext: We note that there are many other codes that could potentially be applied to our problem, such as Polynomial and Lagrange codes [1, 2, 4]. Investigating randomization schemes for other codes is part of our ongoing work.

We now introduce the idea of randomization in this context. In particular, for scenarios where approximate matrix multiplication is sufficient, we show that the recovery threshold can be even reduced to $1$ . Using the same partition as the previous example, if we want the recovery threshold to be $K=1$ , the master can follow the following strategy: it samples one of the submatrices of $A$ and $B$ (i.e., either $(A_{0},B_{0})$ or $(A_{1},B_{1})$ with a certain probability). The chosen index is a Bernoulli random variable $Y$ . It then assigns each worker to compute $A_{Y}B_{Y}$ . It waits for only $K=1$ worker, and declares $A_{Y}B_{Y}$ as the approximate answer for $AB$ . It can be readily shown that the expected value of $A_{Y}B_{Y}$ is $AB$ with proper scaling. Although $A_{Y}B_{Y}$ is an unbiased estimator of $AB$ on average, there will be some error in practice, and the sampling scheme must be designed to (a) give an unbiased estimate of $AB$ , and (b) minimize the resulting error as much as possible. We first briefly summarize the general construction of MatDot, followed by the details of our randomized sampling scheme.

To apply MatDot codes for any $m$ that divides $d_{2}$ , the input matrices $A$ and $B$ are partitioned into $m$ disjoint submatrices horizontally and vertically, respectively, i.e., $A=[A_{0}\cdots A_{m-1}],~{}B=[B_{0}^{T}\cdots B_{m-1}^{T}]^{T},$ where $A_{q}\in\mathbb{F}^{d_{1}\times\frac{d_{2}}{m}}$ and $B_{q}\in\mathbb{F}^{\frac{d_{2}}{m}\times d_{3}},~{}q=0,\cdots,m-1$ . The submatrices of $A$ and $B$ are encoded into $\widetilde{A}_{n}=\sum_{q=0}^{m-1}A_{q}x_{n}^{q},~{}\widetilde{B}_{n}=\sum_{r=0}^{m-1}B_{r}x_{n}^{m-1-r}$ for worker $n$ , where $x_{n}$ is a distinct non-zero element in $\mathbb{F}$ assigned to worker $n$ . Workers compute the product of their respective $\widetilde{A}_{n}$ and $\widetilde{B}_{n}$ , and return the results to the master. The results can be seen as a polynomial evaluated at $N$ distinct points, i.e., $h(x)=\sum_{q=0}^{m-1}\sum_{r=0}^{m-1}A_{q}B_{r}x^{q+m-1-r}$ , where $x=x_{n},~{}n=0,\cdots,N-1$ . The degree of this polynomial is $2m-2$ , hence, the coefficients of the polynomial can be interpolated using any $2m-1$ evaluations. Note that the desired result is the sum of $A_{q}B_{r},~{}q=r$ , and it is the coefficient of $x^{m-1}$ . With the ability of computing the desired result from any $2m-1$ workers, we say $m$ -MatDot achieves a recovery threshold of $K=2m-1$ (see [3] for details).

IV Coded Sampling for Approximate Matrix Multiplication

In this section, we present two coded sampling schemes and study the tradeoff between recovery threshold and approximation error. To apply MatDot, matrices $A$ and $B$ are partitioned into $m$ submatrices horizontally and vertically, respectively. Both schemes sample $s$ submatrices from $A$ and the corresponding submatrices from $B$ , and encode them using MatDot, where the choice of $s$ controls both the approximation error and the recovery threshold.

IV-A Coded Set-wise Sampling

For the coded set-wise sampling scheme, the master samples a subset $\mathcal{S}\subset\{0,\cdots,m-1\}$ of the indices of submatrices, where $|\mathcal{S}|=s\leq m$ is picked according to probability $P_{\mathcal{S}}$ . We denote the sampled submatrices as $A_{\mathcal{S}}\triangleq(A_{q_{0}},\cdots,A_{q_{s-1}})$ and $B_{\mathcal{S}}\triangleq(B_{q_{0}},\cdots,B_{q_{s-1}})$ . The sampled submatrices are then encoded as,

[TABLE]

where the scaling is done to ensure that the approximation is an unbiased estimator of $AB$ and the choice of the constant $c=\binom{m}{s}\cdot s/m$ will become clear in the analysis. The goal is to approximate $AB$ using the sum of $A_{q_{\ell}}B_{q_{\ell^{\prime}}},~{}\ell={\ell^{\prime}}=0,\cdots,s-1$ . Note that this sum is originally a part of $AB$ . Workers are assigned to compute their respective $\widetilde{A}_{n}\widetilde{B}_{n}$ and return the results. The master receives the results,

[TABLE]

for $k=0,\ldots,K-1$ , corresponding to any $K$ workers. As shown in Section III, since the degree of this polynomial is $2s-2$ , the coefficients of the polynomial can be interpolated using the results from any $K=2s-1$ workers. The master can then obtain the approximation $\hat{A}\hat{B}_{\mathcal{S}}=\sum_{\ell=0}^{s-1}\sum_{\ell^{\prime}=\ell}A_{q_{\ell}}B_{q_{\ell^{\prime}}}/cP_{\mathcal{S}}$ .

Our main result is stated in the following Theorem:

Theorem 1.

For an approximate coded matrix multiplication problem, to achieve a recovery threshold of $K=2s-1$ using $s$ -MatDot codes, the expected approximation error of the coded set-wise sampling scheme is as follows,

[TABLE]

by sampling using the optimal distribution $P_{\mathcal{S}}^{\star}$ shown in the analysis, where $\mathcal{S},~{}|\mathcal{S}|=s$ denotes the set of sampled indices and $c=\binom{m}{s}\cdot s/m$ .

To prove Theorem 1, we first show that the approximation $\hat{A}\hat{B}_{\mathcal{S}}$ is an unbiased estimator of $AB$ . We start by looking at the expected value of the $ij$ th element of the approximation:

[TABLE]

where (7) follows from the definition of expected value and the design of the scheme, and $c$ is the number of times each $A_{q}B_{q}$ appears in the summation. Thus,

[TABLE]

Since $\text{Var}[(\hat{A}\hat{B}_{\mathcal{S}})_{ij}]=\mathbb{E}[(\hat{A}\hat{B}_{\mathcal{S}})_{ij}^{2}]-\mathbb{E}[(\hat{A}\hat{B}_{\mathcal{S}})_{ij}]^{2}$ , we have

[TABLE]

We next find the expected approximation error by calculating:

[TABLE]

where (12) follows from placing the double summations before $(\sum_{q\in\mathcal{S}}A_{q}B_{q})_{ij}^{2}$ .

Note that $\|AB\|_{F}^{2}$ is a constant for fixed $A$ and $B$ , hence, we can use the method of Lagrange multipliers to find the optimal $P_{\mathcal{S}}$ by putting $\sum_{\mathcal{S}}P_{\mathcal{S}}=1$ as a constraint on the first term in (12) and solve for the $P_{\mathcal{S}}$ that minimizes the error. The optimal $P_{\mathcal{S}}^{\star}$ can be found to be $P_{\mathcal{S}}^{\star}=\|\sum_{q\in\mathcal{S}}A_{q}B_{q}\|_{F}/\sum_{\mathcal{S^{\prime}}}\|\sum_{q\in\mathcal{S^{\prime}}}A_{q}B_{q}\|_{F}$ . Plugging $P_{\mathcal{S}}^{\star}$ in (12) completes the proof of Theorem 1.

We note that the computational complexity of finding the optimal probabilities is $\binom{m}{s}\times O(d_{1}d_{2}d_{3}s/m)$ , which can be high. A way to overcome this issue is to sample $A$ and $B$ using uniform distribution $P_{\mathcal{S}}=1/\binom{m}{s}$ at the cost of higher approximation error. We next propose another alternative (and simpler) sampling strategy and obtain the corresponding approximation error.

IV-B Coded Independent Sampling

For coded independent sampling, at each iteration, the master samples an index $q_{t}\in[0:m-1]$ according to probability $P_{q_{t}}$ , the probability that $A_{q_{t}}$ and $B_{q_{t}}$ being sampled at time $t,~{}t=0,\cdots,s-1$ . After sampling $s$ indices, the corresponding submatrices are encoded into $\widetilde{A}_{n}=\sum_{t=0}^{s-1}A_{q_{t}}x_{n}^{t}/\sqrt{sP_{q_{t}}},~{}\widetilde{B}_{n}=\sum_{t^{\prime}=0}^{s-1}B_{q_{t^{\prime}}}x_{n}^{s-1-t^{\prime}}/\sqrt{sP_{q_{t^{\prime}}}}$ . Workers are assigned to compute their respective $\widetilde{A}_{n}\widetilde{B}_{n}$ . The results the master received are

[TABLE]

where $x=x_{n},~{}n=0,\cdots,N-1$ . The degree of this polynomial is $2s-2$ , hence, the coefficients of the polynomial can be interpolated by using the results from any $2s-1$ workers. The master can thus obtain the approximation $\hat{A}\hat{B}=\sum_{t=0}^{s-1}\sum_{t^{\prime}=t}A_{q_{t}}B_{q_{t^{\prime}}}/s\sqrt{P_{q_{t}}P_{q_{t}^{\prime}}}$ . The expected error is (following similar steps as in previous section) as follows:

[TABLE]

IV-C Simulation Results

In this section, we present simulation results to show the performance of the two coded randomized sampling schemes. We consider the case where $A\in\mathbb{F}^{60\times 4}$ and $B\in\mathbb{F}^{4\times 60}$ , where $A$ and $B$ are partitioned into $m=4$ submatrices. With $m=4$ , the master can sample either $s=1,2,3$ or $s=4$ submatrices and achieved recovery thresholds of $K=1,3,5$ or $K=7$ , respectively. The normalized errors shown in Fig. 2, 2 and Table I are calculated by computing $\|AB-\hat{A}\hat{B}\|_{F}^{2}/\|AB\|_{F}^{2}$ . It can be seen in Fig. 2 and 2 that the empirical errors obtained by using the optimal sampling distributions have better approximations than the ones obtained by using uniform distributions. Note that in Table I, we can observe that in most cases, coded set-wise sampling has better approximations than coded independent sampling for the same recovery threshold. This is due to the fact that it is possible for the master to sample same submatrices multiple times when using the coded independent sampling scheme. While in coded set-wise sampling, the master always samples fresh submatrices. Furthermore, the errors of coded set-wise sampling always go to zero when $s=m$ as it is equivalent to performing the exact computation of $AB$ .

V Conclusion

In this paper, we studied the problem of approximate coded matrix multiplication. We presented two novel coded sampling schemes where a subset of columns/rows is sampled from the matrices. The sampled submatrices are then encoded using MatDot codes. The results reveal an interesting tradeoff between recovery threshold and approximation error. Generalizing these ideas for other coded computation schemes is an interesting future research direction.

Bibliography16

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Qian Yu, Mohammad Ali Maddah-Ali, and Amir Salman Avestimehr, “Polynomial Codes: an Optimal Design for High-Dimensional Coded Matrix Multiplication,” Co RR , vol. abs/1705.10464, 2017. [Online]. Available:http://arxiv.org/abs/1705.10464.
2[2] Qian Yu, Mohammad Ali Maddah-Ali, and Amir Salman Avestimehr, “Straggler Mitigation in Distributed Matrix Multiplication: Fundamental Limits and Optimal Coding,” Co RR , vol. abs/1801.07487, 2018. [Online]. Available:http://arxiv.org/abs/1801.07487.
3[3] Sanghamitra Dutta, Mohammad Fahim, Farzin Haddadpour, Haewon Jeong, Viveck R. Cadambe, and Pulkit Grover, “On the Optimal Recovery Threshold of Coded Matrix Multiplication,” Co RR , vol. abs/1801.10292, 2018. [Online]. Available:http://arxiv.org/abs/1801.10292.
4[4] Qian Yu, Netanel Raviv, Jinhyun So, and Amir Salman Avestimehr, “Lagrange Coded Computing: Optimal Design for Resiliency, Security and Privacy,” Co RR , vol. abs/1806.00939, 2018. [Online]. Available:http://arxiv.org/abs/1806.00939.
5[5] Petros Drineas, Ravi Kannan, and Michael W. Mahoney, “Fast Monte Carlo algorithms for matrices I: Approximating matrix multiplication,” SIAM Journal on Computing , vol. 36, no. 1, pp. 132–157, 2006.
6[6] Amit Deshpande, Luis Rademacher, Santosh Vempala, and Grant Wang, “Matrix approximation and projective clustering via volume sampling,” in Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm . Society for Industrial and Applied Mathematics, 2006, pp. 1117–1126.
7[7] Christos Boutsidis, Michael W. Mahoney, and Petros Drineas, “An improved approximation algorithm for the column subset selection problem,” Co RR , vol. abs/0812.4293, 2008. [Online]. Available:http://arxiv.org/abs/0812.4293.
8[8] Venkatesan Guruswami and Ali Kemal Sinop, “Optimal column-based low-rank matrix reconstruction,” Co RR , vol. abs/1104.1732, 2011. [Online]. Available:http://arxiv.org/abs/1104.1732.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Random Sampling for Distributed Coded Matrix Multiplication

Abstract

I Introduction

II System Model

III Coded Matrix Multiplication

Example 1**.**

IV Coded Sampling for Approximate Matrix Multiplication

IV-A *Coded Set-wise Sampling *

Theorem 1**.**

IV-B *Coded Independent Sampling *

IV-C Simulation Results

V Conclusion

Example 1.

IV-A Coded Set-wise Sampling

Theorem 1.

IV-B Coded Independent Sampling