Codes for Updating Linear Functions over Small Fields
Suman Ghosh, Lakshmi Natarajan

TL;DR
This paper studies efficient linear coding schemes for updating linear functions of sparse message changes over small finite fields, with applications to distributed data storage, providing tighter bounds and constructions that reduce field size requirements.
Contribution
It offers a field-size aware analysis of the function update problem, tighter codelength bounds, and new codes that balance codelength reduction with smaller field sizes.
Findings
Derived a tighter lower bound on codelength independent of field size
Constructed codes for striped message vectors with reduced field size requirements
Established equivalence between function update and generalized index coding problems
Abstract
We consider a point-to-point communication scenario where the receiver maintains a specific linear function of a message vector over a finite field. When the value of the message vector undergoes a sparse update, the transmitter broadcasts a coded version of the modified message while the receiver uses this codeword and the current value of the linear function to update its contents. It is assumed that the transmitter has access to the modified message but is unaware of the exact difference vector between the original and modified messages. Under the assumption that the difference vector is sparse and that its Hamming weight is at the most a known constant, the objective is to design a linear code with as small a codelength as possible that allows successful update of the linear function at the receiver. This problem is motivated by applications to distributed data storage systems.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Codes for Updating Linear Functions over
Small Fields
Suman Ghosh and Lakshmi Natarajan The authors are with the Department of Electrical Engineering, Indian Institute of Technology Hyderabad, Sangareddy 502 285, India (email: {ee16resch11006, lakshminatarajan}@iith.ac.in).
Abstract
We consider a point-to-point communication scenario where the receiver intends to maintain a specific linear function of a message vector over a finite field. When the value of the message vector changes, which is modelled as a sparse update, the transmitter broadcasts a coded version of the modified message while the receiver uses this codeword and the current value of the linear function to update its contents. It is assumed that the transmitter has access to only the modified message and is unaware of the exact difference vector between the original and modified messages. Under the assumption that the difference vector is sparse and that its Hamming weight is at the most a known constant, the objective is to design a linear code with as small a codelength as possible that allows successful update of the linear function at the receiver. This problem is motivated by applications to distributed data storage systems. Recently, Prakash and Médard derived a lower bound on the codelength, which is independent of the size of the underlying finite field, and provided constructions that achieve this bound if the size of the finite field is sufficiently large. However, this requirement on the field size can be prohibitive for even moderate values of the system parameters. In this paper, we provide a field-size aware analysis of the function update problem, including a tighter lower bound on the codelength, and design codes that trade-off the codelength for a smaller field size requirement. We also show that the problem of designing codes for updating linear functions is related to functional index coding or generalized index coding. We first characterize the family of function update problems where linear coding can provide reduction in codelength compared to a naive transmission scheme. We then provide field-size dependent bounds on the optimal codelength, and construct coding schemes based on error correcting codes and subspace codes when the receiver maintains linear functions of striped message vector. These codes provide a trade-off between the codelength and the size of the operating finite field, and whenever the achieved codelengths equal those reported by Prakash and Médard the requirements on the size of the finite field are matched as well. Finally, for any given function update problem, we construct an equivalent functional index coding or generalized index coding problem such that any linear coding scheme is valid for the function update problem if and only if it is valid for the constructed functional index coding problem.
I Introduction
We consider a point-to-point communication scenario as shown in Fig. 1 where the receiver maintains a linear function of a message vector . The message is an -length column vector over a finite field , where is any prime power, and A is an matrix over with and rank. Suppose the value of the message vector is updated to , where represents a sparse update to the message, i.e., we assume that where denotes the Hamming weight of a vector and is a known constant. In other words at the most entries of the original message are updated to new values. We assume that the transmitter has access to the updated message , but is unaware of the original message or the sparse update . Note that the message update is modelled here as substitutions only and not as insertions or deletions. The objective is to design a linear encoder that uses an matrix to generate the codeword , with as small a codelength as possible, such that the receiver can decode using the transmitted codeword and the older version of its content .
The problem is motivated by distributed storage systems (DSS) where information is stored in linearly coded form across a number of nodes to provide resilience against storage node failures [1]. In the scenario where multiple users can simultaneously edit a single file stored in a DSS, it is possible that a user who wishes to apply his update is unaware of the current version of the message stored in the DSS, for instance when another user has recently edited this file. Letting the user first learn the version stored in the DSS, and then apply his update will incur additional communication cost. As an alternative, if it is known that the update vector is sparse, it is possible to design schemes that do not require the knowledge of the value of at the transmitter [1, 2, 3].
The function update problem was considered in [2, 3] for DSS’s for updating one of the storage nodes with the help of the other nodes in the system. Note that each node in a DSS stores a linear function of the message. A node can become stale in such systems, for instance if the node goes offline while the message and the corresponding linear functions stored in the other nodes undergo an update. Once it is back online, the stale node connects to the other nodes in the distributed storage system to update its own linear function, and the stale data already stored in this node acts as side information. The authors of [2, 3] design both the code for distributed storage and the code for function update to minimize the amount of data downloaded by the stale node to update its contents. This is unlike the problem statement considered in [1] as well as this paper, where it is assumed that an arbitrary matrix A is given and a code for updating the function is to be designed.
The authors in [1] also consider a broadcast scenario where a codeword is broadcast to multiple nodes in order to update the (different) linear functions stored in each of the nodes. Problems related to updating linear functions have been considered in [4, 5, 6]. In [4], codes for updating linear functions are used in cache-aided networks to reduce the cost of multicasting a sequence of correlated data frames. The problem of efficiently storing multiple versions of a file in a DSS while ensuring a property called consistency is considered in [5, 6].
In the study of the point-to-point function update problem given in [2, 3, 1] the authors derive the following field-size independent lower bound on the codelength
[TABLE]
Note that, if , the lower bound on the codelength can be trivially achieved by transmitting . Hence, we will always assume that . The results in [1] show that codelength is achievable using maximally recoverable subcodes of , the subspace spanned by the rows of A, which are guaranteed to exist if the field size . Note that this requirement imposed on the field size can be large even for moderate values of and . The authors of [1] also consider the special case where the matrix A is striped, i.e.,
[TABLE]
where is the identity matrix, and denotes the Kronecker product. Note that and . This structure frequently arises in distributed storage systems where the -length data is partitioned into subvectors , each of length , each subvector is encoded independently by multiplying with , and all the encoded vectors are stored in a single storage node, see Examples 1–3 of [1]. In [1, Section IV], a code is constructed for the case that achieves the codelength using an MDS code, which is guaranteed to exist if the field size . In Remark 4 of [1] the authors consider a modified system model for the function update problem which we show in Section V-A3 of this paper to be equivalent to the case where A is striped with the number of stripes . Construction 1 and Remark 4 of [1] provide a code construction for this modified system model, and hence for the case , that achieves codelength of over any field.
In this paper we provide a field-size aware characterization of the point-to-point function update problem. In particular, we provide bounds on the achievable codelength that take into account the effect of the field size and we provide constructions that trade-off the codelength for a smaller field size requirement. This is unlike the point-to-point results in [1] which provide constructions only for the case but assume that the field size is sufficiently large. To the best of our knowledge, no prior analysis of this problem as a function of the field size is available in the literature except [1] which assumes that the field size is large enough for a maximally recoverable code to exist.
We characterize the family of point-to-point function update problems where linear coding scheme is useful to save at least one transmission, i.e., is achievable (Theorem 3, Section III). This characterization is analyzed in terms of the covering radius of , the dual of the code , in Section III-B. We provide a lower bound (Theorem 4, Section IV) and an upper bound (Theorem 5, Section V) on optimal codelength based on linear error correcting codes. Similar to [1] we also provide code constructions when A is striped (Section V-A1,V-A2) but our focus is on the general case where and . For the case when we provide a construction (Section V-A1) which achieves the optimal codelength for the respective operating field size , for any prime power . For the special case this code construction achieves codelength and this matches the achieved codelength in Construction 2 of [1] for which also requires . Section V-A2 provides code constructions for using subspace codes and error correcting codes over field extensions. All these code constructions yield a trade-off between the chosen field size and achieved codelength where operating over a smaller field size results in a larger codelength than operating over a larger field size (for instance, see Example 3). When restricted to the special case our construction provides a valid coding scheme for the modified function update problem mentioned in [1, Remark 4] that matches codelength over any field reported in [1] (Section V-A3). The performance comparison of the constructed codes are discussed in Section V-A4. Finally, we show that the point-to-point function update problem is equivalent a functional index coding or a generalized index coding problem [7, 8, 9]. Given a point-to-point function update problem we construct a functional index coding problem (Algorithm 1, Section VI-B) such that a coding scheme is valid for the function update problem if and only if it is valid for the constructed functional index coding problem (Theorem 9, Section VI-B). This paper starts with describing the system model and providing relevant preliminary results in Section II.
Notation: Matrices and column vectors are denoted by bold uppercase and lowercase letters, respectively. For any positive integer , the symbol denotes the set . The Hamming weight of a vector is denoted as . The symbol denotes the finite field of size and denotes a column vector of elements over where is a prime power. The identity matrix is denoted as .
II System Model and Preliminaries
We consider a noiseless communication scenario with single transmitter and single receiver. The transmitter knows a column vector of information symbols where each information symbol is an element over finite field . The receiver stores the coded message where () and rank(A) = . Now suppose the information symbol vector is updated to where is the update vector which is also a column vector of length over with , where denotes the Hamming weight of a vector. The objective is to generate a codeword with codelength as small as possible such that the receiver can update its content to using the transmitted codeword and the older version of its content . We assume the transmitter doesn’t know about original information symbol vector or update vector but only knows the updated information symbol vector . The problem of designing coding scheme to update the coded data available at the receiver to with will be called as function update problem.
Definition 1**.**
A valid encoding function of codelength for the function update problem over the field is a function
[TABLE]
such that there exists a decoding function satisfying the following property: for every and with .
The objective of the code construction is to design a pair of encoding and decoding functions that minimizes the codelength and to calculate the optimal codelength over which is the minimum codelength among all valid coding schemes.
A coding scheme is said to be linear if the encoding function is an -linear transformation. For a linear coding scheme, the codeword , where . The matrix is the encoder matrix of the linear coding scheme. The minimum codelength among all valid linear coding schemes for the function update problem over the field will be denoted as .
The trivial coding scheme that transmits the updated coded information symbols i.e., is a valid coding scheme with codelength since the receiver can directly update its content using . We refer to this trivial coding scheme as naive scheme where . Thus, we have the following trivial upper bound on the optimum linear codelength
[TABLE]
In [1] the authors provided a necessary and sufficient condition for a matrix to be a valid encoder matrix for function update problem. In Theorem 2 of [1] the proof is given only for necessary condition for a matrix to be a valid encoder matrix for function update problem. For the sake of completeness here we first prove that the criterion 1 in [1, Theorem 2] is a necessary and sufficient condition for a matrix to be a valid encoder matrix for function update problem and then state the relevant results which will be helpful to derive other results of this paper. Let and denote the linear codes generated by the rows of A and respectively. Also let and let be a generator matrix of .
Theorem 1** (Theorem 2, [1]).**
A matrix is a valid encoder matrix for the function update problem if and only if for any with and .
Proof.
A matrix is a valid encoder matrix for the function update problem if and only if the receiver can uniquely determine from the received codeword and the side information . Hence for two pairs of information symbol vectors and update vectors and such that the coded information symbol vectors available at the receiver are identical i.e., but updated coded information symbol vectors are distinct i.e., then the transmitted codeword must be distinct from to distinguish the two different updated coded information symbol vectors. Equivalently, the condition should hold for every choice of with satisfying and . Therefore is a valid encoder matrix if and only if
[TABLE]
for all such that and . Now denoting and we have
[TABLE]
for all and such that and . Now reformulating the condition given in (2) we obtain for all that satisfy , and . Therefore is a valid encoder matrix if and only if for all and if and then . Hence is a valid encoder matrix if and only if for all with if then . Now using the fact that we deduce that is a valid encoder matrix if and only if for all with such that also satisfies . Hence the statement of the theorem follows. ∎
Lemma 1** (Remark 2, [1]).**
Let be a valid encoder matrix for the function update problem. Let be a generator matrix of the code . Then is also a valid encoder matrix for the function update problem.
If we consider a valid encoder matrix such that , then we can find another valid encoder matrix as the generator matrix of the code . Since is a subcode of , we have . Therefore the encoder matrix has sub-optimal codelength. So from now we only consider encoder matrices such that . Since we assume we can write for some matrix .
Now using we restate Theorem 1 as follows. A matrix such that is a valid encoder matrix for the function update problem if and only if for any with and satisfies . We define the collection as the set of all vectors with such that i.e.,
[TABLE]
Theorem 2**.**
A matrix for some matrix is a valid encoder matrix for the function update problem if and only if
[TABLE]
Now we define the collection as the set of all non-zero linear combinations of or fewer columns of A over i.e.,
[TABLE]
Note that since .
Corollary 1**.**
* is a valid encoder matrix for the function update problem if and only if*
[TABLE]
III Necessary and Sufficient Condition for
In this section we will characterize the family of point-to-point function update problems where linear coding is useful to save at least one transmission compared to the naive scheme i.e., . First we will derive some preliminary results which will be helpful to derive the main result of this section.
Lemma 2**.**
The collection is closed under non-zero scalar multiplication.
Proof.
Suppose . There exists a with such that . For any , , where . Now as , it follows that . Again as and . Therefore for any , . Hence the lemma holds. ∎
III-A A coding scheme for a family of function update problems
Consider any function update problem where there exists a non-zero such that . Let be the subspace of generated by . Therefore dim. Note that dim. Let be a generator matrix of the code . The matrix S is a parity check matrix of the code .
Lemma 3**.**
The matrix S satisfies for all .
Proof.
Proof by contradiction. Let there exist a such that . This implies that . Therefore there exists an such that . Now as , exists and hence . Now as and is closed under non-zero scalar multiplication (using Lemma 3), which is a contradiction. Hence the lemma holds. ∎
Now using Corollary 1, we obtain a valid encoder matrix for the function update problem over as with codelength , whenever there exists a non-zero vector in . We do not claim that this coding scheme yields the optimal codelength .
Example 1**.**
Consider the function update problem over binary field where , , and the matrix is given by
[TABLE]
Note that rank over . The non-zero vector satisfies . The parity check matrix of the code , generated by is given by
[TABLE]
Therefore we obtain a valid encoder matrix with codelength for the function update problem as
[TABLE]
∎
Now we derive a necessary and sufficient condition for any function update problem to save at least one transmission using linear coding scheme compared to the naive scheme.
Theorem 3**.**
For an function update problem, if and only if .
Proof.
To prove the theorem we first show that for any function update problem if then . Next we show that if then .
Proof of first part i.e., if then :
Let be an optimal encoder matrix with . Then there exists a matrix such that . From Corollary 1 we obtain for all . Since contains all non-zero vectors from , the columns of S are linearly independent. Hence . Again from (1), we have . Hence .
Proof of second part i.e., if then :
If then there exists a non-zero vector such that . Therefore using the technique described in Section III-A we can construct a valid encoder matrix such that we save one transmission compared to the naive scheme, i.e., . Hence the lemma holds. ∎
Now we provide a sufficient condition on the field size to save at least one transmission compared to the naive scheme for any function update problem.
Corollary 2**.**
For any with rank where ,
* if *
.
Proof.
If then
[TABLE]
Using the fact that if then , we have
[TABLE]
Now for any with rank, the number of distinct non-zero linear combinations of or fewer columns of A is at the most . Therefore . Hence . Now using Theorem 3 we have . ∎
III-B Relation with covering radius
The covering radius of an linear code over , denoted by , is defined as the smallest integer such that the spheres of radius centered at each codeword of cover the whole space . We can determine covering radius of a linear code in terms of the cosets of the code. For any vector , the set is called a coset of the code and in any coset, a vector with minimum Hamming weight is called a coset leader. The covering radius of the code is the largest among the Hamming weight of all the coset leaders. Upon denoting as a parity check matrix of , is the syndrome of the vector . Two vectors have the same syndrome if and only if they belong to the same coset of . Hence there is an one-to-one correspondence between syndromes and cosets [10].
For the function update problem let be the linear code generated by A. Hence A is a parity check matrix of the code which is the dual code of . Now considering a vector , can be expressed as where with . Therefore the vector denotes the syndrome of a vector with that belongs to some coset of . Note that any vector that belongs to is non-zero, hence can not be the syndrome of the codewords of . Note that is the syndrome of the coset leader of the coset . Since is a vector that belongs to the coset and , the Hamming weight of the coset leader of the coset is at the most .
Corollary 3**.**
For an function update problem, if and only if .
Proof.
From Theorem 3 we have if and only if . Hence to prove the corollary we prove that for any function update problem if and only if .
Proof of if : Since the collection contains all non-zero vectors over , each non-zero vector is the syndrome of some vector with that belongs to some coset of . Since there exists a one-to-one correspondence between the syndromes and cosets, for each vector there exists a coset of that contains a vector with . Hence the coset leader of each coset has Hamming weight at the most . Therefore the largest Hamming weight of the coset leaders among all cosets of is at the most . Hence .
Proof of if : Since , the largest Hamming weight of the coset leaders among all cosets of is at the most . Since there exists a one-to-one correspondence between the syndromes and cosets, each syndrome can be expressed as for some coset leader satisfies . We know that the syndromes of a particular linear code covers the whole space. Hence any vector can be expressed as for some with . Since consists only non-zero vectors that satisfies the above property, .
Hence if and only if . ∎
Example 2**.**
In this example we calculate the minimum number of rows of such that is guaranteed for , and . Now if and only if . From Table I of [11] we observe that for any binary code of length and dimension up to , covering radius is at least . Thus dim implies . Hence and . Therefore for any matrix with rank we can save one transmission compared to the naive scheme. One such example of A is given in Example 1. ∎
IV Lower Bound on Optimal Codelength
In this section we derive a lower bound on the optimal codelength over . First we derive two preliminary lemmas which will help to derive the lower bound.
Lemma 4**.**
For any function update problem and for any invertible matrix , .
Proof.
To prove the lemma we first show that and then .
Proof for : Suppose . Then from (3) we have . Now left multiplying both side by K we obtain since K is invertible. Hence .
Proof for : Suppose . Then from (3) we have . Since K is invertible, exists. Now left multiplying both side by we obtain . Hence .
Hence the lemma holds. ∎
For any function update problem with rank(A)=. Hence A contains linearly independent columns. Now consider a matrix which contains linearly independent columns of A. Note that is an full rank matrix and hence invertible. Denote and . From Lemma 4 we observe that and are equivalent function update problems and any matrix is a valid encoder matrix of function update problem if and only if is a valid encoder matrix of function update problem. Hence we conclude that the linear code generated by the rows of is a subcode of the linear code generated by the rows of i.e., . Hence there exists a matrix such that . Now using the equivalence between and function update problems and using Corollary 1 we say that is a valid encoder matrix of the function update problem if and only if for all .
Let be the set of all non-zero vectors in of Hamming weight at the most .
Lemma 5**.**
For any function update problem
.
Proof.
For an function update problem, where and consists of linearly independent columns of A. Note that the sub-matrix of that contains the corresponding columns forms an identity matrix. Now if we consider any non-zero linear combination of of fewer columns of this sub-matrix we obtain all non-zero vectors over with Hamming weight at the most . Hence . The last equality holds due to Lemma 4. ∎
Let be the maximum dimension among all linear codes over with blocklength and minimum distance .
Theorem 4**.**
The optimal codelength of the function update problem over satisfies
.
Proof.
Let be an optimal encoder matrix of function update problem with codelength . Then there exists a matrix such that . Now using Corollary 1 we have for all . Since , it follows that for all . Therefore any set of columns of S are linearly independent. Hence S is a parity check matrix of a linear code of block length and minimum distance at least . Thus the dimension of this code satisfies . Then . ∎
Theorem 4 provides a lower bound that is aware of the field size . This is tighter than the bound given in [2, 3, 1] since from Singleton bound we know that , and this combined with Theorem 4 yields . Hence, irrespective of the matrix A, a necessary condition for is that an MDS code over must exist.
V Code constructions
In this section we first derive an upper bound on the optimal codelength over and then provide code constructions for function update problem when A is in form given by (4). Define .
Theorem 5**.**
The optimal codelength of the function update problem over satisfies
.
Proof.
From Corollary 1 we have that a matrix for some matrix , is a valid encoder matrix if and only if . To satisfy this condition it is sufficient that any set of columns of S are linearly independent. Now consider S as a parity check matrix of the largest linear code with blocklength and minimum distance . The resulting codelength . Hence the upper bound on the optimal codelength holds. ∎
V-A Code constructions for striped data
In this section we provide linear code construction of an function update problem where follows the structure given by
[TABLE]
where and is a matrix over whose all elements are [math]. Let be the number of repetitions of in the matrix . Hence we write and . First we consider the family of function update problems where and show that for this case the lower bound on optimal codelength given in Theorem 4 and the upper bound on optimal codelength given in Theorem 5 exactly matches with each other. Hence we characterize the optimal codelength for this family of function update problems. Our code construction is based on an appropriately chosen linear error correcting code. Note that in Section IV of [1] the authors provided a linear code construction based on maximally recoverable subcodes (MRSC) which requires field size and uses an MDS code. In comparison our code construction is suitable for any field size.
V-A1 Code Constructions for the family of function update problems with
In this sub-section we first calculate the optimal codelength for such family of function update problems and then provide a code construction based on an appropriately chosen linear error correcting code.
Theorem 6**.**
For the family of function update problems with the optimal codelength over is given by
[TABLE]
Proof.
Consider any . Then can be written as for some with . Hence we write where denotes the column of and for all . Now . Since , at the most terms among are non-zero and each has Hamming weight at the most . Hence for any , we have . It is easy to observe that . So using Theorem 5 we have . Again from Theorem 4 we have . Since the lower bound and the upper bound matches with each other we have . ∎
Now we provide a code construction for the family of function update problems with . Since for any function update problem with the value of is , it is sufficient that for any with . Hence it is sufficient that any columns of S are linearly independent. Now consider S as a parity check matrix of a linear code of maximum dimension with blocklength and minimum distance and set the encoder matrix . This code achieves the optimal codelength . Now if then there exists an MDS code over with blocklength and minimum distance which has maximum dimension among all linear codes over . Hence choosing S as a parity check matrix of an MDS code and encoder matrix we achieve codelength which matches the codelength achieved by the construction given in Section IV of [1] which also requires .
Example 3**.**
Consider an function update problem over where and is given by
[TABLE]
Now from [12] we have . Hence choosing S as parity check matrix of a repetition code over we achieve codelength .
If we view the above matrix as over and the function update problem over where , from [12] we have . Hence choosing S as parity check matrix of a MDS code over we achieve codelength . ∎
V-A2 Code constructions for the family of function update problems where
In this sub-section we provide a linear code construction for the family of function update problems where is given in (4) with . A matrix is a valid encoder matrix if and only if there exists a matrix such that satisfies for all . For any vector we write for some with . Hence
[TABLE]
where with and each . Since , at the most vectors among are non-zero. Hence at the most vectors among are non-zero. Denote where each . Therefore we have that at the most vectors among are non-zero. Now for any we write
[TABLE]
where is the sub-matrix of S containing to columns of S.
I. Case-1, , : To satisfy the condition given in (7) for it is sufficient that the columns of any two or fewer sub-matrices among form linearly independent set. Hence the columns of each sub-matrix are linearly independent. Let be the -dimensional subspace of generated by the columns of over . Now to satisfy the linear independence property of the columns of two or fewer sub-matrices among , it is sufficient to have for any and . Our code construction for an function update problem where is based on subspace codes.
Code Construction 1**.**
Our aim is to construct a matrix where is the sub-matrix of S containing to columns of S such that the subspaces generated by the columns any two sub-matrices and for are trivially intersecting. Note that for any the subspaces and generated by the columns of and respectively are dimensional subspace of and satisfies . Hence to construct such S matrix we utilize pairwise trivially intersecting -dimensional subspaces of . From the literature on subspace codes [13, 14, 15], we know that if then there exist at least pairwise trivially intersecting -dimensional subspaces in . Hence if and provided it is possible to find pairwise trivially intersecting -dimensional subspaces of . Now to construct we choose a basis of subspace which contains vectors over and these linearly independent vectors form the columns of the sub-matrix . After constructing such S matrix, we set which is a valid encoder matrix for the function update problem with . Using this code construction we achieve codelength for function update problem if .
Example 4**.**
Consider an function update problem over with where is given by
[TABLE]
where is given by
[TABLE]
Now our aim is to construct a matrix such that the subspaces generated by the columns of and respectively are pairwise trivially intersecting. From our construction we have that it is possible to find pairwise trivially intersecting -dimensional subspaces of if and provided . If we let then i.e., . Hence over it is possible to construct a matrix S such that is a valid encoder matrix for the function update problem. One possible choice of pairwise trivially intersecting -dimensional subspaces of is , and . Hence the matrix is given by
[TABLE]
∎
II. Case-2, , : Here we provide a linear code construction for the family of function update problem where and is given in (4) with . To satisfy the condition given in (7) for it is sufficient that the columns of any or fewer sub-matrices among form a linearly independent set.
Code Construction 2**.**
Our code construction uses a linear code over of maximum possible dimension with block length and minimum distance . Let be a parity check matrix of such linear code with where denotes the maximum dimension of a linear code over with block length and minimum distance . Note that any columns of are linearly independent over . Let be a primitive element of and be the primitive polynomial corresponding to where each for all . The corresponding companion matrix is given by
[TABLE]
Now we define a matrix where for each is given by . Now for each and , is given by
[TABLE]
where is the entry of . Since any or fewer columns of are linearly independent then using Theorem 3 in [13] we have that the columns of any or fewer block matrices among are linearly independent. Hence the matrix is a valid encoder matrix over with codelength . Since any or fewer columns of are linearly independent we have and hence with equality if and only if is a parity check matrix of an MDS code over . Such an MDS code is guaranteed to exist if . Hence using this code construction we achieve codelength if .
Example 5**.**
Consider an function update problem over with where is given by
[TABLE]
where is given by
[TABLE]
Note that is a primitive polynomial corresponding to and companion matrix corresponding to the primitive polynomial is given by
[TABLE]
Now we set as a parity check matrix of a MDS code over which is repetition code over . Hence is given by
[TABLE]
Now we obtain the matrix from using (8) as
[TABLE]
Now we obtain a valid encoder matrix with codelength over . ∎
V-A3 Comparison with the code in Remark 4 of [1]
Let us first briefly describe about the system model given in Remark 4 in [1] using our notations. In Remark 4 of [1] the authors considered transmission of updated information symbol vectors , . The receiver knows coded version of each information symbol vector denoted by where and and demands updated version of the coded demands i.e., . We can view this problem as an function update problem where takes the form given in (4) with the number of repetitions of the matrix along the block diagonal entries of being equal to . We denote the information symbol vector as and the update vector as with . The authors of [1] provide a valid code construction with codelength based on an MRSC using the Construction 1 in [1]. This construction from [1] is valid over any field .
To construct a valid code for the above function update problem we choose as a parity check matrix of a MDS code over and such a code exists if . Then we construct the matrix from using (8). Hence if we construct a valid code with codelength for the function update problem. Note that for any positive integer , . Hence over any finite field our construction yields a valid encoder matrix with codelength for the function update problem described above.
V-A4 Comparison of Code Construction 1 and Code Construction 2 for function update problem with
In this sub-section we consider the Code Construction 2 for the special case of and then compare the performance with the performance of the Code Construction 1. Consider an function update problem where is of the form given in (4). To obtain a valid code for the function update problem using the Code Construction 2, we use a linear code over of maximum possible dimension with blocklength and minimum distance . Let be a parity of such linear code with where denotes the maximum possible dimension of a linear code over with blocklength and minimum distance is at least . We construct a matrix from using (8) and obtain a valid encoder matrix with code length by multiplying S with . Note that any two or fewer columns of are linearly independent. Hence the subspace generated by each column of are pairwise trivially intersecting. Therefore to construct such a matrix it is necessary and sufficient that the number of trivially intersecting -dimensional subspaces of space is at least . From [16] we know that the space contains exactly trivially intersecting -dimensional subspaces. Hence to construct a matrix S it is necessary and sufficient that
[TABLE]
Now using the fact (since ) we observe that i.e., is a sufficient condition for such an encoder matrix to exist. Hence applying the Code Construction 2 for an function update problem over we achieve codelength if the field size . Hence if we achieve codelength using the Code Construction 2 by choosing a parity check matrix of an MDS code over and such a MDS code exists over since . Note that we also achieve codelength for function update problem using the Code Construction 1 if . Note that in Code Construction 2, the achieved codelength is always an integer multiple of . But applying the Code Construction 1 for function update problem we can achieve any codelength provided the field size . Hence for function update problem the Code Construction 2 becomes a special case of the Code Construction 1. This also inspires us to study the Code Construction 1 separately for function update problem.
VI Equivalence with a Functional Index Coding problem
In this section we discuss a variation of the classical index coding problem where each user demands a coded version of the information symbols present at the transmitter and already knows a subset of the (uncoded) information symbols as side information. This is a special case of the Generalized Index Coding problem [7, 8] and the Functional Index Coding problem [9]. The authors of [7, 8] generalized the classical index coding problem where each receiver knows some linearly coded information symbols as side-information and demands some linearly coded information symbols. Additionally the authors of [7] assume that the information symbols present in the transmitter are also linearly coded information symbols. In [9], authors generalized the index coding problem, where the side-information as well as demanded messages can be arbitrary functions of information symbols, called functional index coding problem. Here we consider a special case of generalized index coding problem and functional index coding problem and then we introduce the relation between function update problem and this family of functional index coding problems.
VI-A Functional Index Coding with Coded Demand and Uncoded Side Information
Consider a broadcast network scenario with single transmitter and receivers . The transmitter has a vector of information symbols . Each receiver knows a subset of the information symbols as side-information. Let be the side-information vector of receiver where . Each receiver demands a coded version of the information symbols vector . Let be the coded demand of receiver where with rank. Upon denoting and we describe the problem instance as functional index coding problem. A valid encoding function over for an functional index coding problem is
[TABLE]
such that for each receiver there exists a decoding function satisfying the following property: for every .
The design objective is to design a tuple of encoding and decoding functions that minimizes the codelength and determine the optimal codelength for the given functional index coding problem which is the minimum codelength among all coding schemes.
A linear code for an functional index coding problem is defined as a coding scheme where the encoding function is a linear transformation over described as , where is the encoder matrix for linear functional index code. The minimum codelength among all valid linear coding schemes for the functional index coding problem over the field will be denoted as .
Now we derive a design criterion for a matrix to be a valid encoder matrix for functional index coding problem. We define the set , or equivalently , of vectors of length such that and for some choice of i.e.,
[TABLE]
Theorem 7**.**
The matrix is a valid encoder matrix for the functional index coding problem if and only if
[TABLE]
Proof.
A matrix is a valid encoder matrix for the functional index coding problem if and only if at each receiver , can be uniquely determined from the received codeword and the side information . Hence for two distinct pair of the information symbol vectors such that the side-information symbol vectors available at the receiver are identical i.e., but demanded coded information symbol vectors are distinct i.e., then the transmitted codeword must be distinct from to distinguish two different demanded coded information symbol vectors. Equivalently, the condition should hold for every pair such that and for some . Therefore is a valid encoder matrix if and only if
[TABLE]
for all such that and for some . Now denoting we have
[TABLE]
for all such that and for some . Hence the statement of the theorem follows. ∎
VI-B Construction of a Equivalent Functional Index Coding Problem from a given Function Update problem
Now we construct an functional index coding problem starting from an function update problem. The number of receivers , the tuple of the side information indices and the tuple of coded demands are obtained from Algorithm 1.
Algorithm 1 considers every possible choice of such that and defines a new user in the functional index coding problem with demand matrix and side information .
Now we relate the set defined for the Function Update problem and the set defined in (9) for the constructed functional index coding problem.
Theorem 8**.**
For any given function update problem and its corresponding functional index coding problem, .
Proof.
To show that , we will show that and .
Proof for : Suppose a vector . Then from (3), we have and . Hence there exists a such that and . Now using the construction procedure described in Algorithm 1 we see that there exists a user in the constructed functional index coding problem such that and . The vector satisfies and . Hence .
Proof for : Suppose a vector . Then there exists at least one user such that and . Since we have . From Algorithm 1 we see that for any , . Note that . Again from the construction we have . Therefore . Hence .
Hence the theorem holds. ∎
Now we relate the problem of constructing linear codes for function update problem to the problem of designing linear coding scheme for the corresponding functional index coding problem.
Theorem 9**.**
A matrix such that for some matrix is a valid encoder matrix for the function update problem if and only if is a valid encoder matrix for the functional index coding problem.
Proof.
From Theorem 2 we know that is a valid encoder matrix for the function update problem if and only if it satisfies
[TABLE]
Now from Theorem 8 we have . Therefore using Theorem 7 we conclude that is a valid encoder matrix for the functional index coding problem if and only if is a valid encoder matrix for the function update problem. ∎
Acknowledgment
The authors thank Dr V. Lalitha for discussions regarding the topic of this paper.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] N. Prakash and M. Médard, “Communication Cost for Updating Linear Functions When Message Updates are Sparse: Connections to Maximally Recoverable Codes,” IEEE Transactions on Information Theory , vol. 64, no. 12, pp. 7557–7576, Dec 2018.
- 2[2] P. Nakkiran, N. B. Shah, and K. V. Rashmi, “Fundamental limits on communication for oblivious updates in storage networks,” in 2014 IEEE Global Communications Conference , Dec 2014, pp. 2363–2368.
- 3[3] P. Nakkiran, N. B. Shah, K. V. Rashmi, A. Sahai, and K. Ramchandran, “Optimal Oblivious Updates in Distributed Storage Networks.” [Online]. Available: www.cs.cmu.edu/%7Ervinayak/papers/Ob Up.pdf
- 4[4] M. Mahdian, N. Prakash, M. Médard, and E. Yeh, “Updating Content in Cache-Aided Coded Multicast,” Co RR , vol. abs/1805.00396, 2018. [Online]. Available: https://arxiv.org/abs/1805.00396
- 5[5] R. E. Ali and V. R. Cadambe, “Multi-version Coding for Consistent Distributed Storage of Correlated Data Updates,” Co RR , vol. abs/1708.06042, 2017. [Online]. Available: https://arxiv.org/abs/1708.06042
- 6[6] Z. Wang and V. R. Cadambe, “Multi-Version Coding–An Information-Theoretic Perspective of Consistent Distributed Storage,” IEEE Transactions on Information Theory , vol. 64, no. 6, pp. 4540–4561, June 2018.
- 7[7] M. Dai, K. W. Shum, and C. W. Sung, “Data Dissemination With Side Information and Feedback,” IEEE Transactions on Wireless Communications , vol. 13, no. 9, pp. 4708–4720, Sep. 2014.
- 8[8] N. Lee, A. G. Dimakis, and R. W. Heath, “Index Coding With Coded Side-Information,” IEEE Communications Letters , vol. 19, no. 3, pp. 319–322, March 2015.
