Multidimensional Rational Covariance Extension with Approximate Covariance Matching
Axel Ringh, Johan Karlsson, Anders Lindquist

TL;DR
This paper extends the multidimensional rational covariance extension problem to handle approximate covariance matching, ensuring well-posedness and demonstrating applications in spectral estimation and texture generation.
Contribution
It introduces a framework for approximate covariance matching in RCEP, connecting soft and hard constraints, and proves their well-posedness with practical examples.
Findings
The problems are well-posed under approximate covariance matching.
Soft and hard constraint problems are connected via a homeomorphism.
Applications include spectral estimation and texture synthesis.
Abstract
In our companion paper "Multidimensional rational covariance extension with applications to spectral estimation and image compression" we discussed the multidimensional rational covariance extension problem (RCEP), which has important applications in image processing, and spectral estimation in radar, sonar, and medical imaging. This is an inverse problem where a power spectrum with a rational absolutely continuous part is reconstructed from a finite set of moments. However, in most applications these moments are determined from observed data and are therefore only approximate, and RCEP may not have a solution. In this paper we extend the results to handle approximate covariance matching. We consider two problems, one with a soft constraint and the other one with a hard constraint, and show that they are connected via a homeomorphism. We also demonstrate that the problems are well-posed…
| Mean | Std. | |
|---|---|---|
| Biased, exact matching | ||
| Biased, approximate matching, ME-solution | ||
| Biased, approximate matching, using true | ||
| Unbiased, approximate matching, ME-solution | ||
| Unbiased, approximate matching, using true |
| Mean | Std. | |
|---|---|---|
| Biased, exact matching | ||
| Biased, approximate matching, ME-solution | ||
| Biased, approximate matching, using true | ||
| Unbiased, approximate matching, ME-solution | ||
| Unbiased, approximate matching, using true |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
\headers
Approximate Multidimensional Rational Covariance ExtensionA. Ringh, J. Karlsson, and A. Lindquist
Multidimensional Rational Covariance Extension
with Approximate Covariance Matching††thanks: This work was supported by the Swedish Research Council (VR), the Swedish Foundation of Strategic Research (SSF), and the ACCESS Linnaeus Center, KTH Royal Institute of Technology.
Axel Ringh222Division of Optimization and Systems Theory, Department of Mathematics, KTH Royal Institute of Technology, 100 44 Stockholm, Sweden. (, )
Johan Karlsson222Division of Optimization and Systems Theory, Department of Mathematics, KTH Royal Institute of Technology, 100 44 Stockholm, Sweden. (, )
Anders Lindquist333Department of Automation and School of Mathematics, Shanghai Jiao Tong University, 200240 Shanghai, China. () 222Division of Optimization and Systems Theory, Department of Mathematics, KTH Royal Institute of Technology, 100 44 Stockholm, Sweden. (, )
Abstract
In our companion paper [54] we discussed the multidimensional rational covariance extension problem (RCEP), which has important applications in image processing, and spectral estimation in radar, sonar, and medical imaging. This is an inverse problem where a power spectrum with a rational absolutely continuous part is reconstructed from a finite set of moments. However, in most applications these moments are determined from observed data and are therefore only approximate, and RCEP may not have a solution. In this paper we extend the results [54] to handle approximate covariance matching. We consider two problems, one with a soft constraint and the other one with a hard constraint, and show that they are connected via a homeomorphism. We also demonstrate that the problems are well-posed and illustrate the theory by examples in spectral estimation and texture generation.
keywords:
Approximate covariance extension, trigonometric moment problem, convex optimization, multidimensional spectral estimation, texture generation.
\slugger
siconxxxxxxxx–x
1 Introduction
Trigonometric moment problems are ubiquitous in systems and control, such as spectral estimation, signal processing, system identification, image processing and remote sensing [5, 20, 59]. In the (truncated) multidimensional trigonometric moment problem we seek a nonnegative measure on satisfying the moment equation
[TABLE]
where , , and is the scalar product in . Here is a finite index set satisfying and . A necessary condition for (1.1) to have a solution is that the sequence
[TABLE]
satisfy the symmetry condition . The space of sequences (1.2) with this symmerty will be denoted and will be represented by vectors , formed by ordering the coefficient in some prescribed manner, e.g., lexiographical. Note that is isomorphic to , where is the cardinality of . However, as we shall see below, not all are bona fide moments for nonnegative measures .
In many of the applications mentioned above there is a natural complexity constraint prescribed by design specifications. In the context of finite-dimensional systems these constraints often arise in the requirement that transfer functions be rational. This leads to the rational covariance extension problem, which has been studied in various degrees of generality in [25, 26, 36, 53, 54] and can be posed as follows.
Define and let
[TABLE]
where is the closure of the convex cone of the coefficients corresponding to trigonometric polynomials
[TABLE]
that are positive for all .
The reason for referring to this problem as a rational covariance extension problem is that the numbers (1.2) correspond to covariances of a discrete-time, zero-mean, and homogeneous111Homogeneity generalizes stationarity in the case . stochastic process . The corresponding power spectrum, representing the energy distribution across frequencies, is defined as the nonnegative measure on whose Fourier coefficients are the covariances (1.2). A scalar version of this problem () was first posed by Kalman [34] and has been extensively studied and solved in the literature [24, 12, 6, 21, 48, 13, 41, 7, 61, 47]. It has been generalized to more general scalar moment problems [8, 27, 9] and to the multidimensional setting [26, 25, 54, 53, 36]. Also worth mentioning here is work by Lang and McClellan [39, 40, 45, 46, 38, 37] considering the multidimensional maximum entropy problem, which hence has certain overlap with the above literature.
The multidimensional rational covariance extension problem posed above has a solution if and only if , where is the open convex cone
[TABLE]
where is the inner product in (Theorem 2.5). However, the covariances are generally determined from statistical data. Therefore the condition may not be satisfied, and testing this condition is difficult in the multidimensional case. Therefore we may want to find a positive measure and a corresponding , namely
[TABLE]
so that is close to in some norm, e.g., the Euclidian norm . This is an ill-posed inverse problem which in general has an infinite number of solutions . As we already mentioned, we are interested in rational solutions (1.3), and to obtain such solutions we use regularization as in [54]. Hence, we seek a that minimizes
[TABLE]
subject to (1.5), where is a regularization parameter and
[TABLE]
is the nomalized Kullback-Leibler divergence [33, ch. 4] [15, 61]. As will be explained in Section 2, is always nonnegative and has the property .
In this paper we shall consider a more general problem in the spirit of [22]. To this end, for any Hermitian, positive definite matrix , we define the weighted vector norm and consider the problem
[TABLE]
which is the same as the problem above with . We shall refer to as the weight matrix.
Using the same principle as in [57], we shall also consider the problem to minimize subject to (1.5) and the hard constraint
[TABLE]
Since (1.5) are bona fide moments and hence , while in general, this problem will not have a solution if the distance from to is greater than . Hence the choice of must be made with some care. Analogously with the rational covariance extension with soft constraints in (1.7), we shall consider the more general problem
[TABLE]
to which we shall refer as the rational covariance extension problem with hard constraints. Again this problem reduces to the simpler problem by setting .
As we shall see, the soft-constrained problem (1.7) always has a solution, while the hard-constrained problem (1.9) may fail to have a solution for some weight matrices . However, in Section 7 we show that the two problems are in fact equivalent in the sense that whenever (1.9) has a solution there is a corresponding in (1.7) that gives the same solution, and any solution of (1.7) can also be obtained from (1.9) by a suitable choice of . The reason for considering both formulations is that one formulation might be more suitable than the other for the particular application at hand. For example, an absolute error estimate for the covariances is more naturally incorporated in the formulation with hard constraints. A possible choice of the weight matrix in either formulation would be the covariance matrix of the estimated moments, as suggested in [22]. This corresponds to the Mahalanobis distance and could be a natural way to incorporate uncertainty of the covariance estimates in the spectral estimation procedure.
Previous work in this direction can be found in [58, 22, 10, 57, 35], where [58, 35, 10] consider the problem of selecting an appropriate covariances sequence to match in a given confidence region. The two approximation problems considered here are similar to the ones considered in [57] and [22]. (For more details, also see [3, Ch. B].)
We begin in Section 2 by reviewing the regular multidimensional rational covariance extension problem for exact covariance matching in a broader perspective. In Section 3 we present our main results on approximate rational covariance extension with soft constraints, and in Section 4 we show that the dual solution is well-posed. In Section 5 we investigate conditions under which there are solutions without a singular part. The approximate rational covariance extension with hard constraints is considered in Section 6, and in Section 7 we establish a homeomorhism between the weight matrices in the two problems, showing that the problems are actually equivalent when solutions exist. We also show that under certain conditions the homeomorphism can be extended to hold between all sets of parameters, allowing us to carry over results from the soft-constrained setting to the hard-constrained one. In Section 8 we discuss the properties of various covariance estimators, in Section 9 we give a 2D example from spectral estimation, and in Section 10 we apply our theory to system identification and texture reconstruction. Some of the results of this paper were announced in [55] without proofs.
2 Rational covariance extension with exact matching
The trigonometric moment problem to determine a positive measure satisfying (1.1) is an inverse problem that has a solution if and only if [36, Theorem 2.3], where is the closure of , and then in general it has infinitely many solutions. However, the nature of possible rational solutions (1.3) will depend on the location of in . To clarify this point we need the following lemma.
Lemma 2.1**.**
*. *
Proof 2.2**.**
Obviously the inner product can be expressed in the integral form
[TABLE]
*and therefore for all , as and can have zeros only on sets of measure zero. Hence the statement of the lemma follows. *
Therefore, under certain particular conditions, the multidimensional rational covariance extension problem has a very simple solution with a polynomial spectral density, namely
[TABLE]
Proposition 2.3**.**
The multidimensional rational covariance extension problem has a unique polynomial solution (2.2) if and only if , namely , where
[TABLE]
The proof of Proposition 2.3 is immediate by noting that any such is a bona fide spectral density and noting that .
As seen from the following result presented in [36, Section 6], the other extreme occurs for , when only singular solutions exist.
Proposition 2.4**.**
*For any there is a solution of (1.1) with support in at most points. There is no solution with a absolutely continuous part . *
However, for any , there is a rational solution (1.3) parametrized by , as demonstrated in [54] by considering a primal-dual pair of convex optimization problems. In that paper the primal problem is a weighted maximum entropy problem, but as also noted in [54, Sec. 3.2], it is equivalent to
[TABLE]
where is the absolutely continuous part of . This amounts to minimizing the (regular) Kullback-Leibler divergence between and , subject to matching the given data [27, 54]. In the present case of exact covariance matching, this problem is equivalent to minimizing (1.6) subject to (1.1), since is fixed and the total mass of is determined by the [math]:th moment . Hence both and are constants in this case. Hence problem (1.7) is the natural extension of (2.3) for the case where the covariance sequence is not known exactly.
The primal problem (2.3) is a problem in infinite dimensions, but with a finite number of constraints. The dual to this problem will then have a finite number of variables but an infinite number of constraints and is given by
[TABLE]
In particular, Theorem 2.1 in [54], based on corresponding analysis in [36], reads as follows.
Theorem 2.5**.**
Problem (2.3) has a solution if and only if . For every and the functional in (2.4) is strictly convex and has a unique minimizer . Moreover, there exists a unique and a (not necessarily unique) nonnegative singular measure with support
[TABLE]
such that
[TABLE]
For any such , the measure
[TABLE]
*is an optimal solution to the problem (2.3). Moreover, can be chosen with support in at most points, where is the cardinality of the index set . *
If , only a singular measure with finite support would match the moment condition (Proposition 2.4). In this case, the problem (2.3) makes no sense, since any feasible solution has infinite objective value.
In [36] we also derived the KKT conditions
[TABLE]
which are necessary and sufficient for optimality of the primal and dual problems.
Since (2.3) is an inverse problem, we are interested in how the solution depends on the parameters of the problem. From Propositions 7.3 and 7.4 in [36] we have the following result.
Proposition 2.6**.**
*Let , and be as in Theorem 2.5. Then the map is continuous. *
To get a full description of well-posedness of the solution we would like to extend this continuity result to the map . However, such a generalization is only possible under certain conditions. The following result is a consequence of Proposition 2.6 and [54, Corollary 2.3].
Proposition 2.7**.**
*Let , , and be as in Theorem 2.5. Then, for and all , the mapping is continuous. *
Corollary 2.3 in [54] actually ensures that for and . However, in Section 4.2 we present a generalization of Proposition 2.7 to cases with , where then may be nonzero. (The proof of this generalization can be found in [52].) Here we shall also consider an example where continuity fails when belongs to the boundary , i.e., the corresponding nonnegative trigonometric polynomial is zero in at least one point.
3 Approximate covariance extension with soft constraints
To handle the case with noisy covariance data, when may not even belong to , we relax the exact covariance matching constraint (1.1) in the primal problem (2.3) to obtain the problem (1.7). In this case it is natural to reformulate the objective function in (2.3) to include a term that also accounts for changes in the total mass of . Consequently, we have exchanged the objective function in (2.3) by the normalized Kullback-Leibler divergence (1.6) plus a term that ensures approximate data matching.
Using the normalized Kullback-Leibler divergence, as proposed in [33, ch. 4] [15, 61], is an advantage in the approximate covariance matching problem since this divergence is always nonnegative, precisely as is the case for probability densities. To see this, observe that, in view of the basic inequality ,
[TABLE]
since is a nonnegative measure. Moreover, , as can be seen by taking in (1.6).
The problem under consideration is to find a nonnegative measure minimizing
[TABLE]
subject to (1.5). To derive the dual of this problem we consider the corresponding maximization problem and form the Lagrangian
[TABLE]
where are Lagrange multitipliers and is the corresponding trigonometric polynomial (1.4). However,
[TABLE]
and therefore
[TABLE]
where , and for , and hence .
In deriving the dual functional
[TABLE]
to be minimized, we only need to consider , as will take infinite values for . In fact, following along the lines of [54, p. 1957], we note that, if , (3) will tend to infinity when . Moreover, since , there is a neighborhood where , letting tend to infinity in this neighborhood, (3) will tend to infinity if . We also note that the nonnegative function can only be zero on a set of measure zero; otherwise the first term in (3) will be .
The directional derivative222Formally, the Gateaux differential [44]. of the Lagrangian (3) in any feasible direction , i.e., any direction such that for sufficiencly small , is easily seen to be
[TABLE]
In particular, the direction is feasible since for . Therefore, any maximizing must satisfy and hence (1.3b). Moreover, a maximizing choice of will require that
[TABLE]
as this nonnegative term can be made zero by the simple choice , and consequently (2.5) must hold. Finally, the directional derivative
[TABLE]
is zero for all if
[TABLE]
Inserting this together with (1.3b) and (3.3) into (3) then yields the dual functional
[TABLE]
Consequently the dual of the (primal) optimization problem (1.7) is equivalent to
[TABLE]
Theorem 3.1**.**
For every the functional in (3.5) is strictly convex and has a unique minimizer . Moreover, there exists a unique , a unique and a (not necessarily unique) nonnegative singular measure with support
[TABLE]
such that
[TABLE]
and the measure
[TABLE]
*is an optimal solution to the primal problem (1.7). Moreover, can be chosen with support in at most points. *
Proof 3.2**.**
The objective functional of the dual problem (3.5) can be written as the sum of two terms, namely
[TABLE]
where . The functional is strictly convex (Theorem 2.5), and trivially the same holds for since it is a positive definite quadratic form. Consequently, is strictly convex, as claimed. Moreover, is lower semicontinuous [54, Lemma 3.1] with compact sublevel sets [54, Lemma 3.2]. Likewise, is continuous with compact sublevel sets. Therefore is lower semicontinuous with compact sublevel sets and therefore has a minimum , which must be unique by strict convexity.
In view of (3.4), the optimal value of is given by
[TABLE]
and is hence unique. Since therefore the linear term in the gradient of takes the value at the optimal point, the analysis in [54, sect. 3.1.5] applies with obvious modifications, showing that there is a , which then must be unique, such that
[TABLE]
Moreover, there is a discrete measure with support in at most points such that (3.7b) holds; see, e.g., [36, Proposition 2.4]. Then (3.7a) holds as well. In view of (3.3),
[TABLE]
and consequently , and the support of must satisfy (3.6).
Finally, let be given in terms of by (1.5), and let be the corresponding primal functional in (1.7). Then, for any such ,
[TABLE]
*and hence is an optimal solution to the primal problem (1.7), as claimed. *
We collect the KKT conditions in the following corollary.
Corollary 3.3**.**
The conditions
[TABLE]
*are necessary and sufficient conditions for optimality of the dual pair (1.7) and (3.5) of optimization problems. *
4 On the well-posedness of the soft-constrained problem
In the previous sections we have shown that the primal and dual optimization problems are well-defined. Next we investigate the well-posedness of the primal problem as an inverse problem. Thus, we first establish continuity of the solutions in terms of the parameters , and .
4.1 Continuity of with respect to , and
We start considering the continuity of the optimal solution with respect to the parameters. The parameter set of interest is
[TABLE]
Theorem 4.1**.**
Let
[TABLE]
*Then the map is continuous on . *
Proof 4.2**.**
Following the procedure in [36, Proposition 7.3] we use the continuity of the optimal value (Lemma A.1) to show continuity of the optimal solution. To this end, let be a sequence of parameters in converging to as . Moreover, defining and for simplicity of notation, let and . By Lemma A.1, is bounded, and hence there is a subsequence, which for simplicity we also call , converging to a limit . If we can show that , then the theorem follows. To this end, choosing a , we have
[TABLE]
Consequently, by Lemma A.1,
[TABLE]
*However , and, since is continuous in , we obtain *
[TABLE]
*Letting in (4.3), we obtain the inequality . By strict convexity of the optimal solution is unique, and hence . *
4.2 Continuity of with respect to
We have now established continuity from to . In the same way as in Proposition 2.7 we are also interested in continuity of the map . This would follow if we could show that the map from to is continuous. From the KKT condition (3.11c), it is seen that is continuous in , and . In view of (3.11b), i.e.,
[TABLE]
continuity of would follow if is continuous in whenever it is finite. If , this follows from the continuity the map in . For the case , this is trivial since if is finite, then and is bounded away from zero (cf., Proposition 2.7). However, for the case the optimal may belong to the boundary , i.e., is zero in some point. The following proposition shows continuity of for certain cases.
Proposition 4.3**.**
*For , let and suppose that the Hessian is positive definite in each point where is zero. Then and the mapping from the coefficient vector to is continuous in the point . *
The proof of this proposition is given in [52]. From Propositions 4.3 and 2.7 the following continuity result follows directly.
Corollary 4.4**.**
*For all , the mapping is continuous in any point for which the Hessian is positive definite in each point where is zero. *
The condition is needed, since we may have pole-zero cancelations in when , and then may be finite even if . The following example shows that this may lead to discontinuities in the map (cf. Example 3.8 in [36]).
Example 4.5**.**
Let
[TABLE]
where and is the singular measure with support in . Since is positive, . Moreover, since
[TABLE]
we have that (see, e.g., [41, p. 2853]). Thus we know [54, Corollary 2.3] that for each we have a unique such that matches , and hence . However, for we have that and (Theorem 2.5). Then, for the sequence , where , we have , so
[TABLE]
*which shows that the mapping is not continuous. *
5 Tuning to avoid a singular part
In many situations we prefer solutions where there is no singular measure in the optimal solution. An interesting question is therefore for what prior and weight we obtain . The following result provides a sufficient condition.
Proposition 5.1**.**
Let and let be the Fourier coefficients of the prior . If the weight satisfies333Here denotes the subordinate (induced) matrix norm.
[TABLE]
then the optimal solution of (1.7) is on the form
[TABLE]
*i.e., the singular part vanishes. *
Remark 5.2**.**
Note that for a scalar weight, the bound (5.1) simplifies to
[TABLE]
*where is the cardinality of index set . *
For the proof of Proposition 5.1 we need the following lemma.
Lemma 5.3**.**
Condition (5.1) implies
[TABLE]
*where is the optimal value of in problem (1.7). *
Proof 5.4**.**
Let
[TABLE]
be the cost function of problem (1.7), and let be the optimal solution. Clearly, , and consequently
[TABLE]
since and . Therefore,
[TABLE]
*which is less than one by (5.1). Hence (5.1) implies (5.3). *
Proof 5.5** (Proof of Proposition 5.1).**
Suppose the optimal solution has a nonzero singular part , and form the directional derivative of (5.4) at in the direction . Then in (1.3a) does not vary, and
[TABLE]
where
[TABLE]
Then for all , and hence
[TABLE]
by (5.3) (Lemma 5.3). Consequently,
[TABLE]
*whenever , which contradicts optimality. Hence must be zero. *
The condition of Proposition 5.1 is just sufficient and is in general conservative. To illustrate this, we consider a simple one-dimensional example ().
Example 5.6**.**
Consider a covariance sequence , where , and a prior , and set . Then, since
[TABLE]
the sufficient condition (5.2) for an absolutely continuous solution is
[TABLE]
We want to investigate how restrictive this condition is.
Clearly we will have a singular part if and only if , in which case we have
[TABLE]
for some . In fact, it follows from in (3.11a) that . Moreover, (3.11b) and (3.11c) yield
[TABLE]
By eliminating , we get
[TABLE]
and solving for yields
[TABLE]
(note that and ). Again, using (3.11c) we have
[TABLE]
We are interested in for which , i.e.,
[TABLE]
which is equivalent to the two conditions
[TABLE]
which could be seen by noting that the left member of (5.6) must be positive and then squaring both sides. To find out whether this has a solution we consider three cases, namely , , and . For , condition (5.7) becomes , which is impossible since . Condition (5.7) cannot be satisfied when , because then would be negative which contradicts . When , Condition (5.7) is satisfied if and only if .
Consequently, there is no singular part if either is negative or
[TABLE]
*This shows that the condition (5.5) is not tight. *
6 Covariance extension with hard constraints
The alternative optimization problem (1.9) amounts to minimizing subject to the hard constraint , where . Hard constraints of this type were used in [57] in the context of entropy maximization. In general the data , whereas, by definition, . Consequently, a necessary condition for the existence of a solution is that and the strictly convex set
[TABLE]
have a nonempty intersection. In the case that , this intersection only contains one point [44, Section 3.12]. In this case, any solution to the moment problem contains only a singular part (Proposition 2.4), and then the primal problem (1.9) has a unique feasible point , but the objective function is infinite. Moreover, is strictly convex with , so if then (1.9) has the trivial unique optimal solution , and . The remaining case, needs further analysis.
To this end, setting , we consider the Lagrangian
[TABLE]
where . Therefore, in view of (3.1),
[TABLE]
where, as before, , and for , and hence . This Lagrangian differs from that in (3) only in the last term that does not depend on . Therefore, in deriving the dual functional
[TABLE]
we only need to consider , and a first variation in yields (1.3b) and (3.3). The directional derivative
[TABLE]
is zero for
[TABLE]
Thus inserting (1.3b) and (3.3) and (6.3) into (6) yields the dual functional
[TABLE]
to be minimized over all and . Since , there is a stationary point
[TABLE]
that is nonnegative as required.
For we must have , and consequently tends to zero as . By weak duality zero is therefore a lower bound for the minimization problem (1.9), and , which corresponds to the trivial unique solution and mentioned above. This solution is only feasible if . Therefore we can restrict our attention to the case . Inserting (6.5) into (6.4) and removing the constant term , we obtain the modified dual functional
[TABLE]
Moreover, combining (6.3) and (6.5), we obtain
[TABLE]
which also follows from complementary slackness since and restricts to the boundary of .
Theorem 6.1**.**
Suppose that , and . Then the modified dual problem
[TABLE]
has a unique solution . Moreover, there exists a unique , a unique and a (not necessarily unique) nonnegative singular measure with support
[TABLE]
such that
[TABLE]
and the measure
[TABLE]
is an optimal solution to the primal problem (1.9). Moreover,
[TABLE]
and can be chosen with support in at most points.
*If , the unique optimal solution is , and then . If , any solution to the moment problem will have only a singular part. Finally, if , then the problem (1.9) will have no solution. *
Proof 6.2**.**
We begin by showing that the functional has a minimum under the stated conditions. To this end, we first establish that the functional has compact sublevel sets , i.e., is bounded for all such that , where is sufficiently large for the sublevel set to be nonempty. The functional (6.6) can be decomposed in a linear and a logarithmic term as
[TABLE]
where . The integral term will tend to as . Therefore we need to have the linear term to tend to as , in which case we can appeal to the fact that linear growth is faster than logarithmic growth. However, if as is generally assumed, there is a such that , so we need to ensure that the positive term dominates.
Let . Then, by Theorem 2.5, there is a positive measure with a nonzero such that
[TABLE]
and satisfies the constraints in the primal problem (1.9). Consequently,
[TABLE]
for all and , which in particular implies that
[TABLE]
Now, if there is a such that , then as , which contradicts (6.13). Therefore, for all . Then, since is continuous, it has a minimum on the compact set . As , . Therefore,
[TABLE]
since [54, Lem. A.1]. Likewise,
[TABLE]
since . Hence
[TABLE]
Comparing linear and logarithmic growth we see that the sublevel set is bounded from above and below. Moreover, a trivial modification of [54, Lemma 3.1] shows that is lower semi-continuous, and hence is compact. Consequently, the problem (6.8) has an optimal solution .
Next we show that is unique. For this we return to the original dual problem to find a minimum of (6.4). The solution is a minimizer of , where
[TABLE]
and . To show that is strictly convex, we form the Hessian
[TABLE]
and the quadratic form
[TABLE]
which is positive for all nonzero , since and . Consequently, has a unique minimizer , where is the unique minimizer of .
It follows from (6.3) and (6.5) that
[TABLE]
which consequently is unique. Moreover, , and hence we can follow the same line of proof as in Theorem 3.1 to show that there is a unique such that and a positive discrete measure with support in points so that (6.9) and (6.10) hold. Next, let be the primal functional in (1.9), where is restricted to the set of positive measures such that , given by (1.5), satisfies the constraint . In view of (6.12),
[TABLE]
*for any such , and hence is an optimal solution to the primal problem (1.9). Finally, the cases , , and have already been discussed above. *
Corollary 6.3**.**
Suppose that and . The KKT conditions
[TABLE]
*are necessary and sufficient conditions for optimality of the dual pair (1.9) and (6.8) of optimization problems. *
The corollary follows by noting that, if , then we obtain the trivial solution , which corresponds to the primal optimal solution .
Proposition 6.4**.**
The condition
[TABLE]
*is sufficient for the pair (1.9) and (6.8) of dual problems to have optimal solutions. *
Proof 6.5**.**
*If , then with equality only for . Hence, if , , i.e., for all except . Then we proceed as in the proof of Theorem 6.1. *
Remark 6.6**.**
Condition (6.17) guarantees that and hence in particular that as required in Theorem 6.1. To see this, note that and that satisfies the hard constraint in (1.9) if . However, since , there is a such that . Then the well-known Matrix Inversion Lemma (see, e.g., [42, p. 746]) yields
[TABLE]
and therefore
[TABLE]
*which establishes that . However, for to be nonempty, need not be contained in this set. Hence, condition (6.17) is not necessary, although it is easily testable. In fact, this provides an alternative proof of Proposition 6.4. *
7 On the equivalence between the two problems
Clearly is always nonempty if . Then both the problem (1.7) with soft constraints and the problem (1.9) with hard constraints have a solution for any choice of W. On the other hand, if , the problem with soft constraints will always have a solution, while the problem with hard constraints may fail to have one for certain choices of . However, if the weight matrix in the hard-constrained problem – let us denote it – is chosen in the set , then it can be seen from Corollaries 3.3 and 6.3 that we obtain exactly the same solution in the soft-constrained problem by choosing
[TABLE]
We note that (7.1) can be written , where . Therefore, substituting in (7.1), we obtain
[TABLE]
which yields . Hence the inverse of (7.1) is given by
[TABLE]
By Theorem 4.1 is continuous in , and hence, by (7.2), the corresponding varies continuously with . In fact, this can be strengthened to a homeomorphism between the two weight matrices.
Theorem 7.1**.**
*The map (7.1) is a homeomorphism between and the space of all (Hermitian positive definite) weight matrices, and the inverse is given by (7.2). *
Proof 7.2**.**
By [11, Lemma 2.3], a continuous map between two spaces of the same dimension is a homeomorphism if and only if it is injective and proper, i.e., the preimage of any compact set is compact. To see that is open, we observe that is continuous in and that is an open set. As noted above, the map (7.2) – let us call it – is continuous and also injective, as it can be inverted. Hence it only remains to show that is proper. To this end, we take a compact set and show that is also compact. There are two ways this could fail. First, the preimage could contain a singular semidefinite matrix. However this is impossible by (7.2), since is bounded for (Lemma A.3) and a nonzero scaling of a singular matrix cannot be nonsingular. Secondly, could tend to infinity. However, this is also impossible. To see this, we first show that there is a such that for all and all . To this end, we observe that the minimum of over all and satisfying the constraint is bounded by
[TABLE]
*by the triangle inequality . The minimum is attained, since is compact, and positive, since . Now, from Corollary 6.3 we see that if and only if . The map from is continuous in . In fact, is uniformly positive in a neighborhood of and hence the corresponding . Due to this continuity, if , then , which cannot happen since for all . Thus, since is bounded away from zero, the preimage of is bounded. Finally, consider a convergent sequence in converging to a limit . Since the sequence is bounded and cannot converge to a singular matrix, we must have , i.e., . By continuity, tends to the limit , which must belong to since it is compact. Hence the preimage must belong to . Therefore, is compact as claimed. *
It is illustrative to consider the simple case when . Then the two maps (7.1) and (7.2) become
[TABLE]
Whereas the range of is the semi-infinite interval , for the homeomorphism to hold is confined to
[TABLE]
where is the distance from to the cone and . When , and . If , then the coresponding problem has the trivial unique solution , corresponding to the primal solution .
Note that Theorem 7.1 implies that some continuity results in one of the problems can be automatically transferred to the other problem. In particular, we have the following result.
Theorem 7.3**.**
Let
[TABLE]
*Then the map is continuous. *
Proof 7.4**.**
*The theorem follows by noting that can be seen as a composition of two continuous maps, namely the one in Theorem 4.1 and the one in Theorem 7.1. *
Next we shall vary also and , and to this end we introduce a more explicit notation for and , namely in (6.1) and
[TABLE]
Then the corresponding set of parameters (4.1) for the problem with hard constraints is given by
[TABLE]
the interior of which is
[TABLE]
Theorem 7.1 can now be modified accordingly to yield the following theorem, the proof of which is deferred to the appendix.
Theorem 7.5**.**
Let the map be given by (7.1) and the map by (7.2). Then the map that sends to is a homeomorphism.
Note that this theorem is not a strict amplification of Theorem 7.1 as we have given up the possibility for to be on the boundary . The same is true for the following modification of Theorem 7.3.
Theorem 7.6**.**
*Let be as in (7.4). Then the map is continuous on . *
Proof 7.7**.**
*The theorem follows immediately by noting that can be seen as a composition of two continuous maps, namely of Theorem 7.5 and of Theorem 4.1. *
Theorem 7.6 is a counterpart of Theorem 4.1 for the problem with hard constraints, except that is restricted to the interior . It should be possible to extend the result to hold for all via a direct proof along the lines of the proof of Theorem 4.1.
8 Estimating covariances from data
For a scalar stationary stochastic process , it is well-known that the biased covariance estimate
[TABLE]
based on an observation record , yields a positive definite Toeplitz matrix, which is equivalent to [2, pp. 13-14] In fact, these estimates correspond to the ones obtained from the periodogram estimate of the spectrum (see, e.g., [59, Sec. 2.2]). On the other hand, the Toeplitz matrix of the unbiased estimate
[TABLE]
is in general not positive definite.
The same holds in higher dimensions () where the observation record is with
[TABLE]
The unbiased estimate is then given by
[TABLE]
and the biased estimate by
[TABLE]
where we define for . The sequence of unbiased covariance estimates does not in general belong to , but the biased covariance estimates yields also in the multidimensional setting. In fact, this can be seen by noting that the biased estimate corresponds to the Fourier coefficients of the periodogram [18, Sec. 6.5.1], i.e., if the estimates are given by (8.2), then
[TABLE]
where denotes the Minkowski set difference. This leads to the following lemma.
Lemma 8.1**.**
*Given the observed data , let be given by (8.2). Then . *
Proof 8.2**.**
Given , let , where be given by (8.2). In view of (2.1) and (8.3) we have
[TABLE]
*which is positive for all . Consequently . *
An advantage of the approximate procedures to the rational covariance extension problem is that they can also be used for cases where the biased estimate is not available, e.g., where the covariance is estimated from snapshots.
9 Application to spectral estimation
As long as we use the biased estimate (8.2), we may apply exact covariance matching as outlined in Section 2, whereas in general approximate covariance matching will be required for biased covariance estimates. However, as will be seen in the following example, approximate covariance matching may sometimes be better even if .
In this application it is easy to determine a bound on the acceptable error in the covariance matching, so we use the procedure with hard constraints. Given data generated from a two-dimensional stochastic system, we test three different procedures, namely (i) using the biased estimate and exact matching, (ii) using the biased estimate and the approximate matching (1.9), and (iii) using the unbiased estimate and the approximate matching (1.9). The procedures are then evaluated by checking the size of the error between the matched covariances and the true ones from the dynamical system.
9.1 An example
Let be the steady-state output of a two-dimensional recursive filter driven by a white noise input . Let the transfer function of the recursive filter be
[TABLE]
where and the coefficients are given by and , where
[TABLE]
The spectral density of , which is shown in Fig. 1 and is similar to the one considered in [54], is given by
[TABLE]
and hence the index set of the coefficients of the trigonometric polynomials and is given by . Using this example, we perform two different simulation studies.
9.2 First simulation study
The system was simulated for time steps along each dimension, starting from whenever either or . Then covariances were estimated from the last samples, using both the biased and the unbiased estimator. With this covariance data we investigate the three procedures (i), (ii) and (iii) described above. In each case, both the maximum entropy (ME) solutions and solutions with the true numerator are computed.444Maximum entropy: . True numerator: . The weighting matrix is taken to be , where is in procedure (ii) and in procedure (iii).555Note that this is the smallest for which the true covariance sequence belongs to the uncertainty set . The norm of the error666Here we use the norm of the covariance estimation error as measure of fit. However, note that this is not the only way to compare accuracy of the different methods. The reason for this choice is that comparing the accuracy of the spectral estimates is not straightforward since it depends on the selected metric or distortion measure. between the matched covariances and the true ones, , is shown in Table. 1. The means and standard deviations are computed over the runs.
The biased covariance estimates belong to the cone (Lemma 8.1), and therefore procedure (i) can be used. The corresponding error in Table 1 is the statistical error in estimating the covariance. This error is quite large because of a short data record. Using approximate covariance matching in this case seems to give a worse match. However, approximate matching of the unbiased covariances gives as good a fit as exact matching of the biased ones.
9.3 Second simulation study
In this simulation the setup is the same as the previous one, except that the simulation data has been discarded if the unbiased estimate belongs to . To obtain such data sets, simulations of the system were needed. (As a comparison, in the previous experiment out of the runs resulted in an unbiased estimate outside .) Again, the norm of the error between matched covariances and the true ones is shown in Table 2, and the means and standard deviations are computed over the runs.
As before, the biased covariance estimates belong to the cone , and therefore procedure (i) can be used. Comparing this with the results from procedure (ii) suggests that there may be an advantage not to enforce exact matching, although we know that the data belongs to the cone. Regarding procedure (iii), we know that the unbiased covariance estimates do not belong to the cone , hence we need to use approximate covariance matching. In this example, this procedure turns out to give the smallest estimation error.
10 Application to system identification and texture reconstruction
Next we apply the theory of this paper to texture generation via Wiener system identification. Wiener systems form a class of nonlinear dynamical systems consisting of a linear dynamic part composed with a static nonlinearity as illustrated in Figure 2. This is a subclass of so called block-oriented systems [4], and Wiener system identification is a well-researched area (see, e.g., [32] and references therein) that is still very active [43, 60, 1]. Here, we use Wiener systems to model and generate textures.
Using dynamical systems for modeling of images and textures is not new and has been considered in, e.g., [14, 50]. The setup presented here is motivated by [23], where thresholded Gaussian random fields are used to model porous materials for design of surface structures in pharmaceutical film coatings. Hence we let the static nonlinearity, call it , be a thresholding with unknown thresholding parameter . In our previous work [52] we applied exact covariance matching to such a problem. However, in general there is no guarantee that the estimated covariance sequence belongs to the cone . Consequently, here we shall use approximate covariance matching instead.
The Wiener system identification can be separated into two parts. We start by identifying the nonlinear part. Using the notations of Figure 2, let be a zero-mean Gaussian white noise input, and let be the stationary output of the linear system, which we assume to be normalized so that . Moreover, let where is the static nonlinearity
[TABLE]
with unknown thresholding parameter . Since , where is the Gaussian cumulative distribution function, an estimate of is given by .
Now, let be the covariances of , and let be the covariances of . As was explained in [52], by using results from [51] one can obtain a relation between and , given by
[TABLE]
This is an invertible map, which we compute numerically, and given we can thus get estimates of the covariances from estimates of the covariances . However, even if is is a biased estimate so that , may not be a bona fide covariance sequence.
10.1 Identifying the linear system
Solving (1.7) or (1.9) for a given sequence of covariance estimates , we obtain an estimate of the absolutely continuous part of the power spectrum of that process. In the case , can be factorized as
[TABLE]
which provides a transfer function of a corresponding linear system, which fed by a white noise input will produce an autoregressive-moving-average (ARMA) process with an output signal with precisely the power distribution in steady state. For , a spectral factorization of this kind is not possible in general [19], but instead there is always a factorization as a sum-of-several-squares [17, 28],
[TABLE]
the interpretation of which in terms of a dynamical system is unclear when . Therefore we resort to a heuristic and apply the factorization procedure in [29, Theorem 1.1.1] although some of the conditions required to ensure the existence of a spectral factor may not be met. (See [54, Section 7] for a more detailed discussion.)
10.2 Simulation results
The method, which is summarized in Algorithm 1, is tested on some textures from the Outex database [49] (available online at http://www.outex.oulu.fi/). These textures are color images and have thus been converted to binary textures by first converting them to black-and-white and then thresholding them.777The algorithm has been implemented and tested in Matlab, version R2015b. The textures have been normalized to account for light inhomogenities using a reference image available in the database. The conversion from color images to black-and-white images was done with the built-in function rgb2gray, and the threshold level was set to the mean value of the maximum and minimum pixel value in the black-and-white image. Three such textures are shown in Figure 3a through 3c.
In this example there is no natural bound on the error, so we use the problem with soft constraints, for which we choose the weight with for all data sets. Moreover, we do maximum-entropy reconstructions, i.e., we set the prior to . The optimization problems are then solved by first discretizing the grid , in this case in points (cf. [54, Theorem 2.6]), and solving the corresponding problems using the CVX toolbox [31, 30]. The reconstructions are shown in Figures 3d - 3f. Each reconstruction seems to provide a reasonable visual representation of the structure of the corresponding original. This is especially the case for the second texture.
11 Conclusions
In this work we extend the results of our previous paper [54] on the multidimensional rational covariance extension problem to allow for approximate covariance matching. We have provided two formulations to this problem, and we have shown that they are connected via a homeomorphism. In both formulations we have used weighted 2-norms to quantify the missmatch of the estimated covariances. However, we expect that by suitable modifications of the proofs similar results can be derived for other norms, since all norms have directional derivatives in each point [16, p. 49].
These results provide a procedure for multidimensional spectral estimation, but in order to obtain a complete theory for multidimensional system identification and realization theory there are still some open problems, such as spectral factorization and interpretations in terms of multidimensional stochastic systems, as briefly discussed in Section 10.1.
Appendix A Deferred proofs
Let denote the closed ball , where is either a set of vectors or a set of matrices depending on then context. The norm is the Euclidean norm for vectors and Frobenius norm for matrices.
Lemma A.1**.**
*Let be given by (4.1) and by (4.2). Furthermore, let . Then the map is continuous for . Moreover, for any compact , the corresponding set of optimal solutions is bounded. *
Proof A.2**.**
The proof follows along the lines of Lemma 7.2 and Proposition 7.4 in [36]. Let be arbitrary and let
[TABLE]
where is chosen so that , i.e., and for all . First we will show that the minimizer of is bounded for all . To this end, note that by optimality
[TABLE]
and hence is bounded from above on the compact set . Consequently, by using the same inequality as in the proof of [36, Lemma 7.1], we see that
[TABLE]
Due to norm equivalence between and , and since the quadratic term is dominating, the norm of is bounded in the set .
Now, let be compact. We want to show that is bounded on . Assume it is not. Then let be a sequence with . Since is compact there is a converging subsequence with . Since there is a such that . However, all but finitely many points belong to , and since is bounded for all , we cannot have .
Next, let and let be the unique minimizers of and , respectively. Choose a and note that is strictly positive and bounded. By optimality,
[TABLE]
for all . Hence, if we can show that, for any , there is an and a such that
[TABLE]
hold whenever , and , then this would imply that
[TABLE]
showing that the optimal value is continuous in . The lower bound is obtained by using (A.2a) and (A.1b), and the upper bound is obtained from (A.1a) and (A.2b). To prove (A.2a), we note that
[TABLE]
Next we observe that
[TABLE]
*by the KKT conditions (3.11) and the fact that , . Hence can be selected small enough for the second and fourth term in (A.3) each to be bounded by for any . Each of the remaining terms can now be bounded by by selecting sufficiently small. Hence (A.2a) follows. This also proves (A.2b). *
Lemma A.3**.**
*Let be given by (7.5) and by (6.6). Furthermore, let . Then for any compact , the corresponding set of optimal solutions is bounded. *
Proof A.4**.**
The proof follows closely the proof of the corresponding part of Lemma A.1. Let be arbitrary and let
[TABLE]
where is chosen so that . To see that the minimizer of is bounded for all , first note that by optimality
[TABLE]
and hence is bounded from above on the compact set .
Now let , as in the proof of Theorem 6.1. Following the same line of argument as in that proof, we see that for all and . Since is continuous in the arguments , it has a minimum on the compact set of tuples such that and hold. Thus the second half of inequality (6.14) still holds, i.e.,
[TABLE]
for all . This is true in particular for , thus
[TABLE]
*Since the linear growth dominates the logarithmic growth, the norm of is bounded on the set . The proof now follows verbatim from the argument in the second paragraph in the proof of Lemma A.1. *
Proof A.5** (Proof of Theorem 7.5).**
This is a modification of the proof of Theorem 7.1, again utilizing [11, Lemma 2.3], where we replace the map defined by and redefine it with the map . To show that is a homeomorphism we need to show that the map is proper. To this end, we take a compact set and show that is also compact. Again, there are two ways this could fail. First, the preimage could contain a singular semidefinite matrix. However this is impossible by (7.2), since is bounded for (Lemma A.3) and a nonzero scaling of a singular matrix cannot be nonsingular. Secondly, could tend to infinity. However, this is also impossible. To see this, we first show that there is a such that for all and all . Again, using the triangle inequality , we observe that the minimum of over all and satisfying the constraint is bounded by
[TABLE]
*The minimum is attained, as is compact, and positive, since . The remaining part of the proof now follows with minor modifications from the proof of Theorem 7.1 by noting that is bounded away from , and hence the preimage is bounded. Therefore the limit of a sequence in the preimage must belong to , and hence is compact as claimed. *
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] M. Abdalmoaty and H. Hjalmarsson , A simulated maximum likelihood method for estimation of stochastic Wiener systems , in IEEE 55th Conference on Decision and Control (CDC), IEEE, 2016, pp. 3060–3065.
- 2[2] N. Ahiezer and M. Krein , Some questions in the theory of moments , vol. 2 of Translations of mathematical monographs, American Mathematical Society, Providence, R.I., 1962.
- 3[3] E. Avventi , Spectral Moment Problems : Generalizations, Implementation and Tuning , Ph D thesis, 2011. Optimization and Systems Theory, Department of Mathematics, KTH Royal Institue of Technology.
- 4[4] S. Billings , Identification of nonlinear systems-a survey , in IEE Proceedings D-Control Theory and Applications, vol. 127, IET, 1980, pp. 272–285.
- 5[5] N. Bose , Multidimensional Systems Theory and Applications , Kluwer Academic Publishers, second ed., 2003.
- 6[6] C. Byrnes, P. Enqvist, and A. Lindquist , Identifiability and well-posedness of shaping-filter parameterizations: A global analysis approach , SIAM Journal on Control and Optimization, 41 (2002), pp. 23–59.
- 7[7] C. Byrnes, T. Georgiou, and A. Lindquist , A new approach to spectral estimation: a tunable high-resolution spectral estimator , IEEE Transactions on Signal Processing, 48 (2000), pp. 3189–3205.
- 8[8] , A generalized entropy criterion for Nevanlinna-Pick interpolation with degree constraint , IEEE Transactions on Automatic Control, 46 (2001), pp. 822–839.
