Maximum vanishing subspace problem, CAT(0)-space relaxation, and   block-triangularization of partitioned matrix

Masaki Hamada; Hiroshi Hirai

arXiv:1705.02060·math.OC·September 12, 2017

Maximum vanishing subspace problem, CAT(0)-space relaxation, and block-triangularization of partitioned matrix

Masaki Hamada, Hiroshi Hirai

PDF

TL;DR

This paper introduces a new approach combining submodular and convex optimization techniques to solve the maximum vanishing subspace problem efficiently, with applications to block-triangularization of partitioned matrices.

Contribution

It develops a pseudo-polynomial time algorithm for the weighted maximum vanishing subspace problem using CAT(0)-space relaxation, a novel combination of optimization methods.

Findings

01

The weighted MVSP can be solved efficiently under certain conditions.

02

The approach leverages CAT(0)-space convex optimization techniques.

03

Implications for canonical block-triangular form of matrices are demonstrated.

Abstract

In this paper, we address the following algebraic generalization of the bipartite stable set problem. We are given a block-structured matrix (partitioned matrix) $A = (A_{α β})$ , where $A_{α β}$ is an $m_{α}$ by $n_{β}$ matrix over field $F$ for $α = 1, 2, \dots, μ$ and $β = 1, 2, \dots, ν$ . The maximum vanishing subspace problem (MVSP) is to maximize $\sum_{α} dim X_{α} + \sum_{β} dim Y_{β}$ over vector subspaces $X_{α} \subseteq F^{m_{α}}$ for $α = 1, 2, \dots, μ$ and $Y_{β} \subseteq F^{n_{β}}$ for $β = 1, 2, \dots, ν$ such that each $A_{α β}$ vanishes on $X_{α} \times Y_{β}$ when $A_{α β}$ is viewed as a bilinear form $F^{m_{α}} \times F^{n_{β}} \to F$ . This problem arises from a study of a canonical block-triangular form of $A$ by…

Figures6

Click any figure to enlarge with its caption.

Equations252

A=\left(\begin{array}[]{ccccc}A_{11}&A_{12}&\cdots&A_{1\nu}\\ A_{21}&A_{22}&\cdots&A_{2\nu}\\ \vdots&\vdots&\ddots&\vdots\\ A_{\mu 1}&A_{\mu 2}&\cdots&A_{\mu\nu}\end{array}\right),

A=\left(\begin{array}[]{ccccc}A_{11}&A_{12}&\cdots&A_{1\nu}\\ A_{21}&A_{22}&\cdots&A_{2\nu}\\ \vdots&\vdots&\ddots&\vdots\\ A_{\mu 1}&A_{\mu 2}&\cdots&A_{\mu\nu}\end{array}\right),

α = 1 \sum μ dim X_{α} + β = 1 \sum ν dim Y_{β}

α = 1 \sum μ dim X_{α} + β = 1 \sum ν dim Y_{β}

A_{α β} (X_{α}, Y_{β}) = {0} (1 \leq α \leq μ, 1 \leq β \leq ν),

A_{α β} (X_{α}, Y_{β}) = {0} (1 \leq α \leq μ, 1 \leq β \leq ν),

(u, v) \mapsto u^{⊤} A_{α β} v .

(u, v) \mapsto u^{⊤} A_{α β} v .

\left(\begin{array}[]{cccc}E_{1}^{\top}&O&\cdots&O\\ O&E_{2}^{\top}&\ddots&\vdots\\ \vdots&\ddots&\ddots&O\\ O&\cdots&O&E_{\mu}^{\top}\end{array}\right)\left(\begin{array}[]{ccccc}A_{11}&A_{12}&\cdots&A_{1\nu}\\ A_{21}&A_{22}&\cdots&A_{2\nu}\\ \vdots&\vdots&\ddots&\vdots\\ A_{\mu 1}&A_{\mu 2}&\cdots&A_{\mu\nu}\end{array}\right)\left(\begin{array}[]{cccc}F_{1}&O&\cdots&O\\ O&F_{2}&\ddots&\vdots\\ \vdots&\ddots&\ddots&O\\ O&\cdots&O&F_{\nu}\end{array}\right),

\left(\begin{array}[]{cccc}E_{1}^{\top}&O&\cdots&O\\ O&E_{2}^{\top}&\ddots&\vdots\\ \vdots&\ddots&\ddots&O\\ O&\cdots&O&E_{\mu}^{\top}\end{array}\right)\left(\begin{array}[]{ccccc}A_{11}&A_{12}&\cdots&A_{1\nu}\\ A_{21}&A_{22}&\cdots&A_{2\nu}\\ \vdots&\vdots&\ddots&\vdots\\ A_{\mu 1}&A_{\mu 2}&\cdots&A_{\mu\nu}\end{array}\right)\left(\begin{array}[]{cccc}F_{1}&O&\cdots&O\\ O&F_{2}&\ddots&\vdots\\ \vdots&\ddots&\ddots&O\\ O&\cdots&O&F_{\nu}\end{array}\right),

α \sum C_{α} dim X_{α} + β \sum D_{β} dim Y_{β}

α \sum C_{α} dim X_{α} + β \sum D_{β} dim Y_{β}

rank A := max {rank i = 1 \sum N z_{i} A_{i} ∣ z_{i} \in F (i = 1, 2, \dots, N)} .

rank A := max {rank i = 1 \sum N z_{i} A_{i} ∣ z_{i} \in F (i = 1, 2, \dots, N)} .

dim X - dim i = 1 \sum N A_{i} (X) \geq c,

dim X - dim i = 1 \sum N A_{i} (X) \geq c,

rank A

rank A

=

\max\left\{\frac{1}{d}\mathop{\rm rank}\sum_{i=1}^{N}Z_{i}\otimes A_{i}\mid d\geq 1,\ \mbox{$d\times d$ matrix $Z_{i}$ over ${\bf F}$ for $i=1,2,\ldots,N$}\right\},

\max\left\{\frac{1}{d}\mathop{\rm rank}\sum_{i=1}^{N}Z_{i}\otimes A_{i}\mid d\geq 1,\ \mbox{$d\times d$ matrix $Z_{i}$ over ${\bf F}$ for $i=1,2,\ldots,N$}\right\},

f ((1 - t) x + t y) \leq (1 - t) f (x) + t f (y)

f ((1 - t) x + t y) \leq (1 - t) f (x) + t f (y)

f ((1 - t) x + t y) \leq (1 - t) f (x) + t f (y) - \frac{κ}{2} t (1 - t) d (x, y)^{2}

f ((1 - t) x + t y) \leq (1 - t) f (x) + t f (y) - \frac{κ}{2} t (1 - t) d (x, y)^{2}

∣ f (x) - f (y) ∣ \leq L d (x, y)

∣ f (x) - f (y) ∣ \leq L d (x, y)

J_{λ}^{f} (x) := argmin_{y \in S} (f (y) + \frac{1}{2 λ} d (x, y)^{2}) (x \in S) .

J_{λ}^{f} (x) := argmin_{y \in S} (f (y) + \frac{1}{2 λ} d (x, y)^{2}) (x \in S) .

f := i = 1 \sum N f_{i},

f := i = 1 \sum N f_{i},

k = 0 \sum \infty λ_{k} = \infty, k = 0 \sum \infty λ_{k}^{2} < \infty.

k = 0 \sum \infty λ_{k} = \infty, k = 0 \sum \infty λ_{k}^{2} < \infty.

x_{k N + i} := J_{λ_{k}}^{f_{i}} (x_{k N + i - 1}) (i = 1, 2, \dots, N) .

x_{k N + i} := J_{λ_{k}}^{f_{i}} (x_{k N + i - 1}) (i = 1, 2, \dots, N) .

λ_{k} := \frac{1 - a}{ϵ ( k + 1 )} .

λ_{k} := \frac{1 - a}{ϵ ( k + 1 )} .

d (x_{k N}, x^{*})^{2} \leq \frac{1}{( k + 2 ) ^{1 - a}} (d (x_{0}, x^{*})^{2} + h (a) \frac{L ^{2} N ( N + 1 )}{ϵ ^{2}}) (k = 1, 2, \dots),

d (x_{k N}, x^{*})^{2} \leq \frac{1}{( k + 2 ) ^{1 - a}} (d (x_{0}, x^{*})^{2} + h (a) \frac{L ^{2} N ( N + 1 )}{ϵ ^{2}}) (k = 1, 2, \dots),

r (p) + r (q) = r (p \land q) + r (p \lor q) (p, q \in L) .

r (p) + r (q) = r (p \land q) + r (p \lor q) (p, q \in L) .

f (p) + f (q) \geq f (p \land q) + f (p \lor q) (p, q \in L) .

f (p) + f (q) \geq f (p \land q) + f (p \lor q) (p, q \in L) .

dim X + dim X^{'} = dim (X \cap X^{'}) + dim (X + X^{'}) (X, X^{'} \in L) .

dim X + dim X^{'} = dim (X \cap X^{'}) + dim (X + X^{'}) (X, X^{'} \in L) .

r (p^{'} \land q^{'}) - r (p \land q^{'}) - r (p^{'} \land q) + r (p \land q) \in {0, 1} .

r (p^{'} \land q^{'}) - r (p \land q^{'}) - r (p^{'} \land q) + r (p \land q) \in {0, 1} .

2^{{1, 2, \dots n}} ∋ X \mapsto i \in X ⋁ a_{i} \in ⟨ a_{1}, a_{2}, \dots, a_{n} ⟩ .

2^{{1, 2, \dots n}} ∋ X \mapsto i \in X ⋁ a_{i} \in ⟨ a_{1}, a_{2}, \dots, a_{n} ⟩ .

r (p_{i} \land q_{j}) - r (p_{i - 1} \land q_{j}) = 1, r (p_{i} \land q_{j - 1}) - r (p_{i - 1} \land q_{j - 1}) = 0.

r (p_{i} \land q_{j}) - r (p_{i - 1} \land q_{j}) = 1, r (p_{i} \land q_{j - 1}) - r (p_{i - 1} \land q_{j - 1}) = 0.

p_{i} \land q_{j - 1} = p_{i - 1} \land q_{j - 1} ⪯ p_{i - 1} \land q_{j} ≺_{1} p_{i} \land q_{j} .

p_{i} \land q_{j - 1} = p_{i - 1} \land q_{j - 1} ⪯ p_{i - 1} \land q_{j} ≺_{1} p_{i} \land q_{j} .

r (p_{i} \land q_{j}) - r (p_{i} \land q_{j - 1}) = 1, r (p_{i - 1} \land q_{j}) - r (p_{i - 1} \land q_{j - 1}) = 0.

r (p_{i} \land q_{j}) - r (p_{i} \land q_{j - 1}) = 1, r (p_{i - 1} \land q_{j}) - r (p_{i - 1} \land q_{j - 1}) = 0.

R (X, Y) := rank A ∣_{X \times Y} ((X, Y) \in L \times \overset{ˇ}{M}),

R (X, Y) := rank A ∣_{X \times Y} ((X, Y) \in L \times \overset{ˇ}{M}),

R (X, Y) + R (X^{'}, Y^{'}) \geq R (X \cap X^{'}, Y + Y^{'}) + R (X + X^{'}, Y \cap Y^{'}) .

R (X, Y) + R (X^{'}, Y^{'}) \geq R (X \cap X^{'}, Y + Y^{'}) + R (X + X^{'}, Y \cap Y^{'}) .

rank A [I, J] + rank A [I^{'}, J^{'}] \geq rank A [I \cap I^{'}, J \cup J^{'}] + rank A [I \cup I^{'}, J \cap J^{'}]

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Maximum vanishing subspace problem, CAT(0)-space relaxation,

and block-triangularization of partitioned matrix 111An extended abstract of this paper appears in the proceeding of 10th Japanese-Hungarian Symposium on Discrete Mathematics and Its Applications

Masaki HAMADA and Hiroshi HIRAI

Department of Mathematical Informatics,

Graduate School of Information Science and Technology,

The University of Tokyo, Tokyo, 113-8656, Japan.

masaki $\_$ [email protected]

[email protected]

Abstract

In this paper, we address the following algebraic generalization of the bipartite stable set problem. We are given a block-structured matrix (partitioned matrix) $A=(A_{\alpha\beta})$ , where $A_{\alpha\beta}$ is an $m_{\alpha}$ by $n_{\beta}$ matrix over field ${\bf F}$ for $\alpha=1,2,\ldots,\mu$ and $\beta=1,2,\ldots,\nu$ . The maximum vanishing subspace problem (MVSP) is to maximize $\sum_{\alpha}\dim X_{\alpha}+\sum_{\beta}\dim Y_{\beta}$ over vector subspaces $X_{\alpha}\subseteq{\bf F}^{m_{\alpha}}$ for $\alpha=1,2,\ldots,\mu$ and $Y_{\beta}\subseteq{\bf F}^{n_{\beta}}$ for $\beta=1,2,\ldots,\nu$ such that each $A_{\alpha\beta}$ vanishes on $X_{\alpha}\times Y_{\beta}$ when $A_{\alpha\beta}$ is viewed as a bilinear form ${\bf F}^{m_{\alpha}}\times{\bf F}^{n_{\beta}}\to{\bf F}$ . This problem arises from a study of a canonical block-triangular form of $A$ by Ito, Iwata, and Murota (1994), and is closely related to the noncommutative rank of a matrix with indeterminates.

We prove that a weighted version (WMVP) of MVSP can be solved in psuedo polynomial time, provided arithmetic operations on ${\bf F}$ can be done in constant time. Our proof is a novel combination of submodular optimization on modular lattice and convex optimization on CAT(0)-space. We present implications of this result on block-triangularization of partitioned matrix.

Keywords: CAT(0)-space, proximal point algorithm, Dulmage-Mendelsohn decomposition, partitioned matrix, submodular function, modular lattice.

1 Introduction

The maximum stable set problem in bipartite graphs is one of the fundamental and well-solved combinatorial optimization problems. We address in this paper the following algebraic generalization of the bipartite stable set problem. We are given a matrix $A$ partitioned into submatrices as

[TABLE]

where $A_{\alpha\beta}$ is an $m_{\alpha}\times n_{\beta}$ matrix over field ${\bf F}$ for $\alpha=1,2,\ldots,\mu$ and $\beta=1,2,\ldots,\nu$ . Such a matrix is called a partitioned matrix of type $(m_{1},m_{2},\ldots,m_{\mu};n_{1},n_{2},\ldots,n_{\nu})$ . The maximum vanishing subspace problem (MVSP) is to maximize

[TABLE]

over vector subspaces $X_{\alpha}\subseteq{\bf F}^{m_{\alpha}}$ for $\alpha=1,2,\ldots,\mu$ and $Y_{\beta}\subseteq{\bf F}^{n_{\beta}}$ for $\beta=1,2,\ldots,\nu$ satisfying

[TABLE]

where each submatrix $A_{\alpha\beta}$ is regarded as a bilinear form ${\bf F}^{m_{\alpha}}\times{\bf F}^{n_{\beta}}\to{\bf F}$ by

[TABLE]

A tuple $(X_{1},X_{2},\ldots,X_{\mu},Y_{1},Y_{2},\ldots,Y_{\nu})$ satisfying (1.2) is called a vanishing subspace, and is called maximum if it has the maximum dimension, where the dimension is defined as (1.1).

MVSP generalizes the maximum stable set problem on bipartite graphs. Indeed, consider the case $m_{\alpha}=n_{\beta}=1$ for each $\alpha,\beta$ . Namely each submatrix is a scalar. Then each vector subspace is $\{0\}$ or ${\bf F}$ , and its dimension is [math] or $1$ . The condition (1.2) says that one of $X_{\alpha}$ and $Y_{\beta}$ is $\{0\}$ if $A_{\alpha\beta}$ is a nonzero scalar. Thus MVSP is the maximum stable set problem on a bipartite graph on vertices $a_{1},a_{2},\ldots,a_{\mu},b_{1},b_{2},\ldots b_{\nu}$ such that edge $a_{\alpha}b_{\beta}$ is given if and only if $A_{\alpha\beta}$ is a nonzero scalar.

A linear algebraic interpretation of MVSP is explained as follows. Consider a transformation of $A$ of the form

[TABLE]

where $E_{\alpha}$ is a nonsingular $m_{\alpha}\times m_{\alpha}$ matrix for $\alpha=1,2,\ldots,\mu$ and $F_{\beta}$ is a nonsingular $n_{\beta}\times n_{\beta}$ matrix for $\beta=1,2,\ldots,\nu$ . If the resulting matrix contains a zero submatrix of $c$ rows and $d$ columns, then from $E_{\alpha}$ and $F_{\beta}$ we obtain a vanishing subspace of dimension $c+d$ . Conversely, from a vanishing subspace of dimension $b$ , we can find a transformation of form (1.3) such that the resulting matrix contains a zero submatrix of $c$ rows and $d$ columns with $c+d=b$ . Thus MVSP is nothing but the problem of finding a transformation (1.3) of $A$ such that the resulting matrix has the largest zero submatrix.

Ito, Iwata, and Murota [27] studied a canonical block-triangular form under transformation (1.3), which generalizes the classical Dulmage-Mendelsohn decomposition [11]; see also [34]. They formulated an equivalent problem of MVSP, though MVSP was explicitly introduced by a recent paper [21]. For several basic special cases [11, 21, 36], MVSP can be solved in polynomial time via Gaussian elimination, bipartite matching, and matroid intersection algorithm, and a canonical block-triangular form is also obtained accordingly. These works are in a cross road of numerical computation and combinatorial optimization. Ito, Iwata, and Murota [27, p.1252] raised an open problem of solving (an equivalent problem of) MVSP and obtaining a canonical block-triangular form in polynomial time.

The contribution of this paper is about this problem. We consider a natural weighted generalization of MVSP. We are further given nonnegative weights $C_{\alpha},D_{\beta}$ for $1\leq\alpha\leq\mu$ and $1\leq\beta\leq\nu$ . The weighted maximum vanishing subspace problem (WMVSP) asks to maximize

[TABLE]

over all vanishing subspaces $(X_{1},X_{2},\ldots,X_{\mu},Y_{1},Y_{2},\ldots,Y_{\nu})$ . Let $m:=\sum_{\alpha}m_{\alpha}$ and $n:=\sum_{\beta}n_{\beta}$ , i.e., $A=(A_{\alpha\beta})$ is an $m\times n$ matrix. The main result is the pseudo-polynomial time solvability of WMVSP:

Theorem 1.1.

Suppose that arithmetic operations on ${\bf F}$ can be done in constant time. WMVSP can be solved in time polynomial in $m,n$ and $W$ , where $W$ is the maximum of weights $C_{\alpha},D_{\beta}$ .

The algorithm in this theorem is applicable to the case where ${\bf F}$ is a finite field (with $\log|{\bf F}|$ fixed). However, if ${\bf F}$ is a rational field ${\bf Q}$ , then the required bit length is not bounded, though our algorithm only requires a polynomial number of arithmetic operations.

Significances, implications, and novel proof techniques of this result are explained in the following.

Submodular optimization on modular lattice.

MVSP (or WMVSP) is viewed as a submodular function minimization (SFM) on the lattice of all vector subspaces of a vector space. Such a lattice is a typical instance of a modular lattice. Submodular optimization on modular lattice is a challenging field in combinatorial optimization. Kuivinen [30, 31] proved a good characterization of SFM on the product ${\cal L}^{n}$ of a modular lattice ${\cal L}$ , where ${\cal L}$ is finite and is a part of the input. In this setting, Fujishige, Király, Makino, Takazawa, and Tanigawa [15] proved the polynomial time solvability of SFM on ${\cal L}^{n}$ where ${\cal L}$ is a modular lattice of rank $2$ . In the valued-CSP setting where a submodular function is given as a sum of submodular functions with a fixed number of variables, the tractability criterion of Kolmogorov, Thapper, and Živńy [29] implies that SFM on ${\cal L}^{n}$ is solvable in polynomial time. In contrast with these results, our SFM is defined on an infinite modular lattice ruled by a linear algebraic machinery. Our result is a step toward understanding this type of discrete optimization problems over a lattice of vector subspaces.

Beyond Euclidean convexity: Outline of the proof.

No reasonable LP/convex relaxation (allowing infiniteness) is known for MVSP and WMVSP. This is a main reason of the difficulty. Beyond Euclidean convexity, our proof employs a method of a non-Euclidean convex optimization, more specifically, convex optimization on CAT(0)-space. Here a CAT(0)-space is a nonpositively-curved metric space enjoying various fascinating properties analogous to those in the Euclidean space; see [8]. One of important features of a CAT(0)-space is the unique geodesic property: every pair of points can be joined by a unique geodesic. Through the unique geodesics, several convexity concepts (e.g., convex functions) are naturally introduced. Computational and algorithmic theory on CAT $(0)$ -space is another challenging research field; see e.g., [2, 4, 38]. Our proof explores the power of the convexity of CAT $(0)$ -space to obtain the polynomial time complexity in discrete optimization.

As is well-known, a (usual) submodular function on Boolean lattice $\{0,1\}^{n}$ is extended to a convex function on hypercube $[0,1]^{n}$ in the Euclidean space, via Lovász extension [32]. This fact enables us to apply Euclidean convex optimization methods (e.g., the ellipsoid method) to various problems related to submodular functions. Analogous to the embedding $\{0,1\}^{n}\hookrightarrow[0,1]^{n}$ , a modular lattice ${\cal L}$ is embedded into a suitable continuous metric space $K({\cal L})$ , called the orthoscheme complex [7]. Figure 1 illustrates the orthoscheme complex of a modular lattice of rank 2, which is obtained by gluing Euclidean triangles along one common side.

It is shown in [9, 20] that $K({\cal L})$ is a CAT $(0)$ -space. In this setting, a submodular function is extended to a convex function on $K({\cal L})$ [22]. Consequently, our problem WMVSP becomes a convex optimization over a CAT $(0)$ -space.

We solve this continuous optimization problem by using a CAT(0)-space version of a proximal point algorithm (PPA). The Euclidean PPA is a well-known simple iterative algorithm to minimize a convex function $f$ , which computes the proximal point operator $J_{\lambda}^{f}(z)$ of the current point $z$ , updates $z\leftarrow J_{\lambda}^{f}(z)$ , and repeat. The PPA is naturally defined on a CAT(0)-space. Bačák [4] showed that the sequence $(z_{\ell})$ generated by PPA converges to a minimizer of $f$ ; see also [6]. We apply a version of PPA to our CAT(0)-space relaxation of WMVSP. By using a recent result of Ohta and Pálfia [37] on the rate of the convergence, we show that after a polynomial number of iterations, a maximum vanishing space is obtained from the current point $z_{\ell}$ . We prove that the proximal operator in each step is computed in polynomial time. This is the most technical but intriguing part of the proof.

Block-triangularization of partitioned matrix.

Let us return to the original motivation of MVSP. A maximal chain of the maximum vanishing subspaces provides, via an appropriate change of bases, the most refined block-triangularization under transformation (1.3), which we call the DM-decomposition [21, 27]. Solving MVSP is not enough to obtaining the DM-decomposition. We here introduce a reasonably coarse block-triangularization, which we call a quasi DM-decomposition. A quasi DM-decomposition still generalizes known important special cases, such as CCF for mixed matrices [36]. We show that a quasi DM-decomposition can be obtained in polynomial time by solving WMVSP with varying weights. We believe that obtaining a quasi DM-decomposition is a limit which we can do by combinatorial or optimization methods. A step to DM-decomposition from quasi DM-decomposition seems to be a matter of numerical analysis/computation, and includes the common invariant subspace problem, which is an extremely difficult numerical computational problem (see e.g.,[3, 25]).

Relation to Edmonds’ problem and the recent development.

After finishing the first version of this paper, we found that our result is closely related to the recent remarkable development [10, 16, 17, 23, 24] on Edmonds’ problem. We briefly explain this fact. Edmonds′ problem [12] asks, given a vector space ${\cal A}$ of $m\times n$ matrices, to determine the maximum rank over matrices in ${\cal A}$ . Suppose that the matrix space ${\cal A}$ is given as its basis $A_{1},A_{2},\ldots A_{N}$ . Then the problem is to determine

[TABLE]

As is noticed in [33], there is a weak duality. For a nonnegative integer $c$ , a $c$ -shrunk subspace is a vector subspace $X\subseteq{\bf F}^{m}$ with

[TABLE]

where $A_{i}$ is viewed as ${\bf F}^{m}\to({\bf F^{*}})^{n}$ by $x\mapsto x^{\top}A_{i}$ for dual ${\bf F}^{*}$ of ${\bf F}$ . For a $c$ -shrunk subspace $X$ , via a basis transformation (including bases of $X$ and $\sum_{i=1}^{N}A_{i}(X)$ ), all $A_{i}$ (i.e., all matrices in ${\cal A}$ ) are transformed to have the zero block of size $n+c$ in the same position. Consequently, $\mathop{\rm rank}{\cal A}$ is bounded by $m-c$ . The relation between shrunk subspaces and vanishing subspaces is explained as follows, where a vanishing subspace $(X,Y)$ in this setting is meant as a pair $(X,Y)$ of subspaces in ${\bf F}^{m}$ and ${\bf F}^{n}$ with $A_{i}(X,Y)=\{0\}$ for all $i$ . For a $c$ -shrunk subspace $X$ , we obtain a vanishing subspace $(X,Y)$ with $\dim X+\dim Y\geq n+c$ for $Y:=(\sum_{i=1}^{N}A_{i}(X))^{\bot}=\bigcap_{i=1}^{N}(A_{i}(X))^{\bot}$ , where $(\cdot)^{\bot}$ means the orthogonal complement. Conversely, $X$ for a vanishing subspace $(X,Y)$ is a $(\dim X+\dim Y-n)$ -shrunk subspace. Summarizing, we obtain the following weak duality relation:

[TABLE]

Now MVSP is nothing but the problem on the right hand side in (1.6) for the case where $A_{i}$ is defined as the $m\times n$ matrix such that the $(\alpha,\beta)$ -th block is $A_{\alpha\beta}$ and other blocks are zero.

The inequality in (1.5) is strict in general. In 2004, Gurvits [19] considered the decision version of the Edmonds’ problem, and introduced the Edmonds-Rado property for square matrix space ${\cal A}$ , which is the property that ${\cal A}$ contains a nonsingular matrix if and only if there is no $1$ -shrunk subspace (or equivalently if there is no vanishing subspace $(X,Y)$ with $\dim X+\dim Y>n$ ). This is the equality case of (1.5). He developed a polynomial time algorithm to solve the decision version of the Edmonds problem in the case of ${\bf F}={\bf Q}$ .

From 2015, there has been a significant development on this problem; the (long) introduction of [16] is an exciting reading of this development. Ivanyos, Qiao, and Subrahmanyam [23] noticed that the right hand side of (1.5) coincides with the non-commutative rank [13] of linear form $\sum_{i}z_{i}A_{i}$ on the skew free field generated by non-commutative indeterminates $z_{i}$ , whereas (1.4) is equal to the rank of $\sum_{i}z_{i}A_{i}$ over the rational function field ${\bf F}(z_{1},z_{2},\ldots,z_{N})$ for commutative indeterminates $z_{i}$ . They showed that the non-commutative rank (nc-rank) of ${\cal A}$ is also given by

[TABLE]

where $\otimes$ is the Kronecker product. Namely (1.7) is viewed as the primal problem of the right hand side of (1.5) in which the strong duality holds. They also presented the first deterministic algorithm to compute the nc-rank. Garg, Gurvits, Oliveira, and Wigderson [16] showed in ${\bf F}={\bf Q}$ that Gurvits’ algorithm in [19] can be used to compute the nc-rank in polynomial time, where a substitution $Z_{i}$ and shrunk subspace $X$ attaining the nc-rank are not obtained in this algorithm. Derksen and Makam [10] gave a polynomial bound of $d$ in (1.7) via invariant theoretic arguments. By using this result, Ivanyos, Qiao, and Subrahmanyam [24] finally proved that a substitution $Z_{i}$ and shrunk subspace $X$ attaining the nc-rank can be computed in polynomial time. In particular, by using their algorithm, MVSP can be solved in polynomial time for any field ${\bf F}$ . Garg, Gurvits, Oliveira, and Wigderson [17] used Gurvits’ algorithm to solve the feasibility problem on the Brascamp-Lieb inequalities in pseudo polynomial time. This problem is essentially WMVSP with single row-block ( $\mu=1$ ) and ${\bf F}={\bf Q}$ .

Gurvits’ algorithm is analytic, and is based on matrix scaling inspired by quantum information theory. The algorithm by Ivanyos, Qiao, and Subrahmanyam is based on the second Wong sequence for matrix spaces, which is a vector space analogue of augmenting path in bipartite matching. Our algorithm for WMVSP is build on submodularity on modular lattice and continuous optimization on CAT(0)-space, and is completely different from both algorithms. Besides the drawback on the bit-length issue, its conceptual simplicity and flexibility of our algorithm are notable. Indeed, our algorithm is easily adapted to compute nc-rank in the same complexity (Remark 3.13). We believe that the approach presented in this paper will add a new perspective to this exciting development on Edmonds’ problem.

Organization.

The rest of this paper is organized as follows. In Section 2, we summarize convex optimization on CAT(0)-space, submodular function, modular lattice, orthoscheme complex, and their interplay. In Section 3, we first reduce WMVSP to a convex optimization on a CAT(0)-space, and apply PPA to prove Theorem 1.1. In Section 4, we explain implications on block-triangularization of partitioned matrix.

2 Preliminaries

2.1 Convex optimization on CAT(0)-space

2.1.1 CAT(0)-space

Let $S$ be a metric space, and let $d:S\times S\to{\bf R}_{+}$ denote the distance function of $S$ . Let $\mathop{\rm diam}S:=\sup_{x,y\in S}d(x,y)$ denote the diameter of $S$ . A path in $S$ is a continuous map $\gamma$ from $[0,1]$ to $S$ . The length of a path $\gamma$ is defined as $\sup\sum_{i=0}^{n-1}d(\gamma(t_{i}),\gamma(t_{i+1}))$ over $0=t_{0}<t_{1}<t_{2}<\cdots<t_{n}=1$ and $n>0$ . We say that a path $\gamma$ connects $x,y\in S$ if $\gamma(0)=x$ and $\gamma(1)=y$ . A geodesic is a path $\gamma$ satisfying $d(\gamma(s),\gamma(t))=d(\gamma(0),\gamma(1))|s-t|$ for every $s,t\in[0,1]$ . Metric space $S$ is called a geodesic metric space if any two points in $S$ is connected by a geodesic, and is said to be uniquely geodesic if any two points in $S$ is connected by a unique geodesic. For points $x,y$ in $S$ , let $[x,y]$ denote the image of a geodesic $\gamma$ connecting $x,y$ (though a geodesic is not unique). For $t\in[0,1]$ , the point $p$ on $[x,y]$ with $d(x,p)/d(x,y)=t$ is formally written as $(1-t)x+ty$ . A geodesic triangle of $x,y,z\in S$ is the union $[x,y]\cup[y,z]\cup[z,x]$ . In the Euclidean plane ${\bf R}^{2}$ , there exist points $\bar{x},\bar{y},\bar{z}\in{\bf R}^{2}$ such that $d(x,y)=\|\bar{x}-\bar{y}\|_{2}$ , $d(y,z)=\|\bar{y}-\bar{z}\|_{2}$ , and $d(z,x)=\|\bar{z}-\bar{x}\|_{2}$ . For $p\in[x,y]$ , the comparison point of $p$ is the unique point $\bar{p}$ in $[\bar{x},\bar{y}]$ with $d(x,p)=\|\bar{x}-\bar{p}\|_{2}$ . A geodesic metric space is called CAT $(0)$ if for every geodesic triangle $\Delta=[x,y]\cup[y,z]\cup[z,x]$ and every $p,q\in\Delta$ , it holds $d(p,q)\leq\|\bar{p}-\bar{q}\|_{2}.$ An intuitive meaning of this definition is that any triangle in $S$ is thinner than the corresponding triangle in the Euclidean plane. See Figure 2.

Proposition 2.1 ([8, Proposition 1.4]).

A CAT $(0)$ -space is uniquely geodesic.

2.1.2 Convex function

Let $S$ be a CAT $(0)$ space. A function $f:S\to{\bf R}$ is said to convex if it satisfies

[TABLE]

for every $x,y\in S$ and $t\in[0,1]$ . A function $f:S\to{\bf R}$ is said to strongly convex with parameter $\kappa>0$ if it satisfies

[TABLE]

for every $x,y\in S$ and $t\in[0,1]$ . A function $f:S\to{\bf R}$ is said to $L$ -Lipschitz with parameter $L\geq 0$ if it satisfies

[TABLE]

for every $x,y\in S$ .

Lemma 2.2.

For any $z\in S$ , the function $x\mapsto d(z,x)^{2}$ is strongly convex with $\kappa=2$ , and is $L$ -Lipschitz with $L\leq 3\mathop{\rm diam}S$ .

The former follows directly from the definition of CAT(0)-space. The latter follows from $d(z,x)^{2}-d(z,y)^{2}\leq(d(z,y)+d(x,y))^{2}-d(z,y)^{2}=(2d(z,y)+d(x,y))d(x,y)\leq(3\mathop{\rm diam}S)d(x,y)$ .

2.1.3 Proximal point algorithm

Let $S$ be a complete CAT(0)-space (which is also called an Hadamard space). For a convex function $f:S\to{\bf R}$ and $\lambda>0$ the resolvent of $f$ is a map $J_{\lambda}^{f}:S\to S$ defined by

[TABLE]

Since the function $y\mapsto f(y)+\frac{1}{2\lambda}d(x,y)^{2}$ is strongly convex with parameter $1/\lambda>0$ , the minimizer is uniquely determined, and $J_{\lambda}^{f}$ is well-defined. The proximal point algorithm (PPA) is to iterate update $x\leftarrow J_{\lambda}^{f}(x)$ . This simple algorithm generates a sequence converging to a minimizer of $f$ under a mild assumption; see [4, 6]. The splitting proximal point algorithm (SPPA) [5, 6], which we will use, minimizes a convex function $f:S\to{\bf R}$ of the following form

[TABLE]

where each $f_{i}:S\to{\bf R}$ is a convex function. Consider a sequence $(\lambda_{k})_{k=1,2,\ldots,}$ satisfying

[TABLE]

Splitting Proximal Point Algorithm (SPPA)

$\bullet$

Let $x_{0}\in S$ be an initial point.

$\bullet$

For $k=0,1,2,\ldots$ , repeat the following:

[TABLE]

Bacǎk [5] showed that the sequence generated by SPPA converges to a minimizer of $f$ if $S$ is locally compact. Ohta and Pálfia [37] proved the sublinear convergence of SPPA if $f$ is strongly convex and $S$ is not necessarily locally compact.

Theorem 2.3 ([37]).

Suppose that $f$ is strongly convex with parameter $\epsilon>0$ and each $f_{i}$ is $L$ -Lipschitz. Let $x^{*}$ be the unique minimizer of $f$ . For $0<a<1$ , define the sequence $(\lambda_{k})$ by

[TABLE]

Then the sequence $(x_{\ell})$ generated by SPPA satisfies

[TABLE]

where $h(a):=2^{2-a}(1-a)^{2}(1+a)/a$ .

Note that Ohta and Pálfia stated this theorem assuming $L\geq 1$ but this condition is not used in their proof, and does not affect our argument.

2.2 Modular lattice

A lattice ${\cal L}$ is a partially ordered set such that every pair $p,q$ of elements has meet $p\wedge q$ (greatest common lower bound) and join $p\vee q$ (lowest common upper bound). Let $\preceq$ denote the partial order. By $p\prec q$ we mean $p\preceq q$ and $p\neq q$ . A pairwise comparable subset of ${\cal L}$ , arranged as $p_{0}\prec p_{1}\prec\cdots\prec p_{k}$ , is called a chain (from $p_{0}$ to $p_{k}$ ), where $k$ is called the length. In this paper, we only consider lattices in which any chain has a finite length. Let ${\bf 0}$ and ${\bf 1}$ denote the minimum and maximum elements of ${\cal L}$ , respectively. The rank $r(p)$ of element $p$ is defined as the maximum length of a chain from ${\bf 0}$ to $p$ . The rank of lattice ${\cal L}$ is defined as the rank of ${\bf 1}$ .

A lattice ${\cal L}$ is called modular if for every triple $x,a,b$ of elements with $x\preceq b$ , it holds $x\vee(a\wedge b)=(x\vee a)\wedge b$ . It is known that a modular lattice is exactly such a lattice that satisfies

[TABLE]

An element of rank $1$ is called an atom. A modular lattice ${\cal L}$ is said to be complemented if every element can be represented as a join of atoms. A lattice ${\cal L}$ is said to be distributive if $x\vee(y\wedge z)=(x\vee y)\wedge(x\vee z)$ and $x\wedge(y\vee z)=(x\wedge y)\vee(x\wedge z)$ hold for every triple $x,y,z$ of elements. A distributive lattice is a modular lattice. A complemented distributive lattice is exactly a Boolean lattice, which is a lattice isomorphic to the poset $2^{\{1,2,\ldots,n\}}$ of all subsets of $\{1,2,\ldots,n\}$ with the inclusion order $\subseteq$ .

A function $f:{\cal L}\to{\bf R}$ is said to be submodular if

[TABLE]

Let $\check{\cal L}$ denote the opposite lattice of ${\cal L}$ , where $\check{\cal L}$ and ${\cal L}$ are equal as a set, and the partial order of $\check{\cal L}$ is the reverse of that of ${\cal L}$ . For a complemented modular lattice ${\cal L}$ , the opposite lattice $\check{\cal L}$ is also a complemented modular lattice.

A canonical example of a complemented modular lattice is the family ${\cal L}$ of all subspaces of a vector space $U$ , where the partial order is the inclusion order with $\wedge=\cap$ , and $\vee=+$ . The rank of a subspace $X\in{\cal L}$ is equal to the dimension $\dim X$ . The following equality of dimension is well-known:

[TABLE]

2.2.1 Basic properties

Let ${\cal L}$ be a modular lattice of rank $n$ , and let $r$ be the rank function of ${\cal L}$ . For $k>0$ , we denote $p\prec_{k}q$ if $p\preceq q$ and $r(q)-r(p)=k$ .

Lemma 2.4.

For $p,p^{\prime},q,q^{\prime}\in{\cal L}$ with $p\prec_{k}p^{\prime}$ and $q\prec_{1}q^{\prime}$ , it holds that

[TABLE]

In particular, the function $u\mapsto r(p^{\prime}\wedge u)-r(p\wedge u)$ is nondecreasing and takes values from [math] to $k$ .

Proof.

First note that $r(p^{\prime}\wedge q^{\prime})-r(p^{\prime}\wedge q)\in\{0,1\}$ and $r(p\wedge q^{\prime})-r(p\wedge q)\in\{0,1\}$ . Indeed, suppose that $r(p^{\prime}\wedge q^{\prime})-r(p^{\prime}\wedge q)>0$ . Then $p^{\prime}\wedge q^{\prime}\not\preceq q$ and hence $(p^{\prime}\wedge q^{\prime})\vee q=q^{\prime}$ . By (2.8), we have $r(p^{\prime}\wedge q^{\prime})+r(q)=r(q^{\prime})+r(p^{\prime}\wedge q)$ , and $r(p^{\prime}\wedge q^{\prime})-r(p^{\prime}\wedge q)=r(q^{\prime})-r(q)=1$ .

Thus it suffices to consider the case of $p^{\prime}\wedge q^{\prime}=p^{\prime}\wedge q$ (i.e., $r(p^{\prime}\wedge q^{\prime})=r(p^{\prime}\wedge q)$ ). Then $p\wedge p^{\prime}\wedge q^{\prime}=p\wedge p^{\prime}\wedge q$ implies $p\wedge q^{\prime}=p\wedge q$ (i.e., $r(p^{\prime}\wedge q^{\prime})=r(p^{\prime}\wedge q)$ ), as required. ∎

In the case where ${\cal L}$ is complemented, a base is a set of $n$ atoms $a_{1},a_{2},\ldots,a_{n}$ with $a_{1}\vee a_{2}\vee\cdots\vee a_{n}={\bf 1}$ . The sublattice $\langle a_{1},a_{2},\ldots,a_{n}\rangle$ generated by a base $\{a_{1},a_{2},\ldots,a_{n}\}$ is called a frame, which is isomorphic to a Boolean lattice $2^{\{1,2,\ldots,n\}}$ by

[TABLE]

Lemma 2.5 (see e.g.,[18]).

Let ${\cal C}$ and ${\cal D}$ be (maximal) chains in $\cal L$ . The sublattice generated by ${\cal C}$ and ${\cal D}$ is distributive. If ${\cal L}$ is complemented, then there is a frame $\langle a_{1},a_{2},\ldots,a_{n}\rangle\subseteq{\cal L}$ containing $\cal C$ and $\cal D$ .

A complemented modular lattice is viewed as a spherical building of type A [1] The latter property of this lemma features the axiom of building, and is particularly important for us; we provide a proof based on [1, Section 4.3].

Proof.

Suppose that ${\cal C}=({\bf 0}=p_{0}\prec p_{1}\prec\cdots\prec p_{n}={\bf 1})$ and ${\cal D}=({\bf 0}=q_{0}\prec q_{1}\prec\cdots\prec q_{n}={\bf 1})$ . We first show

Claim.

There exists a bijection $\sigma$ on $\{1,2,\ldots,n\}$ such that $p_{i-1}\wedge q_{\sigma(i)-1}\prec_{1}p_{i}\wedge q_{\sigma(i)}$ for each $i$ .

Assume the claim. By complementarity, for each $i$ , we can choose an atom $a_{i}$ such that $(p_{i-1}\wedge q_{\sigma(i)-1})\vee a_{i}=p_{i}\wedge q_{\sigma(i)}$ . Then it holds $p_{i-1}\vee a_{i}=p_{i}$ and $q_{\sigma(i)-1}\vee a_{i}=q_{\sigma(i)}$ . Consequently, all $p_{i}$ and $q_{i}$ are represented as joins of $a_{1},a_{2},\ldots,a_{n}$ . See Figure 3 for intuition.

We prove the claim. By Lemma 2.4, for each $i\in\{1,2,\ldots,n\}$ there uniquely exists $j\in\{1,2,\ldots,n\}$ such that

[TABLE]

In particular, it holds that

[TABLE]

Here $p_{i}\wedge q_{j}\not\preceq q_{j-1}$ must hold; if $p_{i}\wedge q_{j}\preceq q_{j-1}$ , then $p_{i}\wedge q_{j}=p_{i}\wedge p_{i}\wedge q_{j}\preceq p_{i}\wedge q_{j-1}$ , which contradicts (2.12). Thus $(p_{i}\wedge q_{j})\vee q_{j-1}=q_{j}$ , implying $p_{i}\wedge q_{j-1}\prec_{1}p_{i}\wedge q_{j}$ (by (2.8)). By (2.12), it necessarily holds that $p_{i-1}\wedge q_{j-1}=p_{i-1}\wedge q_{j}\prec_{1}p_{i}\wedge q_{j}$ .

Thus we can define the map $\sigma$ by associating $i$ with $\sigma(i):=j$ with property (2.11). This map is injective, and hence bijective. Indeed, by (2.11), we have $r(p_{i}\wedge q_{j})-r(p_{i}\wedge q_{j-1})-r(p_{i-1}\wedge q_{j})+r(p_{i-1}\wedge q_{j-1})=1$ , and

[TABLE]

This means that interchanging the roles of $i,j$ yields the inverse map of $\sigma$ . ∎

Suppose that ${\cal L}$ is the lattice of all vector subspaces of a vector space, and that we are given two chains ${\cal C}$ and ${\cal D}$ of vector subspaces, where each subspace $X$ in the chains is given by a matrix $A$ with ${\rm Im}A=X$ or/and a matrix $B$ with $\ker B=X$ . The above proof can be implemented via rank computation/Gaussian elimination, and obtain vectors $a_{1},a_{2},\ldots,a_{n}$ with ${\cal C},{\cal D}\subseteq\langle a_{1},a_{2},\ldots,a_{n}\rangle$ in polynomial time.

Let $U$ and $V$ be vector spaces of dimension $m$ and $n$ , respectively, and let $A:U\times V\to{\bf F}$ be a bilinear form. Let ${\cal L}$ and ${\cal M}$ be the lattices of all vector subspaces of $U$ and of $V$ , respectively. Consider the opposite $\check{\cal M}$ . Define $R=R^{A}:{\cal L}\times\check{\cal M}\to{\bf Z}$ by

[TABLE]

where $A|_{X\times Y}:X\times Y\to{\bf F}$ is the restriction of $A$ to $X\times Y$ , and $\mathop{\rm rank}$ is the rank of the matrix representation. Then $R$ is submodular; an equivalent statement is in [28, Lemma 4.2].

Lemma 2.6.

For $(X,Y),(X^{\prime},Y^{\prime})\in{\cal L}\times{\cal M}$ , it holds

[TABLE]

Thus $R$ is a submodular function on ${\cal L}\times\check{\cal M}$ .

Proof.

By Lemma 2.5, there is a base $\{a_{1},a_{2},\ldots,a_{m}\}$ of ${\cal L}$ with $X,X^{\prime},X\cap X^{\prime},X+X^{\prime}\subseteq\langle a_{1},a_{2},\ldots,a_{m}\rangle$ , and there is a base $\{b_{1},b_{2},\ldots,b_{n}\}$ of ${\cal M}$ with $Y,Y^{\prime},Y\cap Y^{\prime},Y+Y^{\prime}\subseteq\langle b_{1},b_{2},\ldots,b_{n}\rangle$ . Consider the matrix representation $A=(a_{ij})$ with respect to these bases, i.e., $a_{ij}:=A(a_{i},b_{j})$ . For $I\subseteq\{1,2,\ldots,m\}$ and $J\subseteq\{1,2,\ldots,n\}$ , let $A[I,J]:=(a_{ij}:i\in I,j\in J)$ be the submatrix of $A$ with row set $I$ and column set $J$ . Then (2.14) follows from the well-known rank inequality

[TABLE]

for $I,I^{\prime}\subseteq\{1,2,\ldots,m\}$ and $J,J^{\prime}\subseteq\{1,2,\ldots,n\}$ ; see [35, Proposition 2.1.9]. ∎

2.2.2 Orthoscheme complex

The $n$ -dimensional orthoscheme is the simplex in ${\bf R}^{n}$ with vertices

[TABLE]

where $e_{i}$ is the $i$ th unite vector; see Figure 4 for the $3$ -dimensional orthoscheme.

An orthoscheme complex, introduced by Brady and McCammond [7] in the context of geometric group theory, is a metric simplicial complex obtained by gluing orthoschemes. Let ${\cal L}$ be a modular lattice of rank $n$ . Let $F({\cal L})$ be the free ${\bf R}$ -module over ${\cal L}$ , i.e., the set of formal (finite) linear combinations $x=\sum_{p\in{\cal L}}\lambda(p)p$ such that each coefficient $\lambda(p)$ is in ${\bf R}$ and the set of elements $p$ with nonzero coefficient, which we call the support of $x$ , is finite. Let $K({\cal L})$ be the subset of elements $x=\sum_{p\in{\cal L}}\lambda(p)p\in F({\cal L})$ such that $\lambda(p)\geq 0$ for $p\in{\cal L}$ , $\sum_{p\in{\cal L}}\lambda(p)=1$ , and the support of $x$ is a chain of ${\cal L}$ . Namely $K({\cal L})$ is the geometric realization of the order complex of ${\cal L}$ . The subset of $K({\cal L})$ consisting of formal combinations of some chain ${\cal C}$ is called a simplex of $K({\cal L})$ . For a maximal simplex $\sigma$ corresponding to a maximal chain ${\cal C}=p_{0}\prec p_{1}\prec\cdots\prec p_{n}$ , define a map $\varphi_{\sigma}$ from $\sigma$ to the $n$ -dimensional orthoscheme by

[TABLE]

Then a metric $d_{\sigma}$ on each simplex $\sigma$ is defined by

[TABLE]

The length of a path $\gamma:[0,1]\to K({\cal L})$ is defined as $\sup\sum_{i=0}^{m-1}d_{\sigma_{i}}(\gamma(t_{i}),\gamma(t_{i+1}))$ , where the $\sup$ is taken over all $0=t_{0}<t_{1}<t_{2}<\cdots<t_{m}=1$ $(m\geq 1)$ such that $\gamma([t_{i},t_{i+1}])$ belongs to a simplex $\sigma_{i}$ for each $i$ . The metric on $K({\cal L})$ is (well-)defined as above. The resulting metric space $K({\cal L})$ is called the orthoscheme complex of ${\cal L}$ . Then $K({\cal L})$ is a complete geodesic metric space (by Bridson’s theorem [8, Theorem 7.19]).

Theorem 2.7 ([9, 20]).

Let ${\cal L}$ be a modular lattice of rank $n$ . The orthoscheme complex $K({\cal L})$ is a complete CAT(0)-space.

Lemma 2.8 ([7, 9]).

Let ${\cal L}$ and ${\cal M}$ be modular lattices. Define a metric $d$ on $K({\cal L})\times K({\cal M})$ by

[TABLE]

Then $K({\cal L})\times K({\cal M})$ is isometric to $K({\cal L}\times{\cal M})$ , where the isometry $\phi:K({\cal L})\times K({\cal M})\to K({\cal L}\times{\cal M})$ is given by the following algorithm:

Input:

$(x,y)\in K({\cal L})\times K({\cal M})$ .

Output:

$z=\phi(x,y)\in K({\cal L}\times{\cal M})$ .

0:

Let $z:=0$

1:

If $(x,y)=(0,0)$ , then return $z$ .

2:

Choose the maximum element $p$ from the support of $x$ and the maximum element $q$ from the support of $y$ .

3:

Let $\lambda$ be the minimum of the coefficient of $p$ in $x$ and that of $q$ in $y$ . Let $x\leftarrow x-\lambda p$ , $y\leftarrow y-\lambda q$ , and $z\leftarrow z+\lambda(p,q)$ . Go to 1.

The orthoscheme complex of a Boolean lattice is a Euclidean cube as follows, where $1_{X}\in\{0,1\}^{n}$ is the characteristic vector of $X\subseteq\{1,2,\ldots,n\}$ defined by $(1_{X})_{i}=1\Leftrightarrow i\in X$ .

Lemma 2.9 ([7, 9]).

Let ${\cal L}$ be a Boolean lattice $2^{\{1,2,\ldots,n\}}$ . The orthoscheme complex $K({\cal L})$ is isometric to the $n$ -cube $[0,1]^{n}$ in ${\bf R}^{n}$ , where an isometry is given by

[TABLE]

Lemma 2.10 ([9]).

Let ${\cal L}$ be a complemented modular lattice of rank $n$ , and let ${\cal F}$ be a frame of ${\cal L}$ . Then $K({\cal F})\simeq[0,1]^{n}$ is an isometric subcomplex of $K({\cal L})$ .

Corollary 2.11.

Let ${\cal L}$ be a complemented modular lattice of rank $n$ . Then $\mathop{\rm diam}K({\cal L})=\sqrt{n}$ .

Proof.

For two points $x,y\in K({\cal L})$ , there is a frame ${\cal F}$ such that $x,y\in K({\cal F})$ (by Lemma 2.5). Since $K({\cal F})\simeq[0,1]^{n}$ and $K({\cal F})$ is an isometric subspace, the distance $d(x,y)$ is bounded by the diameter $\sqrt{n}$ of $[0,1]^{n}$ , which is attained by $x={\bf 0}$ and $y={\bf 1}$ . ∎

A frame ${\cal F}=\langle a_{1},a_{2},\ldots,a_{n}\rangle$ is isomorphic to Boolean lattice $2^{\{1,2,\ldots,n\}}$ by $a_{i_{1}}\vee a_{i_{2}}\vee\cdots\vee a_{i_{k}}\mapsto\{i_{1},i_{2},\ldots,i_{k}\}$ . Also the subcomplex $K({\cal F})$ is viewed as an $n$ -cube $[0,1]^{n}$ , and a point $x$ in $K({\cal F})$ is viewed as $x=(x_{1},x_{2},\ldots,x_{n})\in[0,1]^{n}$ via isometry $(\ref{eqn:isometry})$ . This $n$ -dimensional vector $(x_{1},x_{2},\ldots,x_{n})$ is called the ${\cal F}$ -coordinate of $x$ . From ${\cal F}$ -coordinate $(x_{1},x_{2},\ldots,x_{n})$ , the original expression of $x$ is recovered by sorting $x_{1},x_{2},\ldots,x_{n}$ in decreasing order as: $x_{i_{1}}\geq x_{i_{2}}\geq\cdots\geq x_{i_{n}}$ , and letting

[TABLE]

where $x_{i_{n+1}}:=0$ .

2.2.3 Lovász extension

We here introduce the Lovász extension for a function on a modular lattice ${\cal L}$ . For a function $f:{\cal L}\to{\bf R}$ , the Lovász extension $\overline{f}:K({\cal L})\to{\bf R}$ of $f$ is defined by

[TABLE]

In the case where ${\cal L}=2^{\{1,2,\ldots,n\}}$ , this definition of the Lovász extension coincides with the original one [14, 32] by $K({\cal L})\simeq[0,1]^{n}$ (Lemma 2.9).

Theorem 2.12 ([22]).

Let ${\cal L}$ be a modular lattice. For a function $f:{\cal L}\to{\bf R}$ , the following conditions are equivalent:

(1)

$f$ * is submodular.*

(2)

$\overline{f}$ * is convex*

Sketch of proof.

For two points $x,y\in K({\cal L})$ , there is a frame ${\cal F}$ such that $K({\cal F})$ contains $x,y$ . Also $K({\cal F})$ is an isometric subspace of $K({\cal L})$ . Therefore the geodesic $[x,y]$ belongs to $K({\cal F})$ . Hence, a function on $K({\cal L})$ is convex if and only if it is convex on $K({\cal F})$ for every frame ${\cal F}$ . For any frame ${\cal F}$ , the restriction of a submodular function $f:{\cal L}\to{\bf R}$ to ${\cal F}$ is a usual submodular function on Boolean lattice ${\cal F}\simeq 2^{\{1,2,\ldots,n\}}$ . Hence $\overline{f}:K({\cal F})\to{\bf R}$ is viewed as the usual Lovász extension by $[0,1]^{n}\simeq K({\cal F})$ , and is convex. ∎

The rank function $r$ is submodular. The Lovász extension $\overline{r}$ of $r$ is written by the $l_{1}$ -metric on $K({\cal L})$ . Here the $l_{1}$ -metric $d_{1}$ is obtained by replacing $\|\cdot\|_{2}$ by $\|\cdot\|_{1}$ in (2.15), i.e.,

[TABLE]

The $l_{1}$ -metric on $K({\cal L})$ is denoted by $d_{1}$ . The function $x\mapsto d_{1}({\bf 0},x)$ is simply written as $d_{1}$ .

Lemma 2.13.

The Lovász extension $\overline{r}$ of the rank function $r$ is equal to $d_{1}$ .

Proof.

For $x=\sum_{i=0}^{n}\lambda_{i}p_{i}\in K({\cal L})$ , consider the simplex $\sigma$ formed by $p_{i}$ ’s. Then we have

[TABLE]

∎

The following lemma will be used to obtain a minimizer of a function on ${\cal L}$ from an approximate minimizer of its Lovász extension.

Lemma 2.14.

Let $f:{\cal L}\to{\bf Z}$ be an integer-valued function, and let $p^{*}\in{\cal L}$ be a minimizer of $f$ . For $x\in K({\cal L})$ , if $\overline{f}(x)-f(p^{*})<1$ , then there exists a minimizer of $f$ in the support of $x$ .

Proof.

Suppose that $x=\sum_{i}\lambda_{i}p_{i}$ . Suppose to the contrary that all $p_{i}$ ’s satisfy $f(p_{i})>f(p^{*})$ . Then $f(p_{i})\geq f(p^{*})+1$ . Hence $\overline{f}(x)=\sum_{i}\lambda_{i}f(p_{i})\geq\sum_{i}\lambda_{i}(f(p^{*})+1)=f(p^{*})+1$ . However this contradicts $\overline{f}(x)-f(p^{*})<1$ . ∎

The following lemma will be used to estimate the Lipschitz constant of the Lovász extension.

Lemma 2.15.

The Lovász extension $\overline{f}$ of $f:{\cal L}\to{\bf R}$ is $L$ -Lipschitz with

[TABLE]

Proof.

We first show that the restriction $\overline{f}|_{\sigma}$ of $\overline{f}$ to any maximal simplex $\sigma$ is $L$ -Lipschitz with $L\leq 2\sqrt{n}\max_{p\in{\cal L}}|f(p)|$ . Suppose that $\sigma$ corresponds to a chain ${\bf 0}=p_{0}<p_{1}<\cdots<p_{n}={\bf 1}$ . Let $x=\sum_{k}\lambda_{k}p_{k}$ and $y=\sum_{k}\mu_{k}p_{k}$ be points in $\sigma$ . For $k=0,1,2,\ldots,n$ , define $u_{k}$ and $v_{k}$ by

[TABLE]

Then $d_{\sigma}(x,y)$ is given by

[TABLE]

Letting $C:=\max_{p\in{\cal L}}|f(p)|$ , we have

[TABLE]

where we let $u_{0}=v_{0}:=1$ and $u_{n+1}=v_{n+1}:=0$ . Thus $\overline{f}|_{\sigma}$ is $2\sqrt{n}C$ -Lipschitz.

Next we show that $\overline{f}$ is $2\sqrt{n}C$ -Lipschitz. For any $x,y\in K({\cal L})$ , choose the geodesic $\gamma$ between $x$ and $y$ , and $0=t_{0}<t_{1}<\cdots<t_{m}=1$ such that $\gamma([t_{i},t_{i+1}])$ belongs to simplex $\sigma_{i}$ . Then we have

[TABLE]

∎

3 Maximum vanishing subspace problem

3.1 CAT(0)-space relaxation

Suppose that we are given an instance of WMVSP: a partitioned matrix $A=(A_{\alpha\beta})$ of type $(m_{1},m_{2},\ldots,m_{\mu};n_{1},n_{2},\ldots,n_{\nu})$ and nonnegative integer weights $C_{\alpha},D_{\beta}$ for $1\leq\alpha\leq\mu$ and $1\leq\beta\leq\nu$ . Let $m=\sum_{\alpha}m_{\alpha}$ and $n=\sum_{\beta}n_{\beta}$ . First we formulate WMVSP as an unconstrained submodular function minimization over a complemented modular lattice. Let ${\cal L}_{\alpha}$ and ${\cal M}_{\beta}$ denote the lattices of all vector subspaces of ${\bf F}^{m_{\alpha}}$ and of ${\bf F}^{n_{\beta}}$ , respectively. Let $R_{\alpha\beta}:=R^{A_{\alpha,\beta}}$ ; see (2.13) for the definition of $R$ . Then the condition (1.2) is written as

[TABLE]

By using $R_{\alpha\beta}$ as penalty terms, WMVSP is equivalent to the following unconstrained problem:

[TABLE]

where the penalty parameter $M>0$ is chosen as

[TABLE]

Lemma 3.1.

Any optimal solution of WMVSPR is optimal to WMVSP

Proof.

It suffices to show that any optimal solution of WMVSPR satisfies the condition (3.1). Indeed, if $R_{\alpha\beta}(X_{\alpha},Y_{\beta})>0$ for $(X_{1},\ldots,X_{\mu},Y_{1},\ldots,Y_{\nu})$ and some $\alpha,\beta$ then the objective value of WMVSPR is positive, and $(X_{1},\ldots,X_{\mu},Y_{1},\ldots,Y_{\nu})$ is never optimal (since the trivial solution $(\{0\},\ldots,\{0\},\{0\},\ldots,\{0\})$ has the objective value zero). ∎

By (2.10), Lemmas 2.6 and 2.13, we have:

Lemma 3.2.

The objective function of WMVSPR is submodular on $\prod_{\alpha}{\cal L}_{\alpha}\times\prod_{\beta}\check{\cal M}_{\beta}$ , where the Lovász extension is given by

[TABLE]

Recall that $d_{1}$ is the function $x\mapsto d_{1}({\bf 0},x)$ . In particular, WMVSPR is equivalent to the following continuous optimization on CAT(0) space:

[TABLE]

where $K(\prod_{\alpha}{\cal L}_{\alpha}\times\prod_{\beta}\check{\cal M}_{\beta})$ is considered as $\prod_{\alpha}K({\cal L}_{\alpha})\times\prod_{\beta}K(\check{\cal M}_{\beta})$ by Lemma 2.8. By Theorem 2.12, $\overline{{\rm WMVSP}}_{R}$ is a convex optimization problem.

Lemma 3.3.

The objective function of $\overline{{\rm WMVSP}}_{R}$ is convex.

3.2 Proximal point algorithm for MVSP

We are going to apply SPPA to the following perturbed problem of $\overline{\rm WMVSP}_{R}$ :

[TABLE]

where the function $x\mapsto d({\bf 0},x)^{2}$ is denoted by $d^{2}$ , and the parameter $\epsilon>0$ is chosen as

[TABLE]

The main reason to consider $\overline{\rm WMVSP}^{+}_{R}$ is the strong-convexity of the objective function. By Lemma 2.2, we have:

Lemma 3.4.

The objective function of $\overline{{\rm WMVSP}}_{R}^{+}$ is strongly convex with parameter $2\epsilon$ .

Let $g$ and $\tilde{g}$ denote the objective functions of $\overline{\rm WMVSP}_{R}$ and of $\overline{\rm WMVSP}^{+}_{R}$ , respectively.

Lemma 3.5.

Let $z^{*}$ and $\tilde{z}$ be minimizers of $g$ and $\tilde{g}$ , respectively. For every point $z$ , it holds that

[TABLE]

Proof.

This follows from $g(z)-g(z^{*})=g(z)-g(\tilde{z})+g(\tilde{z})-g(z^{*})\leq\tilde{g}(z)-\tilde{g}(\tilde{z})+\epsilon d^{2}(\tilde{z})+\tilde{g}(\tilde{z})-\tilde{g}(z^{*})+\epsilon d^{2}(z^{*})\leq\tilde{g}(z)-\tilde{g}(\tilde{z})+2\epsilon(m+n)$ , where we use $\mathop{\rm diam}K({\cal L}_{\alpha})=\sqrt{m_{\alpha}}$ and $\mathop{\rm diam}K({\cal M}_{\beta})=\sqrt{n_{\beta}}$ (Corollary 2.11). ∎

To apply SPPA, we regard the objective function $\tilde{g}$ as the sum $\sum_{i=1}^{N}f_{i}$ with $N=\mu+\nu+\mu\nu$ , where $f_{i}$ is defined by

[TABLE]

for $z=(x_{1},x_{2},\ldots,x_{\mu},y_{1},y_{2},\ldots y_{\nu})$ , $\alpha\in\{1,2,\ldots,\mu\}$ , and $\beta\in\{1,2,\ldots,\nu\}$ .

Theorem 3.6.

Let $(z_{\ell})$ be the sequence obtained by SPPA applied to $\tilde{g}=\sum_{i=1}^{N}f_{i}$ with $a:=1/2$ . For $\ell=\Omega(W^{8}m^{9}n^{9}(m+n)^{24})$ , the support of $z_{\ell}$ contains a minimizer of WMVSP.

Proof.

We first show that each summand $f_{i}$ is $L$ -Lipschitz with

[TABLE]

By Lemma 2.15, the Lipschitz constant of $d_{1}$ is $O(m_{\alpha}^{3/2})$ on $K({\cal L}_{\alpha})$ , and $O(n_{\beta}^{3/2})$ on $K({\cal M}_{\beta})$ . By Lemma 2.2, the Lipschitz constant of $d^{2}$ is $O(m_{\alpha})$ on $K({\cal L}_{\alpha})$ , and $O(n_{\beta})$ on $K({\cal M}_{\beta})$ . If $f_{i}=-C_{\alpha}d_{1}+\epsilon d^{2}$ or $-D_{\beta}d_{1}+\epsilon d^{2}$ , then the Lipschitz constant of $f_{i}$ is $O(W(n+m))$ . On the other hand, the Lipschitz constant of $f_{i}=M\overline{R_{\alpha\beta}}$ is $O(W(m+n)\min\{m_{\alpha},n_{\beta}\}(m_{\alpha}+n_{\beta})^{1/2})=O(W(m+n)^{5/2})$ .

By Theorem 2.3,

[TABLE]

Thus we have

[TABLE]

Thus, for $k=\Omega(W^{8}m^{8}n^{8}(m+n)^{24})$ , it holds $\tilde{g}(z_{kN})-\tilde{g}(\tilde{z})<1/2$ . By Lemma 3.5, we have $g(z_{kN})-g(z^{*})<1$ . By Lemma 2.14, the support of $z_{kN}$ contains a minimizer of WMVSP. ∎

By Lemma 2.14, after a polynomial number of iterations, a minimizer exists in the support of $z_{\ell}$ , where $z_{\ell}$ should be represented as a formal sum in $K(\prod_{\alpha}{\cal L}_{\alpha}\times\prod_{\beta}\check{\cal M}_{\beta})$ via the algorithm in Lemma 2.8.

Thus, our remaining task to prove Theorem 1.1 is to show that the resolvent of each summand can be computed in polynomial time.

3.3 Computation of resolvents

First we consider the resolvent of $-C_{\alpha}d_{1}+\epsilon d^{2}$ or $-D_{\beta}d_{1}+\epsilon d^{2}$ . This is an optimization problem over the orthoscheme complex of a single lattice. Let ${\cal L}$ be a complemented modular lattice of rank $n$ . It suffices to consider the following problem.

[TABLE]

where $C\geq 0$ , $\epsilon\geq 0$ , $\lambda>0$ , and $x^{0}\in K({\cal L})$ .

Lemma 3.7.

Suppose that $x^{0}$ belongs to a maximal simplex $\sigma$ . Then the minimizer $x^{*}$ of P1 exists in $\sigma$ .

Proof.

Let $x^{0}=\sum_{i=0}^{n}\lambda_{i}p_{i}$ , where $\sigma$ corresponds to maximal chain $\{p_{i}\}$ . Let $x^{*}=\sum_{i}\mu_{i}q_{i}$ be the unique minimizer of P1. Consider a frame ${\cal F}=\langle a_{1},a_{2},\ldots,a_{n}\rangle$ containing chains $\{p_{i}\}$ and $\{q_{i}\}$ . Let $(x^{0}_{1},x^{0}_{2},\ldots,x^{0}_{n})$ and $(x^{*}_{1},x^{*}_{2},\ldots,x^{*}_{n})$ be ${\cal F}$ -coordinates of $x^{0}$ and $x^{*}$ , respectively. In $K({\cal F})\simeq[0,1]^{n}$ , the objective function of P1 is written as

[TABLE]

We can assume that $p_{i}=a_{1}\vee a_{2}\vee\cdots\vee a_{i}$ by relabeling. Then $x^{0}_{1}\geq x^{0}_{2}\geq\cdots\geq x^{0}_{n}$ . Suppose that $x^{0}_{i}>x^{0}_{i+1}$ . Then $x^{*}_{i}\geq x^{*}_{i+1}$ must hold. If $x^{*}_{i}<x^{*}_{i+1}$ , then interchanging the $i$ -coordinate and $(i+1)$ -coordinate of $x^{*}$ gives rise to another point in $K({\cal F})$ having a smaller objective value, contradicting the optimality of $x^{*}$ . Suppose that $x^{0}_{i}=x^{0}_{i+1}$ . If $x^{*}_{i}\neq x^{*}_{i+1}$ , then replace both $x_{i}^{*}$ and $x_{i+1}^{*}$ by $(x^{*}_{i}+x^{*}_{i+1})/2$ to decrease the objective value, which is a contradiction. Thus $x^{*}_{1}\geq x^{*}_{2}\geq\cdots\geq x^{*}_{n}$ . By (2.17), the original coordinate is written as $x^{*}=(1-x^{*}_{1}){\bf 0}+\sum_{i=1}^{n}(x^{*}_{i}-x^{*}_{i+1})(a_{1}\vee a_{2}\vee\cdots\vee a_{i})=\sum_{i}(x^{*}_{i}-x^{*}_{i+1})p_{i}$ (with $x_{0}^{*}=1$ and $x^{*}_{n+1}=0$ ). This means that $x^{*}$ belongs to $\sigma$ . ∎

As seen in the proof, to solve P1, it suffices to choose an arbitrary frame ${\cal F}$ containing the chain $\{p_{i}\}$ for $x^{0}=\sum_{i}\lambda_{i}p_{i}$ , and consider the following very easy Euclidean convex optimization problem:

[TABLE]

where $x$ and $x^{0}$ are represented in the ${\cal F}$ -coordinate. Then the optimal solution $x^{*}$ of P1*′* is obtained coordinate-wise. Namely $x^{*}_{i}$ is [math], $1$ , or $(x_{i}^{0}+\lambda C)/(1+2\epsilon\lambda)$ for each $i$ .

Summarizing, P1 can be solved as follows: choose any frame ${\cal F}$ containing $\{p_{i}\}$ (for $x^{\prime}=\sum_{i}\lambda_{i}p_{i}$ ), obtain the ${\cal F}$ -coordinate of $x^{\prime}$ , solve P1*′* to obtain minimizer $x^{*}\in[0,1]^{n}$ , and recover $x^{*}$ in $K({\cal L})$ by (2.17).

Theorem 3.8.

The resolvent of $-Cd_{1}+\epsilon d^{2}$ is computed in polynomial time.

Next we consider the computation of the resolvent of $M\overline{R_{\alpha\beta}}$ . Let $U$ and $V$ be vector spaces over field ${\bf F}$ of dimensions $m$ and $n$ , respectively. Let $A:U\times V\to{\bf F}$ be a bilinear form. Let ${\cal L}$ and ${\cal M}$ be the (complemented modular) lattices of all vector subspaces of vector spaces $U$ and $V$ , respectively, where the partial order is the inclusion order. Let $\check{\cal M}$ be the opposite lattice, which is also complemented modular. Recall the submodular function $R:{\cal L}\times\check{\cal M}\to{\bf Z}$ defined by (2.13), and let $\overline{R}:K({\cal L}\times\check{\cal M})\to{\bf R}$ be the Lovász extension of $R$ . For the computation of the resolvent of $M\overline{R_{\alpha\beta}}$ , it suffices to consider the following problem:

[TABLE]

where $\lambda>0$ , $x^{0}\in K({\cal L})$ , and $y^{0}\in K(\check{\cal M})$ . Recall Lemma 2.8 for $K({\cal L}\times\check{\cal M})\simeq K({\cal L})\times K(\check{\cal M})$ . As in the case of P1, we reduce P2 to a convex optimization over $[0,1]^{m}\times[0,1]^{n}$ by taking a special frame $\langle e_{1},e_{2},\ldots,e_{m},f_{1},f_{2},\ldots,f_{n}\rangle$ of ${\cal L}\times\check{\cal M}$ .

For $X\in{\cal L}$ , let $X^{\bot}$ denote the subspace in $\check{\cal M}$ defined by

[TABLE]

Namely $X^{\bot}$ is the orthogonal subspace of $X$ with respect to the bilinear form $A$ . For $Y\in\check{\cal M}$ , let $Y^{\bot}\in{\cal L}$ be defined similarly.

Let $r:=\mathop{\rm rank}A$ . An $A$ -orthogonal frame ${\cal F}=\langle e_{1},e_{2},\ldots,e_{m},f_{1},f_{2},\ldots,f_{n}\rangle$ is a frame of ${\cal L}\times\check{\cal M}$ satisfying the following conditions:

•

$\langle e_{1},e_{2},\ldots,e_{m}\rangle$ is a frame of ${\cal L}$ .

•

$\langle f_{1},f_{2},\ldots,f_{n}\rangle$ is a frame of $\check{\cal M}$ .

•

$e_{r+1}\vee e_{r+2}\vee\cdots\vee e_{m}=V^{\bot}$ .

•

$f_{1}\vee f_{2}\vee\cdots\vee f_{r}=U^{\bot}$ ( $\Leftrightarrow$ $f_{1}\cap f_{2}\cap\cdots\cap f_{r}=U^{\bot}$ ).

•

$f_{i}={e_{i}}^{\bot}$ for $i=1,2,\ldots,r$ .

For an $A$ -orthogonal frame ${\cal F}=\langle e_{1},e_{2},\ldots,e_{m},f_{1},f_{2},\ldots,f_{n}\rangle$ , the Lovász extension $\overline{R}$ of $R$ takes a much simpler form on $K({\cal F})$ as follows, where the proof is given in Section 3.4.

Theorem 3.9.

Let ${\cal F}=\langle e_{1},e_{2},\ldots,e_{m},f_{1},f_{2},\ldots,f_{n}\rangle$ be an $A$ -orthogonal frame. The restriction of the Lovász extension $\overline{R}$ to $K({\cal F})\simeq[0,1]^{m}\times[0,1]^{n}$ is written as

[TABLE]

where $(x_{1},x_{2},\ldots,x_{m})$ is the $\langle e_{1},e_{2},\ldots,e_{m}\rangle$ -coordinate of $x$ and $(y_{1},y_{2},\ldots,y_{n})$ is the $\langle f_{1},f_{2},\ldots,f_{n}\rangle$ -coordinate of $y$ .

The main ingredient in solving P2 is the following, where the proof is given is Section 3.4. Figure 5 illustrates an $A$ -orthogonal frame in this theorem.

Theorem 3.10.

Let ${\cal X}$ and ${\cal Y}$ be maximal chains corresponding to maximal simplices containing $x^{0}$ and $y^{0}$ , respectively.

(1)

There exists an $A$ -orthogonal frame ${\cal F}=\langle e_{1},e_{2},\ldots,e_{m},f_{1},f_{2},\ldots,f_{n}\rangle$ satisfying

[TABLE]

in which such a frame can be found in polynomial time.

(2)

For an $A$ -orthogonal frame ${\cal F}$ satisfying $(\ref{eqn:XuYbot})$ , the minimizer $(x^{*},y^{*})$ of P2 exists in $K({\cal F})$ .

Assume Theorems 3.9 and 3.10. For an $A$ -orthogonal frame satisfying (3.4), the problem P2 is equivalent to

[TABLE]

Again this problem is easily solved coordinate-wise. Obviously $x^{*}_{i}=x^{0}_{i}$ and $y^{*}_{i}=y^{0}_{i}$ for $i>r$ . For $i\leq r$ , $(x^{*}_{i},y^{*}_{i})$ is the minimizer of the following $2$ -dimensional problem:

[TABLE]

Obviously this can be solved in constant time.

Thus we can solve P2 as follows. Choose an $A$ -orthogonal frame ${\cal F}$ satisfying (3.4), solve P2*′* to obtain the minimizer $(x^{*},y^{*})\in[0,1]^{m}\times[0,1]^{n}$ , and recover $(x^{*},y^{*})$ in $K({\cal L})\times K({\cal M})$ .

Theorem 3.11.

The resolvent of $\overline{R}$ is computed in polynomial time.

Combining Theorems 3.6, 3.8, and 3.11, the proof of Theorem 1.1 is completed.

Remark 3.12.

In the above SPPA, the required bit-length for coefficients of $z\in K({\cal L}\times\check{\cal M})$ is bounded polynomially in $n,m,W$ . Indeed, the transformation between the original coordinate and an ${\cal F}$ -coordinate corresponds to multiplying a triangular matrix consisting of $\{0,1,-1\}$ elements; see (2.17). In each iteration $k$ , the optimal solution of quadratic problem P1*′* or P2*′* is obtained by adding (fixed) rational functions in $n,m,W,k$ to (current points) $x_{i}^{0},y_{i}^{0}$ and multiplying a (fixed) $2\times 2$ rational matrix in $n,m,W,k$ . Consequently the bit increase is bounded as required.

On the other hand, the bit-length estimation for a basis of a vector subspace appearing in the algorithm is not clear.

Remark 3.13.

Our algorithm is easily adapted to compute nc-rank in the same time complexity; see Introduction for nc-rank. Indeed, by (1.6), it suffices to solve

[TABLE]

where ${\cal L}$ and ${\cal M}$ are the lattices of all vector subspaces of ${\bf F}^{m}$ and ${\bf F}^{n}$ , respectively, and $M:=n+m+1$ . As above, we may consider the following perturbed continuous relaxation:

[TABLE]

where $\epsilon:=1/4(n+m)$ . In the setting of $f_{i}(x,y):=M\overline{R^{A_{i}}}(x,y)$ for $(1\leq i\leq N)$ , $-d_{1}(x)+\epsilon d^{2}(x)$ for $i=N+1$ , and $-d_{1}(y)+\epsilon d^{2}(y)$ for $i=N+2$ , the SPPA and the above analysis are applicable.

3.4 Proof

We start with basic properties of $(\cdot)^{\bot}$ , which follow from elementary linear algebra.

Lemma 3.14.

(1)

If $X\subseteq X^{\prime}$ , then $X^{\bot}\supseteq{X^{\prime}}^{\bot}$ and $r(X^{\bot})-r({X^{\prime}}^{\bot})\leq r(X^{\prime})-r(X).$

(2)

$(X+X^{\prime})^{\bot}=X^{\bot}\cap{X^{\prime}}^{\bot}$ .

(3)

$X^{\bot\bot}\supseteq X$ .

(4)

$X^{\bot\bot\bot}=X^{\bot}$ .

Next we give an alternative expression of $R$ by using $(\cdot)^{\bot}$ . Let $\check{r}$ be the rank function of $\check{\cal M}$ . Namely $\check{r}(Y)=m-\dim Y$ .

Lemma 3.15.

$R(X,Y)=r(X)-r(X\wedge Y^{\bot})=\check{r}(Y\vee X^{\bot})-\check{r}(Y)(=r(Y)-r(Y\cap X^{\bot}))$ .

Proof.

Consider bases $\{a_{1},a_{2},\ldots,a_{\ell}\}$ of $X$ and $\{b_{1},b_{2},\ldots,b_{\ell^{\prime}}\}$ of $Y$ . We can assume that $\{a_{k+1},a_{k+2},\ldots,a_{\ell}\}$ is a base of $X\cap Y^{\bot}$ . Consider the matrix representation $(A(a_{i},b_{j}))$ of $A|_{X\times Y}$ with respect to the bases. Since $A(a_{i},Y)=\{0\}$ for $i>k$ , the submatrix of $k+1,k+2,\ldots,\ell$ -th rows is a zero matrix. On the other hand, the submatrix of $1,2,\ldots k$ -th rows must have the row-full rank $k$ . Thus the rank $R(X,Y)$ of $(A(a_{i},b_{j}))$ is $k=\ell-(\ell-k)=r(X)-r(X\wedge Y^{\bot})$ . The same consideration shows the second equality. ∎

Proof of Theorem 3.9.

An $A$ -orthogonal frame $\langle e_{1},e_{2},\ldots,e_{m},f_{1},f_{2},\ldots,f_{n}\rangle=\langle e_{1},e_{2},\ldots,e_{m}\rangle\times\langle f_{1},f_{2},\ldots,f_{n}\rangle$ is naturally identified with Boolean lattice $2^{\{1,2,\ldots,m\}}\times 2^{\{1,2,\ldots,n\}}$ (by $(X,Y)\mapsto\bigvee_{i\in X}e_{i}\vee\bigvee_{j\in Y}f_{j}$ ). Then $r=|\cdot|$ , $\check{r}=|\cdot|$ , and $\vee=\cup$ . Notice that ${e_{i}}^{\bot}=f_{i}$ if $i\leq r$ and ${e_{i}}^{\bot}=V$ if $i>r$ . The latter fact follows from $e_{i}\subseteq V^{\bot}\Rightarrow{e_{i}}^{\bot}\supseteq V^{\bot\bot}\supseteq V$ . This implies that $X^{\bot}=X\cap\{1,2,\ldots,r\}$ for $X\in 2^{\{1,2,\ldots,m\}}$ . By Lemma 3.15, we have

[TABLE]

Identify $2^{\{1,2,\ldots,m\}}\times 2^{\{1,2,\ldots,n\}}$ with $\{0,1\}^{m}\times\{0,1\}^{n}$ by $(X,Y)\mapsto(1_{X},1_{Y})$ . Then $R$ is also written as

[TABLE]

The Lovász extension $\overline{R}$ is equal to the function obtained from $R$ by extending the domain to $[0,1]^{m}\times[0,1]^{n}$ .

Proof of Theorem 3.10 (1).

By Lemma 2.5, we can find (in polynomial time) a frame $\langle e_{1},e_{2},\ldots,e_{m}\rangle$ containing two chains ${\cal X}$ and ${\cal Y}^{\bot}$ . Suppose that ${\cal X}=\{X_{i}\}_{i=0}^{m}$ and ${\cal Y}=\{Y_{i}\}_{i=0}^{n}$ . We can assume that $e_{r+1}\vee e_{r+2}\vee\cdots\vee e_{m}={Y_{0}}^{\bot}=V^{\bot}$ by relabeling. Let $f_{i}:={e_{i}}^{\bot}$ for $i=1,2,\ldots,r$ . Then $f_{1}\vee f_{2}\vee\cdots\vee f_{r}=U^{\bot}$ holds. Indeed, by Lemma 3.14, we have $U^{\bot}=(e_{1}\vee e_{2}\vee\cdots\vee e_{m})^{\bot}={e_{1}}^{\bot}\vee{e_{2}}^{\bot}\vee\cdots\vee{e_{m}}^{\bot}=f_{1}\vee f_{2}\vee\cdots f_{r}\vee V\vee\cdots\vee V=f_{1}\vee f_{2}\vee\cdots\vee f_{r}$ .

Consider the chain ${\cal Y}^{\bot\bot}$ in $\check{\cal M}$ . Then ${\cal Y}^{\bot\bot}\subseteq\langle f_{1},f_{2},\ldots,f_{r}\rangle$ . Indeed, each $Y_{i}^{\bot}$ is a join of a subset of $e_{1},e_{2},\ldots,e_{m}$ . Taking $(\cdot)^{\bot}$ as above, $Y_{i}^{\bot\bot}$ is represented as a join of a subset of $f_{1},f_{2},\ldots,f_{r}$ . Consider a consecutive pair $Y_{i-1},Y_{i}$ in ${\cal Y}$ . Consider ${Y_{i-1}}^{\bot\bot}$ and ${Y_{i}}^{\bot\bot}$ . Then, by Lemma 3.14 (3), ${Y_{i-1}}^{\bot\bot}\preceq Y_{i-1}$ and ${Y_{i}}^{\bot\bot}\preceq Y_{i}$ . Suppose that ${Y_{i-1}}^{\bot\bot}\neq{Y_{i}}^{\bot\bot}$ . Then ${Y_{i-1}}^{\bot\bot}\prec_{1}{Y_{i}}^{\bot\bot}$ (by Lemma 3.14 (1)). Thus for some $f_{j}$ $(1\leq j\leq r)$ , it holds ${Y_{i}}^{\bot\bot}=f_{j}\vee{Y_{i-1}}^{\bot\bot}$ . Here $f_{j}\not\preceq Y_{i-1}$ must hold. Otherwise ${Y_{i-1}}^{\bot\bot}\succeq{f_{j}}^{\bot\bot}={e_{j}}^{\bot\bot\bot}=f_{j}$ , which contradicts ${Y_{i-1}}^{\bot\bot}\prec_{1}{Y_{i}}^{\bot\bot}=f_{j}\vee{Y_{i-1}}^{\bot\bot}$ . Thus $Y_{i}=Y_{i-1}\vee f_{j}$ . Therefore, for each $i$ with ${Y_{i-1}}^{\bot\bot}={Y_{i}}^{\bot\bot}$ , we can choose an atom $f$ with $Y_{i}=f\vee Y_{i-1}$ to add to $f_{1},f_{2},\ldots,f_{r}$ , and obtain a required frame $\langle f_{1},f_{2},\ldots f_{n}\rangle$ (containing ${\cal X}^{\bot}$ and ${\cal Y}$ ).

Proof of Theorem 3.10 (2).

The proof is long. An outline of the proof with an intuition is explained as follows:

•

Imagine the geodesic $\gamma$ emanating from $(x^{0},y^{0})$ to the minimizer $(x^{*},y^{*})$ of P2.

•

In the generic case, the geodesic meets maximal simplices $K_{0},K_{1},K_{2},\ldots,K_{\ell}\subseteq K({\cal L}\times\check{\cal M})$ in order so that $K_{i}\cap K_{i+1}$ has dimension $n+m-1$ . This yields a sequence (gallery) of corresponding maximal chains ${\cal C}_{0},{\cal C}_{1},{\cal C}_{2},\ldots,{\cal C}_{\ell}$ in ${\cal L}\times\check{\cal M}$ .

•

This gallery must have a special property (Lemma 3.19), called the $A$ -orthogonality, which we will introduce.

•

On the other hand, any $A$ -orthogonal gallery ${\cal C}_{0},{\cal C}_{1},{\cal C}_{2},\ldots,{\cal C}_{\ell}$ belongs to the product of sublattices generated by ${\cal X}\cup{\cal Y}^{\bot}$ and ${\cal Y}\cup{\cal X}^{\bot}$ , where ${\cal X}$ and ${\cal Y}$ are the projections of the initial ${\cal C}_{0}$ to ${\cal L}$ and to $\check{\cal M}$ , respectively (Lemma 3.18).

•

In the generic case, the above imply that $(x^{*},y^{*})$ belongs to the product of sublattices generated by ${\cal X}\cup{\cal Y}^{\bot}$ and ${\cal Y}\cup{\cal X}^{\bot}$ , where ${\cal X}$ and ${\cal Y}$ are the supports of $x^{0}$ and $y^{0}$ , respectively. This implies Theorem 3.10 (2).

•

By perturbation, we remove the genericity assumption (Lemma 3.20).

To formulate the $A$ -orthogonality, we start with a general lemma of a modular lattice.

Lemma 3.16.

Let ${\cal L}$ be a modular lattice. Let $p,p^{\prime}\in{\cal L}$ with $p\prec_{2}p^{\prime}$ , and let ${\cal C}$ be a chain such that

[TABLE]

is nonempty. Then there is a unique element $u^{*}$ with $p\prec_{1}u^{*}\prec_{1}p^{\prime}$ such that for all $q\in{\cal C}_{1}$ and all $u\neq u^{*}$ with $p\prec_{1}u\prec_{1}p^{\prime}$ it holds

[TABLE]

where $u^{*}$ is equal to $p\vee(p^{\prime}\wedge q)(=p^{\prime}\wedge(p\vee q))$ for all $q\in{\cal C}_{1}$ .

Intuitively speaking, this $u^{*}$ is the element closest to ${\cal C}$ among elements $u$ with $p\prec_{1}u\prec_{1}p^{\prime}$ ; see Figure 6. The element $u^{*}$ plays an important role, and is denoted by $g(p,p^{\prime},{\cal C})$ .

Proof.

Let $q\in{\cal C}_{1}$ . Then $p\wedge q\prec_{1}p^{\prime}\wedge q$ . Let $u^{*}:=p\vee(p^{\prime}\wedge q)$ . Then $p\prec_{1}u^{*}\prec_{1}p^{\prime}$ and $p^{\prime}\wedge q=u^{*}\wedge q$ . Thus

[TABLE]

Consider $u\neq u^{*}$ with $p\prec_{1}u\prec_{1}p^{\prime}$ . If $p^{\prime}\wedge q\preceq u$ , then $p^{\prime}\wedge q\preceq u\wedge u^{*}=p$ and $p^{\prime}\wedge q=p\wedge q$ , which contradicts $p\wedge q\prec_{1}p^{\prime}\wedge q$ . Thus $p^{\prime}\wedge q\not\preceq u$ . Therefore $u\vee(p^{\prime}\wedge q)=p^{\prime}$ , and $p\wedge q\preceq u\wedge q\prec_{1}p^{\prime}\wedge q$ . Necessarily $p\wedge q=u\wedge q$ . Hence we have

[TABLE]

which also implies the second equality by $r(u)=r(u^{*})$ .

Next we show that $u^{*}$ is independent of $q$ . Consider another $q^{\prime}\in{\cal C}_{1}$ . We may assume that $q\prec q^{\prime}$ . Let $u^{**}:=p\vee(p^{\prime}\wedge q^{\prime})$ . If $u^{*}$ and $u^{**}$ are different, then $p^{\prime}\wedge q\preceq u^{*}\wedge u^{**}=p$ ; this is a contradiction (to $p\wedge q\prec_{1}p^{\prime}\wedge q)$ . ∎

In the case of a Boolean lattice, $g$ is simply described as follows:

Lemma 3.17.

Suppose that ${\cal L}$ is a Boolean lattice $2^{\{1,2,\ldots,m\}}$ . For $X,X^{\prime}\subseteq\{1,2,\ldots,m\}$ with $X^{\prime}\setminus X=\{a,b\}$ and a chain ${\cal C}$ , it holds $g(X,X^{\prime},{\cal C})=X\cup\{a\}$ if and only if ${\cal C}$ has a member $Z$ with $a\in Z\not\ni b$ .

We say that ${\cal C}$ contains $a$ before $b$ if ${\cal C}$ has a member $Z$ with $a\in Z\not\ni b$ .

Let ${\cal L},{\cal M}$ and $A$ be as before. Let ${\cal C}=\{(X_{i},Y_{i})\}_{i=0}^{n+m}$ be a maximal chain of ${\cal L}\times\check{\cal M}$ . Then it holds that

[TABLE]

where

[TABLE]

for each $i\in\{1,2,\ldots,m+n\}$ . Let ${\cal X}_{\cal C}$ denote the maximal chain of ${\cal L}$ obtained by projecting ${\cal C}$ to ${\cal L}$ :

[TABLE]

Similarly, let ${\cal Y}_{\cal C}$ denote the maximal chain of $\check{\cal M}$ defined by

[TABLE]

Let ${\cal C}^{\prime}=\{(X_{i}^{\prime},Y_{i}^{\prime})\}$ be another maximal chain of ${\cal L}\times\check{\cal M}$ . Two chains ${\cal C}$ and ${\cal C^{\prime}}$ are said to be adjacent if $|{\cal C}\cap{\cal C}^{\prime}|=m+n-1$ . If ${\cal C}$ and ${\cal C}^{\prime}$ are adjacent, then there uniquely exists an index $i\in\{1,2,\ldots,m+n\}$ such that $(X_{j},Y_{j})=(X_{j}^{\prime},X_{j}^{\prime})$ holds for all $j\in\{0,1,2,\ldots,m+n\}\setminus\{i\}$ and one of the following holds for $(X_{i},Y_{i})\neq(X_{i}^{\prime},Y_{i}^{\prime})$ :

(0)

$X_{i-1}\prec_{1}X_{i+1}$ , $Y_{i-1}\prec_{1}Y_{i+1}$ , and $\{(X_{i},Y_{i}),(X^{\prime}_{i},Y^{\prime}_{i})\}=\{(X_{i-1},Y_{i+1}),(X_{i+1},Y_{i-1})\}$ .

(1)

$X_{i-1}\prec_{1}X_{i}\prec_{1}X_{i+1}$ , $X_{i-1}\prec_{1}X^{\prime}_{i}\prec_{1}X_{i+1}$ , and $Y_{i-1}=Y_{i}=Y_{i}^{\prime}=Y_{i+1}$ .

(2)

$Y_{i-1}\prec_{1}Y_{i}\prec_{1}Y_{i+1}$ , $Y_{i-1}\prec_{1}Y^{\prime}_{i}\prec_{1}Y_{i+1}$ , and $X_{i-1}=X_{i}=X_{i}^{\prime}=X_{i+1}$ .

${\cal C}$ and ${\cal C}^{\prime}$ are said to be [math]-adjacent if (0) holds, ${\cal L}$ -adjacent if (1) holds, and ${\cal M}$ -adjacent if (2) holds.

Also ${\cal C}$ and ${\cal C}^{\prime}$ are said to be $A$ -orthogonally ${\cal L}$ -adjacent from ${\cal C}$ to ${\cal C}^{\prime}$ if (1) holds with

[TABLE]

and $A$ -orthogonally ${\cal M}$ -adjacent from ${\cal C}$ to ${\cal C}^{\prime}$ if (2) holds with

[TABLE]

Intuitively speaking, if ${\cal C}$ and ${\cal C}^{\prime}$ are $A$ -orthogonally ${\cal L}$ -adjacent from ${\cal C}$ to ${\cal C}^{\prime}$ , then the transition from ${\cal C}$ to ${\cal C}^{\prime}$ is close to ${{\cal Y}_{\cal C}}^{\bot}$ (with nonincreasing $R$ ).

A sequence $({\cal C}_{0},{\cal C}_{1},\ldots,{\cal C}_{\ell})$ is called a gallery if for each $i\in\{1,2,\ldots,\ell\}$ , ${\cal C}_{i-1}$ and ${\cal C}_{i}$ are adjacent, and is called an $A$ -orthogonal gallery if for each $i\in\{1,2,\ldots,\ell\}$ , ${\cal C}_{i-1}$ and ${\cal C}_{i}$ are [math]-adjacent, or $A$ -orthogonally ${\cal L}$ - or ${\cal M}$ -adjacent from ${\cal C}_{i-1}$ to ${\cal C}_{i}$ .

Lemma 3.18.

For an $A$ -orthogonal gallery $({\cal C}={\cal C}_{0},{\cal C}_{1},\ldots,{\cal C}_{\ell})$ , it holds

[TABLE]

where $\langle{\cal Z}\rangle$ denotes the sublattice generated by ${\cal Z}$ .

Proof.

Let ${\cal X}_{k}:={\cal X}_{{\cal C}_{k}}$ and ${\cal Y}_{k}:={\cal Y}_{{\cal C}_{k}}$ . Since ${\cal C}_{k}\subseteq\langle{\cal X}_{k}\cup{{\cal Y}_{k}}^{\bot}\rangle\times\langle{{\cal X}_{k}}^{\bot}\cup{\cal Y}_{k}\rangle$ , It suffices to show

[TABLE]

It is obvious when ${\cal C}_{k}$ and ${\cal C}_{k+1}$ are [math]-adjacent, since ${\cal X}_{k}={\cal X}_{k+1}$ and ${\cal Y}_{k}={\cal Y}_{k+1}$ . We may assume that ${\cal C}_{k}$ and ${\cal C}_{k+1}$ are ${\cal L}$ -adjacent. It suffices to that $X_{i}^{\prime}\in{\cal X}_{k+1}\setminus{\cal X}_{k}$ belongs to $\langle{\cal X}_{k}\cup{{\cal Y}_{k}}^{\bot}\rangle$ , and ${X_{i}^{\prime}}^{\bot}$ belongs to $\langle{{\cal X}_{k}}^{\bot}\cup{\cal Y}_{k}\rangle$ . The former claim follows from Lemma 3.16 that $X_{i}^{\prime}$ is represented as $X_{i-1}\vee(X_{i+1}\wedge{Y_{j}}^{\bot})$ for some ${Y_{j}}^{\bot}\in{{\cal Y}_{k}}^{\bot}$ .

We show the latter claim. By ${X_{i-1}}^{\bot}\preceq{X^{\prime}_{i}}^{\bot}\preceq{X_{i+1}}^{\bot}$ and $r({X_{i-1}}^{\bot})-r({X_{i+1}}^{\bot})\leq r(X_{i+1})-r(X_{i-1})=2$ (Lemma 3.14), it suffices to consider the case where ${X_{i-1}}^{\bot}\prec_{1}{X^{\prime}_{i}}^{\bot}\prec_{1}{X_{i+1}}^{\bot}$ . By $r(X^{\prime}_{i})-r(X_{i}^{\prime}\wedge{Y_{j}}^{\bot})=r(X_{i})-r(X_{i}\wedge{Y_{j}}^{\bot})-1$ and Lemma 3.15, it holds.

[TABLE]

This in turn implies that $r(Y_{j})-r(Y_{j}\cap{X^{\prime}_{i}}^{\bot})=r(Y_{j})-r(Y_{j}\cap{X_{i}}^{\bot})-1$ . By Lemma 3.16, ${X_{i}^{\prime}}^{\bot}$ must be $g({X_{i-1}}^{\bot},{X_{i+1}}^{\bot},{\cal Y}_{k})$ , and belongs to $\langle{{\cal X}_{k}}^{\bot}\cup{\cal Y}_{k}\rangle$ as above. ∎

For a geodesic $[z,z^{\prime}]\subseteq K({\cal L}\times\check{\cal M})$ and $t\in[0,1]$ , let $z^{t}:=(1-t)z+tz^{\prime}$ , and let $K_{t}$ denote the simplex containing $z^{t}$ as its relative interior. The collection $\{K_{t}\}_{t\in[0,1]}$ of simplices is finite, since $[z,z^{\prime}]$ belongs to (finite complex) $K({\cal F})\simeq[0,1]^{n+m}$ for some frame ${\cal F}$ . A geodesic $[z,z^{\prime}]$ is said to be generic if $K_{0}$ has dimension $n+m$ , and $K_{t}$ has dimension $n+m$ or $n+m-1$ for $t\in(0,1)$ . A generic geodesic $[z,z^{\prime}]$ gives rise to a gallery $({\cal C}_{0},{\cal C}_{1},\ldots,{\cal C}_{\ell})$ as follows. Let ${\cal C}_{0}$ be a maximal chain corresponding to the simplex $K_{0}$ containing $z$ as its interior. For some $t_{1}>0$ , the point $z^{t_{1}}$ reaches the boundary of $K_{0}$ , which is a face of $K_{0}$ having dimension $n+m-1$ . For $t\in(t_{1},t_{2})$ for some $t_{2}>t_{1}$ , the point $z^{t}$ lies on the next maximal simplex $K_{1}$ adjacent to $K_{0}$ . Let ${\cal C}_{1}$ denote the maximal chain corresponding to $K_{1}$ . Then ${\cal C}_{0}$ and ${\cal C}_{1}$ are adjacent. As $t\rightarrow 1$ , we obtain a gallery $({\cal C}_{0},{\cal C}_{1},\ldots,{\cal C}_{\ell})$ . The main lemma is the following.

Lemma 3.19.

Let $(x^{*},y^{*})$ be the minimizer of P2. If geodesic $[(x^{0},y^{0}),(x^{*},y^{*})]$ is generic, then the corresponding gallery $({\cal C}_{0},{\cal C}_{1},\ldots,{\cal C}_{\ell})$ is $A$ -orthogonal.

In particular, if $[(x^{0},y^{0}),(x^{*},y^{*})]$ is generic, then the chain ${\cal C}_{\ell}$ including the support of the minimizer $(x^{*},y^{*})$ belongs to the sublattice $\langle{\cal X}\cup{\cal Y}^{\bot}\rangle\times\langle{\cal X}^{\bot}\cup{\cal Y}\rangle$ (by Lemma 3.18), which belongs to an $A$ -orthogonal frame ${\cal F}$ satisfying (3.4) to prove Theorem 3.10 (2).

Proof.

We may assume $\ell\geq 1$ . Let $\ell^{\prime}$ be the minimum index such that $({\cal C}_{\ell^{\prime}},{\cal C}_{\ell^{\prime}+1},\ldots,{\cal C}_{\ell})$ is $A$ -orthogonal. If $\ell^{\prime}=0$ , then the gallery is $A$ -orthogonal as required. Suppose to the contrary that $\ell^{\prime}>0$ . We may assume that ${\cal C}_{\ell^{\prime}-1}$ and ${\cal C}_{\ell^{\prime}}$ are ${\cal L}$ -adjacent and are not $A$ -orthogonal. Let ${\cal X}=\{X_{i}\}_{i=0}^{n+m}:={\cal X}_{{\cal C}_{\ell^{\prime}-1}}$ , ${\cal X}^{\prime}:={\cal X}_{{\cal C}_{\ell^{\prime}}}$ and ${\cal Y}:={\cal Y}_{{\cal C}_{\ell^{\prime}-1}}={\cal Y}_{{\cal C}_{\ell^{\prime}}}$ . For some $j$ , we have ${\cal X}^{\prime}={\cal X}\setminus\{X_{j}\}\cup\{X_{j}^{\prime}\}$ , where $X_{j}^{\prime}\neq g(X_{j-1},X_{j+1},{\cal Y}^{\bot})$ (or $g(X_{j-1},X_{j+1},{\cal Y}^{\bot})$ is not defined). Consider an $A$ -orthogonal frame $\langle e_{1},e_{2},\ldots,e_{m},f_{1},f_{2},\ldots,f_{n}\rangle$ containing $\langle{\cal X}^{\prime}\cup{\cal Y}^{\bot}\rangle\times\langle{{\cal X}^{\prime}}^{\bot}\cup{\cal Y}\rangle$ .

For some $t\in(0,1)$ , the point $(x^{t},y^{t})=(1-t)(x^{0},y^{0})+t(x^{*},y^{*})$ belongs to the intersection of maximal simplices corresponding to ${\cal C}_{\ell^{\prime}-1}$ and ${\cal C}_{\ell^{\prime}}$ . By Lemma 3.18, the frame $\langle e_{1},e_{2},\ldots,e_{m},f_{1},f_{2},\ldots,f_{n}\rangle$ contains ${\cal C}_{\ell}$ . Regard $\langle e_{1},e_{2},\ldots,e_{m}\rangle$ as $2^{\{1,2,\ldots,m\}}$ . Then $X_{j+1}\setminus X_{j-1}=\{a,b\}$ , $X_{j}^{\prime}=\{a\}\cup X_{j-1}$ , and $\tilde{X}_{j}:=\{b\}\cup X_{j-1}$ for distinct elements $a,b\in\{1,2,\ldots,m\}$ . Also $K(\langle e_{1},e_{2},\ldots,e_{m},f_{1},f_{2},\ldots,f_{n}\rangle)\simeq[0,1]^{m}\times[0,1]^{n}$ contains both $(x^{t},y^{t})$ and $(x^{*},y^{*})$ . Now $[(x^{t},y^{t}),(x^{*},y^{*})]$ is the segment in $[0,1]^{m}\times[0,1]^{n}$

Consider $x^{t}$ and $x^{*}$ in the $\langle e_{1},e_{2},\ldots,e_{m}\rangle$ -coordinate, In the original coordinate $x^{t}=\sum_{i}\lambda_{i}X_{i}$ , the coefficient $\lambda_{j}$ of $X_{j}$ is zero. Thus $x^{t}_{a}=x^{t}_{b}$ holds. In $x^{t+\epsilon}$ for small $\epsilon>0$ , the coefficient of $X^{\prime}_{i}$ becomes positive. This means that $x^{t+\epsilon}_{a}>x^{t+\epsilon}_{b}$ . Consequently $x_{a}^{*}>x_{b}^{*}$ holds. Let $\tilde{x}$ be obtained from $x^{*}$ by interchanging the $a$ -th and $b$ -th coordinates of $x^{*}$ . By $x^{t}_{a}=x^{t}_{b}$ , it holds

[TABLE]

By $d(x^{0},\tilde{x})\leq d(x^{0},x^{t})+d(x^{t},\tilde{x})=d(x^{0},x^{t})+d(x^{t},x^{*})=d(x^{0},x^{*})$ , we have

[TABLE]

Case 1: $g(X_{j-1},X_{j+1},{\cal Y}^{\bot})$ is defined. We are going to show

[TABLE]

which is a contradiction to its unique optimality of $(x^{*},y^{*})$ . Notice that $\tilde{X}_{i}=g(X_{j-1},X_{j+1},{\cal Y}^{\bot})$ also belongs to $\langle e_{1},e_{2},\ldots,e_{m}\rangle$ (since it is generated by $X_{j-1}$ , $X_{j+1}$ , and ${\cal Y}^{\bot}$ ). Since $X_{j}^{\prime}=\{a\}\cup X_{j-1}\neq g(X_{j-1},X_{j+1},{\cal Y}^{\bot})$ , by Lemma 3.17, chain ${\cal Y}^{\bot}$ contains $b$ before $a$ . Then ${\cal Y}^{\bot\bot}$ contains $b$ before $a$ . This must be $a\in\{1,2,\ldots,r\}$ . Consider $y^{t}$ in the $\langle f_{1},f_{2},\ldots,f_{n}\rangle$ -coordinate.

Case 1-1: $b\in\{1,2,\ldots,r\}$ . In this case, ${\cal Y}$ also contains $b$ before $a$ , since ${\cal Y}$ is obtained from ${\cal Y}^{\bot\bot}$ by adding elements $r+1,r+2,\ldots,n$ (see the proof of Theorem 3.10 (1)). Thus

[TABLE]

Case 1-1-1: $y_{b}^{*}\geq y_{a}^{*}$ . Recall Theorem 3.9 that $\overline{R}(x^{*},y^{*})$ is given by

[TABLE]

By $x_{a}^{*}>x^{*}_{b}$ and $y_{b}^{*}\geq y_{a}^{*}$ , it is easy to verify

[TABLE]

For example, if $x_{a}^{*}\geq y_{b}^{*}\geq x_{b}^{*}\geq y_{a}^{*}$ , the LHS is $x_{a}^{*}-y_{a}^{*}$ and the RHS is $(x_{b}^{*}-y_{a}^{*})+(x_{a}^{*}-y_{b}^{*})\leq x_{a}^{*}-y_{a}^{*}$ . Thus we obtain

[TABLE]

By (3.5), we have contradiction (3.6).

Case 1-1-2: $y_{b}^{*}<y_{a}^{*}$ .

Let $\tilde{y}$ be obtained from $y^{*}$ by interchanging the $a$ -th and $b$ -th coordinates of $y^{*}$ . Clearly

[TABLE]

Since $y^{t}_{b}>y^{t}_{a}$ and $y_{b}^{*}<y_{a}^{*}$ , it must hold $y^{t^{\prime}}_{a}=y^{t^{\prime}}_{b}$ for some $t^{\prime}>t$ , and hence $d(y^{t^{\prime}},y^{*})=d(y^{t^{\prime}},\tilde{y})$ . Thus we have

[TABLE]

Then we obtain a contradiction:

[TABLE]

Case 1-2: $b\in V^{\bot}$ i,e., $b>r$ . By $\max\{0,x^{*}_{a}-y^{*}_{a}\}\geq\max\{0,x^{*}_{b}-y^{*}_{a}\}=\max\{0,\tilde{x}_{a}-y^{*}_{a}\}$ , and $\overline{R}(x^{*},y^{*})\geq\overline{R}(\tilde{x},y^{*})$ . we obtain a contradiction (3.6).

Case 2: $g(X_{j-1},X_{j+1},{\cal Y}^{\bot})$ is not defined. In this case, both $a$ and $b$ belong to $V^{\bot}$ , Namely $a,b>r$ holds. Then $\overline{R}(\tilde{x},y^{*})=\overline{R}(x^{*},y^{*})$ . Thus we obtain (3.6). ∎

Finally we remove the genericity assumption.

Lemma 3.20.

For $z^{0}=(x^{0},y^{0})\in K({\cal L}\times\check{\cal M})$ , and a maximal simplex $K$ containing $z^{0}$ , there is $z\in K$ such that

(1)

$[z,J^{\overline{R}}_{\lambda}(z)]$ * is generic, and*

(2)

$J^{\overline{R}}_{\lambda}(z^{0})$ * is contained in the simplex containing $J^{\overline{R}}_{\lambda}(z)$ as its relative interior.*

Thus we can choose points $x$ and $y$ from any maximal simplices corresponding to ${\cal X}$ and ${\cal Y}$ such that $[(x,y),J_{\lambda}^{\overline{R}}(x,y)]$ is generic, and the simplex $K^{*}$ containing $J_{\lambda}^{\overline{R}}(x,y)$ in its relative interior contains $(x^{*},y^{*})=J_{\lambda}^{\overline{R}}(x^{0},y^{0})$ . Therefore, for any $A$ -orthogonal frame ${\cal F}$ satisfying (3.4), the subcomplex $K({\cal F})$ contains $J_{\lambda}^{\overline{R}}(x,y)$ and $K^{*}\ni(x^{*},y^{*})$ .

Proof.

First notice that $J=J_{\lambda}^{\overline{R}}$ is nonexpansive [26], and is continuous. Let $B(z,\epsilon)$ denote the open ball with center $z$ and radius $\epsilon>0$ . For sufficiently small $\epsilon>0$ and every $u\in B(z^{0},\epsilon)$ , the simplex $K^{*}$ containing $J(u)$ as its relative interior also contains $J(z^{0})$ . Thus, by perturbing $z^{0}$ , we can assume in advance that $z^{0}$ belongs to the interior of $K$ . Let $\epsilon>0$ be sufficiently small so that $B(z^{0},\epsilon)\subseteq K$ . We can replace $z^{0}$ by a point $z^{\prime}$ in $B(z^{0},\epsilon)$ that maximizes the dimension of the simplex $K^{*}$ containing $J(z^{\prime})$ as its relative interior. Then we can assume that for sufficiently small $\epsilon>0$ , the image $J(B(z^{0},\epsilon))$ belongs to the relative interior of $K^{*}(\ni J(z^{0}))$ .

It suffices to show that there is $z\in B(z^{0},\epsilon)$ such that $[z,J(z)]$ is generic. Consider a frame ${\cal F}$ containing the supports of $z$ and $K^{*}$ . Regard $K({\cal F})\simeq[0,1]^{n+m}$ . Consider the affine hull of $K^{*}$ , which is represented by linear equation $Au=b$ . In $K^{*}$ , Lovász extension $\overline{R}$ is a linear function $u\mapsto c^{\top}u$ . For every $u\in B(z^{0},\epsilon)$ , resolvent $J(u)$ is the unique minimizer of

[TABLE]

This is an equality-constrained quadratic program. By the Lagrange multiplier method, we obtain an explicit formula of $J$ :

[TABLE]

where $c^{\prime}$ is a constant vector. Consider geodesic (segment) $[u,J(u)]$ . For each $t\in(0,1)$ , define $\varphi_{t}:B(z^{0},\epsilon)\to[0,1]^{n+m}$ by

[TABLE]

Here $A^{\top}(AA^{\top})^{-1}A$ is a projection, and its eigenvalue is [math] or $1$ . Hence $(I-tA^{\top}(AA^{\top})^{-1}A)$ is nonsingular for $t\in(0,1)$ . This implies that $\varphi_{t}(B(z^{0},\epsilon))$ is an open neighborhood of $\varphi_{t}(z)$ for $t\in(0,1)$ . Suppose that open segment $(z^{0},J(z^{0}))$ meets simplices $F_{1},F_{2},\ldots,F_{\ell}$ of dimension at most $n+m-2$ . Now $\epsilon$ is small. For every $u\in B(z^{0},\epsilon)$ , any simplex of dimension at most $n+m-2$ which $(u,J(u))$ can meet is one of $F_{1},F_{2},\ldots,F_{\ell}$ . For $i\in\{1,2,\ldots,\ell\}$ , the set of points $u\in B(z^{0},\epsilon)$ with $\varphi_{t}(u)\in F_{i}$ belongs to an affine subspace of dimension $n+m-2$ . Consequently, the set of points $u\in B(z^{0},\epsilon)$ with $\varphi_{t}(u)\in F_{i}$ for some $t\in(0,1)$ , i.e., $(u,J(u))$ meets $F_{i}$ , must belong to a hypersurface ${\cal H}_{i}$ (of dimension $n+m-1$ ). Therefore, choose $z$ from $B(z^{0},\epsilon)\setminus\bigcup_{i=1}^{\ell}{\cal H}_{i}$ . Then $(z,J(z))$ meets none of simplices $F_{1},F_{2},\ldots,F_{\ell}$ . Namely $[z,J(z)]$ is generic, as required. ∎

4 Block-triangularization of partitioned matrix

In this section, we present implications of Theorem 1.1 on a block-triangularization of a partitioned matrix.

4.1 DM-decomposition

Let $A=(A_{\alpha\beta})$ be a partitioned matrix as above. Consider MVSP for $A$ . A vanishing subspace $(X_{1},X_{2},\ldots,X_{\mu},Y_{1},Y_{2},\ldots,Y_{\nu})$ is simply denoted by $(X,Y)$ , where $X$ and $Y$ denote tuples of subspaces $X_{\alpha}$ and $Y_{\beta}$ , respectively. We say that $(X,Y)$ is a vanishing subspace with dimension $\dim X+\dim Y$ , where $\dim X:=\sum_{\alpha}\dim X_{\alpha}$ and $\dim Y:=\sum_{\beta}\dim Y_{\beta}$ . Formally speaking, $(X,Y)$ represents subspace $\bigoplus_{\alpha}X_{\alpha}\times\bigoplus_{\beta}Y_{\beta}$ of $\bigoplus_{\alpha}{\bf F}^{m_{\alpha}}\times\bigoplus_{\beta}{\bf F}^{n_{\beta}}$ on which the bilinear form $\bigoplus_{\alpha}{\bf F}^{m_{\alpha}}\times\bigoplus_{\beta}{\bf F}^{n_{\beta}}\to{\bf F}$ defined by

[TABLE]

vanishes, where $u_{\alpha}$ (resp. $v_{\beta}$ ) is the natural projection of $u$ to ${\bf F}^{m_{\alpha}}$ (resp. ${\bf F}^{n_{\beta}}$ ). A vanishing subspace of a maximum dimension is called a maximum vanishing subspace, abbreviated as an mv-subspace.

Let ${\cal S}={\cal S}_{A}$ denote the modular lattice of all mv-subspaces for $A$ , where the partial order is given by $(X,Y)\preceq(X^{\prime},Y^{\prime})$ if and only if $X_{\alpha}\subseteq X_{\alpha}^{\prime}$ and $Y_{\beta}\supseteq Y_{\beta}^{\prime}$ for each $\alpha,\beta$ . Consider a chain $(X^{0},Y^{0})\prec(X^{1},Y^{1})\prec\cdots\prec(X^{\ell},Y^{\ell})$ of mv-subspaces. For each $\alpha$ , choose a base $E_{\alpha}=\{e^{\alpha}_{1},e^{\alpha}_{2},\ldots,e^{\alpha}_{m_{\alpha}}\}$ of ${\bf F}^{m_{\alpha}}$ such that $E_{\alpha}^{k}=\{e^{\alpha}_{1},e^{\alpha}_{2},\ldots,e^{\alpha}_{k_{\alpha}}\}$ , for some $k_{\alpha}$ , is a base of $X_{\alpha}^{k}$ for $k=1,2,\ldots,\ell$ . For each $\beta$ , choose a base $F_{\beta}=\{f^{\beta}_{1},f^{\beta}_{2},\ldots,f^{\beta}_{n_{\beta}}\}$ of ${\bf F}^{n_{\beta}}$ such that $F_{\beta}^{k}=\{f^{\beta}_{1},f^{\beta}_{2},\ldots,f^{\beta}_{k_{\beta}}\}$ , for some $k_{\beta}$ , is a base of $Y_{\beta}^{k}$ for $k=1,2,\ldots,\ell$ . Then $\bigcup_{\alpha}E_{\alpha}$ is regarded as a base of ${\bf F}^{m}$ via canonical injection ${\bf F}^{m_{\alpha}}\hookrightarrow\bigoplus_{\alpha}{\bf F}^{m_{\alpha}}$ , and $\bigcup_{\beta}F_{\beta}$ is regarded as a base of ${\bf F}^{n}$ similarly. Also $\bigcup_{\alpha}E_{\alpha}^{k}$ is a base of $X^{k}$ , and $\bigcup_{\beta}F_{\beta}^{k}$ is a base of $Y^{k}$ . Then the change of the bases gives rise to a transformation of the form (1.3). By rearranging rows and columns, we obtain the following block-triangular form:

[TABLE]

where the diagonal block $D_{k}$ is a square matrix of size $\dim X^{k+1}-\dim X^{k}=\dim Y^{k}-Y^{k+1}$ for $k=1,2,\ldots,\ell$ , $D_{0}$ is a matrix of $\dim X_{0}$ rows and $n-\dim Y_{0}(<\dim X_{0})$ columns and $D_{\ell+1}$ is a matrix of $m-\dim X_{\ell}(<\dim Y_{\ell})$ rows and $\dim Y_{\ell}$ columns.

For any vanishing subspace $(X,Y)$ of $A$ , recall the introduction that the following inequality holds:

[TABLE]

In particular, $m+n-\mathop{\rm rank}A$ is an upper bound of the maximum vanishing dimension, though it is not attained in general. Ito, Iwata, and Murota [27] mainly focus the case where this bound is attained. In this case, the resulting block-triangular form (4.1) satisfies the rank-condition that each $D_{k}$ is of row- or column-full rank. Such a block triangular form is particularly called proper. We here do not impose the properness on decomposition (4.1).

The DM-decomposition of $A$ is the most refined block triangularization such that the chain of mv-subspaces is taken to be maximal in ${\cal S}$ . The original DM-decomposition [11] corresponds to the case of $m_{\alpha}=n_{\beta}=1$ for all $\alpha,\beta$ . The combinatorial canonical form (CCF) for a multilayered mixed matrix [36] corresponds to the case of $n_{\beta}=1$ for all $\beta$ . There are polynomial time algorithms (based on bipartite matching and matroid union) to obtain DM-decompositions for these cases, whereas no polynomial time algorithm is known for the general case.

MVSP asks for one mv-subspace. On the other hand, the DM-decomposition needs a maximal chain of mv-subspaces. Therefore, solving MVSP is not enough to obtaining the DM-decomposition.

On the difficulty of DM-decomposition.

Obtaining the DM-decomposition cannot avoid issues of numerical analysis/computation and the algebraically closedness of base field ${\bf F}$ . Consider the following partitioned matrix of type $(n,n;n,n)$

[TABLE]

where $A,B,C,D$ are all nonsingular. Finding the DM-decomposition of this matrix reduces to the eigenvalue problem as follows. Suppose that $(X_{1},X_{2},Y_{1},Y_{2})$ is a vanishing subspace. By the nonsingularity of the submatrices, it must hold that $\dim X_{\alpha}+\dim Y_{\beta}\leq n$ . Consequently, trivial vanishing subspaces $(\{0\},\{0\},{\bf F}^{n},{\bf F}^{n})$ and $({\bf F}^{n},{\bf F}^{n},\{0\},\{0\})$ are maximum with dimension $2n$ . Suppose that $(X_{1},X_{2},Y_{1},Y_{2})$ is an mv-subspace. Then it must hold $\dim X_{1}=\dim X_{2}=n-\dim Y_{1}=n-\dim Y_{2}$ . Moreover, from $AY_{1}=(X_{1})^{\bot}=BY_{2}$ and $CY_{1}=(X_{1})^{\bot}=DY_{2}$ , we obtain

[TABLE]

where $(\cdot)^{\bot}$ means the orthogonal subspace with respect to the standard inner product. If such $Y_{1}$ is given, then we can recover mv-subspace $(X_{1},X_{2},Y_{1},Y_{2})$ . This implies that finding a maximal chain of mv-subspaces is equivalent to finding a maximal chain of invariant subspaces of matrix $C^{-1}DB^{-1}A$ . In the case where the base field ${\bf F}$ is algebraically-closed, the Schur decomposition finds such a chain of invariant subspaces and triangularizes $C^{-1}DB^{-1}A$ by a similarity transformation, where the resulting triangular form has all eigenvalues in diagonals. Consequently, we obtain a maximal chain of mv-subspaces and the DM-decomposition with four diagonal blocks of size $2\times 2$ . In particular, the DM-decomposition may change when ${\bf F}$ is not algebraically-closed and the matrix is considered in an extension field of ${\bf F}$ . A simple example of such a matrix (over ${\bf Q}$ ) is given in [28, 6.2]

A more difficult situation occurs. Consider the following partitioned matrix of type $(n,n,n;n,n,n)$

[TABLE]

where all submatrices are nonsingular. By the same argument, the maximum vanishing dimension is $3n$ . Also, if $(X_{1},X_{2},X_{3},Y_{1},Y_{2},Y_{3})$ is an mv-subspace, then $Y_{1}$ must satisfy

[TABLE]

Namely $Y_{1}$ is a common invariant subspace of three matrices. Therefore the problem of finding the DM-decomposition includes the common invariant subspace problem. This extremely difficult problem undergoes current research in numerical analysis/computation (see e.g., [3, 25]), and a satisfactory algorithm is not yet obtained (as far as we recognize).

4.2 Quasi DM-decomposition

Here we introduce the concept of quasi DM-decomposition, which is a block-triangular form coarser than the DM-decomposition but does not depend on base field ${\bf F}$ and still generalizes important special cases (the original DM-decomposition and CCF). It turns out that a quasi DM-decomposition corresponds exactly to a chain of mv-subspaces detectable by solving WMVSP, and is obtained in polynomial time. We believe that obtaining a quasi DM-decomposition is a limit which we can do by combinatorial or optimization methods.

Let $A=(A_{\alpha\beta})$ be a partitioned matrix as above, and ${\cal S}$ the lattice of all mv-subspaces for $A$ . A vanishing space $(X,Y)$ is said to be trivial if $X_{\alpha}=0$ for each $\alpha$ or $Y_{\beta}=0$ for each $\beta$ . Other vanishing spaces are said to be nontrivial. $A$ is called DM-irreducible if ${\cal S}$ consists only of trivial mv-subspaces, and called DM-regular if ${\cal S}$ contains both of the trivial mv-subspaces, or equivalently, if the maximum vanishing dimension is equal to $n$ and $m$ . In particular, a DM-regular matrix is necessarily a square matrix. In the DM-decomposition (4.1), each diagonal block $D_{k}$ is DM-irreducible, and is DM-regular if $1\leq k\leq\ell$ .

To formulate quasi DM-decomposition, we introduce the notion of the quasi DM-irreducibility. Partitioned matrix $A$ is called quasi DM-irreducible if for each nontrivial mv-subspace $(X,Y)\in{\cal S}$ there are positive integers $k,\ell$ with $k<\ell$ such that for all $\alpha,\beta$ it holds

[TABLE]

This means that any nontrivial mv-subspace of a quasi DM-irreducible matrix has a common ratio of dimensions in ${\bf F}^{m_{\alpha}}\times{\bf F}^{n_{\beta}}$ for all $\alpha,\beta$ . Obviously the quasi DM-irreducibility is a weaker notion than the DM-irreducibility. If $A$ is quasi DM-irreducible and admits a nontrivial mv-subspace $(X,Y)$ , then $\max(m,n)\leq\sum_{\alpha}\dim X_{\alpha}+\sum_{\beta}\dim Y_{\beta}=(k/\ell)\sum_{\alpha}m_{\alpha}+((\ell-k)/\ell)\sum_{\beta}n_{\beta}=(k/\ell)m+((\ell-k)/\ell)n\leq\max(m,n)$ , and necessarily the maximum vanishing dimension is equal to $n=m$ , which implies that $A$ is DM-regular. In particular, the DM-irreducibility and quasi DM-irreducibility are the same for a non-square partitioned matrix.

For $n\geq 2$ , any $n\times n$ nonsingular matrix $A$ , viewed as a partition matrix of type $(n;n)$ , is not DM-irreducible but quasi DM-irreducible. Indeed, for any proper nonzero subspace $X$ , $(X,(XA)^{\bot})$ is a nontrivial mv-subspace with $(n-\dim(XA)^{\bot})/n=(n-(n-\dim X))/n=\dim X/n$ . Also, a partitioned matrix of form (4.2) is quasi DM-irreducible and not DM-irreducible if ${\bf F}$ is algebraically closed. More generally, any partitioned matrix of consisting $n\times n$ nonsingular submatrices, such as (4.3), is quasi DM-irreducible.

A quasi DM-decomposition of $A$ is a block-triangular form (4.1) such that each diagonal block is quasi DM-irreducible. The quasi DM-decomposition still generalizes an important special case of CCF ( $n_{\beta}=1$ for all $\beta$ ). This fact follows from:

Lemma 4.1.

Suppose that $A$ is DM-regular with $\gcd(m_{1},m_{2},\ldots,m_{\mu},n_{1},n_{2},\ldots,n_{\nu})=1$ . Then $A$ is DM-irreducible if and only if $A$ is quasi DM-irreducible.

Proof.

If $A$ admits a nontrivial mv-subspace as in (4.4), then $\ell$ becomes a common divisor of $m_{\alpha},n_{\beta}$ , which is greater than $1$ . ∎

The main result of this section is the following.

Theorem 4.2.

A quasi DM-decomposition of a partitioned matrix over ${\bf F}$ can be obtained in polynomial time, provided arithmetic operations on ${\bf F}$ can be done in constant time.

The rest of this section is devoted to the proof of this theorem. The algorithm is based on a simple recursive idea: Find a nontrivial mv-subspace for $A$ by solving WMVSP with special weights. If a nontrivial mv-subspaces $(X,Y)$ is found, then decompose $A$ into two matrices $A^{X,Y^{c}}$ and $A^{X^{c},Y}$ , and recurse into $A^{X,Y^{c}}$ and into $A^{X^{c},Y}$ .

Now suppose that we are given one (nontrivial) mv-subspace $(X,Y)$ . The two partitioned matrices $A^{X,Y^{c}}$ and $A^{X^{c},Y}$ are constructed as follows. For each $\alpha$ , choose a complement $U_{\alpha}$ of $X_{\alpha}$ . For each $\beta$ , choose a complement $V_{\beta}$ of $Y_{\beta}$ . Let $A^{X,Y^{c}}_{\alpha\beta}$ be the matrix representation of the restriction of $A_{\alpha\beta}$ to $X_{\alpha}\times V_{\beta}$ . Let $A^{X,Y^{c}}:=(A^{X,Y^{c}}_{\alpha\beta})$ be the partitioned matrix consisting of the nonempty matrices among them. Define $A^{X^{c},Y}:=(A^{X^{c},Y}_{\alpha\beta})$ similarly.

Lemma 4.3.

Let $(X,Y)$ be an mv-subspace for $A$ , and let $(X^{\prime},Y^{\prime})$ be a vanishing subspace for $A$ such that $(X^{\prime},Y^{\prime})\preceq(X,Y)$ . The following conditions are equivalent:

(1)

$(X^{\prime},Y^{\prime})$ * is an mv-subspace for $A$ .*

(2)

$(X^{\prime},Y^{\prime})$ * is represented as $(X^{\prime},Y+Q)$ with an mv-subspace $(X^{\prime},Q)$ for $A^{X,Y^{c}}$ .*

Proof.

Let $Q_{\beta}:=V_{\beta}\cap Y^{\prime}_{\beta}$ for $\beta$ . Then $Y^{\prime}_{\beta}=Q_{\beta}+Y_{\beta}$ . $A_{\alpha\beta}(X^{\prime}_{\alpha},Y^{\prime}_{\beta})=A_{\alpha\beta}(X^{\prime}_{\alpha},Y_{\beta})+A_{\alpha\beta}(X^{\prime}_{\alpha},Q_{\beta})=A_{\alpha\beta}^{X,Y^{c}}(X^{\prime}_{\alpha},Q_{\beta})$ (since $A_{\alpha\beta}(X^{\prime}_{\alpha},Y_{\beta})\subseteq A_{\alpha\beta}(X_{\alpha},Y_{\beta})=\{0\}$ ). Thus $A_{\alpha\beta}(X^{\prime}_{\alpha},Y^{\prime}_{\beta})=\{0\}$ if and only if $A_{\alpha\beta}^{X,Y^{c}}(X^{\prime}_{\alpha},Q_{\beta})=\{0\}$ . The claim follows from this fact and $\dim Y^{\prime}_{\beta}=\dim Q_{\beta}+\dim Y_{\beta}$ . ∎

Next we consider to find a nontrivial mv-subspace by solving WMVSP. An mv-subspace $(X,Y)$ is called extremal if $(X,Y)$ is the unique optimal solution of WMVSP for some weights $C_{\alpha},D_{\beta}$ .

The minimal and maximal mv-subspaces are extremal.

Lemma 4.4.

(1)

Define weights $C_{\alpha},D_{\beta}$ by

[TABLE]

Then an optimal solution of WMVSP is unique, and is equal to the maximal mv-subspace.

(2)

Define weights $C_{\alpha},D_{\beta}$ by

[TABLE]

Then an optimal solution of WMVSP is unique, and is equal to the minimal mv-subspace.

Proof.

It suffices to prove (1). Let $(X,Y)$ be the unique maximal mv-subspace, and let $(X^{\prime},Y^{\prime})$ be an arbitrary vanishing subspace. Then

[TABLE]

If $(X^{\prime},Y^{\prime})$ is not an mv-subspace, then (4.7) $\geq m+1-m>0$ . If $(X^{\prime},Y^{\prime})$ is a nonmaximal mv-subspace, then $\dim X>\dim X^{\prime}$ , and (4.7) $>0$ . Thus $(X,Y)$ is the unique optimal solution of WMVSP. ∎

Therefore we may focus on a DM-regular partitioned matrix.

Lemma 4.5.

Suppose that $A$ is DM-regular.

(1)

For $\alpha^{\prime}\in\{1,2,\ldots,\mu\}$ , define weights $C_{\alpha},D_{\beta}$ by

[TABLE]

Then any optimal solution of WMVSP is an mv-subspace.

(2)

For $\beta^{\prime}\in\{1,2,\ldots,\nu\}$ , define weights $C_{\alpha},D_{\beta}$ by

[TABLE]

Then any optimal solution of WMVSP is an mv-subspace.

Proof.

It suffices to prove (1). Let $(X,Y)$ be an optimal solution of WMVSP, and let $(X^{\prime},Y^{\prime})$ be a vanishing subspace. Then, letting $M:=2m_{\alpha^{\prime}}+1$ , we have

[TABLE]

From this, we have

[TABLE]

where we use $\dim Y^{\prime}\leq n=m$ . This implies that $\dim X+\dim Y\geq\dim X^{\prime}+\dim Y^{\prime}$ . Thus $(X,Y)$ is an mv-subspace. ∎

Theorem 4.6.

Suppose that $A$ is DM-regular. The following conditions are equivalent:

(1)

$A$ * is quasi DM-irreducible.*

(2)

There is no extremal nontrivial mv-subspace.

(3)

For each $\alpha^{\prime}\in\{1,2,\ldots,\mu\}$ , the trivial mv-subspaces are optimal to WMVSP with weights $(\ref{eqn:CD1})$ , and, for each $\beta^{\prime}\in\{1,2,\ldots,\nu\}$ , the trivial mv-subspaces are optimal to WMVSP with weights $(\ref{eqn:CD2})$ .

Proof.

(1) $\Rightarrow$ (2). Let $(X,Y)$ be a nontrivial mv-subspace (of dimension $n=m$ ). Then there are positive integers $k,\ell$ satisfying (4.4). For any weights $C_{\alpha},D_{\beta}$ , we have

[TABLE]

Here $\sum_{\alpha}C_{\alpha}m_{\alpha}$ and $\sum_{\beta}D_{\beta}n_{\beta}$ are the weights of two trivial mv-subspaces. This means that $(X,Y)$ is never a unique optimal solution of WMVSP.

(2) $\Rightarrow$ (3). Let $(X,Y)$ be an optimal solution of WMVSP under weights $(\ref{eqn:CD1})$ . By Lemma 4.5, the space $(X,Y)$ is an mv-subspace. If $(X,Y)$ has the weight greater than the weight of the trivial mv-subspace, then $(X,Y)$ is nontrivial, and this implies the existence of an extremal mv-subspace other than the trivial ones.

(3) $\Rightarrow$ (1). Suppose that $A$ is not quasi DM-irreducible. There is a nontrivial mv-subspace $(X,Y)$ such that one of the following holds:

(i)

$\dim X_{\alpha}/m_{\alpha}\neq\dim X_{\alpha^{\prime}}/m_{\alpha^{\prime}}$ for some $\alpha,\alpha^{\prime}$ .

(ii)

$\dim Y_{\beta}/n_{\beta}\neq\dim Y_{\beta^{\prime}}/n_{\beta^{\prime}}$ for some $\beta,\beta^{\prime}$ .

(iii)

$\dim X_{\alpha}/m_{\alpha}\neq(n_{\beta}-\dim Y_{\beta})/n_{\beta}$ for some $\alpha,\beta$ .

We may assume that (i) or (ii) holds. Indeed, suppose that (iii) holds and both (i) and (ii) do not hold. There are some positive integers $k,\ell,k^{\prime},\ell^{\prime}$ with $k/\ell\neq k^{\prime}/\ell^{\prime}$ such that $\dim X_{\alpha}/m_{\alpha}=k/{\ell}$ and $\dim Y_{\beta}/n_{\beta}=(\ell^{\prime}-k^{\prime})/\ell^{\prime}$ hold for all $\alpha,\beta$ . Thus we have

[TABLE]

This is a contradiction since the maximum vanishing dimension is $n$ .

We may assume that (i) holds. Let $\kappa_{\alpha}:=\dim X_{\alpha}/m_{\alpha}$ and $\kappa:=\dim X/m=(n-\dim Y)/n$ . Let $\alpha^{\prime}$ denote an index $\alpha$ having the maximum $\kappa_{\alpha}$ . Then we have

[TABLE]

Consider the optimal value of WMVSP with weights (4.11) for index $\alpha^{\prime}$ , which is given by

[TABLE]

where we let $M:=2m_{\alpha^{\prime}}+1$ , and we use $n=m$ and $\kappa_{\alpha^{\prime}}>\kappa$ . Here $M^{2}m^{2}+Mmm_{\alpha^{\prime}}$ is the weight of the trivial ones. In particular, the trivial vanishing spaces are not optimal. ∎

Now we are ready to describe an algorithm to obtain a quasi DM-decomposition, The algorithm outputs a chain of mv-subspaces corresponding to a quasi DM-decomposition, which we call a q-DM chain.

Algorithm: q-DM

Input:

A partitioned matrix $A$ .

Output:

A q-DM chain ${\cal C}$ of mv-subspaces for $A$ .

1:

Solve WMVSP for $A$ under weights (4.5) to obtain the maximal mv-subspace $(X_{\max},Y_{\min})$ .

2:

Solve WMVSP for $A$ under weights (4.6) to obtain the minimal mv-subspace $(X_{\min},Y_{\max})$ .

3:

Let $A\leftarrow(A^{X_{\max},Y_{\min}^{c}})^{X^{c}_{\min},Y_{\max}}$ , which is DM-regular.

4:

Call q-DMreg for input $A$ to obtain a q-DM chain $\{(X^{k},Y^{k})\}_{k}$ for $A$ , where each $X^{k}$ (resp. $Y^{k}$ ) is viewed as a subspace of a complement of $X_{\min}$ (resp. $Y_{\min}$ )

5:

Return ${\cal C}:=\{(X^{k}+X_{\min},Y^{k}+Y_{\min})\}_{k}$ .

Algorithm: q-DMreg

Input:

A DM-regular partitioned matrix $A$ .

Output:

A q-DM chain ${\cal C}$ of mv-subspaces for $A$ .

1:

For each $\alpha^{\prime}\in\{1,2,\ldots,\mu\}$ , solve WMVSP for weights (4.11), and for each $\beta^{\prime}\in\{1,2,\ldots,\nu\}$ , solve WMVSP for weights (4.14).

2:

If we find an optimal solution $(X,Y)$ of WMVSP having the weight greater than that of trivial mv-subspaces, then do the following:

2.1:

Call q-DMreg for input $A^{X^{c},Y}$ to obtain a q-DM chain $\{(Z^{k},Y^{k})\}_{k}$ of $A^{X^{c},Y}$ .

2.2:

Call q-DMreg for input $A^{X,Y^{c}}$ to obtain a q-DM chain $\{(X^{\ell},W^{\ell})\}_{\ell}$ of $A^{X,Y^{c}}$ .

2.3:

Return ${\cal C}:=\{(Z^{k}+X,Y^{k})\}_{k}\cup\{(X^{\ell},W^{\ell}+Y)\}_{\ell}$ .

3:

Otherwise, $A$ is quasi DM-irreducible. Return two trivial mv-subspaces.

The correctness of this algorithm follows from Lemmas 4.3, 4.4, 4.5 and Theorem 4.6. The algorithm solves WMVSP polynomially many times. Since weights $C_{\alpha},D_{\beta}$ are always bounded by a polynomial of $n,m$ , by Theorem 1.1, WMVSP can be solved in polynomial time. Consequently, the whole algorithm runs in polynomial time. This proves Theorem 4.2.

Acknowledgments

We thank Kazuo Murota, Satoru Iwata, Satoru Fujishige, and Yuni Iwamasa for helpful comments. The work was partially supported by JSPS KAKENHI Grant Numbers 25280004, 26330023, 26280004, 17K00029.

Bibliography38

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] P. Abramenko and K. S. Brown: Buildings—Theory and Applications (Springer, New York, 2008).
2[2] F. Ardila, M. Owen, and S. Sullivant, Geodesics in CAT(0) cubical complexes, Advances in Applied Mathematics 48 (2012), 142–163
3[3] D. Arapura and C. Peterson, The common invariant subspace problem: an approach via Gröbner bases, Linear Algebra and its Applications 384 (2004) 1–7.
4[4] M. Bačák, The proximal point algorithm in metric spaces, Israel Journal of Mathematics 194 (2013), 689–701.
5[5] M. Bačák, Computing medians and means in Hadamard spaces, SIAM Journal on Optimization 24 (2014), 1542–1566.
6[6] M. Bačák, Convex Analysis and Optimization in Hadamard Spaces . De Gruyter, Berlin, 2014.
7[7] T. Brady and J. Mc Cammond, Braids, posets and orthoschemes. Algebraic and Geometric Topology 10 (2010), 2277–2314.
8[8] M. R. Bridson and A. Haefliger, Metric Spaces of Non-positive Curvature . Springer-Verlag, Berlin, 1999.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Maximum vanishing subspace problem, CAT(0)-space relaxation,

Abstract

1 Introduction

Theorem 1.1**.**

Submodular optimization on modular lattice.

Beyond Euclidean convexity: Outline of the proof.

Block-triangularization of partitioned matrix.

Relation to Edmonds’ problem and the recent development.

Organization.

2 Preliminaries

2.1 Convex optimization on CAT(0)-space

2.1.1 CAT(0)-space

Proposition 2.1** ([8, Proposition 1.4]).**

2.1.2 Convex function

Lemma 2.2**.**

2.1.3 Proximal point algorithm

Theorem 2.3** ([37]).**

2.2 Modular lattice

2.2.1 Basic properties

Lemma 2.4**.**

Proof.

Lemma 2.5** (see e.g.,[18]).**

Proof.

Claim**.**

Lemma 2.6**.**

Proof.

2.2.2 Orthoscheme complex

Theorem 2.7** ([9, 20]).**

Lemma 2.8** ([7, 9]).**

Lemma 2.9** ([7, 9]).**

Lemma 2.10** ([9]).**

Corollary 2.11**.**

Proof.

2.2.3 Lovász extension

Theorem 2.12** ([22]).**

Sketch of proof.

Lemma 2.13**.**

Proof.

Lemma 2.14**.**

Proof.

Lemma 2.15**.**

Proof.

3 Maximum vanishing subspace problem

3.1 CAT(0)-space relaxation

Lemma 3.1**.**

Proof.

Lemma 3.2**.**

Lemma 3.3**.**

3.2 Proximal point algorithm for MVSP

Lemma 3.4**.**

Lemma 3.5**.**

Proof.

Theorem 3.6**.**

Proof.

3.3 Computation of resolvents

Lemma 3.7**.**

Proof.

Theorem 3.8**.**

Theorem 3.9**.**

Theorem 3.10**.**

Theorem 3.11**.**

Remark 3.12**.**

Remark 3.13**.**

3.4 Proof

Lemma 3.14**.**

Lemma 3.15**.**

Proof.

Proof of Theorem 3.9.

Proof of Theorem 3.10 (1).

Proof of Theorem 3.10 (2).

Lemma 3.16**.**

Proof.

Lemma 3.17**.**

Lemma 3.18**.**

Theorem 1.1.

Proposition 2.1 ([8, Proposition 1.4]).

Lemma 2.2.

Theorem 2.3 ([37]).

Lemma 2.4.

Lemma 2.5 (see e.g.,[18]).

Claim.

Lemma 2.6.

Theorem 2.7 ([9, 20]).

Lemma 2.8 ([7, 9]).

Lemma 2.9 ([7, 9]).

Lemma 2.10 ([9]).

Corollary 2.11.

Theorem 2.12 ([22]).

Lemma 2.13.

Lemma 2.14.

Lemma 2.15.

Lemma 3.1.

Lemma 3.2.

Lemma 3.3.

Lemma 3.4.

Lemma 3.5.

Theorem 3.6.

Lemma 3.7.

Theorem 3.8.

Theorem 3.9.

Theorem 3.10.

Theorem 3.11.

Remark 3.12.

Remark 3.13.

Lemma 3.14.

Lemma 3.15.

Lemma 3.16.

Lemma 3.17.

Lemma 3.18.

Lemma 3.19.

Lemma 3.20.

Lemma 4.1.

Theorem 4.2.

Lemma 4.3.

Lemma 4.4.

Lemma 4.5.

Theorem 4.6.