Matrix scaling limits in finitely many iterations
Melvyn B. Nathanson

TL;DR
This paper studies the convergence behavior of the Sinkhorn scaling algorithm, providing explicit examples of matrices that become doubly stochastic after exactly one column scaling, highlighting specific limits of the iterative process.
Contribution
It constructs a two-parameter family of positive matrices that reach doubly stochastic form after a single column scaling, revealing new insights into the algorithm's convergence properties.
Findings
Matrices that become doubly stochastic after one column scaling
Explicit construction of such matrices for any size n
Insights into the limits of the Sinkhorn scaling algorithm
Abstract
The alternate row and column scaling algorithm applied to a positive matrix converges to a doubly stochastic matrix , sometimes called the \emph{Sinkhorn limit} of . For every positive integer , a two parameter family of row but not column stochastic positive matrices is constructed that become doubly stochastic after exactly one column scaling.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Matrix scaling limits in finitely many iterations
Melvyn B. Nathanson
Department of Mathematics
Lehman College (CUNY)
Bronx, NY 10468
Abstract.
The alternate row and column scaling algorithm applied to a positive matrix converges to a doubly stochastic matrix , sometimes called the Sinkhorn limit of . For every positive integer , a two parameter family of row but not column stochastic positive matrices is constructed that become doubly stochastic after exactly one column scaling.
Key words and phrases:
Matrix scaling, Sinkhorn limits.
2010 Mathematics Subject Classification:
11C20, 11B75, 11J68, 11J70.
1. The alternate scaling algorithm
A positive matrix is a matrix with positive coordinates. A nonnegative matrix is a matrix with nonnegative coordinates. Let denote the diagonal matrix with coordinates on the main diagonal. The diagonal matrix is positive if its coordinates are positive. If is an positive matrix, if is an positive diagonal matrix, and if is an positive diagonal matrix, then , , are positive matrices.
Let be an matrix. The th row sum of is
[TABLE]
The th column sum of is
[TABLE]
The matrix is row stochastic if it is nonnegative and for all . The matrix is column stochastic if it is nonnegative and for all . The matrix is doubly stochastic if it is both row stochastic and column stochastic.
Let be a nonnegative matrix such that and for all . Define the positive diagonal matrix
[TABLE]
Multiplying on the left by multiplies each coordinate in the th row of by , and so
[TABLE]
for all . The process of multiplying on the left by to obtain the row stochastic matrix is called row scaling. We have if and only if is row stochastic if and only if . Note that the row stochastic matrix is not necessarily column stochastic.
Similarly, we define the positive diagonal matrix
[TABLE]
Multiplying on the right by multiplies each coordinate in the th column of by , and so
[TABLE]
for all . The process of multiplying on the right by to obtain a column stochastic matrix is called column scaling. We have if and only if if and only if is column stochastic. The column stochastic matrix is not necessarily row stochastic.
Let be a positive matrix. Alternately row scaling and column scaling the matrix produces an infinite sequence of matrices that converges to a doubly stochastic matrix This result (due to Sinkhorn [7], Knopp-Sinkhorn [8], Brualdi, Parter, and Schnieder [1]), Menon [4], Letac [3], Tverberg [9], and others) is classical.
Nathanson [5, 6] proved that if is a positive matrix that is not doubly stochastic but becomes doubly stochastic after a finite number of scalings, then is at most 2, and the row stochastic matrices that require exactly one scaling were computed explicitly. An open question was to describe matrices with that are not doubly stochastic but become doubly stochastic after finitely many scalings. Ekhad and Zeilberger [2] discovered the following row-stochastic but not column stochastic matrix, which requires exactly one column scaling to become doubly stochastic:
[TABLE]
Column scaling produces the doubly stochastic matrix
[TABLE]
The following construction generalizes this example. For every , there is a two parameter family of row-stochastic matrices that require exactly one column scaling to become doubly stochastic
Let be an matrix. For , we denote the th row of by
[TABLE]
Theorem 1**.**
Let and be positive integers, and let . Let and be positive real numbers such that
[TABLE]
and let
[TABLE]
The matrix such that
[TABLE]
is row stochastic but not column stochastic. The matrix obtained from after one column scaling is doubly stochastic.
Proof.
If
[TABLE]
then
[TABLE]
If
[TABLE]
then
[TABLE]
Thus, the matrix is row stochastic.
If
[TABLE]
then
[TABLE]
and if
[TABLE]
then
[TABLE]
Thus, matrix is not column stochastic.
The column scaling matrix for is the positive diagonal matrix
[TABLE]
For the column scaled matrix , we have the following row sums. If
[TABLE]
then
[TABLE]
If
[TABLE]
then
[TABLE]
Thus, the matrix is row stochastic. This completes the proof. ∎
For example, let and , and let be positive real numbers such that
[TABLE]
[TABLE]
The matrix
[TABLE]
is row stochastic but not column stochastic. By Theorem 1, column scaling produces a doubly stochastic matrix. Choosing and , we obtain the matrix (1).
Let , , and . Choosing
[TABLE]
we obtain the row but not column stochastic matrix
[TABLE]
Column scaling produces the doubly stochastic matrix
[TABLE]
Theorem 2**.**
Every matrix constructed in Theorem 1 satisfies .
Proof.
There are three cases.
If or , then has two equal columns and .
If or , then has two equal rows and .
If and , then
[TABLE]
and
[TABLE]
This completes the proof. ∎
Theorem 2 is important for the following reason. Let be an positive matrix. If , then (by solving a system of linear equations) there exist a unique diagonal matrix and a column stochastic matrix such that . If for all , then is invertible. Setting , we have . Because is positive, the matrices and can be adjusted so that is a positive diagonal matrix and is a positive matrix. If is row stochastic, then is the row scaling matrix associated to . Thus, if is a row stochastic matrix such that column scaling produces a doubly stochastic matrix, then we have pulled back to a column stochastic matrix , and we have increased by 1 the number of scalings needed to get a doubly stochastic matrix.
Unfortunately, by Theorem 2, the matrices constructed above all have determinant 0.
2. Open problems
- (1)
Does there exist a positive row stochastic but not column stochastic matrix with nonzero determinant such that becomes doubly stochastic after one column scaling? 2. (2)
Let be a positive row stochastic but not column stochastic matrix that becomes doubly stochastic after one column scaling. Does imply that has the shape of matrix (4)? 3. (3)
Here is the inverse problem: Let be an row-stochastic matrix. Does there exist a column stochastic matrix such that row scaling produces (equivalently, such that )? Compute . 4. (4)
Modify the above problems so that the matrices are required to have rational coordinates. 5. (5)
Determine if, for positive integers and , there exists a positive matrix that requires exactly scalings to reach a doubly stochastic matrix. 6. (6)
Classify all matrices for which the alternate scaling algorithm terminates in finitely many steps.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] R. A. Brualdi, S. V. Parter, and H. Schneider, The diagonal equivalence of a nonnegative matrix to a stochastic matrix , J. Math. Anal. Appl. 16 (1966), 31–50.
- 2[2] S. B. Ekhad and D. Zeilberger, Answers to some questions about explicit Sinkhorn limits posed by Mel Nathanson , ar Xiv:1902.10783, 2019.
- 3[3] G. Letac, A unified treatment of some theorems on positive matrices , Proc. Amer. Math. Soc. 43 (1974), 11–17.
- 4[4] M. V. Menon, Reduction of a matrix with positive elements to a doubly stochastic matrix , Proc. Amer. Math. Soc. 18 (1967), 244–247.
- 5[5] M. B. Nathanson, Alternate minimization and doubly stochastic matrices , ar Xiv:1812.11935, 2018.
- 6[6] by same author, Matrix scaling, explicit Sinkhorn limits, and arithmetic , ar Xiv:1902.04544, 2019.
- 7[7] R. Sinkhorn, A relationship between arbitrary positive matrices and doubly stochastic matrices , Ann. Math. Statist. 35 (1964), 876–879.
- 8[8] R. Sinkhorn and P. Knopp, Concerning nonnegative matrices and doubly stochastic matrices , Pacific J. Math. 21 (1967), 343–348.
