Fast Differentiable Matrix Square Root and Inverse Square Root
Yue Song, Nicu Sebe, Wei Wang

TL;DR
This paper introduces two efficient methods for computing differentiable matrix square roots and inverse square roots, significantly speeding up computations in computer vision tasks while maintaining or improving performance.
Contribution
The authors propose novel Matrix Taylor Polynomial and Matrix Pade Approximant methods for faster differentiable matrix square root computations, outperforming traditional SVD and Newton-Schulz approaches.
Findings
Methods are significantly faster than SVD and NS iteration.
Achieve competitive or better performance in real-world vision tasks.
Validated across multiple applications like batch normalization and vision transformers.
Abstract
Computing the matrix square root and its inverse in a differentiable manner is important in a variety of computer vision tasks. Previous methods either adopt the Singular Value Decomposition (SVD) to explicitly factorize the matrix or use the Newton-Schulz iteration (NS iteration) to derive the approximate solution. However, both methods are not computationally efficient enough in either the forward pass or the backward pass. In this paper, we propose two more efficient variants to compute the differentiable matrix square root and the inverse square root. For the forward propagation, one method is to use Matrix Taylor Polynomial (MTP), and the other method is to use Matrix Pad\'e Approximants (MPA). The backward gradient is computed by iteratively solving the continuous-time Lyapunov equation using the matrix sign function. A series of numerical tests show that both methods yield…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMatrix Theory and Algorithms · Model Reduction and Neural Networks · Electromagnetic Scattering and Analysis
