Fast Evaluation of Truncated Neumann Series by Low-Product Radix Kernels
Piyush Sao

TL;DR
This paper develops new low-product radix kernels for efficiently evaluating truncated Neumann series, significantly reducing the number of matrix multiplications needed in dense matrix computations.
Contribution
It constructs the first exact higher-radix kernels beyond radix 5 and introduces a residual-based framework for approximate kernels, improving evaluation efficiency.
Findings
Radix 9 kernel reduces products by 21% compared to repeated squaring.
Numerical optimization yields a radix 15 kernel with minimal spillover.
Experiments confirm theoretical product-count savings and runtime improvements.
Abstract
Truncated Neumann series are used in approximate matrix inversion and polynomial preconditioning. In dense settings, matrix-matrix products dominate the cost of evaluating . Naive evaluation needs products, while splitting methods reduce this to . Repeated squaring, for example, uses products, so further gains require higher-radix kernels that extend the series by terms per update. Beyond the known radix-5 kernel, explicit higher-radix constructions were not available, and the existence of exact rational kernels was unclear. We construct radix kernels for and use them to build faster series algorithms. For radix 9, we derive an exact 3-product kernel with rational coefficients, which is the first exact construction beyond radix 5. This kernel yields $5\log_9 k=1.58\log_2…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Matrix Theory and Algorithms · Tensor decomposition and applications
