TSENOR: Highly-Efficient Algorithm for Finding Transposable N:M Sparse Masks

Xiang Meng; Mehdi Makni; Rahul Mazumder

arXiv:2505.23949·cs.LG·June 2, 2025

TSENOR: Highly-Efficient Algorithm for Finding Transposable N:M Sparse Masks

Xiang Meng, Mehdi Makni, Rahul Mazumder

PDF

Open Access 1 Video

TL;DR

This paper presents a scalable, efficient algorithm for generating transposable N:M sparse masks in neural networks, enabling better hardware acceleration and compression without sacrificing model performance.

Contribution

We introduce a novel, scalable solver for transposable N:M masks using optimal transport, enabling application to billion-parameter models and arbitrary N:M ratios.

Findings

01

Achieves up to 100x speedup over existing methods.

02

Maintains model performance close to dense models with 16:32 sparsity.

03

Outperforms standard 2:4 sparse models in experiments.

Abstract

Network pruning reduces the computational requirements of large neural networks, with N:M sparsity -- retaining only N out of every M consecutive weights -- offering a compelling balance between compressed model quality and hardware acceleration. However, N:M sparsity only accelerates forward-pass computations, as N:M patterns are not preserved during matrix transposition, limiting efficiency during training where both passes are computationally intensive. While transposable N:M sparsity has been proposed to address this limitation, existing methods for finding transposable N:M sparse masks either fail to scale to large models or are restricted to M=4 which results in suboptimal compression-accuracy trade-off. We introduce an efficient solver for transposable N:M masks that scales to billion-parameter models. We formulate mask generation as optimal transport problems and solve through…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

TSENOR: Highly-Efficient Algorithm for Finding Transposable N:M Sparse Masks· slideslive

Taxonomy

TopicsVehicle License Plate Recognition · DNA and Biological Computing · graph theory and CDMA systems

MethodsEntropy Regularization · Pruning