DOTResize: Reducing LLM Width via Discrete Optimal Transport-based Neuron Merging

Neha Verma; Kenton Murray; Kevin Duh

arXiv:2507.04517·cs.LG·February 26, 2026

DOTResize: Reducing LLM Width via Discrete Optimal Transport-based Neuron Merging

Neha Verma, Kenton Murray, Kevin Duh

PDF

Open Access

TL;DR

DOTResize introduces a novel method for compressing large language models by using optimal transport to reproject and merge neurons, preserving useful information and reducing computational costs.

Contribution

It proposes a new neuron width reduction technique based on discrete optimal transport, enhancing model compression beyond traditional pruning methods.

Findings

01

Achieves measurable reductions in computational cost.

02

Serves as an effective add-on to existing pruning techniques.

03

Maintains model performance while reducing size.

Abstract

Structured pruning methods designed for Large Language Models (LLMs) generally focus on identifying and removing the least important components to optimize model size. However, in this work, we question this prevalent approach by instead exploring how to recombine information from structures designated for pruning back into the reduced model. We specifically focus on neuron width reduction, and frame this problem as a Discrete Optimal Transport problem, and propose DOTResize, a novel Transformer compression method that uses optimal transport theory to transform and compress model width. To ensure applicability within the Transformer architecture, we motivate and incorporate necessary entropic regularization and matrix factorization techniques into the transportation maps produced by our method. Unlike pruning-based approaches which discard neurons based on importance measures, DOTResize…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques