Model Compression with Exact Budget Constraints via Riemannian Manifolds

Michael Helcig; Dan Alistarh

arXiv:2605.00649·cs.LG·May 8, 2026

Model Compression with Exact Budget Constraints via Riemannian Manifolds

Michael Helcig, Dan Alistarh

PDF

1 Repo 1 Models

TL;DR

This paper introduces Riemannian Constrained Optimization (RCO), a novel method for model compression under exact budget constraints that leverages the geometry of a smooth manifold to enable efficient first-order optimization.

Contribution

The paper proposes RCO, a new optimization approach that enforces exact budget constraints in model compression by exploiting Riemannian manifold geometry, avoiding hyperparameter tuning.

Findings

01

RCO matches or exceeds state-of-the-art in synthetic and LLM benchmarks.

02

RCO often requires less wall-clock time than existing methods.

03

The method enables direct optimization of the true loss under exact budget constraints.

Abstract

Assigning one of K options to each of N groups under a total cost budget is a recurring problem in efficient AI, including mixed-precision quantization, non-uniform pruning, and expert selection. The objective, typically model loss, depends jointly on all assignments and does not decompose across groups, preventing combinatorial solvers from directly optimizing the true objective and forcing reliance on proxy formulations. Methods such as evolutionary search evaluate the actual loss but lack gradient information, while penalty-based approaches enforce the budget only approximately and often require extensive hyperparameter tuning. We present a new approach by showing that, under softmax relaxation, the budget constraint defines a smooth Riemannian manifold in logit space with unusually simple geometry. The normal vector admits a closed-form expression, shifting logits along the cost…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

IST-DASLab/RCO
github

Models

🤗
ISTA-DASLab/Qwen3-Coder-Next-RCO-pruned
model· ♡ 1
♡ 1

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.