Compression Aware Certified Training

Changming Xu; Gagandeep Singh

arXiv:2506.11992·cs.LG·June 16, 2025

Compression Aware Certified Training

Changming Xu, Gagandeep Singh

PDF

Open Access 3 Reviews

TL;DR

CACTUS is a unified training framework that enhances neural network robustness and efficiency by integrating compression techniques like pruning and quantization during training, ensuring high certified accuracy even after compression.

Contribution

It introduces CACTUS, a novel method that combines compression and certified robustness training, outperforming existing approaches in accuracy and robustness across multiple datasets.

Findings

01

CACTUS maintains high certified accuracy after compression.

02

It achieves state-of-the-art results for pruning and quantization.

03

Effective across various datasets and input specifications.

Abstract

Deep neural networks deployed in safety-critical, resource-constrained environments must balance efficiency and robustness. Existing methods treat compression and certified robustness as separate goals, compromising either efficiency or safety. We propose CACTUS (Compression Aware Certified Training Using network Sets), a general framework for unifying these objectives during training. CACTUS models maintain high certified accuracy even when compressed. We apply CACTUS for both pruning and quantization and show that it effectively trains models which can be efficiently compressed while maintaining high accuracy and certifiable robustness. CACTUS achieves state-of-the-art accuracy and certified performance for both pruning and quantization on a variety of datasets and input specifications.

Peer Reviews

Decision·Submitted to ICLR 2026

Reviewer 01Rating 4Confidence 3

Strengths

Deploying robust models on resource-limited devices is an important and timely research direction. The joint training objective is clearly defined and implemented. The use of compression sets and curriculum-based loss weighting is technically reasonable. Experiments are carefully executed and include ablations (AWP radius, compression-set size). CACTUS consistently outperforms sequential baselines in certified accuracy under compression. The paper is well written, equations are clean, and imp

Weaknesses

The work overlooks Gui et al. (2019), "Model Compression with Adversarial Robustness: A Unified Optimization Framework", which already introduced a unified optimization framework combining model compression (pruning and quantization) with adversarial training. While ATMC focused on empirical rather than certified robustness, the underlying idea (joint optimization of robustness and compression) is the same. A clearer connection to this prior line of work would strengthen the paper’s positioning

Reviewer 02Rating 8Confidence 4

Strengths

- Clear problem formulation unifying certified training with compression; objective over a compression set is well motivated. - Theory for quantization: a clean reduction from quantization to weight-bounded perturbations via AWP with a formal upper-bound guarantee. - Consistent empirical gains under compression: across pruning and quantization, CACTUS improves certified accuracy versus robust baselines; integration with multiple certified-training losses shows method generality. - Ablations beyo

Weaknesses

- Scope of headline comparisons: By design, CACTUS is strongest when compressed; for $\delta$=0 or unquantized, SABR typically wins. This is expected but should be emphasized alongside deployment guidance. - Condition discrepancy: Main text, Theorem 4.1 states $q_{step}\leq \eta$ while Appendix Theorem D.1 states $q_{step}\leq 2\eta$. The bound is fine, yet the precise requirement should be consistently stated. - Compute cost: Training time overhead is non-trivial; although addressed in Append

Reviewer 03Rating 2Confidence 4

Strengths

1. This paper is the first work to address certified robustness for model compression. 2. The background section is well-written, providing a clear foundation to understand the problem.

Weaknesses

1. **Writing and Notation Quality**. The paper suffers from many writing and notation issues that significantly affect readability and clarity. - Several key symbols are used before being defined, such as $\theta$, $Q(\cdot)$, $\Delta$, and $\eta$. - The notation convention stated in Section 2 is inconsistent with later usage. For example, the authors claim lowercase bold letters denote vectors, but $\theta$, which may represent network parameters, should arguably be bold ($\boldsymbol{\theta}$)

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotic Mechanisms and Dynamics