PQuantML: A Tool for End-to-End Hardware-aware Model Compression
Roope Niemi, Anastasiia Petrovych, Arghya Ranjan Das, Enrico Lupi, Chang Sun, Dimitrios Danopoulos, Marlon Joshua Helbing, Mia Liu, Sebastian Dittmeier, Michael Kagan, Vladimir Loncar, Maurizio Pierini

TL;DR
PQuantML is an open-source, hardware-aware model compression library that simplifies deploying neural networks with pruning and quantization for latency-critical environments.
Contribution
It introduces a unified interface for end-to-end hardware-aware model compression, supporting multiple pruning methods and fixed-point quantization.
Findings
Achieves significant parameter and bit-width reduction while maintaining accuracy.
Evaluated on jet tagging, a real-time LHC data processing task.
Outperforms existing tools like QKeras and HGQ in compression effectiveness.
Abstract
PQuantML is a new open-source, hardware-aware neural network model compression library tailored to end-to-end workflows. Motivated by the need to deploy performant models to environments with strict latency constraints, PQuantML simplifies training of compressed models by providing a unified interface to apply pruning and quantization, either jointly or individually. The library implements multiple pruning methods with different granularities, as well as fixed-point quantization with support for High-Granularity Quantization. We evaluate PQuantML on representative tasks such as the jet substructure classification, so-called jet tagging, an on-edge problem related to real-time LHC data processing. Using various pruning methods with fixed-point quantization, PQuantML achieves substantial parameter and bit-width reductions while maintaining accuracy. The resulting compression is further…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
