Inlier-Centric Post-Training Quantization for Object Detection Models

Minsu Kim; Dongyeun Lee; Jaemyung Yu; Jiwan Hur; Giseop Kim; Junmo Kim

arXiv:2602.03472·cs.CV·February 4, 2026

Inlier-Centric Post-Training Quantization for Object Detection Models

Minsu Kim, Dongyeun Lee, Jaemyung Yu, Jiwan Hur, Giseop Kim, Junmo Kim

PDF

Open Access 3 Reviews

TL;DR

This paper introduces InlierQ, a post-training quantization method that effectively separates informative inliers from task-irrelevant anomalies, improving object detection accuracy with minimal calibration data.

Contribution

InlierQ is a novel anomaly-aware quantization approach that uses gradient-aware saliency and EM algorithm to preserve useful features during model compression.

Findings

01

Reduces quantization error on COCO and nuScenes benchmarks

02

Improves detection accuracy for camera and LiDAR-based models

03

Requires only 64 calibration samples

Abstract

Object detection is pivotal in computer vision, yet its immense computational demands make deployment slow and power-hungry, motivating quantization. However, task-irrelevant morphologies such as background clutter and sensor noise induce redundant activations (or anomalies). These anomalies expand activation ranges and skew activation distributions toward task-irrelevant responses, complicating bit allocation and weakening the preservation of informative features. Without a clear criterion to distinguish anomalies, suppressing them can inadvertently discard useful information. To address this, we present InlierQ, an inlier-centric post-training quantization approach that separates anomalies from informative inliers. InlierQ computes gradient-aware volume saliency scores, classifies each volume as an inlier or anomaly, and fits a posterior distribution over these scores using the…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 6Confidence 3

Strengths

1. The paper is presented fairly clearly. 2. The motivation is well-articulated, and the proposed InlierQ method is reasonably designed.

Weaknesses

1. In Equation (7), is the supervision applied to the top-$K$ entries for each channel of the heatmap? If so, the summation indices over $K$ and $C$ might be reversed in the equation. 2. Equation (12) is described as “explicitly discards anomalous activations and focuses only on the curvature of inlier distributions.” Does this imply that in Equation (6), $\lambda_I = 1$ and $\lambda_O = 0$? If so, by directly discarding background activations and focusing only on high-gradient regions, could t

Reviewer 02Rating 4Confidence 3

Strengths

- The method is clearly described with equations and an algorithmic flow (Algorithm 1), making the paper easy to follow and reproduce. - The results cover both 2D and 3D object detection models (COCO, nuScenes), showing robustness under different modalities and architectures. - The authors identify that quantization error can be dominated by high-magnitude anomalies or uninformative background activations, which is indeed an important real-world problem for low-bit quantization in detection mode

Weaknesses

- The claim that existing quantization approaches treat all activations uniformly is inaccurate. A substantial body of prior work has explicitly or implicitly modelled activation importance. Although the authors mention outlier-suppression methods such as SmoothQuant, QDrop, and SVDQuant in Related Work, the distinction they claim is that these works only relax amplitudes while their method decomposes activations into inliers and anomalies. However, many existing PTQ methods already model activ

Reviewer 03Rating 6Confidence 2

Strengths

1. Conceptually novel with clear theoretical grounding, using an inlier-centric optimization that allocates bit precision to task-relevant activations. 2. Strong engineering practicality, since it is label-free and training-free, and as a plug-in PTQ module it needs only 64 calibration samples. 3. Broad applicability across modalities and architectures, covering camera-based 2D detection, camera-based 3D detection, and LiDAR-based 3D detection.

Weaknesses

1. Gains are limited at higher bits. Under W8A8 the performance is close to full precision or baseline PTQ, so the advantage is less pronounced. 2. Sensitivity to hyperparameters and unresolved robustness questions. The threshold τ controls the inlier ratio and the final accuracy, which may require retuning across datasets and detection heads. 3. Modest improvement on 2D detection tasks. Ablations indicate that Inlier and Anomaly Sets are less separable in 2D, which reduces the benefit.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Advanced Neural Network Applications · Infrared Target Detection Methodologies