TL;DR
This paper introduces a novel calibration method for deep neural networks that improves reliability under distribution shifts by leveraging frequency-domain analysis and gradient-based rectification, without needing target domain data.
Contribution
It proposes a frequency-based filtering strategy combined with gradient rectification to enhance calibration robustness without access to target domain information.
Findings
Significantly improves calibration under distribution shift.
Maintains strong in-distribution calibration performance.
Effective on synthetic and real-world datasets like CIFAR-10/100-C and WILDS.
Abstract
Deep neural networks often produce overconfident predictions, undermining their reliability in safety-critical applications. This miscalibration is further exacerbated under distribution shift, where test data deviates from the training distribution due to environmental or acquisition changes. While existing approaches improve calibration through training-time regularization or post-hoc adjustment, their reliance on access to or simulation of target domains limits their practicality in real-world scenarios. In this paper, we propose a novel calibration framework that operates without access to target domain information. From a frequency-domain perspective, we identify that distribution shifts often distort high-frequency visual cues exploited by deep models, and introduce a low-frequency filtering strategy to encourage reliance on domain-invariant features. However, such information…
Peer Reviews
Decision·ICLR 2026 Conference Withdrawn Submission
1. The use of DCT-based block-wise low-pass filtering is a creative tool for encouraging robustness to shift, going beyond standard pixel-level or data augmentation strategies. 2. The proposed projection-based gradient rectification is well-motivated and avoids hyperparameter tuning.
1. The proposed method improves upon the compared method for metrics except accuracy. 2. Poor performance on most metrics for the TinyImageNet dataset, including in the results in the appendix. This reflects poorly on the efficacy of the proposed approach.
- The challenge of improving out-domain calibration performance while also retaining the ID calibration performance is quite relevant due to its practical significance. - Putting ID calibration as a hard constraint via gradient projection to sustain ID calibration performance is interesting and requires no weighting parameters. - The related work provides a good coverage of recent and relevant methods in model calibration. - The experimental comparison is shown with different baselines and i
- To compute the main gradient, why both low-pass filtered and original images are put as hybrid? - Is it possible to use some other loss than Soft-ECE and expect similar or even better OOD and ID calibration performance? - The weighted sum in Table 3 perform very closely to the FGR idea. Is there any explanation to that? - It is not obvious how filtering the rectification improves OOD calibration performance? - The ID calibration performance is not better than other methods in many cases (
(1) Exploring domain shift calibration is important. (2) The paper is well written and easy to follow
(1) Why use DCT is not clear. As described in the introduction, "learning to recognize 'birds' based on special texture (e.g., green leafy patterns) rather than shape . Motivated by this, we apply Discrete Cosine Transform (DCT) filtering to isolate low-frequency image components, encouraging the model to rely on shape-related information that is more consistent across distributions.". However, i do not think this is well motivated to use DCT. More analysis are needed. (2) Why use gradient rect
- Clarity of presentation: The main argument and logical flow of the paper are easy to follow and clearly articulated throughout. - The proposed method does not require access to the target distribution: This approach overcomes a major limitation of conventional methods that rely on access to the target distribution.
- Limitations of frequency-based assumptions in distribution shift: The method developed in this paper assumes that invariant features are primarily embedded in low-frequency components, whereas spurious features arise from high-frequency regions. However, this assumption does not consistently reflect real-world conditions. For example, distribution shifts may result from variations in low-frequency aspects such as overall image tone (e.g., brightness or hue), or from frequency-independent struc
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
