LR0.FM: Low-Res Benchmark and Improving Robustness for Zero-Shot Classification in Foundation Models
Priyank Pathak, Shyam Marjit, Shruti Vyas, Yogesh S Rawat

TL;DR
This paper introduces LR0.FM, a benchmark and metric for evaluating and improving the robustness of visual-language foundation models against low-resolution images, revealing key factors affecting performance and proposing a simple enhancement strategy.
Contribution
The paper presents LR0.FM, a new benchmark and metric for low-resolution robustness, along with a novel method LR-TK0 to improve model robustness without retraining.
Findings
Model size correlates with robustness to resolution loss.
Pre-training dataset quality impacts robustness more than size.
Fine-tuned and high-resolution models are less robust to LR images.
Abstract
Visual-language foundation Models (FMs) exhibit remarkable zero-shot generalization across diverse tasks, largely attributed to extensive pre-training on largescale datasets. However, their robustness on low-resolution/pixelated (LR) images, a common challenge in real-world scenarios, remains underexplored. We introduce LR0.FM, a comprehensive benchmark evaluating the impact of low resolution on the zero-shot classification performance of 10 FM(s) across 66 backbones and 15 datasets. We propose a novel metric, Weighted Aggregated Robustness, to address the limitations of existing metrics and better evaluate model performance across resolutions and datasets. Our key findings show that: (i) model size positively correlates with robustness to resolution degradation, (ii) pre-training dataset quality is more important than its size, and (iii) fine-tuned and higher resolution models are less…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHydraulic Fracturing and Reservoir Analysis · Domain Adaptation and Few-Shot Learning
