SciLT: Long-tailed Image Classification under Scientific Image Domains

Jiahao Chen; Bing Su

arXiv:2604.03687·cs.CV·May 12, 2026

SciLT: Long-tailed Image Classification under Scientific Image Domains

Jiahao Chen, Bing Su

PDF

TL;DR

This paper introduces SciLT, a framework for scientific long-tailed image classification that leverages multi-level features and dual supervision to improve performance across class distributions.

Contribution

The paper proposes SciLT, a novel method that exploits multi-level representations and dual-supervision learning to enhance scientific long-tailed recognition.

Findings

01

Fine-tuning foundation models yields limited gains on scientific data.

02

Penultimate-layer features are crucial for tail class recognition.

03

SciLT outperforms existing methods across multiple scientific benchmarks.

Abstract

Long-tailed recognition has benefited from foundation models and fine-tuning paradigms, yet existing studies and benchmarks are mainly confined to natural image domains, where pre-training and fine-tuning data share similar distributions. In contrast, scientific images exhibit distinct visual characteristics and supervision signals, raising questions about the effectiveness of fine-tuning foundation models in such settings. In this work, we investigate scientific long-tailed recognition under a purely visual and fine-tuning paradigm. Experiments on three scientific benchmarks show that fine-tuning foundation models yields limited gains, and reveal that penultimate-layer features play an important role, particularly for tail classes. Motivated by these findings, we propose SciLT, a framework that exploits multi-level representations through adaptive feature fusion and dual-supervision…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.