Super-class guided Transformer for Zero-Shot Attribute Classification

Sehyung Kim; Chanhyeong Yang; Jihwan Park; Taehoon Song; Hyunwoo J.; Kim

arXiv:2501.05728·cs.CV·January 17, 2025

Super-class guided Transformer for Zero-Shot Attribute Classification

Sehyung Kim, Chanhyeong Yang, Jihwan Park, Taehoon Song, Hyunwoo J., Kim

PDF

Open Access 1 Repo 1 Video

TL;DR

SugaFormer is a novel transformer-based framework that leverages super-classes and knowledge transfer strategies to improve zero-shot attribute classification's scalability and generalizability, achieving state-of-the-art results.

Contribution

The paper introduces SugaFormer, which uses super-classes for query reduction and multi-context decoding, along with knowledge transfer strategies for enhanced zero-shot attribute classification.

Findings

01

Achieves state-of-the-art performance on three benchmarks.

02

Effectively generalizes across datasets in zero-shot settings.

03

Improves scalability for large attribute sets.

Abstract

Attribute classification is crucial for identifying specific characteristics within image regions. Vision-Language Models (VLMs) have been effective in zero-shot tasks by leveraging their general knowledge from large-scale datasets. Recent studies demonstrate that transformer-based models with class-wise queries can effectively address zero-shot multi-label classification. However, poor utilization of the relationship between seen and unseen attributes makes the model lack generalizability. Additionally, attribute classification generally involves many attributes, making maintaining the model's scalability difficult. To address these issues, we propose Super-class guided transFormer (SugaFormer), a novel framework that leverages super-classes to enhance scalability and generalizability for zero-shot attribute classification. SugaFormer employs Super-class Query Initialization (SQI) to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mlvlab/SugaFormer
pytorchOfficial

Videos

Super-Class Guided Transformer for Zero-Shot Attribute Classification· underline

Taxonomy

TopicsOptical Systems and Laser Technology · Infrared Target Detection Methodologies