# Distilling knowledge from multiple foundation models for zero-shot image classification

**Authors:** Siqi Yin, Lifan Jiang

PMC · DOI: 10.1371/journal.pone.0310730 · PLOS ONE · 2024-09-20

## TL;DR

This paper presents a new zero-shot image classification framework that uses knowledge distillation from multiple foundation models to recognize unseen categories.

## Contribution

The novel approach combines text-to-image synthesis and knowledge distillation from models like ChatGPT, DALL-E, CLIP, and DINO for improved zero-shot classification.

## Key findings

- The proposed method achieves over 96% AUROC scores on multiple datasets.
- It outperforms previous approaches in zero-shot classification accuracy.
- Experiments were conducted on datasets like MNIST, CIFAR-10, and TinyImageNet.

## Abstract

Zero-shot image classification enables the recognition of new categories without requiring additional training data, thereby enhancing the model’s generalization capability when specific training are unavailable. This paper introduces a zero-shot image classification framework to recognize new categories that are unseen during training by distilling knowledge from foundation models. Specifically, we first employ ChatGPT and DALL-E to synthesize reference images of unseen categories from text prompts. Then, the test image is aligned with text and reference images using CLIP and DINO to calculate the logits. Finally, the predicted logits are aggregated according to their confidence to produce the final prediction. Experiments are conducted on multiple datasets, including MNIST, SVHN, CIFAR-10, CIFAR-100, and TinyImageNet. The results demonstrate that our method can significantly improve classification accuracy compared to previous approaches, achieving AUROC scores of over 96% across all test datasets. Our code is available at https://github.com/1134112149/MICW-ZIC.

## Full-text entities

- **Genes:** TOP1 [NCBI Gene 101093174]
- **Chemicals:** DINO (-)
- **Species:** Canis lupus familiaris (dog, subspecies) [taxon 9615], Felis catus (cat, species) [taxon 9685], Ailurus fulgens (lesser panda, species) [taxon 9649]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC11414985/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11414985/full.md

## References

42 references — full list in the complete paper: https://tomesphere.com/paper/PMC11414985/full.md

---
Source: https://tomesphere.com/paper/PMC11414985