AgriCLIP: Adapting CLIP for Agriculture and Livestock via   Domain-Specialized Cross-Model Alignment

Umair Nawaz; Muhammad Awais; Hanan Gani; Muzammal Naseer; Fahad Khan,; Salman Khan; Rao Muhammad Anwer

arXiv:2410.01407·cs.CV·October 3, 2024·2 cites

AgriCLIP: Adapting CLIP for Agriculture and Livestock via Domain-Specialized Cross-Model Alignment

Umair Nawaz, Muhammad Awais, Hanan Gani, Muzammal Naseer, Fahad Khan,, Salman Khan, Rao Muhammad Anwer

PDF

Open Access 1 Repo

TL;DR

AgriCLIP is a specialized vision-language model for agriculture and livestock that leverages a large domain-specific dataset and a combined training approach to improve zero-shot performance on related tasks.

Contribution

The paper introduces a new large-scale agricultural dataset ALive and a training pipeline combining contrastive and self-supervised learning for domain-specific vision-language modeling.

Findings

01

Achieved 7.8% improvement in zero-shot classification accuracy over standard CLIP.

02

Demonstrated effectiveness across 20 downstream agricultural and livestock tasks.

03

Provided accessible dataset and code for future research.

Abstract

Capitalizing on vast amount of image-text data, large-scale vision-language pre-training has demonstrated remarkable zero-shot capabilities and has been utilized in several applications. However, models trained on general everyday web-crawled data often exhibit sub-optimal performance for specialized domains, likely due to domain shift. Recent works have tackled this problem for some domains (e.g., healthcare) by constructing domain-specialized image-text data. However, constructing a dedicated large-scale image-text dataset for sustainable area of agriculture and livestock is still open to research. Further, this domain desires fine-grained feature learning due to the subtle nature of the downstream tasks (e.g, nutrient deficiency detection, livestock breed classification). To address this we present AgriCLIP, a vision-language foundational model dedicated to the domain of agriculture…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

umair1221/agriclip
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCooperative Studies and Economics

MethodsSparse Evolutionary Training · Contrastive Language-Image Pre-training