CMAViT: Integrating Climate, Managment, and Remote Sensing Data for Crop   Yield Estimation with Multimodel Vision Transformers

Hamid Kamangir; Brent. S. Sams; Nick Dokoozlian; Luis Sanchez; J.; Mason. Earles

arXiv:2411.16989·cs.CV·November 27, 2024

CMAViT: Integrating Climate, Managment, and Remote Sensing Data for Crop Yield Estimation with Multimodel Vision Transformers

Hamid Kamangir, Brent. S. Sams, Nick Dokoozlian, Luis Sanchez, J., Mason. Earles

PDF

Open Access

TL;DR

This paper presents CMAViT, a multimodal vision transformer that integrates climate, management, and remote sensing data for accurate vineyard crop yield prediction, outperforming traditional models.

Contribution

Introduction of CMAViT, a novel multi-modal transformer that combines spatial, temporal, and management data for pixel-level crop yield estimation.

Findings

01

Achieved R2 of 0.84 and MAPE of 8.22% on unseen data.

02

Outperformed traditional models like UNet-ConvLSTM.

03

Modality ablation showed each data type's importance for accuracy.

Abstract

Crop yield prediction is essential for agricultural planning but remains challenging due to the complex interactions between weather, climate, and management practices. To address these challenges, we introduce a deep learning-based multi-model called Climate-Management Aware Vision Transformer (CMAViT), designed for pixel-level vineyard yield predictions. CMAViT integrates both spatial and temporal data by leveraging remote sensing imagery and short-term meteorological data, capturing the effects of growing season variations. Additionally, it incorporates management practices, which are represented in text form, using a cross-attention encoder to model their interaction with time-series data. This innovative multi-modal transformer tested on a large dataset from 2016-2019 covering 2,200 hectares and eight grape cultivars including more than 5 million vines, outperforms traditional…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRemote Sensing in Agriculture · Remote Sensing and Land Use · Remote Sensing and LiDAR Applications

MethodsAttention Is All You Need · Label Smoothing · Dropout · Linear Layer · Byte Pair Encoding · Adam · Residual Connection · Softmax · Position-Wise Feed-Forward Layer · Absolute Position Encodings