Temporal-Spatial Tubelet Embedding for Cloud-Robust MSI Reconstruction using MSI-SAR Fusion: A Multi-Head Self-Attention Video Vision Transformer Approach
Yiqun Wang, Lujun Li, Meiru Yue, Radu State

TL;DR
This paper introduces a novel Video Vision Transformer framework with temporal-spatial fusion embedding for improved multispectral imagery reconstruction in cloud-covered regions, leveraging tubelet extraction and multi-modal fusion.
Contribution
It proposes a temporal-spatial fusion embedding using tubelets in a ViViT-based framework, reducing information loss and enhancing reconstruction accuracy in MSI data affected by clouds.
Findings
Achieved 2.23% lower MSE on MSI reconstruction compared to baseline.
Realized 10.33% improvement with SAR-MSI fusion over non-fused methods.
Demonstrated robustness in cloud-affected agricultural monitoring scenarios.
Abstract
Cloud cover in multispectral imagery (MSI) significantly hinders early-season crop mapping by corrupting spectral information. Existing Vision Transformer(ViT)-based time-series reconstruction methods, like SMTS-ViT, often employ coarse temporal embeddings that aggregate entire sequences, causing substantial information loss and reducing reconstruction accuracy. To address these limitations, a Video Vision Transformer (ViViT)-based framework with temporal-spatial fusion embedding for MSI reconstruction in cloud-covered regions is proposed in this study. Non-overlapping tubelets are extracted via 3D convolution with constrained temporal span , ensuring local temporal coherence while reducing cross-day information degradation. Both MSI-only and SAR-MSI fusion scenarios are considered during the experiments. Comprehensive experiments on 2020 Traill County data demonstrate notable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRemote Sensing in Agriculture · Remote-Sensing Image Classification · Advanced Image Fusion Techniques
