Tile Compression and Embeddings for Multi-Label Classification in   GeoLifeCLEF 2024

Anthony Miyaguchi; Patcharapong Aphiwetsa; Mark McDuffie

arXiv:2407.06326·cs.CV·July 10, 2024

Tile Compression and Embeddings for Multi-Label Classification in GeoLifeCLEF 2024

Anthony Miyaguchi, Patcharapong Aphiwetsa, Mark McDuffie

PDF

Open Access 1 Repo

TL;DR

This paper presents a multi-faceted approach combining frequency-domain compression, neighborhood models, and self-supervised learning to improve plant species classification from remote sensing data in the GeoLifeCLEF 2024 competition.

Contribution

It introduces the use of DCT-based data compression and LSH-based neighborhood models, along with tile2vec embeddings, for enhanced multi-label classification in geospatial data.

Findings

01

Best model achieved a leaderboard score of 0.152

02

Post-competition score improved to 0.161

03

Source code and models are publicly available

Abstract

We explore methods to solve the multi-label classification task posed by the GeoLifeCLEF 2024 competition with the DS@GT team, which aims to predict the presence and absence of plant species at specific locations using spatial and temporal remote sensing data. Our approach uses frequency-domain coefficients via the Discrete Cosine Transform (DCT) to compress and pre-compute the raw input data for convolutional neural networks. We also investigate nearest neighborhood models via locality-sensitive hashing (LSH) for prediction and to aid in the self-supervised contrastive learning of embeddings through tile2vec. Our best competition model utilized geolocation features with a leaderboard score of 0.152 and a best post-competition score of 0.161. Source code and models are available at https://github.com/dsgt-kaggle-clef/geolifeclef-2024.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dsgt-kaggle-clef/geolifeclef-2024
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsText and Document Classification Technologies · Web Data Mining and Analysis · Geographic Information Systems Studies

MethodsDiscrete Cosine Transform · Contrastive Learning