SAGDA: Open-Source Synthetic Agriculture Data for Africa
Abdelghani Belgaid, Oumnia Ennaji

TL;DR
SAGDA is an open-source Python toolkit that generates and augments synthetic agricultural data to improve machine learning applications in African agriculture, addressing data scarcity issues.
Contribution
It introduces SAGDA, a comprehensive toolkit for synthetic data generation, augmentation, and validation tailored for African agricultural ML applications.
Findings
Enhanced yield prediction through data augmentation
Improved fertilizer recommendation accuracy
Demonstrated effectiveness of synthetic data in ML models
Abstract
Data scarcity in African agriculture hampers machine learning (ML) model performance, limiting innovations in precision agriculture. The Synthetic Agriculture Data for Africa (SAGDA) library, a Python-based open-source toolkit, addresses this gap by generating, augmenting, and validating synthetic agricultural datasets. We present SAGDA's design and development practices, highlighting its core functions: generate, model, augment, validate, visualize, optimize, and simulate, as well as their roles in applications of ML for agriculture. Two use cases are detailed: yield prediction enhanced via data augmentation, and multi-objective NPK (nitrogen, phosphorus, potassium) fertilizer recommendation. We conclude with future plans for expanding SAGDA's capabilities, underscoring the vital role of open-source, data-driven practices for African agriculture.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSmart Agriculture and AI
