Design and Evaluation of a Scalable Data Pipeline for AI-Driven Air Quality Monitoring in Low-Resource Settings
Richard Sserujongi, Daniel Ogenrwot, Nicholas Niwamanya, Noah Nsimbe, Martin Bbaale, Benjamin Ssempala, Noble Mutabazi, Raja Fidel Wabinyai, Deo Okure, Engineer Bainomugisha

TL;DR
This paper introduces a scalable, open-source data pipeline designed for real-time and batch air quality data processing in resource-constrained environments, supporting AI-driven analytics and calibration.
Contribution
It presents a modular, cloud-native architecture using open-source tools for robust air quality data management in low-resource settings, including innovative calibration and deployment strategies.
Findings
Supports over 400 devices with millions of measurements monthly
Achieves low latency and high throughput in constrained conditions
Provides a reusable blueprint for environmental data platforms
Abstract
The increasing adoption of low-cost environmental sensors and AI-enabled applications has accelerated the demand for scalable and resilient data infrastructures, particularly in data-scarce and resource-constrained regions. This paper presents the design, implementation, and evaluation of the AirQo data pipeline: a modular, cloud-native Extract-Transform-Load (ETL) system engineered to support both real-time and batch processing of heterogeneous air quality data across urban deployments in Africa. It is Built using open-source technologies such as Apache Airflow, Apache Kafka, and Google BigQuery. The pipeline integrates diverse data streams from low-cost sensors, third-party weather APIs, and reference-grade monitors to enable automated calibration, forecasting, and accessible analytics. We demonstrate the pipeline's ability to ingest, transform, and distribute millions of air quality…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAir Quality Monitoring and Forecasting
