CAAL: Confidence-Aware Active Learning for Heteroscedastic Atmospheric Regression
Fei Jiang, Jiyang Xia, Junjie Yu, Mingfei Sun, Hugh Coe, David Topping, Dantong Liu, Zhenhui Jessie Li, Zhonghua Zheng

TL;DR
This paper introduces CAAL, a confidence-aware active learning framework designed to efficiently select samples for measuring complex atmospheric particle properties from noisy, heteroscedastic data, improving data collection strategies.
Contribution
The paper presents a novel active learning approach that decouples uncertainty estimation and incorporates aleatoric uncertainty to enhance sample selection in heteroscedastic regression tasks.
Findings
CAAL outperforms standard active learning methods in simulations and real data.
The decoupled training stabilizes uncertainty estimates in noisy conditions.
Dynamic weighting improves sample efficiency in high-cost data collection.
Abstract
Quantifying the impacts of air pollution on health and climate relies on key atmospheric particle properties such as toxicity and hygroscopicity. However, these properties typically require complex observational techniques or expensive particle-resolved numerical simulations, limiting the availability of labeled data. We therefore estimate these hard-to-measure particle properties from routinely available observations (e.g., air pollutant concentrations and meteorological conditions). Because routine observations only indirectly reflect particle composition and structure, the mapping from routine observations to particle properties is noisy and input-dependent, yielding a heteroscedastic regression setting. With a limited and costly labeling budget, the central challenge is to select which samples to measure or simulate. While active learning is a natural approach, most acquisition…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAir Quality Monitoring and Forecasting · Gaussian Processes and Bayesian Inference · Air Quality and Health Impacts
