Meteorology-Driven GPT4AP: A Multi-Task Forecasting LLM for Atmospheric Air Pollution in Data-Scarce Settings
Prasanjit Dey, Soumyabrata Dev, Bianca Schoen-Phelan

TL;DR
GPT4AP is a multi-task, data-efficient forecasting model based on GPT-2 that excels in air pollution prediction under limited data, domain shifts, and long-term scenarios.
Contribution
This paper introduces GPT4AP, a parameter-efficient, multi-task forecasting framework leveraging a frozen GPT-2 backbone with lightweight adaptations for air pollution prediction.
Findings
GPT4AP outperforms baseline models in few-shot and zero-shot settings.
The model maintains competitive accuracy in long-term forecasting.
It demonstrates improved generalization across different monitoring stations.
Abstract
Accurate forecasting of air pollution is important for environmental monitoring and policy support, yet data-driven models often suffer from limited generalization in regions with sparse observations. This paper presents Meteorology-Driven GPT for Air Pollution (GPT4AP), a parameter-efficient multi-task forecasting framework based on a pre-trained GPT-2 backbone and Gaussian rank-stabilized low-rank adaptation (rsLoRA). The model freezes the self-attention and feed-forward layers and adapts lightweight positional and output modules, substantially reducing the number of trainable parameters. GPT4AP is evaluated on six real-world air quality monitoring datasets under few-shot, zero-shot, and long-term forecasting settings. In the few-shot regime using 10% of the training data, GPT4AP achieves an average MSE/MAE of 0.686/0.442, outperforming DLinear (0.728/0.530) and ETSformer…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
