Visibility nowcasting in South Korea: a machine learning approach to class imbalance and distribution shift
Bong Gyun Shin, Chan Sik Lee, Hyesun Suh

TL;DR
This paper presents a machine learning framework for nowcasting atmospheric visibility in South Korea, addressing data imbalance and distribution shift challenges to improve prediction accuracy in transportation safety and air quality management.
Contribution
It introduces a novel methodology combining data augmentation and ensemble modeling to handle class imbalance and temporal distribution shifts in visibility prediction.
Findings
Model performance declined due to distributional shift between training and testing periods.
SMOTENC and CTGAN effectively addressed data imbalance issues.
Quantitative analysis confirmed the impact of external environmental factors on model accuracy.
Abstract
Atmospheric visibility is a critical variable for transportation safety and air quality management, however, accurate prediction remains challenging due to the complex interactions between meteorological conditions and air pollutants, as well as the rarity of low-visibility events. This study introduces a machine learning framework to nowcast visibility in six major South Korean cities. To handle the imbalance in the 2018-2020 training data, we applied the Synthetic Minority Over-sampling Technique with Nominal and Continuous (SMOTENC) and Conditional Tabular Generative Adversarial Network (CTGAN). An ensemble approach combining machine learning and deep learning models was then used and evaluated on a 2021 test dataset. The results revealed a marked decline in predictive performance in the test set compared to the cross-validation phase. This degradation was attributed to a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
