# Understanding machine learning weather prediction by designing a cost-efficient model with knowledge-oriented modules

**Authors:** Minjong Cheon, Jeong-Hwan Kim, Yumi Choi, Yo-Hwan Choi, Seon-Yu Kang, Jeong-Gil Lee, Yoo-Geun Ham, Jin Young Kim, Daehyun Kang

PMC · DOI: 10.1038/s41598-025-32366-3 · Scientific Reports · 2025-12-15

## TL;DR

This paper introduces KARINA, a cost-efficient machine learning model for weather prediction that achieves strong performance while using fewer resources.

## Contribution

The novel KARINA model combines Geocyclic Padding and SENet modules with a ConvNeXt backbone to improve weather forecasting efficiency and interpretability.

## Key findings

- KARINA achieves competitive performance with lower training costs compared to models like Pangu-Weather and GraphCast.
- Geocyclic Padding enhances horizontal advection modeling, while SENet captures atmospheric convection dynamics effectively.

## Abstract

Deep learning-based models are gaining prevalence in global weather forecasting, surpassing the performance of existing numerical models. However, training these models with high-resolution global weather data requires massive computational resources, making it difficult to conduct extensive experiments to understand the model processes. In addition, the reason for region- or variable-dependent accuracy in the machine learning models, along with the extra predictability provided by each component, remains unknown. Therefore, we propose a novel data-driven model named KARINA, which combines Geocyclic Padding and SENet modules with the ConvNeXt backbone to enhance weather forecasting while minimizing training resources. Despite its much lower training cost, KARINA achieved competitive performance compared to the recently developed data-driven models such as Pangu-Weather and GraphCast, while surpassing the numerical weather prediction of ECMWF IFS at a lead time of up to 10 days. The efficient training process and KARINA’s modular structure allow us to demonstrate the effectiveness of Geocyclic Padding and SENet through comprehensive trials. Geocyclic Padding significantly improves the modeling of horizontal advection, while SENet particularly captures the dynamics of atmospheric convection. These findings suggest that incorporating knowledge-oriented techniques can lead to reliable performance. This paper presents a framework for gaining a deeper understanding of the model mechanism and proposes ways to improve machine learning weather prediction models.

The online version contains supplementary material available at 10.1038/s41598-025-32366-3.

## Full-text entities

- **Diseases:** ACC (MESH:D000013)
- **Chemicals:** carbon dioxide (MESH:D002245), carbon (MESH:D002244)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12819519/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12819519/full.md

## References

22 references — full list in the complete paper: https://tomesphere.com/paper/PMC12819519/full.md

---
Source: https://tomesphere.com/paper/PMC12819519