# Forecasting and analyzing seasonal GHI for a SAPV system in extreme Indian climatic regions

**Authors:** Aadyasha Patel, Gnana Swathika O. V.

PMC · DOI: 10.1038/s41598-025-23168-8 · Scientific Reports · 2025-11-12

## TL;DR

This paper uses machine learning to predict solar radiation in four Indian locations, finding that Gaussian Process Regression is the most accurate method for solar forecasting.

## Contribution

The study introduces a novel comparison of ML models for seasonal GHI forecasting in diverse Indian climates, highlighting GPR's superior performance.

## Key findings

- GPR outperforms ELR and RT in seasonal GHI prediction with significantly lower RMSE and MAE.
- GPR captures nearly all dataset variability with a high R² of 0.9999.
- ELR is found to be the least reliable model for solar forecasting in the studied regions.

## Abstract

Long-term average solar radiation prediction via seasonal Global Horizontal Irradiance (GHI) forecasting is increasingly leveraging Machine Learning (ML) to uncover complex relationships in historical GHI and climatic parameters surpassing traditional methods. This study investigates seasonal GHI forecasting across four climatically distinct Indian locations namely Chennai, Jaisalmer, Leh and Mawsynram using data from the National Solar Radiation Database. Following rigorous data quality assessment and feature selection based on Spearman’s correlation and Mutual Information, eight key features are identified for training three ML models: Efficient Linear Regression (ELR), Regression Trees (RT) and Gaussian Process Regression (GPR). The GPR model demonstrates superior accuracy in seasonal GHI prediction across diverse Indian climatic conditions making it ideal for reliable and cost-efficient Stand-Alone Photovoltaic (SAPV) systems. It is observed that the GPR model achieves the lowest Root Mean Square Error (RMSE) of 0.0030, Mean Absolute Error (MAE) of 0.0022 and Coefficient of Determination (R²) of 0.9999 achieving a reduction of 189.1% in RMSE, 190.09% in MAE and 20.56% improvement in R² compared to ELR. Furthermore, the GPR model surpasses the RT model with a reduction of 124.05% in RMSE and 111.1% in MAE, together with an improvement of 0.2604% in R². While RT provides reasonably accurate forecasts, ELR shows the least reliability. These results confirm GPR’s exceptional precision and ability to capture nearly all dataset variability.

## Full-text entities

- **Diseases:** NMI (MESH:C537354), GHI (MESH:D009759)
- **Chemicals:** GHI (-)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12612247/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12612247/full.md

## References

12 references — full list in the complete paper: https://tomesphere.com/paper/PMC12612247/full.md

---
Source: https://tomesphere.com/paper/PMC12612247