# Irradiance dataset in the south of Colombia from 2013 to 2023 in 5-minutes intervals

**Authors:** John Barco-Jiménez, Daniel Rosero, Andrés Zambrano, Francisco Eraso-Checa, Miller Ruales, José Camilo Eraso

PMC · DOI: 10.1016/j.dib.2025.112063 · Data in Brief · 2025-09-13

## TL;DR

This paper introduces a detailed 11-year irradiance dataset from southern Colombia, useful for solar energy forecasting and system design.

## Contribution

The paper provides a rigorously preprocessed, high-resolution irradiance dataset spanning 2013 to 2023 for San Juan de Pasto, Colombia.

## Key findings

- The dataset contains 603,495 irradiance records with 5-minute intervals, cleaned through preprocessing to ensure accuracy.
- The dataset supports AI models for irradiance forecasting and photovoltaic system optimization using indicators like HPS and seasonal trends.
- It can be used for educational purposes and to study solar energy variability and climate patterns in the region.

## Abstract

This article presents an extensive irradiance dataset collected in San Juan de Pasto, located in southern Colombia, using a Davis Vantage PRO 2 meteorological station. The dataset spans 11 years, covering the period from 2013 to 2023, with measurements taken at 5-minute intervals, resulting in approximately 603,495 irradiance records, each accompanied by a corresponding timestamp.

The construction of the dataset required a rigorous preprocessing stage. This stage included the removal of erroneous values (NaN) and outliers, the identification of missing entries, and the correction of inconsistencies in the date records. Missing values were addressed through gap-filling procedures based on averaged data, complemented by visual inspections using graphical representations. The cleaned dataset was exported after ensuring data integrity, accuracy, and consistency, which are essential for reliable analysis and subsequent modeling.

This dataset is valuable for building training datasets used as input for artificial intelligence models to perform short-, medium-, and long-term irradiance forecasting. For instance, Barco-Jiménez et al. (2021) utilized a portion of this dataset to develop multitemporal irradiance predictions. These predictive models can be applied in various domains, including energy management, grid optimization, and solar energy production planning. Furthermore, the dataset supports statistical analyses that provide insights for appropriately sizing photovoltaic systems through indicators such as Hours of Peak Sunlight (HPS), maximum and minimum irradiance values, average daily and monthly irradiance, and seasonal trends. These indicators play a fundamental role in the optimization of photovoltaic system performance, contributing to cost reduction and enhancing energy efficiency across rural, residential, and commercial applications.

This dataset supports photovoltaic system design and studies on solar energy variability and climate patterns in the region. Analysis of irradiance fluctuations over time provides insights into the influence of atmospheric conditions on solar energy availability. This information is essential for enhancing the reliability of solar power systems and effectively integrating renewable energy sources into existing power grids. The dataset can also be used in educational settings to teach data analysis techniques and renewable energy concepts, providing students and researchers with a practical resource for hands-on learning.

## Full-text entities

- **Chemicals:** silicon (MESH:D012825), NaN (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12538570/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12538570/full.md

## References

7 references — full list in the complete paper: https://tomesphere.com/paper/PMC12538570/full.md

---
Source: https://tomesphere.com/paper/PMC12538570