Hybrid AI Models for Short-Term Photovoltaic Forecasting: A Systematic Review of Architectures, Performance, and Deployment Challenges
Joan M. Saltos, M. Gabriela Intriago Cedeño, Ney R. Balderramo Velez, Germán T. Ramos León, A. Cano-Ortega

TL;DR
This paper reviews hybrid AI models for predicting solar energy output, highlighting their structures, performance, and challenges in real-world use.
Contribution
The paper introduces a novel classification of hybrid AI models and identifies key factors affecting their performance and deployment.
Findings
Optimized and decomposition-based hybrids offer the best balance of effectiveness and efficiency.
Weather and historical PV data are most commonly used inputs for forecasting.
High computational costs and data quality issues remain major barriers to deployment.
Abstract
The rapid incorporation of solar energy (PV) systems into electrical grids has increased the demand for accurate short-term forecasts to ensure stability and improve processes. Although hybrid artificial intelligence (AI) models are increasingly being suggested to address this challenge, there is a lack of systematic compilation of their structures, effectiveness, and readiness for use in real-world applications. This paper provides a detailed analysis of 58 peer-reviewed articles (2020–2025) focused on hybrid models for short-term (1–24 h) solar photovoltaic power forecasting. We propose an innovative classification that groups hybrids into four categories: AI-AI (28%), AI with optimization (21%), decomposition-based (17%), and image-based (7%). Our research indicates that weather conditions (34%) and historical photovoltaic energy records (32%) are the most frequent inputs, and that…
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15- —University of Jaén
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSolar Radiation and Photovoltaics · Photovoltaic System Optimization Techniques · Energy Load and Power Forecasting
1. Introduction
Solar photovoltaic energy has emerged as one of the fastest-growing renewable electricity technologies and has great potential worldwide [1]. According to recent estimates by the International Renewable Energy Agency (IRENA), global photovoltaic solar energy capacity is expected to reach 5234 GW by 2030, indicating significant growth and accelerating the shift toward low-carbon energy systems. Figure 1a illustrates installed capacity per year in 2024, and Figure 1b illustrates total cumulative capacity for 2024 [2].
However, the natural fluctuation of solar radiation generates irregularities that impact the reliability and consistency of electrical systems, especially in situations where there is a large incorporation of renewable energies [3]. This challenge has increased the need for more accurate, robust, and flexible forecasting techniques for solar photovoltaic energy, especially for time periods in which operational decisions react significantly to variations [4,5].
In this field, hybrid models have become one of the most promising areas of study [6]. Recent research indicates significant advances in the accuracy of predictions [7], reduction in economic losses and lower costs due to penalties [8], efficient transfer of attributes in situations with limited data [8], and optimization in daily dispatch [9].
These approaches integrate information acquired through irradiance sensors, weather sensors, electrical sensors, and image-based sensors, allowing for a more comprehensive characterization of the dynamic behaviour of photovoltaic systems [10]. Several studies have demonstrated that incorporating multisensory data into hybrid architectures significantly reduces errors in multiple performance metrics [11], improves the stability of predictions under variable atmospheric conditions [12], and enhances the ability of models to adapt to real operating environments with high solar penetration.
In addition to atmospheric variables, a growing number of studies incorporate electrical sensors, such as current, voltage, and active power, as fundamental inputs in hybrid photovoltaic prediction models. These variables, measured directly in inverters, smart meters, and SCADA systems, exhibit a high correlation with the actual operating state of the PV system, enabling the capture of losses, thermal effects, and performance degradation that cannot be inferred solely from irradiance. Recent studies demonstrate that integrating historical electrical data with meteorological variables and deep learning techniques improves the accuracy of short-term forecasts and enables advanced applications such as self-consumption management, optimal dispatch, and intra-day planning [5,13]. In this context, electrical sensors act as a direct link between the physical behavior of the system and predictive models based on artificial intelligence.
Complementary to this, image-based sensors, such as sky cameras and satellite sensors, have become a key tool in modern photovoltaic prediction approaches, particularly in ultra-short-term horizons. These sensors enable the capture of the spatial and temporal evolution of cloudiness, providing anticipatory information that is not available through traditional point sensors. Recent literature highlights that hybrid models that combine image data with convolutional and recurrent neural networks achieve substantial improvements in irradiance and PV power prediction compared to approaches based exclusively on weather sensors [14,15,16]. Consequently, vision sensors constitute an essential component of new-generation hybrid architectures, reinforcing the predictive capacity of models in highly dynamic scenarios [17].
Their increased importance is attributed to the natural limitations of physical, statistical, and machine learning methods when used in isolation [18]. Hybrid structures combine the complementary benefits of numerical weather prediction, statistical learning, and artificial intelligence, facilitating better performance in different weather situations [19].
Despite their capabilities, hybrid models still face various technical and operational challenges, such as the appropriate choice of fundamental models, interference and anomalies in irradiance and photovoltaic energy measurements, uncertainty in climate predictions, unreliable confidence signals, complicated hyperparameter tuning, rapid climate change, and high computing costs [20].
An organized, quantitative, and [14,15,16] architecture-focused collection of hybrid models for short-term photovoltaic energy forecast (1–24 h) is provided in this review. The primary contributions are as follows:
The first comprehensive comparison in terms of accuracy, computational cost, complexity, and data needs is provided by classifying four major classes of hybrid models from 58 recent investigations.
A clear quantitative perspective on the evolution of methodologies is provided by the fact that 28 percent of studies combine artificial intelligence with other artificial intelligence, 21 percent combine artificial intelligence with optimization techniques, 17 percent use hybrids that break down signals, and 7 percent use image-based techniques.
Climate variables, historical solar energy production, and irradiance indices are the most important data, which explains how high-performance models are constructed, according to the grouping of 14 parameter groups.
The United States 29% and China 22% account for almost 51% of the world’s contributions, whereas Australia, Morocco, and India are significant regions for validation in various climatic conditions.
Evidence indicates that optimized and decomposition-based hybrids provide the best balance between accuracy and robustness, while image-based hybrids are superior for very short-term forecasts.
The review is organized around four research questions (RQs), each focusing on a key aspect: (1) the input parameters and hybrid architectures used in PVPF models, (2) the impact of hybrid architectures on predictive performance, (3) the limitations and challenges associated with input data, and (4) barriers to real-world implementation, along with (5) future research directions. By combining technological developments, limitations, and emerging trends, this study provides a comprehensive guide for academics and experts involved in renewable energy forecasting and grid integration tactics.
By presenting organized information on technological advances, shortcomings and future prospects, this study serves as a useful reference for researchers and experts focused on renewable energy management and its integration into smart grids. Current reviews on solar forecasting have provided important overviews of statistical, machine learning, deep learning, and numerical weather prediction techniques, as well as image-based approaches, along with their forecasting horizons and practical applications. However, they only provide a superficial and fragmented analysis of hybrid models. The present reviews fail to clearly classify hybrid architectures, compare their integration methods, or examine their advantages and disadvantages in comparison with independent methods. A methodical and focused review of hybrid forecasting strategies is still needed to structure the field, highlight performance advantages, and guide the development of more accurate and robust hybrid models for the next generation. Table 1 shows a comparative analysis of recent literature on photovoltaic forecasting. The rest of this document is organized as follows: Section 2 describes the methodology used to select and evaluate the literature; Section 3 sets out the theoretical foundations of solar energy hybrid forecasting; Section 4 examines the input parameters and hybrid architectures identified; Section 5 reviews enabling technologies and their limitations; Section 6 analyses the results in terms of performance; Section 7 presents a critical analysis of the implications and limitations; and Section 8 summarizes the conclusions and suggests future directions for advanced hybrid models suitable for real-world energy systems.
2. Methodology
This systematic review is based on the methodological guidelines of Kitchenham and Charters [29], which are widely recognized in the scientific community for their thoroughness, clarity and evidence-based approach [30], being part of these standards offers several significant advantages, such as the organized development of protocols, reduced bias in study selection, and greater clarity and repeatability of procedures [31]. The main objective of this analysis is to compile a comprehensive and exhaustive review of the relevant literature published between 2020 and 2025, focusing particularly on short-term solar energy forecasting using hybrid techniques [32]. To ensure transparency, reproducibility, and methodological rigor, the study selection process was conducted following the PRISMA 2020 guidelines. The identification, screening, eligibility assessment, and inclusion stages are summarized in Figure 2. A total of 195 records were initially screened after removing duplicates and ineligible entries. Following title and abstract evaluation, 93 reports were sought for retrieval, of which 86 were assessed for eligibility. After applying predefined inclusion and exclusion criteria primarily focusing on short-term forecasting scope, hybrid AI methodology, and sufficient methodological detail 58 studies were finally included in the review.
The methodology is organized into three main phases, as illustrated in Figure 3: (1) Development of the review protocol, (2) Literature search and execution, and (3) Selection and synthesis of studies. This structured approach ensures a transparent, replicable, and systematic process for identifying, evaluating, and synthesizing relevant research.
This methodological framework ensures a systematic and reproducible process for identifying, evaluating, and synthesizing relevant literature. The research questions guiding this review are illustrated in Figure 4.
A comprehensive keyword search plan was developed to systematically identify relevant literature in three main areas: (1) solar photovoltaic forecasting, (2) short-term forecasting horizons, and (3) hybrid methodologies. The search covered synonyms, acronyms, and different terminological variants in each field. To refine the search and tailor it to the study’s focus, additional terms related to modelling techniques, data processing methods, and evaluation metrics were incorporated. This method ensured the inclusion of publications relevant to the predefined research questions, while preserving methodological rigour. The final search can be found in Table 2.
Following the application of the search string to the title, abstract, and author keyword fields, a total of 661 articles were initially identified. The search was conducted using a date filter for the period 2020–2025 in ScienceDirect and IEEE Xplore. The results were limited to review articles and original research articles within the energy and engineering subject areas.
Figure 5 shows how the selected publications related to short-term photovoltaic solar energy forecasting (PVPF) using hybrid models are distributed over time and by type between 2020 and 2025. There is a clear increase in the number of publications, especially peer-reviewed articles, which are the most frequently published each year and peak in 2025. This growth reflects growing academic interest and active research in energy management using hybrid PVPF, possibly motivated by advances in artificial intelligence-driven data analysis and the growing need for sustainable energy solutions.
From a geographical perspective, only a limited number of countries are leaders in research on forecasting for hybrid photovoltaic systems. Figure 6 illustrates the global distribution of publications dealing with these models. The United States, China and India account for most of the contributions, indicating their strong commitment to incorporating renewable energies and their considerable installed capacity in photovoltaics. Australia and Morocco are also recognized as important contributors, especially due to their abundant solar resources and increased investment in solar technologies. On the other hand, countries such as Canada, Germany, the Netherlands, Turkey and Brazil show more modest, albeit relevant, efforts in advancing these prediction models.
3. Forecasting Models
3.1. Solar Forecasting
The essential characteristics of photovoltaic solar energy production are crucial for solar energy simulation and forecasting. This section explains the important aspects of solar forecasting, including the variables relevant to forecasting and the time frames for forecasting.
(A) Forecast horizon
The forecast horizon officially describes the prospective time interval over which photovoltaic energy production is estimated. It is widely recognized in the literature that the uncertainty and magnitude of error in forecasts generally increase with the length of the time horizon [33]. Forecast horizons are conventionally classified into four main categories:
- Very short-term forecast: Predictions ranging from seconds to minutes, and typically not exceeding 30 min [34].
- Short-term forecast: Predictions covering periods from 1 h to 48 h, or up to a week in advance [35].
- Medium-term forecast: Projections covering a period ranging from one week to one month in advance [36].
- Long-term forecast: Estimates covering a period ranging from one month to one year in advance. These forecasts play a key role in strategic planning for electricity generation, transmission infrastructure development and distribution network management [37].
The corresponding applications for decision-making are shown schematically in Figure 7.
Short-term photovoltaic (PV) forecasting is defined as the estimation of energy production over a period of 60 min to 6 h [38]. This temporal division is based on the characteristics of meteorological processes that influence solar irradiance at different scales [39]. In the range of one minute to one hour (short-term forecast), the movement of individual clouds and short-term fluctuations in atmospheric turbulence dominate [40]. In the range of one to three hours (very short-term forecast), mesoscale weather systems and local convection patterns exert a significant influence. Over a period of three to six hours (short-term forecast), synoptic systems increasingly determine the behavior of solar irradiance [41]. This temporal stratification determines the most appropriate modelling techniques for each period. Methods based on persistence and satellite imagery are optimal for immediate forecasts, while numerical weather prediction (NWP) models and machine learning (ML) techniques are more effective for longer forecast periods [42]. Figure 8 illustrates the relationship between short-term forecast horizons, forecast models, and dominant solar intermittency phenomena [43].
(B) Spatial Dimension in Short-Term Photovoltaic Forecasting
In addition to temporal aspects, consideration of space is vital to the accuracy and robustness of solar energy prediction models. The geographical scale used for prediction, whether at the level of individual solar installations or broader regional and grid aggregations, has a notable impact on patterns of variability and model effectiveness.
When focusing on individual plants, the accuracy of predictions is heavily influenced by local weather behavior, such as rapid changes in cloud cover, differences in microclimates, the influence of shading, and variations in sunlight received at specific locations. In these cases, the use of high-resolution data, such as sky cameras, ground-based solar sensors, and specific meteorological data, is crucial. Hybrid approaches at this level typically require a variety of inputs and systems that can respond quickly to accurately reflect variability within an hour [44]. Therefore, effective hybrid PV forecasting systems should explicitly consider spatial granularity as a design parameter, aligning model complexity, input features, and computational architecture with the intended deployment scale.
In comparison, forecasting on an aggregate or regional basis gains advantage from the effects of spatial smoothing. When several solar power setups are spread out over an extensive region, localized changes in sunlight often counterbalance one another, leading to a decrease in overall variability and resulting in diminished normalized forecasting inaccuracies. Consequently, more straightforward hybrid configurations can perform effectively on a regional level in comparison to highly intricate models focused on individual plants [45]. Nevertheless, forecasting on a large scale introduces fresh obstacles, including the need for modeling spatial correlations, synchronizing data among a network of distributed sensors, and coordinating with energy management systems at the transmission level. Inputs obtained from satellites and numerical weather prediction (NWP) models are becoming increasingly significant for broader geographic scopes.
Thus, efficient hybrid PV forecasting systems need to take spatial granularity into account as a design factor, ensuring that model complexity, characteristics of inputs, and computational structure are aligned with the scale of deployment intended [20].
3.2. Photovoltaic Solar Power Forecasting Techniques
The forecasting of solar energy production or solar irradiance is a non-linear problem influenced by meteorological conditions [46]. Identifying parameters that accurately capture these non-linear trends is challenging. The process of forecasting photovoltaic (PV) energy production generally consists of three main steps. First, energy properties are extracted and influencing factors are analysed. Next, a suitable forecasting method is selected and the model is optimized [47]. Photovoltaic forecasting methods can be classified into three groups according to the methodology used: physical, statistical, and hybrid. This section briefly describes the different solar forecasting methods, focusing on hybrid approaches.
(1) Statistical models
Prediction techniques based on previous solar radiation records are classified into two groups: statistical techniques and learning techniques. Examples of statistical techniques include the study of seasonality, ARIMA models, multiple regression, and exponential smoothing. In contrast, artificial intelligence approaches encompass fuzzy inference systems, genetic algorithms, neural networks, and machine learning, among others. Figure 9 shows a categorization of statistical models according to their fundamental principles.
This classification summarizes the diversity of approaches used in statistical methods for short-term photovoltaic energy forecasting.
Statistical models are trained on available data to extract relevant patterns that enable accurate predictions. The quantity and quality of data are the most influential factors in forecast accuracy. The two main categories of data-driven models are time series (TS) and machine learning (ML) models [48]. AI-based statistical models are widely used for solar energy forecasting [49]. As a result, various intelligent computing technologies are increasingly employed as alternatives to conventional techniques. Table 3 compiles recent publications that use AI models for PV power forecasting, outlining their temporal horizon, temporal resolution, input parameters, forecasting models, and key insights.
Based on the information presented in Table 3, various AI models demonstrate a high capacity for capturing nonlinear relationships in solar power forecasting. However, their performance is dependent on large volumes of data and substantial computational resources.
ANN-based approaches have emerged as a prominent alternative for modelling complex nonlinear relationships between climatic variables and power generation. The following section synthesizes relevant applications and findings of ANN models used for short-term PV forecasting, with a focus on key variations including MLP, RNN, CNN, LSTM, and GRU architectures.
CNN models are widely applied in the energy sector for short- and medium-term solar radiation prediction [67]. Radio neural networks (RNNs) are frequently used to estimate solar irradiance predictions, achieving 97% accuracy with an RMSE of 4% [58]. Deep multi-layer configurations, especially LSTM networks, are used to forecast short-term energy generation [68]. Using past records of solar irradiance, solar energy, and numerical climate predictions, a GRU-based model has been created to forecast long-term solar energy [69]. For meteorological variables such as temperature and solar irradiance, the K-NN algorithm, based on pattern recognition, is a reliable method for predicting daily sequences from historical meteorological data [70]. SVM is used to estimate photovoltaic power using a time series analysis approach [71]. RF is applied for the prediction of photovoltaic energy generation [72].
(2) Physical models
Physical models represent how photovoltaic (PV) modules transform solar energy into electricity. These models estimate daily energy production based on anticipated weather data for a specific day [73]. The essential input elements include solar radiation, cloud cover and ambient temperature. Mathematical equations are used to determine the expected amount of energy generated by the solar panels [74]. Scientists can customize these models for different locations by incorporating local information such as panel tilt, particular weather patterns in the area, and historical energy generation data [75]. However, these physical models are most efficient under stable weather conditions; sudden fluctuations in weather variables can affect their accuracy. One example of a physical forecasting method involves the use of satellite imagery. This technique calculates solar energy production by analysing surface solar radiation records obtained by satellites in geostationary orbit [76]. Figure 10 presents the general outline of the physical models, which integrate climate forecasts with system information and measurement data to model solar irradiation on photovoltaic panels and anticipate the generated energy.
Figure 10 highlights how prediction based on physical models depends heavily on the accuracy of meteorological data and system characterization.
(3) Hybrid models
The essential concept of hybrid methods is based on combining models with different theoretical foundations and predictive capabilities. This collaboration allows these approaches to address different aspects of solar variability that influence the accuracy of forecasts at various time scales and in different climatic situations [77]. By combining complementary techniques, hybrid models offer a more detailed description of the natural uncertainty of meteorological and photovoltaic systems, as well as providing greater adaptability to changing weather conditions. For example, scientists have created models that fuse deep neural networks with numerical weather prediction models, leveraging both the generalization capabilities of artificial intelligence and the physical basis of atmospheric modelling [78].
Hybrid artificial neural network (ANN) models that use metaheuristic approaches combine the predictive power of ANNs with the optimization capabilities of metaheuristic algorithms. This approach aims to overcome the limitations of ANNs alone, particularly their susceptibility to getting stuck in local minima during training and their sensitivity to initial parameters and network architecture [79]. Long short-term memory (LSTM) and gated recurrent unit (GRU) recurrent neural networks have demonstrated high efficiency in forecasting applications [80]. To address the challenges associated with search and convergence in inverse problems based on evolutionary computation, optimizations have been applied using hybrid models that employ techniques such as Flower Search Optimizations (FSO) and Particle Swarm Optimisation (PSO) [81]. Methods for identifying optimal deep learning model parameters using Genetic Algorithms (GAs) are also widely adopted for short-term forecasting [82].
Table 4 summarizes a selection of studies from 2020 to 2025 on short-term, very short-term, intra-hourly and daily photovoltaic energy forecasts, with an emphasis on hybrid approaches and AI-based model optimization techniques. Each entry is classified by year of publication, time horizon and resolution, input parameters, forecasting model, country of origin of the photovoltaic data, and relevant observations. This compilation facilitates the identification of methodological and regional trends in photovoltaic forecasting, allowing for the comparison of hybrid approaches in different climates and data sources, and highlights improvements in accuracy relative to reference models.
3.3. Domain Applications of Short-Term Photovoltaic Forecasting
Short-term forecasting of electricity generated by solar panels is proving to be a key issue for the management and design of electrical systems affected by fluctuating renewable energy sources. As the amount of decentralized solar energy grows in different geographical areas, the ability to accurately forecast electricity production over periods ranging from a few minutes to a few hours is proving crucial to maintaining grid stability, optimizing operational efficiency, and ensuring system security [23].
Intra-hourly forecasts are fundamental for frequency and voltage management, as well as for automatic generation management in electrical systems with a high integration of solar energy [127]. In today’s grid environments, these forecasts contribute directly to Automatic Generation Control (AGC) and Energy Management Systems (EMSs), enabling real-time dynamic realignment in response to variability. Furthermore, short-term expectations are used to improve the scheduling of thermal power units, battery utilization, and energy storage strategies in microgrids (MGs) and isolated systems [103].
Very short-term projections are especially necessary to ensure real-time balance between load, generation, and storage, particularly in hybrid configurations where multiple distributed energy sources operate simultaneously [128].
In expansive solar power facilities, the incorporation of hybrid AI frameworks into SCADA systems is becoming more prevalent for the purposes of ongoing surveillance, detecting anomalies, and planning operations for the following day. In such industrial settings, simply having accurate predictions is not enough; forecasting systems must also adhere to stringent latency standards, maintain high dependability, and show robustness against issues like sensor drift, measurement inaccuracies, and disruptions in communication. Furthermore, aligning varied data feeds from weather sensors, electricity meters, and satellite or aerial imaging systems poses significant technical difficulties within distributed IoT-Cloud setups.
In the context of microgrids and remote energy networks, forecast results must be closely integrated with Battery Management Systems (BMSs) and inverter control methods to ensure immediate power equilibrium while reducing dependence on fossil fuel backup generation. This integration necessitates that forecasting models are not only precise but also efficient in computation and able to function on limited-resource edge devices that are deployed on-site.
Models utilizing satellite-based deep neural networks (DNNs) that can adapt across various PV locations without needing local measurement tools have been suggested for areas with minimal infrastructure. These methods have shown prediction inaccuracies that are on par with those from models trained locally, enabling broader application in developing regions where monitoring systems are scarce. From an industrial angle, this ability to generalize lessens installation expenses, streamlines system setup, and boosts scalability.
Beyond assessing theoretical performance, the actual execution of hybrid forecasting models in working photovoltaic facilities involves tackling further engineering challenges, such as the following:
Interoperable integration with SCADA systems and grid management platforms.Compliance with grid codes related to active and reactive power control.Curtailment management under grid congestion conditions.Predictive operation and maintenance (O&M) through deviation analysis between forecasted and actual generation.Secure and reliable data transmission within distributed IoT architectures.
Thus, the practical use of hybrid AI systems for short-term solar forecasting in industry requires a holistic methodology that melds algorithmic precision, computational adeptness, compatibility, durability, and scalability. These factors elevate short-term forecasting beyond mere statistical modeling, making it a vital functional element in contemporary digitalized energy systems.
4. Internet of Things and Cloud-Based Architecture for Hybrid Solar Prediction Systems
Lately, the combination of IoT technology with photovoltaic systems has revolutionized how solar energy is observed, managed, and enhanced, allowing for instant tracking, remote operation, proactive upkeep, and optimization based on data. Essential aspects of a solar monitoring system consist of tracked variables, kinds of sensors, management units, methods of data transmission, software applications, and ways to monitor.
IoT architectures applied to energy monitoring follow a multi-layered model composed of a sensing layer, edge processing layer, communication layer, and application layer [127]. Figure 11 illustrates the block diagram of an IoT-based PV monitoring system.
4.1. Sensing Layer
The sensing layer consists of gathering unprocessed readings from devices that track factors like photovoltaic panel voltage, array amperage, instantaneous power, solar irradiance, module temperature, humidity levels, etc.
Hall effect sensors (ACS712), voltage dividers, LM35, and DHT sensors were connected to Arduino or ESP8266 microcontrollers for digital conversion and initial processing [127].
The integrity of the information obtained in this stage is crucial for the following training of hybrid predictive models, as continuous errors can influence the model’s ability to generalize effectively.
4.2. Edge Processing Layer
The system’s primary distributed architecture element is the Edge layer as show Figure 12. The PV system sensors provide the data, which is then passed to the data sensing layer for semantic enrichment and pre-processing. There is a hierarchy between the two processing sub-layers that make up the Edge layer. Since it makes sense to enrich only the qualified or properly selected data, pre-processing always comes before data enrichment. The streaming processes are then further analysed by looking more closely at each layer’s capabilities [128]. This layer can be applied on a first stage of noise reduction before feeding the hybrid physical-intelligent model.
4.3. Communication Layer
The communication layer of the IoT, also referred to as the network or transport layer, serves as the fundamental framework that facilitates the exchange of information among devices, gateways, and cloud services.
It accommodates a variety of protocols including Wi-Fi, BLE, Zigbee, LoRaWAN, and cellular networks like 5G, which link sensors to processing units. Essential functions of this layer encompass addressing, routing, and ensuring dependable packet delivery [127].
The selection of protocol affects the reliability of data transmission needed for immediate forecasts. Communication techniques may differ based on geographic and energy circumstances. As mentioned by [129], GSM is a more effective option for remote rural microgrids where Wi-Fi access is limited. ZigBee is employed for energy-efficient local networks as noted by [130]. In urban settings and IoT smart grid scenarios, Wi-Fi (ESP8266) is utilized. LoRa is preferred for extensive areas requiring minimal power usage.
4.4. Application Layer
The application layer represents the highest level within the IoT protocol framework, engaging directly with user software, applications, and devices to facilitate data formatting, assurance of security, and communication of messages [131].
The growing Internet of Things offers a chance to drastically improve the oversight of solar energy production and facility functions. For this purpose, a remote monitoring system is essential, leveraging the Internet of Things to collect and send data. This system consists of elements like a data gateway, data gathering, and display for a cloud service. Data acquired are saved in the cloud, allowing for a visual depiction of the monitored variables. To improve the use of solar energy, IoT-driven monitoring systems have been created to enable real-time data collection and analysis of solar metrics for efficiency forecasting and reliable power generation [132].
These findings highlight the necessity of considering recent climate variations when estimating solar output in near real-time, especially in isolated locations and areas that differ from worldwide patterns.
Table 5 indicates that while many IoT-based PV monitoring systems have been introduced, the majority of their designs concentrate mainly on data collection and visualization. There are few studies that incorporate edge-level processing alongside cloud-based hybrid AI forecasting models. Additionally, only a limited number of contributions assess communication reliability and its effect on the accuracy of short-term predictions. This deficiency underscores the necessity for fully integrated IoT-Edge-Cloud intelligent frameworks for photovoltaic forecasting, which serves as the primary motivation for this review.
4.5. System-Level Integration Challenges in IoT-Based PV Forecasting Architectures
Although IoT applications significantly improve monitoring and prediction in photovoltaic systems, there are still several system-level integration issues that have not been adequately addressed in research.
4.5.1. Communication Latency and Forecasting Accuracy
Short-term photovoltaic forecasting, especially at intra-hourly and minute intervals, requires reliable real-time data flows. However, communication latency caused by wireless protocols such as Wi-Fi, GSM, and LoRaWAN can cause desynchronisation between meteorological and electrical data. Even minimal delays can negatively affect model performance, especially in hybrid artificial intelligence architectures that rely on coordinated data from multiple sources. Thus, the accuracy of predictions is not only based on the algorithm, but is also influenced by the robustness and predictability of the IoT communication infrastructure used [133].
4.5.2. Packet Loss and Data Integrity
In distributed solar power systems located in rural or remote areas, problems such as packet loss and unstable connectivity are common. The lack of or imperfection of sensor data can introduce biases during the training phases and limit the ability of machine learning models to generalize.
Therefore, rigorous data validation, redundancy mechanisms, and edge-level buffering strategies are essential to ensure the quality of datasets before transferring them to the cloud [134].
4.5.3. Interoperability and Data Standardization
Photovoltaic monitoring systems that use IoT technology typically combine different sensors, microcontrollers, connection devices, and cloud services. The absence of uniform standards in data formats and communication interfaces complicates communication between dispersed nodes.
To achieve seamless integration, it is essential to use standardised protocols (such as MQTT and HTTP/REST) and organised formats (such as JSON or CSV). Without adequate interoperability, the massive expansion of distributed prediction structures remains restricted [135].
4.5.4. Scalability of Distributed Sensor Networks
With the rise in monitored solar power setups, the importance of system scalability grows. Centralized cloud systems might face limits in bandwidth and higher processing demands.
Hybrid edge-cloud models address this challenge by shifting initial filtering and feature extraction activities to edge nodes, lessening reliance on the cloud and enhancing the overall efficiency of the system [136].
4.5.5. Cybersecurity and Data Privacy
IoT-enabled photovoltaic system structures face risks of cyberattacks, such as identity theft, data manipulation, and denial-of-service incidents. Since predictive models rely on accurate sensor data, manipulated information can lead to erroneous operational decisions.
Consequently, robust encryption, validation methods, and secure communication channels are essential elements in predictive energy systems that leverage the IoT [137].
4.5.6. FAIR Data Compliance and Data Exchange
Ensuring that solar datasets produced by the IoT adhere to FAIR Standards (Findable, Accessible, Interoperable, and Reusable) is essential for reproducibility in science and collaboration in large-scale predictive research [138].
Structured metadata annotation.Persistent identifiers (DOIs) for datasets.Open and machine-readable formats.Public or controlled-access repositories.
The methodical incorporation of data sources that align with FAIR Standards promotes clarity, enables better comparison of hybrid artificial intelligence models, and optimizes data exchange between different institutions.
5. RQ1: What Are the Input Variables and Most Commonly Used Techniques That Stand out in Hybrid Models for Short-Term Solar Forecasting?
First, as common input elements, meteorological characteristics like temperature, relative humidity, and wind speed clearly dominate. Furthermore, historical photovoltaic energy production data are frequently utilized.
It should be mentioned that the most popular hybrid models are AI-AI combinations, which are followed by AI combinations with signal decomposition and optimization algorithms. The distribution of the hybrid model categories and the relative frequency of the input factors utilized in the study are compared in Figure 13.
Figure 13b shows solar irradiance as a third important factor, while images and other statistical data are used less frequently as input parameters in hybrid models. This highlights the four most commonly used parameters in the development of hybrid models.
By gathering information on the input parameters and the different hybrid modelling techniques used, Figure 14 shows the distribution of input parameters across different categories of hybrid models. This perspective offers a deeper understanding of hybrid modelling strategies, revealing that the choice of input data influences the effectiveness and applicability of each hybrid forecasting approach.
6. RQ2: What Enabling Technologies Are Used in Hybrid Models for Short-Term Photovoltaic Energy Prediction, and What Are Their Limitations?
The most widely used technologies, according to the information obtained from the analysis, reveal that deep learning architectures form the computational core of most hybrid models, and their effectiveness is often improved through evolutionary optimization for parameter tuning and signal decomposition for data refinement. Notably, the choice of enabling technology exhibits strong horizon-dependence: Vision-Based methods are particularly effective for minute-ahead forecasts, whereas Hybrid Statistical-ML frameworks show promise for probabilistic day-ahead predictions. Table 6 systematically categorizes the enabling technologies employed in hybrid models for short-term PV power forecasting, synthesizing their roles and limitations.
7. RQ3: Impact of Hybrid Architectures on the Performance of Short-Term Photovoltaic Energy Forecasting
Independent models perform less well in forecasting than hybrid models. In this area, combinations of artificial intelligence with artificial intelligence predominate, especially different versions of recurrent neural networks (LSTM, GRU) that are integrated with optimization methods (GA, PSO, WOA) or signal decomposition techniques (CEEMDAN, VMD). These hybrid approaches consistently outperform individual models, with improvements in accuracy ranging from 9% to 96% in the reduction of mean square error. However, performance improvements come with inherent disadvantages. While optimization-enhanced hybrids (e.g., GWO-MLP [64]) improve accuracy by 15–30%, they increase computational cost by 40–60% due to population-based searching. Similarly, decomposition-based hybrids (e.g., EMD-SCA-ELM [70]) excel in very short-term horizons (≤15 min) but add pre-processing overhead that may not be justifiable for longer forecasts; Figure 15 summarizes the comparative performance of the main categories of hybrid PV forecasting models using a qualitative heatmap. It contrasts five model families (AI–AI, AI–Optimization, Decomposition-Based, Statistical–AI, and Vision-Based) across four dimensions: accuracy, computational cost, model complexity, and robustness. In addition, an embedded table reports the forecasting horizon and temporal resolution; since these are expressed as ranges rather than ordinal variables, they are presented separately and are not included in the heatmap.
The effectiveness of hybrid architectures is strongly horizon dependent:
Ultra-short-term (≤1 h): Decomposition-based and vision-based hybrids perform best.Short-term (1–6 h): Optimization-enhanced RNNs are most effective.Day-ahead: Data-fusion hybrids combining NWP with AI show superior reliability.
As illustrated in Figure 15, no single architecture dominates across all metrics. AI-AI hybrids achieve peak accuracy but with high complexity, while optimization-based hybrids offer better computational efficiency, and statistical-AI hybrids provide superior interpretability.
8. RQ4: Challenges of Hybrid Models for Short-Term Photovoltaic Energy Prediction
The main barriers identified in the reviewed literature are detailed below, along with proposals for overcoming these observed limitations. Hybrid models combined with optimization and signal decomposition use high computational resources for training, which hinders their integration and validation in real time.
Analyses show that improved hybrids (e.g., GWO-MLP) can increase computational load by 40% to 60% compared to individual models. Furthermore, decomposition-based pre-processing introduces significant latency, particularly for very short-term forecasts. This limits scalability for utilities with limited IT budgets or facing edge deployment scenarios, especially in distributed solar energy systems.
The accuracy of predictions is based on high-quality, well-defined information (e.g., irradiance, sky photographs, PNT). Low-quality data limit the model’s ability to generalise and increases the difficulty of pre-processing, particularly in mixed models that use images and satellites.
The lack of uniformity in evaluation measures (such as RMSE, MAE, and skill score) and the variety of experimental conditions complicate fair comparison and replicability across studies. Only 34% of the studies reviewed used multiple error metrics, and less than 20% provided comprehensive quantification of uncertainty or prediction intervals.
Numerous hybrid models are created independently and are not prepared to communicate with existing SCADA, EMS, or data file systems. Only 12% of the analyses examined referenced connection to network management platforms or real-time data streams.
The following table summarizes the main barriers hindering the implementation of hybrid photovoltaic forecasting models, along with proposed solutions and supporting evidence drawn from the reviewed literature. This synthesis provides a clear roadmap for researchers and practitioners seeking to develop next-generation, grid-ready forecasting systems as show Table 7.
9. Conclusions and Future Directions
The literature review conducted in this paper establishes a path for discovering how the use of hybrid architectures affects short-term solar energy prediction, which is an important field for sustainable energy management companies, researchers in the field of renewable energy, and further advances in the development of energy management techniques in smart grids. Guided by four research questions, this article provides a structured and comprehensive understanding of how hybrid models are being applied, the technologies that predominate in the architectures, the challenges and limitations they face, and where the future of these models is headed.
The study begins with RQ1, which examines which of the hybrid models is the most widely used and under which input parameters learning can be performed. The analysis shows that AI-AI hybrids are the most widely used and that the input parameters for configuring these architectures are historical data and meteorological parameters. Combining AI-AI-optimization architectures can intelligently improve predictions using the same amount of input data and emerges as a promising direction.
RQ2 identifies that deep learning models are highly complex and require large amounts of data, while optimization algorithms generate high computational costs. To overcome these difficulties, we suggest creating lighter deep learning architectures for use on local devices, developing decomposition methods that adapt to specific weather patterns, and establishing benchmark standards to ensure reproducible optimization results. Furthermore, future studies should investigate federated learning models that utilize distributed solar data while maintaining privacy, as well as physics-based artificial intelligence to improve the generalization ability of models in extreme weather situations.
RQ3 addresses the impact of the family of hybrid models found in the review with regard to their performance in short-term solar energy prediction. The review highlights that AI hybrid models optimized for 1 to 6 h forecasts in advance have the highest accuracy compared to the other hybrid models found in the review. However, the computational cost is high. AI-AI hybrid models stand out for their high complexity as a model to be developed, but they are also among the models with the best accuracy without such high computational costs. Future research should develop hybrid models that combine the AI-AI family with optimization models to improve model stability while maintaining computational cost and high accuracy.
Finally, RQ4 allowed us to identify the trend towards which solutions to the complexities presented by hybrid models should be directed. The review proposes a hybrid model architecture that combines CNN-LSTM optimized with GWO, which includes the optimization of historical data processing in order to improve the model’s learning trends. This architecture aligns with the growing need for more accurate prediction systems that can not only process historical data but also use images to strengthen predictions with more input parameters.
To conclude the contributions presented in this review, a research roadmap based on the four research questions is included. Current applications of hybrid models focus mainly on improving the accuracy of their forecasts, taking into account that the complexity of the model increases and so does the computational cost. Signal processing technologies or the use of optimised models are proposed as a solution. Based on these findings, future research should aim to (i) expand the applications of hybrid models for use in the medium and long term, (ii) publish real cases through research using the hybrid models found and focus on practical applications to bridge the gap between academic research and industrial practice, (iii) develop CNN-LSTM-GWO hybrid model architectures that are adaptive to different dynamic climatic environments, (iv) promote models that standardize data collection and the evaluation of different existing architectures. This guide seeks to orient future initiatives towards hybrid models that are more scalable, intelligent, and accurate and less complex.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Tian Q. Li J. Xie Z. Li P. Wang Y. Chen D. Zheng Y. A novel metric for quantifying solar irradiance stability: Mapping solar irradiance variability to photovoltaic power generation Renew. Energy 202523912203510.1016/j.renene.2024.122035 · doi ↗
- 2Said Z. Alshehhi A.A. Mehmood A. Predictions of UAE’s renewable energy mix in 2030 Renew. Energy 201811877978910.1016/j.renene.2017.11.075 · doi ↗
- 3Puziem A.S. Diawuo F.A. Acheampong P. Anabadongo M.A. Abdulai D. Time series forecast of power output of a 50 M Wp solar farm in Ghana Sol. Compass 20251410011110.1016/j.solcom.2025.100111 · doi ↗
- 4Yang D. Wang W. Gueymard C.A. Hong T. Kleissl J. Huang J. Perez M.J. Perez R. Bright J.M. Xia X. A review of solar forecasting, its dependence on atmospheric sciences and implications for grid integration: Towards carbon neutrality Renew. Sustain. Energy Rev.202216111234810.1016/j.rser.2022.112348 · doi ↗
- 5Maciel J.N. Ledesma J.J.G. Junior O.H.A. Hybrid prediction method of solar irradiance applied to short-term photovoltaic energy generation Renew. Sustain. Energy Rev.202419211418510.1016/j.rser.2023.114185 · doi ↗
- 6Tian Z. Chen Y. Wang G. Enhancing PV power forecasting accuracy through nonlinear weather correction based on multi-task learning Appl. Energy 202538612552510.1016/j.apenergy.2025.125525 · doi ↗
- 7Negash T. Weldemikael N. Ghebregziabiher M. Tedla Y. István S. István F. Addressing photovoltaic (PV) forecasting challenges: Satellite-driven data models for predicting actual PV generation using hybrid (LSTM-GRU) model Energy Rep.2025142141215610.1016/j.egyr.2025.08.034 · doi ↗
- 8Zhuo Y. Long H. Wu Z. Gu W. LFTL: Lightweight feature transfer learning with channel-independent LSTM for distributed PV forecasting Energy AI 20252210061610.1016/j.egyai.2025.100616 · doi ↗
