Differential Evolution Algorithm based Hyper-Parameters Selection of Transformer Neural Network Model for Load Forecasting
Anuvab Sen, Arul Rhik Mazumder, Udayon Sen

TL;DR
This paper explores the use of Differential Evolution to optimize hyperparameters of a Transformer-based neural network for load forecasting, demonstrating improved accuracy over traditional models.
Contribution
It introduces a novel approach combining Transformer models with Differential Evolution for hyperparameter tuning in load forecasting tasks.
Findings
Metaheuristic-optimized Transformer models outperform traditional methods.
Differential Evolution effectively finds optimal hyperparameters for load forecasting.
Enhanced models show lower MSE and MAPE in experiments.
Abstract
Accurate load forecasting plays a vital role in numerous sectors, but accurately capturing the complex dynamics of dynamic power systems remains a challenge for traditional statistical models. For these reasons, time-series models (ARIMA) and deep-learning models (ANN, LSTM, GRU, etc.) are commonly deployed and often experience higher success. In this paper, we analyze the efficacy of the recently developed Transformer-based Neural Network model in Load forecasting. Transformer models have the potential to improve Load forecasting because of their ability to learn long-range dependencies derived from their Attention Mechanism. We apply several metaheuristics namely Differential Evolution to find the optimal hyperparameters of the Transformer-based Neural Network to produce accurate forecasts. Differential Evolution provides scalable, robust, global solutions to non-differentiable,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEnergy Load and Power Forecasting · Neural Networks and Applications · Image and Signal Denoising Methods
MethodsMulti-Head Attention · Attention Is All You Need · Softmax · Dense Connections · Linear Layer · Dropout · Sigmoid Activation · Adam · Gated Recurrent Unit · Label Smoothing
