Forecasting with Deep Learning

Gissel Velarde

arXiv:2302.12027·cs.LG·February 24, 2023

Forecasting with Deep Learning

Gissel Velarde

PDF

Open Access 1 Repo

TL;DR

This paper explores deep learning for time series forecasting, showing it works well with patterned data but struggles with less structured series like stock prices, and provides open-source implementation.

Contribution

It introduces a deep learning-based forecasting method and evaluates its effectiveness across different types of time series datasets.

Findings

01

Deep learning models perform well on patterned time series.

02

Models struggle with unstructured data like stock prices.

03

Open-source implementation is provided.

Abstract

This paper presents a method for time series forecasting with deep learning and its assessment on two datasets. The method starts with data preparation, followed by model training and evaluation. The final step is a visual inspection. Experimental work demonstrates that a single time series can be used to train deep learning networks if time series in a dataset contain patterns that repeat even with a certain variation. However, for less structured time series such as stock market closing prices, the networks perform just like a baseline that repeats the last observed value. The implementation of the method as well as the experiments are open-source.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Alebuenoaz/LSTM-and-GRU-Time-Series-Forecasting
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTime Series Analysis and Forecasting · Stock Market Forecasting Methods · Complex Systems and Time Series Analysis

Full text

11institutetext: Data & Analytics, Vodafone

11email: [email protected]

Author’s accepted manuscript of: Velarde, G. (2022). Forecasting with Deep Learning [White Paper]. Vodafone. The Data Digest, 2(8).

.

Forecasting with Deep Learning

Gissel Velarde

Ph.D

Abstract

This paper presents a method for time series forecasting with deep learning and its assessment on two datasets. The method starts with data preparation, followed by model training and evaluation. The final step is a visual inspection. Experimental work demonstrates that a single time series can be used to train deep learning networks if time series in a dataset contain patterns that repeat even with a certain variation. However, for less structured time series such as stock market closing prices, the networks perform just like a baseline that repeats the last observed value. The implementation of the method as well as the experiments are open-source.

Keywords:

Forecasting Deep Learning Machine Learning Time Series

1 Introduction

This paper aims to present a method based on two related deep learning architectures: Long-Short Term Memory (LSTM) and Gated Recurrent Unit (GRU). Deep learning networks of the type Recurrent Neural Networks (RNNs) are known to model dependencies over time [4]. Therefore, they are relevant in time series forecasting. LSTM decides to keep content thanks to its input, forget and output gates [3]. GRU consists of reset and update gates [2]. Since their inception, both networks have been extensively used in problems of sequential nature.

Although there are classical methods for time series forecasting, such as Autoregressive Integrated Moving Average (ARIMA), this report focuses on exploring deep learning networks. Indeed, previous studies have shown that LSTM outperforms ARIMA on financial data [5] and from various deep learning models, LSTM and GRU deliver low forecasting error [1]. Next, the method is explained in a nutshell. Its detailed description can be found in [6].

2 Method

The method consists of data preparation, model training, evaluation, and visual inspection, as seen in Fig. 2. Data Preparation consists of normalization, the definition of train and test sets partition, and the selection of a time series for training. First, each time series in the data set is normalized between 0 and 1. Then, a time series of length $Q$ samples is prepared as in Fig. 2, where $w$ is the window size, $f$ is the number of steps ahead for forecasting, and $N$ is the number of training samples. The remaining samples are used for testing.

Model training consists of training either an LSTM or a GRU network with a layer of 128 units, followed by a dense layer that outputs $f$ -step ahead. The networks are trained for 200 epochs, with Adam optimizer, and Mean Squared Error (MSE) Loss function.

Evaluation consists of measuring Root Mean Squared Error (RMSE) and Directional Accuracy (DA) between actual and predicted values on the test set. Finally, each time series is unnormalized and plotted for visual inspection to better understand the results.

3 Experiments

The experimental setup can be seen in Fig. 3. The method has been tested on two datasets, each with ten time series. The first dataset is the Activities dataset, which contains ten synthetic time series with five days of high activity and two days of low activity. This dataset may resemble, for example, the volume of weekly calls, see Fig. 4. The second dataset is the BANKEX dataset, which contains stock market closing prices of ten financial institutions, see Fig. 5. Fig. 6 shows the effect of normalization between 0 and 1. A window of size $w$ =60 days was used for data preparation with the first time series of each dataset. The last 251 samples of each series were used for testing. Forecasting was performed by LSTM, GRU networks, and a Baseline that simply repeats the last observed value. Each model was evaluated on one-step and twenty-step ahead RMSE and DA.

3.1 Results

Tables 2 to 5 summarise the mean and standard deviation (SD) of RMSE and DA over the ten time series on the test set of each dataset. Close-to-zero RMSE and close-to-one DA are preferred. Tables 2 and 3 present the results on the Activities dataset. The best results are highlighted in blue.

For One-Step ahead, GRU significantly outperforms LSTM and the Baseline on $RMSE$ . However, both deep learning networks perform equally well on DA, and significantly outperform the Baseline. For Twenty-step ahead forecast, LSTM is the clear winner considering RMSE and DA. On the Activities dataset, the networks prove their capability to learn patterns that repeat, even with a certain variation.

Tables 4 and 5 present the results on the BANKEX dataset. In this case, the networks perform just like the Baseline, possibly due to the nature of stock market series. Finally, visual inspection helps understand the numerical results; see Fig. 7 and Fig. 8.

4 Conclusion

This paper showcases a method using LSTM and GRU deep learning networks for time series forecasting with the following highlights:

•

It shows that LSTM and GRU networks can be trained for forecasting with a single time series in a dataset of series with patterns that repeat even with certain variation, if the data is properly prepared.

•

It shows the performance of the method on two datasets. While the method is appropriate for time series that contain patterns that repeat like those of weekly activities, it is not appropriate for stock market data, possibly because some information is not encoded in closing price alone, or due to the problem’s nature.

•

It is flexible to forecast not only one-step ahead but also twenty-step ahead.

•

In addition to the numerical evaluation provided by RMSE and DA, visual inspection helps understand the numerical results.

•

The implementation and results are reproducible and shared as open-source at: https://github.com/Alebuenoaz/LSTM-and-GRU-Time-Series-Forecasting

.

Dr. Gissel Velarde is a Senior Expert Data Scientist at Vodafone. She holds a Ph.D. degree from Aalborg University for her thesis on Machine Learning-based methods for media analysis, pattern discovery, and classification. In addition, she developed computational creativity models. She taught Artificial Intelligence, Machine Learning, and Deep Learning courses at the university level. Besides, she supervised the development of analysis and recommendation systems for media applications. Currently, she leads projects for forecasting and fraud detection systems.

Bibliography6

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Balaji, A.J., Ram, D.H., Nair, B.B.: Applicability of deep learning models for stock price forecasting an empirical study on bankex data. Procedia computer science 143 , 947–953 (2018)
2[2] Cho, K., Van Merriënboer, B., Bahdanau, D., Bengio, Y.: On the properties of neural machine translation: Encoder-decoder approaches. ar Xiv preprint ar Xiv:1409.1259 (2014)
3[3] Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural computation 9 (8), 1735–1780 (1997)
4[4] Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. nature 323 (6088), 533–536 (1986)
5[5] Siami-Namini, S., Tavakoli, N., Namin, A.S.: A comparison of arima and lstm in forecasting time series. In: 2018 17th IEEE international conference on machine learning and applications (ICMLA). pp. 1394–1401. IEEE (2018)
6[6] Velarde, G., Brañez, P., Bueno, A., Heredia, R., Lopez-Ledezma, M.: An open source and reproducible implementation of lstm and gru networks for time series forecasting. Engineering Proceedings 18 (1), 30 (2022), https://doi.org/10.3390/engproc 2022018030 · doi ↗