Time Series Forecasting (TSF) Using Various Deep Learning Models
Jimeng Shi, Mahek Jain, Giri Narasimhan

TL;DR
This paper compares deep learning models including RNN, LSTM, GRU, and Transformer for time series forecasting using the Beijing Air Quality dataset, analyzing how window size and prediction horizon affect performance.
Contribution
It introduces a comprehensive comparison of deep learning models, especially Transformer, for TSF and explores the impact of look-back window sizes and forecast horizons.
Findings
Transformers outperform other models in MAE and RMSE.
Optimal look-back window is about one day for 1-hour ahead prediction.
Performance varies with forecast horizon and window size.
Abstract
Time Series Forecasting (TSF) is used to predict the target variables at a future time point based on the learning from previous time points. To keep the problem tractable, learning methods use data from a fixed length window in the past as an explicit input. In this paper, we study how the performance of predictive models change as a function of different look-back window sizes and different amounts of time to predict into the future. We also consider the performance of the recent attention-based Transformer models, which has had good success in the image processing and natural language processing domains. In all, we compare four different deep learning methods (RNN, LSTM, GRU, and Transformer) along with a baseline method. The dataset (hourly) we used is the Beijing Air Quality Dataset from the UCI website, which includes a multivariate time series of many factors measured on an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAir Quality Monitoring and Forecasting · Traffic Prediction and Management Techniques · Forecasting Techniques and Applications
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Tanh Activation · Sigmoid Activation · Long Short-Term Memory · Byte Pair Encoding · Position-Wise Feed-Forward Layer · Dense Connections · Softmax
