# Curriculum Learning in Deep Neural Networks for Financial Forecasting

**Authors:** Allison Koenecke, Amita Gajewar

arXiv: 1904.12887 · 2020-01-28

## TL;DR

This paper introduces a novel application of curriculum learning to deep neural networks for financial time series forecasting, achieving significant accuracy improvements over traditional models.

## Contribution

It is the first to apply curriculum learning to neural networks in financial forecasting, demonstrating enhanced accuracy with LSTM models on real-world revenue data.

## Key findings

- Approximately 30% accuracy improvement over traditional models
- Curriculum learning LSTM performs best among tested models
- Models generalize well without overfitting on medium-sized data

## Abstract

For any financial organization, computing accurate quarterly forecasts for various products is one of the most critical operations. As the granularity at which forecasts are needed increases, traditional statistical time series models may not scale well. We apply deep neural networks in the forecasting domain by experimenting with techniques from Natural Language Processing (Encoder-Decoder LSTMs) and Computer Vision (Dilated CNNs), as well as incorporating transfer learning. A novel contribution of this paper is the application of curriculum learning to neural network models built for time series forecasting. We illustrate the performance of our models using Microsoft's revenue data corresponding to Enterprise, and Small, Medium & Corporate products, spanning approximately 60 regions across the globe for 8 different business segments, and totaling in the order of tens of billions of USD. We compare our models' performance to the ensemble model of traditional statistics and machine learning techniques currently used by Microsoft Finance. With this in-production model as a baseline, our experiments yield an approximately 30% improvement in overall accuracy on test data. We find that our curriculum learning LSTM-based model performs best, showing that it is reasonable to implement our proposed methods without overfitting on medium-sized data.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1904.12887/full.md

## Figures

15 figures with captions in the complete paper: https://tomesphere.com/paper/1904.12887/full.md

## References

18 references — full list in the complete paper: https://tomesphere.com/paper/1904.12887/full.md

---
Source: https://tomesphere.com/paper/1904.12887