# Efficient Optimization of Echo State Networks for Time Series Datasets

**Authors:** Jacob Reinier Maat, Nikos Gianniotis, Pavlos Protopapas

arXiv: 1903.05071 · 2019-03-13

## TL;DR

This paper presents a method to optimize Echo State Networks (ESNs) hyperparameters using Bayesian optimization, enabling efficient modeling of large time series datasets by clustering similar series and reducing the number of models needed.

## Contribution

The work introduces a Bayesian hyperparameter optimization approach for ESNs and a clustering strategy to model groups of similar time series, reducing computational costs.

## Key findings

- Bayesian optimization outperforms grid search for ESN hyperparameter tuning.
- Clustering time series reduces the number of ESNs needed without losing predictive accuracy.
- Method applied successfully to astronomical light curve data.

## Abstract

Echo State Networks (ESNs) are recurrent neural networks that only train their output layer, thereby precluding the need to backpropagate gradients through time, which leads to significant computational gains. Nevertheless, a common issue in ESNs is determining its hyperparameters, which are crucial in instantiating a well performing reservoir, but are often set manually or using heuristics. In this work we optimize the ESN hyperparameters using Bayesian optimization which, given a limited budget of function evaluations, outperforms a grid search strategy. In the context of large volumes of time series data, such as light curves in the field of astronomy, we can further reduce the optimization cost of ESNs. In particular, we wish to avoid tuning hyperparameters per individual time series as this is costly; instead, we want to find ESNs with hyperparameters that perform well not just on individual time series but rather on groups of similar time series without sacrificing predictive performance significantly. This naturally leads to a notion of clusters, where each cluster is represented by an ESN tuned to model a group of time series of similar temporal behavior. We demonstrate this approach both on synthetic datasets and real world light curves from the MACHO survey. We show that our approach results in a significant reduction in the number of ESN models required to model a whole dataset, while retaining predictive performance for the series in each cluster.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1903.05071/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/1903.05071/full.md

## References

24 references — full list in the complete paper: https://tomesphere.com/paper/1903.05071/full.md

---
Source: https://tomesphere.com/paper/1903.05071