Consistency in Echo-State Networks
Thomas Lymburn, Alexander Khor, Thomas Stemler, D\'ebora C. Corr\^ea,, Michael Small, Thomas J\"ungling

TL;DR
This paper explores the concept of consistency in echo-state networks, providing a method to measure how reliably these networks respond to inputs, which enhances understanding of their dynamic properties.
Contribution
It introduces a novel application of the consistency concept to echo-state networks and proposes a replica test to quantify their echo-state property.
Findings
Consistency levels vary with network parameters
The replica test effectively measures the echo-state property
Insights into the functional dependency of networks on inputs
Abstract
Consistency is an extension to generalized synchronization which quantifies the degree of functional dependency of a driven nonlinear system to its input. We apply this concept to echo-state networks, which are an artificial-neural network version of reservoir computing. Through a replica test we measure the consistency levels of the high-dimensional response, yielding a comprehensive portrait of the echo-state property.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Consistency in Echo-State Networks
Thomas Lymburn
Complex Systems Group, Department of Mathematics and Statistics, Faculty of Engineering and Mathematical Sciences, The University of Western Australia, Crawley, Western Australia 6009, Australia
Alexander Khor
Complex Systems Group, Department of Mathematics and Statistics, Faculty of Engineering and Mathematical Sciences, The University of Western Australia, Crawley, Western Australia 6009, Australia
Thomas Stemler
Complex Systems Group, Department of Mathematics and Statistics, Faculty of Engineering and Mathematical Sciences, The University of Western Australia, Crawley, Western Australia 6009, Australia
Débora C. Corrêa
Complex Systems Group, Department of Mathematics and Statistics, Faculty of Engineering and Mathematical Sciences, The University of Western Australia, Crawley, Western Australia 6009, Australia
Michael Small
Complex Systems Group, Department of Mathematics and Statistics, Faculty of Engineering and Mathematical Sciences, The University of Western Australia, Crawley, Western Australia 6009, Australia
Mineral Resources, CSIRO, Kensington, Western Australia 6151, Australia
Thomas Jüngling
Complex Systems Group, Department of Mathematics and Statistics, Faculty of Engineering and Mathematical Sciences, The University of Western Australia, Crawley, Western Australia 6009, Australia
Abstract
Consistency is an extension to generalized synchronization which quantifies the degree of functional dependency of a driven nonlinear system to its input. We apply this concept to echo-state networks, which are an artificial-neural network version of reservoir computing. Through a replica test we measure the consistency levels of the high-dimensional response, yielding a comprehensive portrait of the echo-state property.
††preprint: UWA/001-DMW
When a nonlinear dynamical system is externally modulated by an information-carrying signal, its erratic response hides an intricate property: Consistency. It is difficult to estimate from time series whether or not the variability in the output is entirely determined by the driving signal. For autonomous chaotic systems it is well-known that their inherent instability gives rise to a certain level of unpredictability. For a driven system, this means that a part of the variability of its output does not depend on the drive. Consistency quantifies the degree of this dependency through a replica test. The nonlinear system is repeatedly driven by the same signal, and the corresponding responses are compared. We apply this concept to echo-state networks, a class of artificial neural networks with a fixed random internal connectivity. Such networks have been successfully utilized for sequential processing tasks like nonlinear time series prediction and spoken digit recognition. Studying the consistency property allows for a more comprehensive understanding of the dynamical response and for tailoring the network systematically towards enhanced functionality and a wider range of applications.
I Introduction
Synchronization is a common phenomenon in interacting nonlinear oscillators that has been studied for almost three decades Boccaletti et al. (2002); Arenas et al. (2008). The mutual or directed interaction can lead to several forms of entrainment of the trajectories. Different degrees of relationships have been discussed, like complete synchronization (CS) and phase synchronization (PS) Pikovsky et al. (2001). For generalized synchronization (GS) Rulkov et al. (1995); Moskalenko et al. (2012), however, only the presence and form of a functional relationship have been analyzed, but no corresponding weaker form of synchronization received any considerable attention Giacomelli et al. (2010). This least-studied case may be the most prevalent in natural systems and also of high relevance for novel forms of neuro-inspired computation.
When a nonlinear system is driven by an external signal - like neurons which are excited by real-world stimuli - the nature of the dependency between drive and response is a very important and challenging aspect. In the field of reservoir computing (RC), which overlaps with recurrent neural networks (RNN), dynamical systems are employed to tailor functions on sequential data Lukoševičius and Jaeger (2009); Appeltant et al. (2011); Konkoli (2018). In contrast to feedforward structures, which constitute the majority of present artificial neural networks (ANN), dynamical systems are known to develop instabilities. This property is generally undesired and hard to control, which is one of the reasons for the marginal existence of RC and RNN. In this work, we connect a concept from nonlinear science, namely consistency, and the associated replica scheme to echo-state networks (ESN) Jaeger (2001); Jaeger and Haas (2004), which are a particular flavor of RC that is based on RNN.
The concept of consistency emerged mainly within the last decade as an approach to introduce nonlinear science methodology to a broader domain in which the response of a nonlinear dynamical system to arbitrary signals plays a crucial role Uchida et al. (2004, 2008); Kanno and Uchida (2012); Oliver et al. (2015); Nakayama et al. (2016); Bueno et al. (2017); Jüngling et al. (2018). Consistency is based on the replica test, in which a nonlinear dynamical system is repeatedly driven with the same signal. The test is an adaption of the Abarbanel test for GS Abarbanel et al. (1996), in which at least two identical units are driven simultaneously. In each version of the replica test, the different responses, which in theory just differ in their initial conditions, are compared, typically by means of a correlation coefficient which measures the degree of consistency Uchida et al. (2004); Jüngling et al. (2018). The term consistency has not been defined rigorously yet, and there are currently different possible interpretations. We elaborate on this issue in Sec. II. A major goal of this work is to contribute to an enhanced understanding of consistency, in particular for large dynamical systems such as the ESN.
Echo-state networks are a computationally feasible RC paradigm, which is distinguished from other RNNs by a simplified training procedure Jaeger (2001); Lukoševičius and Jaeger (2009). The main idea of RC is to utilize the response of a large dynamical system, the reservoir, to generate nonlinear features in a high-dimensional space. Reservoir computing generally employs physical dynamical systems and thus belongs to the field of unconventional computation (UC) Konkoli (2018). The reservoir in ESN is a numerical model, typically given by a realization of a random network of dynamical nodes with sigmoid activation functions. Without driving signal the reservoir is typically designed to reside in a stable steady state, and the transient activation during injection of the input is recorded. This random nonlinear embedding of the signal in a large dimension facilitates certain regression or classification tasks Pathak et al. (2017); Lu et al. (2017); Zimmermann and Parlitz (2018).
Consistency is a property of the nonlinear response of ESN which is related to conditional stability. Stability in a driven nonlinear system typically refers to the spectrum of conditional Lyapunov exponents (CLE) and the corresponding Lyapunov vectors, most importantly the largest exponent Pecora and Carroll (1990); Pyragas (1997). All these quantities depend on both properties of the driven dynamical system and the driving signal Heiligenthal et al. (2011). The CLE may often appear to be tightly linked to consistency, however, both are complementary characteristics of the driven system Oliver et al. (2015). In ESN, conditional stability is a synonym for the echo-state property, which is a well-known central characteristic that is often referred to as a necessary criterion for the function of the network as a reservoir. Under ideal conditions, the echo-state property is equivalent to complete consistency, which refers to identical responses to repetitions of the driving signals Oliver et al. (2015); Jüngling et al. (2018). A few attempts to obtain a deeper insight into the echo-state property have been presented Verstraeten and Schrauwen (2009); Yildiz et al. (2012); Manjunath and Jaeger (2013), as well as a mean-field theory for the signal propagation Massar and Massar (2013), and the use of reservoirs in non-stationary regimes Marquez et al. (2018).
In this work, we employ the ESN in a parameter regime beyond the typical ranges that guarantee the echo-state property, as well as with intrinsic noise which has a similar effect. The replica test is implemented for realizations of Gaussian white noise as a scalar driving signal. We distinguish between the micro-level consistency of the reservoir nodes and the emergent consistency at the readout level, thus yielding a comprehensive portrait of the consistency property. This perspective is important for the response characterization in neuronal microcircuits, in which many factors lead to a noisy micro-level but still allow for a robust functionality, see also the concept of coarse coding MacLennan (2018); Sanger (1996); Rumelhart et al. (1986). Moreover, our approach is applicable to neuro-inspired technical applications in which an experimental access to the reliability of the response systems is required. We elaborate on the general consistency property in Sec. II. In Section III, we investigate the relationship between consistency and the fading memory in an ESN. We finally introduce a consistency profile in Sec. IV which demonstrates how an injected signal propagates and fades in the fluctuating neural medium.
II The consistency property
The replica test and the consistency measure have so far been applied only to low-dimensional systems and scalar time series. When transferring the concept to complex dynamical systems like an ESN, one encounters several new aspects. We approach these with a general driven system acting as a reservoir
[TABLE]
Here, is the state of the reservoir through continuous time, which we will also refer to as a network of nodes representing the degrees of freedom Appeltant et al. (2011); Grigoryeva et al. (2015). The vector is the multivariate driving signal, and is a set of fixed parameters which control internal wiring of the reservoir, the input injection, as well as the shape of the nonlinearity.
In a basic replica test, an identical copy of the reservoir is simultaneously driven with the same signal , but starting from different initial conditions . Alternatively, the same reservoir may be driven repeatedly, which reveals the same result in theory but poses different experimental challenges Oliver et al. (2015). For a scalar reservoir, or for a scalar observation of the reservoir, consistency is then measured by the consistency correlation , which is the Pearson-correlation coefficient between the two responses. For the multivariate response of a large reservoir one may define consistency correlation for each node
[TABLE]
where is the average with respect to time , and indicates normalization of to zero mean and unit variance. An average over all nodes
[TABLE]
then accounts for the ‘global’ consistency. It is worth noting that a different replica test could also be designed for each node in the network separately, or for an arbitrary group of nodes as a subset of the whole reservoir. This would define a different consistency measure which depends on the selection of the subset. Such a measure would for instance allow to locate the source of inconsistency. Nevertheless, we will focus in this work only on the replica test for the whole reservoir as outlined before and illustrated in Fig. 1.
The supervised learning procedure in RC creates a set of nodes in a separate readout layer. The readout signal is typically a linear superposition of the reservoir nodes
[TABLE]
The are components of the readout vector which is typically obtained by ridge regression with respect to a target signal . For the sake of simplicity, we will omit the bias term in the following discussions. A separate consistency measure can be determined for these readout nodes by
[TABLE]
where is the readout with the same vector applied to the replica reservoir, and normalization is applied in both and . The readout can be considered an emergent quantity which due to the training adjustments is a special projection of the reservoir dynamics. Its consistency level thus plays a distinguished role as compared to the individual .
A broader notion of consistency can be found throughout the community in which a system is said to be consistent if similar inputs lead to similar outputs. However, this idea overlaps with the approximation property Maass et al. (2002), and measures of similarity are not yet specified. We restrict our investigation to only the output similarity given exact repetitions of the input, as described above, in order to probe for the degree of functional dependency. This way, the consistency property is distinguished from the approximation property. Generalizing from the consistency correlation , a reasonable measure of similarity among the responses to repeated inputs is given by the consistency correlation of any observable of the system. The typical reservoir readouts are the special case in which the projection is linear. This notion is distinct from consistency in readouts which are a filtered function of the reservoir state, for instance . Future work may be oriented towards a general consistency concept including filtered signals, to account for phenomena like rate coding in neuronal circuits where the timing of individual spikes is inconsistent with respect to certain reference signals. In neuroscience, consistency on the level of spike timing is known as reliability Mainen and Sejnowski (1995); Goldobin and Pikovsky (2006). Despite the similarity between consistency and reliability, however, the two concepts are not identical due to the different context and measure of functional dependency. In nonlinear science, consistency can also be compared to synchronization due to common drive, where noise is often chosen as a driving signal Teramae and Tanaka (2004); Goldobin and Pikovsky (2005); Pimenova et al. (2016). What distinguishes consistency from synchronization phenomena is that it is a property of a single system subject to a driving signal. Moreover, the consistency property is inherent to the system even if the signal is not repeatedly presented.
We illustrate our consistency concept at the example of an ESN driven by noise. This reservoir updates in discrete time and reads
[TABLE]
where is applied element-wise. The internal connectivity is summarized in the matrix , and the input injection is contained in . We create a network with nodes and connect nodes randomly with a probability of . The weight of each connection is then chosen from a normal distribution , and the weight is zero if there is no connection. The resulting matrix is then scaled by global factor to achieve a desired spectral radius . The spectral radius is a key parameter in the design of ESN which can be thought of as the internal gain of the dynamical system. The input connections in are created in a way that the input is injected into each node, with the weight for each connection taken from a uniform distribution between and . is a vector of biases which shifts the operating window of each node to different regions of the -nonlinearity. The bias for each node is set to one here, meaning . The input is chosen to be a scalar () IID random variable taken from a normal distribution with zero mean and unit variance, .
In order to visualize sections through the input-output relationship, i.e. between the drive and the response , we create first a sufficiently long reference sequence as a single realization of the noise process. The ESN is then repeatedly driven with a variation of this sequence, in which only a single element takes a different value in each run. We select a few nodes to plot the dependency of on this variable for different lags . Figure 2 shows these sections for selected nodes in a regime of low consistency at together with the same sections in the same network and drive, but at where the response is completely consistent. The consistent response reveals slices through a functional . In the low-consistency regime, however, the dependency is blurred to an extent which depends on the individual node as well as on the state of the drive. In both cases, we recognize the -nonlinearity in the instantaneous response, and higher nonlinear features for increasing lag. Inferring to an exact repetition of the drive, the variability in the response can be seen by taking a vertical slice of the response portraits shown in Fig. 2. In the consistent case the slice reduces to a point, whereas in the inconsistent case this is a distribution, describing the values the response may take for a certain input sequence. The consistency correlation thus measures the degree to which the total variability through time arises from a functional dependence as compared to the chaotic variability.
III Memory and Consistency
Recurrent neural networks allow the input signal to propagate through the network for multiple timesteps, meaning that the current state of the network contains information about the history of the input. This ability to store past information is a key component of RNN which enables them to be powerful tools for computation on sequential data. For ESN, understanding the relationship between the memory profile and the hyper-parameters of the network is important for optimal, task specific reservoir design. In this section we investigate the memory of ESN in the context of consistency.
The linear reconstruction task and the associated memory capacity (MC) measure as introduced by Jaeger Jaeger (2001) are commonly used to quantify the fading memory. The ESN is trained to reconstruct the input timesteps ago, meaning that the training target is , . The reconstruction accuracy at lag is an indicator of the amount of information held in the network about the input at that lag. With the reconstruction from the ESN reading , the accuracy is measured by the correlation coefficient
[TABLE]
where the usual normalizations apply. This performance measure as a function of is the memory profile of the ESN, as shown in Fig. 3. The memory capacity is an integral over all lags which measures the total linear memory of the reservoir Jaeger (2001)
[TABLE]
A key result with regard to ESN memory is that the MC is maximized at the edge of chaos, where the maximal Lyapunov exponent becomes positive and the reservoir dynamics change from a stable to an unstable regime. This has informed part of the design strategy for ESN, which is to scale the spectral radius of the internal weight matrix to just before this point in order to maximize memory.
We investigate here the memory of an ESN in the stable regime, at the transition to instability, and in the unstable regime. The ESN is set up as in Sec. II, but with a size of nodes and wiring probability . We perform the reconstruction task for a spectral radius of , which encompasses the aforementioned regimes with different levels of consistency. The uncorrelated input ensures that any memory observed is purely from transient activation in the reservoir, rather than due to any autocorrelation already present in the input. We measure the global consistency according to Eq. (3). The values are averaged over 10 different realizations after decay of transients. The results of this experiment are shown in Fig. 3 and Fig. 4.
Considering consistency in the terminology of ESN, complete consistency () is equivalent to the echo-state property, and the transition to inconsistency at is equivalent to the edge of chaos. The results in Fig. 4 support the notion that the memory capacity of an ESN is maximized approximately at the onset of inconsistency. Starting from no connectivity (), memory increases with increasing until the network transitions into an unstable regime, leading to a decrease in consistency which leads to a decrease in memory. However, even deep in the inconsistent regime the network still performs relatively well, with the accuracy of the reproduction for large lags () being greater than the network in the fully consistent regime (Fig. 3). For the memory capacity measure, the highly inconsistent network performs comparably to the reservoir scaled to (Fig. 4), which was the previous best practice in reservoir design. This is a surprising result, as one may expect that a significant loss of consistency should be accompanied by a comparable overall loss of reconstruction accuracy. Particularly for large lags, the effect of inconsistency is expected to accumulate, because the signal-to-noise ratio is effectively reduced with every propagation step. However, it is at large lags that the performance of the inconsistent reservoir is the strongest relative to the consistent and transition cases. Thus it seems wrong to assume that inconsistency simply acts like additive noise at each node. There must instead be a mechanism which enables the input to propagate through an inconsistent reservoir in a way that is recoverable by a linear readout.
We obtain further insight into the modes of signal propagation by a different numerical experiment. Starting from an ESN which is initially in the fully consistent regime, say , we induce inconsistency by introducing a source of noise at each node. The update equation for the reservoir becomes
[TABLE]
where is a parameter that determines the amount of noise, and . Figure 5 shows the memory profile of two ESN, each in an inconsistent regime with similar levels of global consistency (). One ESN has inconsistency naturally due to instability in the autonomous reservoir dynamics caused by a large spectral radius. The other has inconsistency introduced via noise. The results show that on the memory test the standard reservoir in an inconsistent regime performs better than a reservoir spiked with noise. Figure 5 also shows the square root of the output consistency, , according to Eq. (5) with depending on . For the outputs associated with reproducing recent inputs, the output consistency is much larger than the consistency of the network as a whole. This means that there are readout projections for which the response of the ESN is highly consistent, even if the network as a whole is inconsistent. We also see that the accuracy of the memory reconstruction is closely bounded by the values. This is in agreement with consistency theory Jüngling et al. (2018), meaning that the trained readout exploits consistency optimally.
IV Consistency Profile
The disturbance of the propagating signal due to chaos emerges to be less dramatic than by a source of noise at every node. This is little surprising when we take into account that chaos effectively populates only a few degrees of freedom according to the attractor dimension. We follow this idea by first calculating the conditional Lyapunov spectrum for an ESN Verstraeten and Schrauwen (2009). The transfer of the dimension to a driven system is possible if we interpret chaotic dimensionality as additional degrees of freedom superposed to the signal response. Figure 6 shows the CLE calculated via the Gram-Schmidt procedure Eckmann and Ruelle (1985) together with the global consistency and the conditional attractor dimension , which is the Kaplan-Yorke dimension from the CLE Frederickson et al. (1983). As expected, the transition to inconsistency corresponds to the crossing of the maximal Lyapunov exponent from negative to positive. However, even for large inconsistency approximately of the Lyapunov exponents remain negative, and the chaotic dimension is still small compared to the state-space dimension . Thus the ESN is still effectively stable in a large portion of the available directions. These more stable directions may have a higher level of consistency than others, which gives a first idea of why a signal can still propagate through a globally inconsistent network for many timesteps.
The Lyapunov dimension provides only a very basic argument on the distribution of signal and noise in the response of the reservoir. In general, Lyapunov exponents are little related to correlations in dynamical systems. This is because the attractor dimension is a topological property, whereas correlations are geometrical properties of the dynamics. The chaotic degrees of freedom of the reservoir are not confined to a trivial subspace, as for instance in the case that the reservoir was a linear system, e.g. by omitting the -nonlinearity. Lyapunov vectors from the nonlinear Eq. (6) are time dependent and effectively distribute the chaotic instabilities over all degrees of freedom. In the following, we introduce a comprehensive characterization of the distribution of signal and chaos (noise), including the effect of regularization, based on principal-component analysis (PCA).
We first apply the PCA directly to the full response of the ESN under various conditions. This is done by performing singular-value decomposition on the covariance matrix for ,
[TABLE]
Note that in contrast to the correlation functions before, here we do not apply normalization. The columns of form an orthonormal set of vectors in the direction of the principal components (PCs) of the response. The diagonal matrix contains the sizes of the principal components, which measure the extent of the response of in the corresponding PC-direction. Figure 7 shows the PC profiles for different ESN responses. Moreover, by taking the PC directions as readouts, , we calculate the corresponding readout consistency correlations for these directions and thus obtain a consistency distribution. We apply this procedure to a network in the completely consistent () and in the inconsistent () regime (Fig. 7a-b). We further consider the effect of regularization, by adding measurement noise to the reservoir state, , , equivalent to performing ridge regression (Fig. 7c).
For the completely consistent case we find = 1 for each readout, as expected. The effect of regularization is to remove the smallest of the response, leaving only those directions active in which the added noise is small relative to the propagated input signal. The resulting consistency distribution reveals a new way to describe the capacity of the ESN, which turns out to be significantly smaller than the number of degrees of freedom of the reservoir. This is important as ridge regression is commonly used when training ESN, and in physical cases of reservoir computing there is often measurement noise which acts in a similar way to regularization. For the inconsistent case, the directions of larger response tend to have consistency above the global consistency , and vice versa. However, even in the direction with greatest consistency, the response does not reach the consistency level observed for readouts in the memory profile (Fig. 6).
We describe in the following how to trace directions of particular consistency levels. The notion of consistency directly leads to a new characteristic set of readouts, different from the orthogonal set obtained from the PCA. We perform an alternative PCA on the consistent component of the reservoir response Jüngling et al. (2018). The consistent component can be found via an average over an ensemble of replica states . Following naturally from this, an inconsistent component can be defined for each replica such that . The only underlying assumption in this decomposition is ergodicity, which we found is to a high degree satisfied in large reservoirs. Thus becomes equivalent to a realization of a noise-like process which, for long enough time, does not correlate with either the consistent component or other realizations of this process Jüngling et al. (2018). This allows us to relate the variance of the consistent component to the covariance of two replica states (Eq. (11)).
[TABLE]
This result extends to the covariance matrix of the consistent component, which is equal to the cross-covariance matrix between two replicas in the long time limit. Thus, through PCA of the cross-covariance matrix, we can access the principal components of the consistent part of the response, which then can be compared with the full response of the ESN. The cross-covariance matrix of two replica responses and reads
[TABLE]
where is the covariance matrix of , and no normalizations are applied. Our numerical experiments confirm that the ergodicity assumption behind this equality holds well, and finite-size effects are negligible with reasonable time-series lengths. We perform SVD on both (Eq. (10)) and (Eq. (12)) to find the principal components, using the symmetry and positive definiteness of the covariance matrices
[TABLE]
From here, we aim to measure the relative orientation of the signal response within the full response, which can be geometrically illustrated as two nested ellipsoids.
To demonstrate this method we will consider a simple two dimensional test system comprised of a consistent and inconsistent component.
[TABLE]
All . The results of PCA on this test system are shown in Fig. 8a, where the components span ellipses. Besides the main axes of the full and consistent responses, we also show the directions of maximum and minimum consistency. These are the directions in which the ratio between inner and outer ellipse are maximal or minimal, respectively. The consistency directions do not align with the orthogonal PC axes of either ellipse. In order to define the consistency directions, we introduce a coordinate transformation which normalizes the full response components
[TABLE]
We apply to the two replica states and to get and . The bar notation indicates the transformed states. This transformation preserves the relative proportions relevant for consistency. The normalized geometry is shown in Fig. 8b. The full response component ellipse has been transformed into a unit circle. The directions of maximum and minimum consistency here align with the principal components of the consistent response, and moreover, also with the components of the inconsistent response (not shown). In other words, the result of the isotropic reservoir response in the new coordinates is the consistent and the inconsistent part being complementary with a shared set of principal axes. Moreover, the diagonal elements of in these coordinates are the consistency levels in the corresponding directions.
We apply this method to an ESN with 100 nodes and a global consistency of . By plotting the consistency along each principal component, we get the consistency profile (Fig. 9). In approximately of directions the consistency is larger than the global consistency of the ESN, with some directions maintaining a very high level of consistency. This suggests that it is through these directions of higher consistency that the input signal is able to propagate in a manner that is recoverable by the linear readout.
V Conclusion
We have applied the concept of consistency to echo-state networks as a new way of characterizing the echo-state property. In particular, we have assessed the performance of ESN in a range of consistency regimes on a memory task. We found that inconsistency is not as destructive to fading memory as expected, with the inconsistent reservoir performing comparably, and even outperforming consistent reservoirs. The reason for signals surviving in the inconsistent regime is found in the distribution of signal and noise. We introduced the consistency profile based on principal-component analysis as a portrait of the high-dimensional response. We found that a few directions of high consistency are always present, bypassing the chaotic or noise-induced fluctuations. Our method is applicable to an arbitrary reservoir computer, including physical media in which noise is inherently present.
While complete consistency is always desirable for reservoir design, we found that in typical ESN with regularization the response is poorly exploited. Our findings thus may give rise to unsupervised pre-training methods which aim to maximize the consistent dimension in a response subject to chaos, noise, and regularization. Furthermore, one may think of shared processing of multiple input channels, in which the distinct inputs act as sources of inconsistency for each other rather than complementary channels to synthesize the desired output. This is likely to be the case in many biological examples, where a single computational unit receives multiple stimuli. Our method may help to understand how computational capacity is distributed and routed in such systems. In summary, the consistency profile including effects of regularization may lead to an enhanced understanding of computational capacity in noisy neuronal microcircuits, and also prove useful in unsupervised optimization procedures for reservoir design.
Acknowledgements.
TL is supported by the Australian Government Research Training Program at The University of Western Australia. AK was supported by the Hackett Postgraduate Scholarship. MS is supported by Australian Research Council Discovery Project DP180100718.
References
- Boccaletti et al. (2002) S. Boccaletti, J. Kurths, G. Osipov, D. Valladares, and C. Zhou, Physics Reports 366, 1 (2002).
- Arenas et al. (2008) A. Arenas, A. Díaz-Guilera, J. Kurths, Y. Moreno, and C. Zhou, Physics Reports 469, 93 (2008).
- Pikovsky et al. (2001) A. Pikovsky, M. Rosenblum, and J. Kurths, Synchronization: A Universal Concept in Nonlinear Sciences (Cambridge, 2001).
- Rulkov et al. (1995) N. F. Rulkov, M. M. Sushchik, L. S. Tsimring, and H. D. I. Abarbanel, Phys. Rev. E 51, 980 (1995).
- Moskalenko et al. (2012) O. I. Moskalenko, A. A. Koronovskii, A. E. Hramov, and S. Boccaletti, Phys. Rev. E 86, 036216 (2012).
- Giacomelli et al. (2010) G. Giacomelli, S. Barland, M. Giudici, and A. Politi, Phys. Rev. Lett. 104, 194101 1 (2010).
- Lukoševičius and Jaeger (2009) M. Lukoševičius and H. Jaeger, Computer Science Review 3, 127 (2009).
- Appeltant et al. (2011) L. Appeltant, M. Soriano, G. Van der Sande, J. Danckaert, S. Massar, J. Dambre, B. Schrauwen, C. Mirasso, and I. Fischer, Nature Comm. 2, 468 (2011).
- Konkoli (2018) Z. Konkoli, “Reservoir computing,” in Unconventional Computing: A Volume in the Encyclopedia of Complexity and Systems Science, Second Edition, edited by A. Adamatzky (Springer US, New York, NY, 2018) pp. 619–629.
- Jaeger (2001) H. Jaeger, The ”echo state” approach to analysing and training recurrent neural networks, GMD Report 148 (GMD - Forschungszentrum Informationstechnik, 2001).
- Jaeger and Haas (2004) H. Jaeger and H. Haas, Science 304, 78 (2004).
- Uchida et al. (2004) A. Uchida, R. McAllister, and R. Roy, Phys. Rev. Lett. 93, 244102 (2004).
- Uchida et al. (2008) A. Uchida, K. Yoshimura, P. Davis, S. Yoshimori, and R. Roy, Phys. Rev. E 78, 1 (2008).
- Kanno and Uchida (2012) K. Kanno and A. Uchida, Phys. Rev. E 86, 066202 (2012).
- Oliver et al. (2015) N. Oliver, T. Jüngling, and I. Fischer, Phys. Rev. Lett. 114, 123902 (2015).
- Nakayama et al. (2016) J. Nakayama, K. Kanno, and A. Uchida, Opt. Express 24, 8679 (2016).
- Bueno et al. (2017) J. Bueno, D. Brunner, M. C. Soriano, and I. Fischer, Opt. Express 25, 2401 (2017).
- Jüngling et al. (2018) T. Jüngling, M. C. Soriano, N. Oliver, X. Porte, and I. Fischer, Phys. Rev. E 97, 042202 (2018).
- Abarbanel et al. (1996) H. D. I. Abarbanel, N. F. Rulkov, and M. M. Sushchik, Phys. Rev. E 53, 4528 (1996).
- Pathak et al. (2017) J. Pathak, Z. Lu, B. R. Hunt, M. Girvan, and E. Ott, Chaos: An Interdisciplinary Journal of Nonlinear Science 27, 121102 (2017).
- Lu et al. (2017) Z. Lu, J. Pathak, B. Hunt, M. Girvan, R. Brockett, and E. Ott, Chaos: An Interdisciplinary Journal of Nonlinear Science 27, 041102 (2017).
- Zimmermann and Parlitz (2018) R. S. Zimmermann and U. Parlitz, Chaos: An Interdisciplinary Journal of Nonlinear Science 28, 043118 (2018).
- Pecora and Carroll (1990) L. M. Pecora and T. L. Carroll, Phys. Rev. Lett. 64, 821 (1990).
- Pyragas (1997) K. Pyragas, Phys. Rev. E 56, 5183 (1997).
- Heiligenthal et al. (2011) S. Heiligenthal, T. Dahms, S. Yanchuk, T. Jüngling, V. Flunkert, I. Kanter, E. Schöll, and W. Kinzel, Phys. Rev. Lett. 107, 234102 (2011).
- Verstraeten and Schrauwen (2009) D. Verstraeten and B. Schrauwen, in Artificial Neural Networks – ICANN 2009, edited by C. Alippi, M. Polycarpou, C. Panayiotou, and G. Ellinas (Springer Berlin Heidelberg, Berlin, Heidelberg, 2009) pp. 985–994.
- Yildiz et al. (2012) I. B. Yildiz, H. Jaeger, and S. J. Kiebel, Neural Networks 35, 1 (2012).
- Manjunath and Jaeger (2013) G. Manjunath and H. Jaeger, Neural Computation 25, 671 (2013).
- Massar and Massar (2013) M. Massar and S. Massar, Phys. Rev. E 87, 042809 (2013).
- Marquez et al. (2018) B. A. Marquez, L. Larger, M. Jacquot, Y. K. Chembo, and D. Brunner, Scientific Reports 8, 3319 (2018).
- MacLennan (2018) B. J. MacLennan, “Analog computation,” in Unconventional Computing: A Volume in the Encyclopedia of Complexity and Systems Science, Second Edition, edited by A. Adamatzky (Springer US, New York, NY, 2018) pp. 3–33.
- Sanger (1996) T. D. Sanger, Journal of Neurophysiology 76, 2790 (1996), pMID: 8899646.
- Rumelhart et al. (1986) D. E. Rumelhart, J. L. McClelland, and P. R. Group, Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Volume 1: Foundations (MIT Press, 1986) a Bradford Book.
- Grigoryeva et al. (2015) L. Grigoryeva, J. Henriques, L. Larger, and J.-P. Ortega, Scientific Reports 5, 12858 (2015).
- Maass et al. (2002) W. Maass, T. Natschläger, and H. Markram, Neural Computation 14, 2531 (2002).
- Mainen and Sejnowski (1995) Z. Mainen and T. Sejnowski, Science 268, 1503 (1995).
- Goldobin and Pikovsky (2006) D. S. Goldobin and A. Pikovsky, Phys. Rev. E 73, 061906 (2006).
- Teramae and Tanaka (2004) J.-n. Teramae and D. Tanaka, Phys. Rev. Lett. 93, 204103 (2004).
- Goldobin and Pikovsky (2005) D. S. Goldobin and A. Pikovsky, Phys. Rev. E 71, 045201 (2005).
- Pimenova et al. (2016) A. V. Pimenova, D. S. Goldobin, M. Rosenblum, and A. Pikovsky, Scientific Reports 6, 38518 (2016).
- Eckmann and Ruelle (1985) J. P. Eckmann and D. Ruelle, in The Theory of Chaotic Attractors (Springer, 1985) pp. 273–312.
- Frederickson et al. (1983) P. Frederickson, J. L. Kaplan, E. D. Yorke, and J. A. Yorke, Journal of Differential Equations 49, 185 (1983).
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Boccaletti et al. (2002) S. Boccaletti, J. Kurths, G. Osipov, D. Valladares, and C. Zhou, Physics Reports 366 , 1 (2002).
- 2Arenas et al. (2008) A. Arenas, A. Díaz-Guilera, J. Kurths, Y. Moreno, and C. Zhou, Physics Reports 469 , 93 (2008).
- 3Pikovsky et al. (2001) A. Pikovsky, M. Rosenblum, and J. Kurths, Synchronization: A Universal Concept in Nonlinear Sciences (Cambridge, 2001).
- 4Rulkov et al. (1995) N. F. Rulkov, M. M. Sushchik, L. S. Tsimring, and H. D. I. Abarbanel, Phys. Rev. E 51 , 980 (1995).
- 5Moskalenko et al. (2012) O. I. Moskalenko, A. A. Koronovskii, A. E. Hramov, and S. Boccaletti, Phys. Rev. E 86 , 036216 (2012).
- 6Giacomelli et al. (2010) G. Giacomelli, S. Barland, M. Giudici, and A. Politi, Phys. Rev. Lett. 104 , 194101 1 (2010).
- 7Lukoševičius and Jaeger (2009) M. Lukoševičius and H. Jaeger, Computer Science Review 3 , 127 (2009).
- 8Appeltant et al. (2011) L. Appeltant, M. Soriano, G. Van der Sande, J. Danckaert, S. Massar, J. Dambre, B. Schrauwen, C. Mirasso, and I. Fischer, Nature Comm. 2 , 468 (2011).
