key: cord-0672005-q2auey49
authors: Roy, Mousumi; Senapati, Abhishek; Poria, Swarup; Mishra, Arindam; Hens, Chittaranjan
title: Role of assortativity in predicting burst synchronization using echo state network
date: 2021-10-11
journal: nan
DOI: nan
sha: 35b85e5e3526837227526d0007f2b75866d6dd2f
doc_id: 672005
cord_uid: q2auey49

In this study, we use a reservoir computing based echo state network (ESN) to predict the collective burst synchronization of neurons. Specifically, we investigate the ability of ESN in predicting the burst synchronization of an ensemble of Rulkov neurons placed on a scale-free network. We have shown that a limited number of nodal dynamics used as input in the machine can capture the real trend of burst synchronization in this network. Further, we investigate on the proper selection of nodal inputs of degree-degree (positive and negative) correlated networks. We show that for a disassortative network, selection of different input nodes based on degree has no significant role in machine's prediction. However, in the case of assortative network, training the machine with the information (i.e time series) of low-degree nodes gives better results in predicting the burst synchronization. Finally, we explain the underlying mechanism responsible for observing this differences in prediction in a degree correlated network.

Reservoir computing (RC) is a simple yet extremely efficient machine learning methodology for predicting temporal data. There are several techniques, e.g., Echo state network (ESN), liquid state machine (LSM), those have been developed depending on the mechanism of reservoir computing. Since the seminal work by Jaeger et al. [1, 2] , it has opened up a new direction in the field of ESN based machine learning approach [3] [4] [5] [6] [7] [8] . Due to its simplicity, ESN is appeared as a considerable tool in different areas ranging from neuroscience [9, 10] , speech recognition [11] , language processing [12] , robotics [13] to stock market prediction [14] , inference of connectivity [15, 16] , network classification [17] and even in predicting recent COVID-19 epidemic trends [18] . There are also some studies which have focused on the suitable choices of reservoir weights [4, 19, 20] and on the optimal range of hyper parameters [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] , which are crucial for a good prediction. On the other hand, despite the improvement of diverse nonlinear and statistical tools [31, 32] , the understanding of ergodicity as well as analyzing the chaotic feature pose challenges for the researchers in the field of nonlinear dynamics.

In this backdrop, it has been shown that ESN based machine learning approach can be highly useful for the efficient prediction of nonlinear data due to its simple architecture and faster computation. For instance, Pathak et al. [33] showed that a reservoir computation based ESN approach can indeed predict a large spatiotemporal chaotic data and the prediction efficiency is very good up to few multiples of Lyapunov time scale. Since then, ESN based prediction has received growing attention from the researchers and has been used in several aspects of nonlinear dynamics, for example, prediction of a chaotic time * arindammishra@gmail.com series data [33] [34] [35] [36] [37] , spatiotemporal dynamics [38] , determining Lyapunov exponents [39, 40] , dynamics of multiscale systems [35] , predicting critical transition [41] and critical range for the efficiency of the reservoir [42] , to name a few. Another interesting aspect namely the collective or macroscopic behavior of ensemble of interacting oscillators, such as, synchronization, quenching of oscillations, chimera states etc. [43] [44] [45] [46] [47] , those can be efficiently identified and extracted with the machine learning tools. The critical parameter for the amplitude death [48] , and the onset of generalized synchronization can easily be captured using ESN [49, 50] based approach. First-order and second-order phase transition of a system of non-identical oscillators can be predicted through ESN as well [51] . There are also other approaches of machine learning for identification of chimera and solitary states [52, 53] , generalized synchronization [54] , predicting extreme events [55, 56] and forecasting COVID-19 spread [57] , but, we consider ESN for our study for its less computational cost.

Motivated by such diverse applications of ESN in nonlinear systems, we target here to predict the collective spiking and bursting dynamics in a network of nonidentical neurons. Particularly, we develop a modified architecture of ESN to capture the onset as well as the strength of burst synchronization. It is to be noted that spiking is a repeated firing state of neurons, whereas bursting is a bunch of spikes [58] [59] [60] [61] [62] [63] . Each burst is followed by a quiescence state before the next burst occurs and it can be periodic or chaotic. In a recent study [64] , it has been revealed that for a mixed population of coupled oscillators, ESN needs at least one time series data from each group to get a satisfactory spiking and bursting prediction as well as predicting the onset of generalized synchronization. However, this prediction of bursting dynamics by ESN was limited to a complete (uncorrelated) graph and the original as well as machine generated profile of the bursts were periodic. Also, a slight increase of the coupling makes the system into two tight cluster states and machine predicts the entire time signals more accurately. However, broad spectrum of randomness in the system parameters (desynchronized chaotic neurons) will not generate cluster synchronization immediately after the coupling is active, rather the phases of neurons may start to synchronize earlier [61] . To identify such phase synchronization (particularly starting and ending of the bursts of each oscillator at the same time) motivates us to use the machine learning(ML): whether ESN can predict the burst synchronization for non-identical neurons. On the other hand, the network architecture encoded in neuronal connection is not regular rather follows a complex structure [65] [66] [67] [68] . Thus we target here to predict the burst synchronization of neurons connected in complex network through the modified approach of ESN. To delve deeper, we have investigated the emergence of bursts in an assortative as well as disassortative networks [69, 70] . Assortative mixing is associated with several collective events, such as, functional network in human cortex [65] , spreading phenomena through complex network [71, 72] , information networks [73] , etc. Interaction among similar individuals are subjected to network resilience, makes a system more stable under external perturbations. In this regard, the key questions we have raised here what would be the minimum number of input neurons required for an efficient prediction and depending on the level of assortaivity of the original system which nodes should be used as the inputs to the machine?

As an attempt to address theses questions, we concentrate on making an ESN based prediction of the degree of burst synchronization of a neuronal system on a complex network with positive and negative degree-degree correlation. We consider an ensemble of Rulkov neurons, diffusively coupled though a heterogeneous scale-free network. It is well known that the system attains a phase synchronized state through a continuous phase transition process [74] . At this stage, in an ESN environment, we train the machine using time series data of a few nodes to acquire the predicted time series of remaining nodes. Here, in stead of whole time series, we focus on the onset of bursts and try to predict the route of the synchronization transition process. Furthermore, we explore whether any particular choice of input nodes results a better prediction. In this context, since assortative mixing is a local network characteristic having a significant impact in collective dynamics of the system, we try to use this characteristic as a base of selecting input nodes to train the machine. Surprisingly, this aspect of finding an optimal choice of input nodes is still not well explored. Therefore, in this study, we try to elucidate how degree dependent assortativity of the original system makes a discrepancy in the predicted results. Moreover, we also examine how we can improve the results using different group of training nodes. Finally, we explain these findings based on the dynamics of original system.

Here, we attach a brief outline of this study. In Sec. II, a detailed description of the reservoir system is presented.

We briefly describe the Rulkov map based model and structure of the underlying complex network in Sec. III. Sec. IV is devoted to define some quantities like assortativity, synchronization order parameter, root mean square error, etc. that are used throughout the study in quantifying the efficiency of the prediction. Next we organize all the results in Sec. V, and end up with a conclusion in Sec. VI.

In this study, we consider a conventional ESN, a discrete time leaky tanh system updates its internal dynamics using the formula:

(1) where, W res (∈ R Nres×Nres ) is the reservoir matrix. It is chosen as a sparse matrix that determines the connectivity weights into the reservoir. It takes the non zero elements randomly chosen from [−1, 1]. We fix the spectral radius of W res at ρ = 0.95 and rescale the matrix accordingly (we consider the spectral radius as less than unity for preventing the reservoir to settle down on multiple fixed points, periodic or even chaotic attractor modes, thus ensuring the echo state property). v(∈ R Nres ) represents the dynamics of internal reservoir nodes.

Here, we use a system of Rulkov neurons to investigate the efficiency of the ESN generated prediction process. A detailed description of the model is given in Sec. III. From this system, to train ESN we use the time series of K chosen nodes for first t train time points (after ignoring a considerable transient time), which is the term u(t) in Eq. 1, defined as,

is the corresponding weights to connect the inputs with reservoir. We randomly take the elements of W in from the uniformly distributed interval [-0.5,0.5]. α is the leaking parameter can take values from 0 to 1. Particularly in this study, we consider α = 0.09 and N res = 1000.

Once the training is done, our target is to predict the dynamics of remaining N − K nodes from the machine generated time series for next t test time points. During the training period, at each time instant the readout weights are collected into a matrix X, where X(∈ R (Nres+K+1)×ttrain ) is defined as,

This gives us the system of linear equations of readout weights,

where, Y (∈ R (N −K)×ttrain ) is the time series of the targeted N − K nodes for the training period(t = 1, 2, 3, ..., t train ), Therefore, we can find

λ is the ridge regression parameter to avoid over fitting during the training process. Now, for testing we obtain Y target by employing W out on X at next t test time points. First we use u(t), time series of the initially chosen K input nodes for t = t train+1 , t train+2 , ..., t test into Eq. 1. At every time instant we get v(t) as well as X(t) = [1; u(t); v(t)], and using Eq. 2, we finally obtain the targeted result

We consider an ensemble of diffusively coupled N chaotic Rulkov neurons [62] . At every time instant the system follows the equations :

where, α i is randomly chosen from the uniform distribution [4.1, 4.5] . We take β = −1.5 and σ = 0.003. For this parameter set, isolated node follows a chaotic bursting dynamics (as shown in Fig. 1 ). Here is the coupling strength and C is the adjacency matrix of a Barabási-Albert(BA) scale free network [75] . For this study, we consider N = 500 and average degree 6. Initially, we consider the original system 4-5 connected through an uncorrelated network with assortativity A = 0. Assortativity A is defined as the degree-degree Pearson correlation as follows:

where, j i and h i are degrees of the nodes connected through i th link, and M is the total number of links associated with the network. Next, we train the machine using the time series of t train = 25000 data points (avoiding first 20000 transient time points) of K chosen nodes. Our target is to predict the time series of remaining N − K nodes for next t test = 10000 time instants. In Fig. 2 , we plot the time series of a randomly chosen nodes (see Figs. 2(a)-(d)) and the corresponding mean fields (see Figs. 2(e)-(h)) of whole system from both the original and machine predicted data at several coupling strengths ( = 0.02, 0.04, 0.06, 0.12) where mean field is defined as,

Here, we randomly choose K = 15 input nodes to train the machine. We use the colors gray and red to present the original and machine predicted data respectively. Interestingly, we observe that machine is not able to predict individual time series with a perfect amplitude matching (see Figs. 2(a)-(d)) but the mean field (see Figs. 2(e)-(h)) and the onsets of bursts are considerably well captured. At considerably low coupling strength, ( = 0.02), machine's prediction is very poor but with the increment of coupling strength an improvement in the predicted result is noticeable. For comparatively higher coupling strength ( = 0.06, 0.12), even though exact amplitude prediction has not been achieved, machine can accurately produce mean field of the whole system (see Figs. 2(g),(h)). We see that instead of individual time series, machine can capture the onset of bursts and as well as overall mean field dynamics quite well at various coupling strengths, that motivates us to focus on predicting the degree of burst synchronization of the whole system. As we can see that machine can not capture the individual amplitudes of the oscillators, we have to rely on phases in order to predict burst synchronization. We measure the phase θ from the time series of every neuron by introducing a Poincaré section at x 0 = −1. Every time the time series of x crosses the line x 0 = −1 in upward direction we take those points as the starting points of new bursts (see Fig. 1 ). We also put a restriction that two simultaneous bursts should apart by at least 60 time points.Then, mathematically, phase of a chaotic oscillator θ j (t) is defined as [43] ,

where, t n,j is the starting time point of n th burst of j th neuron. in Fig. 1 , we present the time series of four randomly chosen nodes from the uncoupled system, and those red dots find the starting point of a new burst. We take the data after avoiding a considerable transient time points. Now, with the phases of the oscillators, we can quantify the degree of burst synchronization using the absolute value of Kuramoto's order parameter [76] as,

where, θ j (t) is the phase of j th neuron at time t. For this calculation, we separately use the original and machine generated time series for the t test time points. Averaging (7) over a large time interval, we get the final averaged out order parameter

where, t f is the final time point, and t i is the first time point after ignoring the transient period. This global synchronization order parameter R → 1 signifies a perfectly phase synchronized state and in contrast, R → 0 for an entirely non-phase synchronized state.

In order to quantify the order of a good prediction, we consider the values of root mean square error (RM SE) between the original and predicted time series of r(t). RM SE is calculated using the formula :

In Fig. 3 , we take the same uncorrelated network as in Fig.2 , and plot the values of R with respect to for different number of input nodes. The original (black circle) and machine predicted (green square) order parameters are represented for K = 5, 15 and 25 (see Figs. 3(a)-(c)). We randomly choose the input nodes and observe that with increasing values of K, the prediction improves gradually. In Fig. 3(d) , corresponding RMSE justifies this results shown in Fig. 3(a)-(c) . RMSE gradually decreases with increasing values of K. Therefore, for an uncorrelated network, increasing inputs help the machine to generate an improved prediction.

But what will be the scenario if the network has a degree-degree correlation between the nodes and for this is there any preference of choosing specific input nodes? We address these questions in the next sections.

In this section, we address the questions on whether the types of nodes we select to train the machine have any effective influence in the predicted results and whether it has a connection with the assortativity of the original network. Assortativity (A) is one of the important characteristics of a scale-free network that quantifies the tendency of connecting similar degree nodes. We make a network assortative by removing links between high and low degree nodes and reconnect two similar degree nodes. On the other hand, by reversing this process in opposite sense, we can make a network disassortative. We follow the algorithm, proposed in [77] to get a network having desired level of degree assortativity. It is measured by Pearson's correlation coefficient as described in equation 6. At different level of assortativity, local network topology reconstructed without affecting the degree distribution of the whole network. But this local rearrangement of links have an effective impact in global scenario. This leads us to investigate the effect of assortativity index over the input node selection process during the training of ESN. Therefore, we consider networks with different level of assortativity and use different group of nodes for training. We consider the nodes with degree less than 6 as the low degree nodes, with degree greater than 15 as the high degree nodes and the intermediate nodes are having degree between 6 to 15.

At a global synchronized state, considering large values of K (number of input nodes) is absolutely unnecessary. Feeding the data of only one node is enough to predict ESN. ((a)-(c) ) Global synchronization order parameter R is represented with respect to from the original (black circle) and machine generated (green square) data for different values of K. Here, we take an uncorrelated network. Input nodes are randomly chosen from the whole system. (d) RMSE is depicted with respect to . At low coupling strength, increasing inputs make a better prediction. the phases of the remaining nodes (since, they are all synchronized). Therefore, we concentrate on the desynchronized or partially synchronized state of the system i.e for the low coupling strength (< ∼ 0.06).

We start by considering a disassortative network (negatively correlated) with A = −0.3. In Fig. 4 , we plot the values of R from both original and machine generated time series data with respect to for different number of inputs. Here, we make a preference in choosing these training nodes depending on their degree, like high (degree> 15) and low (degree< 6) degree. In the upper FIG. 4: A disassortative (A = −0.3) network is considered. The global synchronization order parameter R from the original (black) and machine predicted (green) data is depicted with respect to . We choose the inputs from high ((a)-(c)) and low ((e)-(g)) degree groups. In the last column, we present RMSE with the variation of for different number of (d) high and (h) low degree inputs and with the increment in the number of inputs we obtain better prediction. (i)-(k) RMSE for different group of input nodes. Selection of lower and higher degree nodes during the training do not show any distinct effect on the efficiency of prediction process.

panel (see Fig. 4 (a)-(d)), the time series of high degree nodes are used to train the machine and, we gradually increase the number of input nodes (K) from 5 to 25 (see Fig. 4(a)-(c) ). We use black and green colors for the original and machine predicted data. It is observed that the increment in the number of inputs helps ESN to capture the synchronization level of the whole system. Fig. 4(d) justifies this observation as RMSE decreases with the increasing values of K. Here, red, blue and yellow colors are used for K = 5, 15 and 25. Increasing inputs favors in decreasing RMSE gradually at the desynchronized or partially phase synchronization state. In the lower panel (see Figs. 4(e)-(h)), the inputs are chosen from the low degree nodes, and we find similar results as the high de-gree inputs.

In the last row, we compare RMSE for different group of inputs at some fixed value of K. Blue and yellow colors are used for high and low degree groups, and we observe that RMSE takes almost similar values (see Figs. 4(i)-(k)) for these two different group of input nodes. This leads us to the fact that, for a disassortative network, depending on degree, preference in selecting different group of training nodes has no significant impact on the machine predicted results. Only increasing number of inputs helps ESN to make a better prediction as before for an uncorrelated network.

Now, we consider assortativity mixing, i.e, a bias of a node in favor of attaching to the nodes of similar characteristics. In Fig. 5 we consider a degree correlated network with A = 0.3. Here, we also choose different group of input nodes to examine the effect of degree correlation over the node selection process. The upper, middle and lower row represent the results for high (degree> 15), intermediate (6 ≤degree ≤ 15) and low (degree< 6) degree inputs respectively. Same as the previous figure, we use the colors black and green to represent R corresponding to the original and machine generated data. Similar to the disassortative case, we find better predictions with increasing values of K for each group of inputs. The last column representing RMSE, apparently verifies this result (see Fig. 5(d) ,(h),(l)). But, here an interesting result is observed depending on the degree of input nodes. Now, we take the first column into our consideration (see Figs. 5 (a), (e), (i)). Here, K = 5. We find that the training with high degree nodes do not produce the desired result. In comparison with the high degree nodes, intermediate nodes give better prediction, and the low degree nodes give more accurate result than intermediate or high degree inputs. In Fig. 5(m) , we plot RMSE, and observe a small variation among the results corresponding to different degree inputs. We use blue, green and yellow colors for high, intermediate and low degree inputs respectively. The discrepancy in predicted results becomes more clear for increasing values of K, i.e. for K = 15 and K = 25. RMSE for low degree input nodes remains considerably low in comparison with the high and intermediate groups (see Figs. 5(n), (o)). Therefore, we notice that in an assortative network, if the low degree nodes are used to train the machine, ESN gives better results comparing to the other nodes. We discuss the reason behind this phenomenon on later section.

We now explore an overall scenario about the effect of assortativity index of the original system on ESN predicted results in Fig. 6 . We fix K = 15 and use the groups of high (degree > 15) and low (degree< 6) degree nodes for training. Here in Fig.6 , initially we consider an uncorrelated network (A = 0) and gradually increase the degree correlation up to a certain level (A = 0.3) to find out the interconnection between assortativity index of the original system and RMSE regarding different groups of input nodes. Here, we plot RMSE corresponding to the considered groups of training nodes for different values of A. Blue and magenta colors are used for high and low degree inputs respectively. In Fig. 6(a) , we plot the RMSE between the synchronization order parameters of original and machine generated data for A = 0, and observe that the machine prediction is al-most similar for two separate inputs. In Fig. 6(b)-(d) , we plot the same for A = 0.1, A = 0.2 and A = 0.3. Interestingly, examining this figure, we find that training with higher degree nodes fails to give the desired prediction for a degree correlated network. Here, the lower degree input nodes can give comparatively better results. A difference in the values of RMSE arises for two distinct group of inputs, and apparently this difference gradually enlarges with increasing assortativity parameter of the original system. Therefore, if we consider a degree correlated system, the low degree training nodes help the machine to make more accurate prediction.

In this section, we try to explore the reason behind the fact that why using lower degree nodes as inputs to ESN results better prediction in a highly assortative network. Furthermore, how this effect fades down with decreasing degree correlation in network topology of the original system. In Fig.6 , we choose the training nodes from two distinct groups depending on their degree. We try to investigate the reason behind such discrepancy in the obtained results depending on the input node selection. Since, based on degree of individual nodes increasing assortativity restructures the local connectivity of the network, and similar degree nodes are tending to connect with each other, here we explore the collective synchronization behavior of the original system. The path of the global synchronization order parameters of these two separate groups along with the whole system are depicted in Fig. 7 . Here, blue, green and magenta colors we use for the whole system, group of high degree (degree> 15) and low degree (degree< 6) nodes respectively.

In Fig. 7(a) , we observe the paths to the synchronized state for a disassortative network (A = −0.3). Enough connections between high and low degree nodes help to synchronize the whole system at a time. The synchronization paths are almost same for all the separate groups and the whole network. Therefore, any choice of inputs nodes does not have any non-identical effect on the results. However, in Fig. 7(b) , for highly assortative network (A = 0.3), we observe that different groups follow different paths to synchronize. The group of high degree nodes make a cluster and synchronizes quickly in comparison with the whole network. That might be a reason for not getting the desired prediction when we train the machine with high degree nodes. On the other hand, group of low degree nodes follows almost the same path as the whole network. Therefore, for a degree correlated network, training with low degree nodes gives a better prediction of synchronization level of the whole network. 3 are considered, and we observe the difference between RMSE for the groups of higher and lower degree nodes becomes larger with increasing assortative index. Therefore, using low degree nodes are found to be effective while training. networks are considered. Blue, green and magenta colors are used for the whole system, and the group of high and low degree nodes respectively.

In this study, we attempted to predict the burst synchronization of a neuronal system, coupled through a heterogeneous scale-free network using ESN based machine learning approach. We train the machine with the help of the time series of a few nodes and let the machine to predict the series of the remaining nodes of the system. Interestingly, ESN can not precisely predict the amplitudes of time series of each nodes, instead the onset of bursts as well as the mean field can be predicted quite well by ESN. Since, assortativity appears as a crucial characteristic for a heterogeneous network, we investigated the effect of this characteristic on the ESN predicted results. During the training session, we choose the inputs having different degree, and interestingly, dif-ferent group of input nodes are found to be effective in the prediction process. In a disassortative network, no such dissimilarity is observed for different group of inputs (depending on degree). However, for a degree correlated network, the lower degree input nodes seem to be very effective. If we train ESN using the dynamics of low degree nodes, in comparison with the higher degree inputs, machine can correctly predict route of the global phase synchronization order parameter approaching synchronized state. We also analyzed the possible reason behind this result from the synchronization behavior of original system. As the lower degree nodes are very large in number, dominating the collective synchronized regime of the whole system for an assortative network, at this case low degree nodes are able to give a preferred result.

This investigation might be useful in providing practical insights in several fields. As ESN is a model independent technique, we might use this result in any area where only the information of some lower degree nodes is available for a heterogeneous system. At this point, if the concern is to get an idea about the collective behavior of the whole system, ESN can easily build up a desired prediction. From neuroscience viewpoint, the burst synchronized activity in some regions of the human cortex is related to human memory and cognition. So,our findings may be very useful to understand the mechanism of human memory retrieval process.

on the basis of the budget approved by the Saxon State Parliament. C.H. is supported by DST-INSPIRE-Faculty grant (code No. IFA17-PH193)

Neural networks: Tricks of the trade

Recent Advances of Neural Network Models and Applications

Nonlinear time series analysis

Synchronization: a universal concept in nonlinear sciences