1 Introduction

Coma is a serious medical condition characterized by a prolonged state of unconsciousness in which an individual has little or no level of response [9, 10]. The electroencephalogram (EEG) is a medical exam that can record the brain’s electrical activity. It represents the summation effect of many excitatory and inhibitory postsynaptic potentials produced in the pyramidal layer of the cerebral cortex [29]. Data from this exam are commonly used for monitoring and diagnosing various diseases, some of which are schizophrenia [24] and epilepsy [26].

The adoption of EEG data for disease detection tasks often considers well-known machine learning (ML) algorithms such as adaptive boosting, support vector machine and tree-based algorithms that compare portions of the exam for one or more electrodes [17, 22, 28]. To enable the application of ML models to examination data, traditional approaches involve the application of various filters to all channels to extract spectral attributes [5]. Subsequently, these attribute vectors are formed and processed through the cited algorithms.

In addition to disease-related monitoring and diagnosis, EEG data can be used for behavioral and emotional analysis of patients. Some studies show that it is possible to obtain good results in predicting emotions applying feature extraction methods by examining time series followed by the application of ML models [20, 27]. These studies performed good results in predicting emotions by combining a series of attributes extracted from the time series resulting from the EEG examination, most of the time combining attributes from the time and frequency domains. To improve the results, other studies combine time series attributes with the result of facial expression detection [14, 30].

Other studies show the impact that a musical stimulus can bring to the brain activity behavior [15, 25], and even the relationship between these musical stimuli with the behavioral recovery of comatose patients [7]. Research on the therapeutic potential of musical stimulation in coma patient recovery often employs a comparative approach by analyzing the temporal signal from each electrode before and after the stimulus. These comparisons involve the extraction of statistical variables from the time series data. Additionally, there are applications of ML models to comprehensively assess the impact generated by such stimuli on patients’ neurological responses and recovery trajectories [23].

Most approaches to EEG analysis involve extracting information from descriptive statistical variables or using simple signal pre-processing methods [18, 19]. Another well-known approaches include Recurrent Neural Networks (RNN), such as echo state networks and Long Short-Term Memory (LSTM) [1, 3, 4, 21, 25, 32, 33]. LSTM is particularly effective for handling temporal data and managing relationships between states. In EEG analysis tasks, LSTM often includes a regularization layer during the training process. This architectural choice enhances the model’s ability to accurately interpret and analyze the examination results.

LSTMs yield satisfactory results when applied to EEG exams for the aforementioned applications. However, they miss a significant opportunity to explore the scalp position of each electrode and the relationships between channels. Graph Neural Networks (GNNs) have the potential to work with graph-structured data. Recent works on applying GNNs to emotion prediction and other general EEG classification tasks [8, 34] have obtained promising results. Besides such positive outcomes, GNNs can also explore connections involving the spatial positioning of the electrodes (nodes) and the correlations between the graph data features, which is a few explored topic in the literature.

In this work, we investigate four GNNs architectures for the Prognosis of Patients in Coma (PPC) using EEG data. Our graph classification approach incorporates convolutional-based layers specifically designed for graph data, allowing the exploration of spatial relationships between electrodes due to the representation of the EEG examination in the form of a graph structure. In order to represent EEG sequential data into graphs, we also develop EEGraph, a modeling strategy in which each vertex in the graph corresponds to an electrode, and the edges are constructed from the electrodes position in the scalp.

While the PPC task is continuously evolving, we recognize certain opportunities for improvement. For example, the topological structure of EEG channels is often underutilized in the formulation of EEG representations [2]. To address this, our study focuses on leveraging the graph structure to uncover more informative patterns and relationships among the electrodes, leading to enhanced prognostic accuracy. By addressing such a challenge, we aim to contribute in the PPC task from EEG signals, an approach compliant with most of the Brazilian public hospitals in terms of cost and equipment availability. Our experiments with real data shows that both the proposed GNNs architectures and the graph modeling strategy improve the predictive results regarding the outcome of comatose patients. In summary, our main contributions can be stated as follows:

  • Development of four GNNs architectures based on convolutional dynamics to predict PPC outcomes. The proposed architectures aim to explore both nonlinear transformations and local connections of the vertices.

  • Design of EEGraph, a modeling strategy to transform raw EEG data into graph structures. EEGraph seeks to exploit the connections between the vertices (electrodes) of graph structures as much as possible.

  • Development of an experimental setup to compare the proposed architectures against LSTM, and provide some evidence about their satisfactory results.

The remainder work is divided as follows: Sect. 2 presents the background for the understanding of the proposed architectures. Section 3 describes the proposed approaches. Section 4 shows the experimental results, and Sect. 5 brings the conclusion of the work as well as forthcoming investigations.

2 Background

This section presents a brief description of the mathematical concepts related to the GNN and LSTM layers investigated in this paper.

2.1 Graph Neural Networks

For a better understanding of GNNs and their respective convolutional layers used in this work, it is important to reiterate the graph data structure definition in which these networks operate. A graph \( \mathcal {G = (V, E)} \) is the data representation formed by connecting vertices through edges, which are denoted respectively by \( \mathcal {V} \) and \( \mathcal {E} \). Each edge can contain a weight value \( \mathcal {W} \). An adjacency matrix \( \mathcal {A} \) can be used to represent an overview of the final structuring of the graph.

  • Graph Convolutional Network (GCN) have their layer propagation behavior governed by the following equation [16]:

    $$\begin{aligned} H^{(l + 1)} = \sigma ( \tilde{D}^{-\frac{1}{2}} \tilde{A} \tilde{D}^{-\frac{1}{2}} H^{(l)} W^{(l)})~, \end{aligned}$$
    (1)

with the variables related to a certain undirected graph \( \mathcal {G} \), where \( \tilde{A} \) is the adjacency matrix, \( \tilde{D} \) is the diagonal matrix of \( \tilde{A}\) governed by \(\tilde{D}_{ii} = \sum _{j} \tilde{A}_{ij}\), \( W ^l\) is the matrix of weights of neural network in layer l, \(\sigma ()\) function represents the activation functions at \(l^{th}\) layer, and finally \(H^{l + 1}\) is the resulting matrix at the end of \( l^{th}\) layer, where \(H^{0}\) is the initial inputs, called X, and \(H^{n}\), with n as the last layer, contains the results in the final step of our learning process, called Y.

Equation (1) is the final representation of the summary equations of the three stages of GCN before the prediction, which are: feature propagation, linear transformation and nonlinear activation. The feature propagation stage can be updated for the entire graph represented by:

$$\begin{aligned} \tilde{H}^{(l)} \leftarrow \tilde{D}^{-\frac{1}{2}} \tilde{A} \tilde{D}^{-\frac{1}{2}} H^{(l - 1)}~. \end{aligned}$$
(2)

After the first stage, Eq. (2) will be combined with the weight matrix \( W ^l\) for each iteration, resulting in the linear transformation stage. Finally, the function \(\sigma ()\) can be applied to complete the process:

$$\begin{aligned} H^{(l)} \leftarrow \sigma ( \tilde{D}^{-\frac{1}{2}} \tilde{A} \tilde{D}^{-\frac{1}{2}} H^{(l-1)} W^{(l)})~. \end{aligned}$$
(3)

At the end, GCN prediction is performed by applying a softmax layer in order to normalize the results and weights obtained during the update process.

  • Simple Graph Convolution (SGC) differs from GCN by the removal of the non-linear transformations sequence between the layers of the GCN structure and by exploring more the feature propagation stage [31]. The feature propagation stage for the SGC has as main objective to explore the potential of the relations between the neighboring vertices via configuration of the adjacency matrix. Therefore, Eq. (2) previously used in GCN layers can be rewritten for use in SGC architectures as follows:

    $$\begin{aligned} \tilde{H}^{(l)} \leftarrow (\tilde{D}^{-\frac{1}{2}} \tilde{A} \tilde{D}^{-\frac{1}{2}})^{l} H^{(0)}~. \end{aligned}$$
    (4)

From Eq. (4), the product by the matrix of weights \( W ^l\) results is given by:

$$\begin{aligned} \tilde{H}^{(l)} \leftarrow (\tilde{D}^{-\frac{1}{2}} \tilde{A} \tilde{D}^{-\frac{1}{2}})^{l} H^{(0)} W^{(l)}~. \end{aligned}$$
(5)

To obtain the normalized value of the SGC predictions for the classes, a softmax function is applied over Eq. (5).

  • Sample and Aggregate Convolutional Network (SAGE) introduces an approach for aggregating feature information from a node’s local neighborhood, enhancing the representation learning for graph-structured data. The propagation behavior of the SAGE layer is given by Eqs. (6) and (7), which collectively describe the sampling and aggregation processes integral to the model [12].

The first step in the SAGE layer involves sampling a fixed-size set of neighbors for each node, which can be described by:

$$\begin{aligned} \mathcal {N}(v) = \text {Sample}(\tilde{\mathcal {N}}(v), K)~, \end{aligned}$$
(6)

where \(\mathcal {N}(v)\) represents the set of neighbors sampled for node \( v \), \(\tilde{\mathcal {N}}(v)\) is the full set of neighbors of \( v \), and \( K \) is the number of neighbors to sample.

Once the neighbors are sampled, the next step is to aggregate their features using an aggregation function \( \text {AGG} \), which can be mean, LSTM-based, or pooling-based. The aggregated feature vector is then concatenated with the node’s own feature vector and transformed through a weight matrix and non-linear activation function, as follows:

$$\begin{aligned} h_v^{(l+1)} = \sigma \left( W^{(l)} \cdot \text {CONCAT} \left( h_v^{(l)}, \text {AGG} \left( \{ h_u^{(l)}, \forall u \in \mathcal {N}(v) \} \right) \right) \right) ~, \end{aligned}$$
(7)

in which \( h_v^{(l)} \) denotes the feature vector of node \( v \) at layer \( l \), \( \text {CONCAT} \) is the concatenation operation, \( \text {AGG} \) represents the chosen aggregation function, \( W^{(l)} \) is the weight matrix for layer \( l \), and \( \sigma \) is the activation function.

The final node representations can be used for downstream tasks such as node classification or link prediction, often incorporating a softmax layer for normalization and classification purposes.

  • Graph Convolutional Network II (GCNII) extends the traditional GCN by introducing initial residual and identity mapping, which address the limitations of deeper GNNs. The propagation behavior of GCNII is governed by Eqs. (8) and (9), which incorporate these novel characteristics [6].

The initial residual connection allows each layer to directly utilize the input features of the first layer, enhancing information flow and mitigating the vanishing gradient problem. From Eq. 1, this can be defined as follows:

$$\begin{aligned} H^{(l+1)} = \sigma \left( (1 - \alpha ) \left( \tilde{D}^{-\frac{1}{2}} \tilde{A} \tilde{D}^{-\frac{1}{2}} H^{(l)} W^{(l)} \right) + \alpha H^{(0)} \right) ~, \end{aligned}$$
(8)

where \(\alpha \) is a hyperparameter controlling the balance between the initial residual connection and the current layer’s transformation.

The identity mapping characteristic further stabilizes the training of deep graph networks by incorporating identity mappings, which preserve the information from the previous layer, such as follows:

$$\begin{aligned} H^{(l+1)} = \sigma \left( (1 - \alpha ) \left( \tilde{D}^{-\frac{1}{2}} \tilde{A} \tilde{D}^{-\frac{1}{2}} H^{(l)} W^{(l)} \right) + \alpha H^{(0)} + \beta H^{(l)} \right) ~, \end{aligned}$$
(9)

where \(\beta \) is another hyperparameter that controls the contribution of the identity mapping term \(H^{(l)}\), further enhancing the model’s capacity to maintain feature representation throughout layers.

The final node representations generated by GCNII typically involves a softmax layer for normalization and classification purposes.

2.2 Long Short-Term Memory

LSTM is a type of RNN architecture designed to address the vanishing gradient problem and capture long-term dependencies in sequential data [11, 13]. LSTM uses specialized memory units to store and retrieve information selectively. Unlike traditional RNNs, LSTM has a stronger memory mechanism that allows it to retain information for longer periods and overcome the vanishing gradient problem. The LSTM model adopted for comparison purposes in the present work consists of five key functions [11]:

  1. 1.

    Forget Gate: Denoted by \(f_t\), it computes the amount of previously stored information to forget. It is calculated by applying a sigmoid activation function (\(\sigma \)) to the linear transformation of the concatenated input (\([h_{t-1}, x_t]\)) using the weight matrix (\(W_f\)) and bias vector (\(b_f\)).

    $$\begin{aligned} f_t = \sigma (W_{xf}x_t + W_{hf}h_{t-1} + W_{cf}C_{t-1} + b_f) \end{aligned}$$
    (10)
  2. 2.

    Input Gate: Denoted by \(i_t\), it determines the amount of new information to be stored in the current memory state. It is calculated by applying a sigmoid activation function (\(\sigma \)) to the linear transformation of the concatenated input (\([h_{t-1}, x_t]\)) using the weight matrix (\(W_i\)) and bias vector (\(b_i\)).

    $$\begin{aligned} i_t = \sigma (W_{xi}x_t + W_{hi}h_{t-1} + W_{ci}C_{t-1} + b_i) \end{aligned}$$
    (11)
  3. 3.

    Candidate Cell State: Denoted by \(\widetilde{C}t\), it represents the new information that can be added to the memory. It is calculated by applying a hyperbolic tangent activation function to the linear transformation of the concatenated input (\([h{t-1}, x_t]\)) using the weight matrix (\(W_c\)) and bias vector (\(b_c\)).

    $$\begin{aligned} \widetilde{C}t = \tanh (W_{xc}x_t + W_{hc}h_{t-1} + b_c) \end{aligned}$$
    (12)
  4. 4.

    Cell State Update: It combines the forget gate activation vector (\(f_t\)), the input gate activation vector (\(i_t\)), and the candidate cell state (\(\widetilde{C}_t\)) to update the current memory state (\(C_t\)). Element-wise multiplication (\(\odot \)) is performed between the forget gate and the previous memory state, and between the input gate and the candidate cell state.

    $$\begin{aligned} C_t = f_tc_{t-1} + i_t\widetilde{C}_t \end{aligned}$$
    (13)
  5. 5.

    Output Gate: Denoted by \(o_t\), it controls the amount of information to be output from the memory state. It is calculated by applying a sigmoid activation function (\(\sigma \)) to the linear transformation of the concatenated input (\([h_{t-1}, x_t]\)) using the weight matrix (\(W_o\)) and bias vector (\(b_o\)).

    $$\begin{aligned} o_t = \sigma (W_{xo}x_t + W_{ho}h_{t-1} + W_{co}c_t + b_o) \end{aligned}$$
    (14)

The LSTM model involves various parameters, including weight matrices (\(W_f\), \(W_i\), \(W_c\), \(W_o\)), bias vectors (\(b_f\), \(b_i\), \(b_c\), \(b_o\)), and the hidden state \(h_t\). These parameters are learned during the training process and play a crucial role in the network’s ability to capture and model complex temporal dependencies.

3 Model Description

This section provides a detailed description of all the components developed to enable the GNN learning over EEG data. We also present the dataset adopted in this study, the formulation and pseudocode of the modeling approach to represent EEG data into graphs, the selection of hyper-parameters for the GNN networks, and the validation process employed. An overview about the complete pipeline of the model can be see in the Fig. 1, and all its respective steps will be detailed in this section.

Fig. 1.
figure 1

Complete pipeline fluxogram. (a) Electroencephalogram before preprocessing step; (b) EEG exam as a graph, with each electrode’s respective temporal values as vertices, and the patient’s actual outcome stored as a target class of the graph; (c) Graph structure after modeling the exam data: each electrode is a node, and all the edges have the same weights (w = 1); (d) Graph Neural Network representation, created based on the convolutions principles; and (e) the final result (outcome probability) of the pipeline.

3.1 EEG Dataset

The dataset used in this study is private. With approval of the Research Ethics Committee of the Federal University of Uberlândia (UFU), EEG exams recorded from comatose patients in the intensive care unit of the UFU Clinical Hospital were collected. Table 1 shows the class distribution. EEG exams considered the 10–20 electrode system, which the disposition along the scalp is shown by Fig. 2. After the preprocessing, each EEG exam is denoted by 10 two-second segments selected through visual inspection by a specialized physician in the field. Along with the results, the datasets were labeled regarding the outcome of the comatose patient, considering the following groups:

  • Favorable outcome is related to patients with positive outcome.

  • Unfavorable outcome is related to patients with clinical or brain death;

Table 1. Outcomes Final Class Distribution
Fig. 2.
figure 2

Electrodes position on patients’ cortex.

EEG data were collected at a frequency of 100 to 400 Hz for the given electrodes [FP1, FP2, F7, F3, FZ, F4, F8, T3, C3, CZ, C4, T4, T5, P3, PZ, P4, T6, O1, O2]. To address the possible difference in the frequency at which data were collected, the amount of data collected for 400 Hz was used as reference.

3.2 EEGraph: Modeling EEG Data Into Graphs

As illustrated by steps Figs. 1(b) and 1(c), we develop EEGraph, a modeling strategy to represent EEG exam data into graphs. The graph’s data structure, as seen in the Background section, requires the definition of vertices and edges in order to leverage the power of GNNs frameworks. Therefore, EEGraph is divided in two main steps: Node and Edge Construction.

In the Node Construction step, each electrode is denoted by a graph vertex (\( \mathcal {V} \)), whose attributes contain their respective temporal EEG records. Figure 1(b) illustrates the step in EEGraph responsible for modeling the EEG records for each electrode to vertices in the graph structure. At this stage, the temporal values for each electrode are assigned to the respective vertices in the form of an attribute vector, and simultaneously, the respective patient outcomes are also assigned as the target class of the final graph data structure.

In the Edge Construction step, a matrix P, defined by (15), is built as a reference for the creation of the edges between nodes (electrodes), as exemplified in Fig. 1(c). All links were created by considering the electrodes positions along the scalp, as shown in Fig. 2, to effectively leverage the spatial relationships between the nodes and represent them as connections. The padding elements, denoted by \(\times \) in (15), were used only as a facilitator in the visual interpretation and also for the implementation of a neighboring heuristic described next. Although they are not directly present in the construction of the graph structures, they have a great importance enabling the EEGraph strategy not only to have a positional reference for each vertex in the edge construction process, but also to define different levels of neighboring relationships.

$$\begin{aligned} P = \begin{bmatrix} \times & F_{p1} & \times & F_{p2} & \times \\ F_{7} & F_{3} & F_{z} & F_{4} & F_{8} \\ T_{3} & C_{3} & C_{z} & C_{4} & T_{4} \\ T_{5} & P_{3} & P_{z} & P_{4} & T_{6} \\ \times & O_{1} & \times & O_{2} & \times \\ \end{bmatrix} \end{aligned}$$
(15)

We enhance our EEGraph modeling strategy by designing an edge construction heuristic able take advantage of different levels of neighboring relationships among the nodes. Such a heuristic is named h-EEGraph, in which h defines the maximum geodesic distance to be considered in the creation of adjacent edges. Figure 3 shows the graph constructed for 1-EEGraph, in which only adjacent vertices are connected, ignoring the padding elements in matrix P. Figure 4 shows the connection of the \(F_{p1}\) electrode/node considering one, two and three levels of neighborhood by setting maximum geodesic distance h respectively equal to 1, 2 and 3. Formally, the edge construction step is given by:

$$\begin{aligned} A(v_i,v_j) = {\left\{ \begin{array}{ll} 1, & \text {if } d(v_i, v_j) \le h~,\\ 0, & \text {otherwise,} \end{array}\right. } \end{aligned}$$
(16)

in which A refers to the adjacency matrix obtained from our h-EEGraph modeling, and \(d(v_i, v_j)\) is the geodesic distance between nodes \(v_i\) and \(v_j\).

Fig. 3.
figure 3

Electrodes structured as a graph.

Fig. 4.
figure 4

Example of the FP1 channel method connections.

3.3 Neural Networks Techniques

The GNNs frameworks investigated in this study are GCN [16], SGC [31], SAGE [12] and GCNII [6]. For comparative purposes, we considered a LSTM network [33], a widely adopted technique for sequential data analysis. Section 2 presents a brief description about each one of the techniques under comparison.

To be specific, all GNNs models considered three convolutional layers. In the SAGE layers architecture, we adopted the mean aggregation function, and for the GCNII layers architecture, we adopted 0.1 and 0.5 values for the \(\alpha \) and \(\beta \) hyperparameters, respectively. They were trained with a number of 100 epochs, a batch size of 64 and Adaptive Movement Estimation (Adam) optimizer with a learning rate of \(10^{-3}\). The parameters and structures of the created GNNs were similarly designed to ensure a fair comparison between them. Regarding LSTM architecture parameters, we identified the optimal configuration, which consisted of a single LSTM layer with 20 neurons, trained over 100 epochs, yielding the best predictive performance.

Table 2. Performance Comparison: Metric Value/Standard Deviation

4 Experimental Results

In order to ensure a reliable and fair comparison between the results generated by the network architectures used, the K-Fold Stratified method was applied. The dataset was partitioned into 10 folds maintaining proportionality between the classes, using 8 partitions for training, 1 for validation and 1 for test. Table 1 made explicit a considerable imbalance between the number of registers for each outcome, a factor of great weight for the choice of metrics presented in the comparison of Table 2. To deal with this imbalance, F1-Score was used as the most important metric to both selection of hyper-parameters and evaluation of the predictive performance of the neural networks’ (NN for short in the table) architectures. The indicated Recall and Precision values were chosen to facilitate the understanding of the existing trade-off in each architecture, making evident any model fit problems with the data. And finally, the choice of accuracy was made so that we could always compare with the reference F1-Score, ensuring that the balancing of the metrics was obeyed and that there were no problems related to the adaptation of the models to the class imbalance.

Table 3. Statistical analysis using the Wilcoxon test to evaluate the best configuration among the combinations of GNNs and EEGraph neighborhood levels. Adopting a confidence level of 95% (\(\alpha \) = 0.05), the symbols >, <, and \(\approx \) indicate that the variable in the column is respectively better, worse, or equivalent to the variable in the row. Inner h=4 results omitted without affecting the analysis.

The results obtained in the Table 2 shows a significant improvement for all GNNs proposed architectures in the target metric F1-Score when compared to the reference LSTM architecture. In addition to a significant improvement of almost 18% in the target metric of the study and a better balance between precision and recall, a better fit of the GNNs architectures compared to the LSTM is also noticeable since the precision of the LSTM network has signs of bias. The EEGraph strategy for the graph structure is shown in the table according to the number of neighborhood levels (h), considered from 1 to 4. With exception of GCNII, smaller neighborhood levels like 1 and 2 usually provide better results in terms of the target metric F1-Score, as well as in terms of Accuracy and Recall. This means that a strategy to increment the number of edges based solely on neighborhood levels, without considering weight differences during the edge construction steps, does not contribute to an enhancement of network learning for the given problem. On the other hand, it also indicates that the relationships among adjacent electrodes contribute decisively to achieve better predictive performance.

Table 3 shows the Wilcoxon statistical test between all the architectures used in the experiment, making explicit the superiority of all GNNs techniques in comparison to LSTM. The results of the SGC algorithm with h=1 are statistically superior to all other GNNs and EEGraph configurations. For h=2 and h=3, SGC and GCN are statistically equivalent and outperform all other architectures, demonstrating the potential of both techniques.

Table 4. Statistical analysis using the Wilcoxon test to evaluate the models regardless of the EEGraph modeling strategy. Adopting a confidence level of 95% (\(\alpha \) = 0.05), the symbols >, <, and \(\approx \) indicate that the variable in the column is respectively better, worse, or equivalent to the variable in the row.
Table 5. Statistical analysis using the Wilcoxon test to evaluate the EEGraph modeling strategy, regardless of the model architecture. Adopting a confidence level of 95% (\(\alpha \) = 0.05), the symbols >, <, and \(\approx \) indicate that the variable in the column is respectively better, worse, or equivalent to the variable in the row.

When comparing the models regardless of the h-EEGraph architecture, observed in Table 4, the SGC model consistently outperforms GCN, SAGE, and GCNII, indicating it is the most effective model in this context. GCN also performs well, surpassing SAGE and GCNII. SAGE shows better performance than GCNII but is outperformed by both GCN and SGC.

In addition to the previous statistical analysis, one can see in Table 2 that GCNII model has very lower predictive performance in comparison to all other GNNs. It is important to note that we used only a single hyperparameters configuration for the GCNII architecture, which could be a potential limitation in fully evaluating its performance, as it may indicate an stronger dependence about the choice of \(\alpha \) and \(\beta \) hyperparameters, related respectively to the initial residual and identity mapping contributions. Because of that, we remove GCNII results from our next statistical analysis, as we are interested in identifying the contribution of different neighborhood levels in our EEGraph modeling strategy. Table 5 shows that one level neighborhood (h=1) outperforms the other neighborhood configurations across GCN, SGC and SAGE GNNs. Conversely, the h=4 configuration is less effective compared to the others, maybe due to the difficult to propagate salient information over a denser graph.

In summary, the results presented in this section highlight the substantial potential of leveraging the links between electrodes in EEG classification tasks. Furthermore, they also show that EEGraph modeling strategy allow easy integration of EEG data with GNNs, a powerful machine learning framework. Overall, the results confirm that the SGC and 1-EEGraph is the best-performing model.

5 Conclusion

In this study, we successfully developed four GNNs architectures based on convolutional dynamics to predict PPC results. These architectures effectively explored both nonlinear transformations and local connections between the vertices. Furthermore, we proposed EEGraph, a modeling strategy to transform raw EEG data into graph structures, which leverages the connections between the vertices (electrodes) of the graph structures to the fullest extent. Our experimental setup provided a comprehensive comparison between the proposed GNNs architectures and an LSTM architecture, demonstrating the satisfactory performance of our proposals.

In conclusion, the results obtained from our study demonstrate that the proposed GNN architectures outperformed the LSTM in terms of various performance metrics. The comparison was carried out considering different modeling architectures of EEG data as a graph. This accomplishment aligns with the primary objective of our article, which aimed to explore the spatial structure of EEG and establish its potential contribution to performance improvement.Throughout the experiments, it was observed that the GNNs architectures consistently achieved higher accuracy, F1-score, and recall values compared to the LSTM. This suggests that, leveraging the topological characteristics of EEG data through graph-based approaches can effectively enhance the performance of EEG analysis tasks.

Looking forward, further studies will focus on developing novel layers tailored specifically for the GNN architecture to achieve even better performance on the prognostic of patients in coma. By customizing the GNN architecture to better suit the characteristics of EEG data, we can potentially uncover additional improvements in the accuracy and effectiveness of EEG analysis techniques. In the future, we will refine the modeling strategy to represent EEG data into graph structures and extend its application to other public datasets. This refinement will focus on employing network structural optimization techniques to enhance the accuracy and efficiency of the graph-based models, thereby improving the overall predictive performance and applicability in real-world scenarios.