key: cord-0127150-teay3tao
authors: Zhu, Qi; Zhang, Chao; Park, Chanyoung; Yang, Carl; Han, Jiawei
title: Shift-Robust Node Classification via Graph Adversarial Clustering
date: 2022-03-07
journal: nan
DOI: nan
sha: 16f1c8da09c83e86f361c267c8b69c255cc7691d
doc_id: 127150
cord_uid: teay3tao

Graph Neural Networks (GNNs) are de facto node classification models in graph structured data. However, during testing-time, these algorithms assume no data shift, i.e., $Pr_text{train}(X,Y) = Pr_text{test}(X,Y)$. Domain adaption methods can be adopted for data shift, yet most of them are designed to only encourage similar feature distribution between source and target data. Conditional shift on classes can still affect such adaption. Fortunately, graph yields graph homophily across different data distributions. In response, we propose Shift-Robust Node Classification (SRNC) to address these limitations. We introduce an unsupervised cluster GNN on target graph to group the similar nodes by graph homophily. An adversarial loss with label information on source graph is used upon clustering objective. Then a shift-robust classifier is optimized on training graph and adversarial samples on target graph, which are generated by cluster GNN. We conduct experiments on both open-set shift and representation-shift, which demonstrates the superior accuracy of SRNC on generalizing to test graph with data shift. SRNC is consistently better than previous SoTA domain adaption algorithm on graph that progressively use model predictions on target graph for training.

Graph Neural Networks (GNNs) [10; 24; 6] have achieved enormous success for node classification. Recently, researchers observe sub-optimal performance when training data exhibit distributional shift against testing data [33] . In real-world graphs, GNNs suffer poor generalization from two major kinds of shift: (1) open-set shift: there are emerging new classes in the graph (e.g., new COVID variant in community transmission); (2) close-set shift: time-augmented test graph has substantially different class distributions (e.g., time-evolving Twitter and Academia graphs).

In machine learning, domain adaption helps model generalize to target domain when there is data shift between source and target domain. For open-set shift, Out-of-Distribution (OOD) detection methods [7] calibrate supervised classifiers to detect samples belonging to other domains (e.g., different datasets) or unseen classes. OpenWGL [29] and PGL [15] aim for open-set classification on graph, where they both progressive add pseudo labels for "unseen" classes. For close-set shift, inferior performance caused by non-IID training data is first discussed [16; 33] for GNNs. More general closeset shift is observed in time-evolving (dynamic) graph, where training graph can be viewed as a non-IID sub-graph of test graph. To the light of this issue, the most well-known approach is domain invariant representation learning [13; 14; 30] , namely, indistinguishable feature representations between source and target domain. However, the limitation of these methods is obvious when class-conditional distributions of input features change (Pr train (Y |X) = Pr test (Y |X)) -conditional shift [31] . Despite the popularity and importance of GNNs, only a few attempts aim to make GNNs robust of either open-set shift [29; 15] or close-set shift [33] . All previous work do not incorporate graph-specific designs and still suffer from the aforementioned conditional shift (see results of domaininvariant baselines in Table 3 for more details). For example, various degree of graph homophily (e.g., neighbors with same labels) exhibit in different graphs, whose potential has not been explored against distribution shifts.

In this paper, we propose a unified domain adaption framework for shift-robust node classification, where unlabeled target (testing) graph is used to facilitate model generalization. As shown in Figure 1 , both shifts can be interpreted as one specific distribution shift in the plot. A major reason behind the ineffectiveness of existing domain adaption techniques for GNN is the lack of modeling classes in target data, i.e. Pr(X t , Y t ). Without modeling class distribution on target, it is impossible to mitigate the conditional shift when it happens. Motivated by graph homophily, we propose to use graph clustering to identify latent classes [3; 23] by breaking less frequent edges (possibly heterophily) between potential clusters (right side of Figure 1 ). Our proposed model, which we call SRNC (Section 4.1), features two graph neural networks (GNNs): an shift-robust classification GNN Θ and an adversarial clustering GNN Φ (Section 4.2). In SRNC optimization, the two modules improve each other through adversarial training:

1. The clustering structure inferred by the clustering GNN on source data Pr Φ (C s |X s ) should be close to training conditional distribution Pr train (Y s |X s ). For instance, in upper Figure 2 , the KL-divergence between above two probabilities are minimized to push the clustering result consistent with training data.

In the other direction, the classifier is optimized on both training graph and target graph. The adversarial samples are sampled from clustering GNN to improve the generalization on target graph.

As far as we know, it is the first domain adaption framework considers the joint target probability Pr(x t i , y t i ) on graph. The joint probability avoids tackling conditional shift on Pr(y t i |x t i ) and our real-world experiments (see Section 5.3) demonstrates that only minimizing feature distribution discrepancy leads to negative transfer. Furthermore, convergence of the iterative optimization is theoretically guaranteed in Section 4.3.

To summarize, we present a domain adaption framework for two kinds of most common data shift in graph structured data. Our experiments show that the shift-robust classifier trained in this way can detect more than 70% open-set samples on three widely used benchmarks. For close-set shift, we use early snapshot (prior to 2011) of OGB-Arxiv [8] network as training graph and testing on more recent graphs (14-16/16-18/18-20) . SRNC can reduce the negative effect of shift by at least 30% in testing. We further show progressively utilizing classifier's prediction as pseudo labels (our ablation without clustering component) on target graph [29; 15] is much worse than proposed method.

Domain Adaption. Domain adaption transfers machine learning models trained on the source domain to the related target domain. The H-divergence [1] is first developed as a distance function between a model's performance on the source and target domains to describe how similar they are. Then discrepancy measures [17] are invented to measure the generalization bound between source and target domains. To bridge the gap between source and target, Domain Invariant Representation Learning aims to minimize these discrepancy measures on source and (unlabeled) target data by adversarial learning [4] or regularization (e.g. CMD [30] , MMD [13; 14] ). We also use unlabeled target data in our framework, but we propose a clustering component for adversarial samples specifically for graph structured data.

Open-set Classification. Open set recognition and classification [20; 2] require classifiers to detect unknown objects during testing. A closely related problem is out-of-distribution detection [7] , which focuses on detecting test samples of rather different distribution than in-distribution (training distribution by classifier). These studies focus on addressing the overconfidence issue of deep neural networks for unknown classes, such as OpenMax [2] and DOC [22] . Specifically, on graph structured data, OpenWGL [29] employs an uncertainty loss in the form of graph reconstruction loss [11] on unlabeled data. PGL [15] extends previous domain adaption framework to open-set scenario with graph neural networks. Yet, none of open-set learning on graph has explored modeling class distribution on target graph by graph homophily.

Distribution Shift on GNNs. There has been a recent surge on discussing GNNs generalization [5; 26; 16] . The generalization and stability of GCNs is found to be related with graph filters [26] . Our work mainly focus on the distribution shifts between specific training and testing data. To this end, SRGNN [33] first adopts two shift-robust techniques -CMD and importance sampling for non-IID training data. However, the previous explorations assume no conditional shift such that adaption is limited or even negative in real-world situations.

A, X} is defined as a graph with nodes V , their features (X ∈ R |V |×|F | ) and edges between nodes (e.g., adjacency matrix A, A ∈ R |V |×|V | ). A GNN encoder takes node features X and adjacency matrix A as input, aggregates the neighborhood information and outputs representation h k v for each node v. In its k-th layer, for node i, the GNN encoder aggregates neighbor information from k − 1 layer into neighborhood representation z:

where h k i ∈ R d is the hidden representation at each layer, N i is the neighborhood of node i and h 0 = X.

In graph, graph homophily [32] indicates densely connected nodes tend to have similar properties. Recent study has shown graph homophily as key factor to the success of graph neural networks on node classification, where nodes with similar representations are likely to have same label. 

Among various possible causes of distribution shift, we are interested in two major shifts in this paper: (1) open-set shift, namely, new classes arise during test-time, i.e. |y train | < |y test |. (2) close-set shift, joint probability changes between training and testing, Pr train (H, Y ) = Pr test (H, Y ).

In this section, we present our framework for shift-robust node classification. SRNC contains two adversarial GNN modules (1) a shift-robust classification GNN P θ (Section 4.1) and (2) an unsupervised clustering GNN Q Φ (Section 4.2). The classification GNN is optimized on labeled source graph and pseudo labels (adversarial samples) on target graph for better generalization against possible distributional shifts. Meanwhile, the clustering GNN is optimized on target graph and regularized with training data from source. Finally in Section 4.3 and Algorithm 1, we summarize how we optimize SRNC via adversarial training between both modules. We also provide convergence analysis of the training by Expectation-Maximization (EM). In the remaining of the paper, we call training data as source and testing data as target interchangeably.

We first propose the domain adaption loss for shift-robust semi-supervised classification on distribution shifts. Classifier P θ minimize a negative log-likelihood loss L θ,S on training data (x s i , y s i ), and shift-robust loss L θ,T on adversarial samples (x t i , y t i ) from target (testing) data.

The sampling process in the second term (x t j , y t j ) ∼ Q Φ first uniformly sample same amount of nodes as training {x t j } |D s | j=1 from target graph. Then we obtain their pseudo la-

We will discuss how to align the identity of cluster {1..C} and classes {1..N } in the beginning of model optimization (Section 4.3). In first row of Figure 2 , P θ is first trained on source graph and jointly optimized on adversarial target samples {x t j , y t j }. Note that source and target graph can be same (open-set shift) or different (close-set shift).

Our framework works for both kinds of distribution shifts that we deem essential in the paper. For open-set shift, we set number of cluster larger than known classes, i.e. C > N and P θ classify target data into seen N classes and the new N + 1 class. In second term of above Equation 3, the samples y t j ∈ C \ N from unaligned clusters are mapped into the unified unknown class N + 1. For close-set shift, we simply set number of cluster and classes equal in two modules. 1 We study the distribution shift in final latent space H and refer Pr(H, Y ) as joint probability in the paper. 

In this section, we will explain how to approximate the target data distribution {x t i , y t i } ∼ P t in Equation 3 and its connection with graph homophily and clustering. The sufficient condition of uniform sampling from target P t requires an unbiased estimation of P t (y t

is unknown, similar nodes likely appear in the same cluster based on graph homophily. In graph clustering, to model node k's cluster membership c k , we have Q Φ (c k |X, A) ∼ Categorical(Φ). We thus parameterize Q Φ using another GNN's outputs:

(4) Now we first describe the process of graph clustering on target graph (uncolored left part in Figure 2 ). Then, we will introduce how to align those clusters c ∼ Q Φ with classes on source graph (colored right part in Figure 2 ).

Given the cluster membership matrix C ∈ {0, 1} |V |×C , the modularity measure [19] S quantifies the divergence between the number of intra-cluster edges and the expected number of a random graph. By maximizing the modularity, the nodes are densely connected within each cluster:

where c i and c j is the cluster membership of node i and j and d i ∈ R |V | is the degree of the node. Instead of optimizing the binary cluster membership matrix C (following [23] ), our GNN Φ optimizes a real-valued cluster assignment matrix Q Φ (y i |X, A) as follows (matrix-form),

The optimized cluster distribution is the soft cluster membership output by the GNN in Equation 4 . Given cluster distribution Q t Φ (c j |x j ) on target graph, we are interested in its correlations with true P t (y t i |x t i ). Here we assume a good clustering algorithm works consistently good on source and target w.r.t. graph homophily, that is,

To this end, first we need to build a mapping from the graph clusters C to the class labels Y using source graph. The number of clusters is set to be no smaller number of classes N (K ≥ N ). Thus, we could build a bipartite mapping between the clusters {c K k=1 } and the classes {y N n=1 }. We use the KL divergence to measure the distance between cluster c k and class y n over source data. Specifically, we search for the optimal mapping M ∈ {0, 1} K×N by solving a linear sum assignment cost [12] between them, defined as follows, min k n T k,n M k,n ,

Thus we can map some clusters to classes, i.e., C L → Y, C L ⊆ C, |C L | = N . After this step, the clusters on source graph are "colored" in Figure 2 . In order to achieve similar class distribution on source data (Q s Φ (c j |x j ) ≈ P s (y s i |x s i )), we add an adversarial regularization term on clustering,

If labels are all known in source data, we random sample B nodes and P s (y s i |x s i , A s ) is a one-hot vector. Otherwise (semi-supervised), we use current classifier's inference probability on source data P θ (y|x s i , A s ).

Following Equation 3, the overall loss for classification is,

where the last term (adversarial samples (x s i , y s i ) on unknown training) is added to ensure training convergence. We have α = 1 for semi-supervised α = 0 for full-supervised source data. It is similar with traditional domain adaption [1] that unlabeled target data involved as L θ,Φ,T .

Similarly, the overall loss for adversarial clustering also includes both source and target data,

Joint Optimization. As we can see in the loss function, both modules require inference result (freezing parameters) from the other. Therefore, we initialize both models with L θ,S and L Φ,T , that is pre-training classification with labeled source graph and clustering with unlabeled target graph. We then iteratively train the classification module and clustering module w.r.t. L θ and L Φ . In L θ,Φ,T , we sample same amount of nodes as labeled source data D s on target. In adversarial regularization loss L Φ,θ,S , we train GNN Φ for T steps and sample B nodes each step. Lastly, Algorithm 1 summarizes the joint optimization algorithm.

Convergence. With the Variational-EM model [18] , we now discuss the convergence of graph adversarial clustering. For simplicity, we denote G = (X s , A s ) as inputs on source data. Essentially, we optimize the evidence lower bound (ELBO) of the log-likelihood as follows,

In t-th E-step, regarding the Equation 11, the optimal Q

. In practice, we achieve this update by sample B nodes on source graph, which is exactly the adversarial regularization term (Equation 8 ).

In t-th M-step, we update the output distribution on source from classifier P (t+1) θ with Q (t+1) Φ (y|G). To estimate the expectation term in Equation 11 , we sample (x s j , y s j ) ∼ Q Φ to compute the log-likelihood and update parameter θ of the classifier. Given the alignment between cluster and classes in Section 4.2. Interestingly, we describe the same sampling process on target in L θ,Φ,T and now we add L θ,Φ,S in final loss of classifier to accomplish the M-step. Therefore, at each episode, the joint optimization is equivalent to perform variational EM alternatively. The convergence is proven in the original paper [18] .

Complexity. Compared with normal node classification, our extra computation comes from graph adversarial clustering. Hereby, we analyze the additional computation cost in Q Φ . Let O(Φ) be the time GNN Φ takes to compute a single node embedding and T for compute cluster membership for each node. Assuming there are |V t | nodes in target graph, the pretraining of clustering takes O(|V t | · (Φ + T )). In addition, the adversarial regularization costs O(2 · E · B · Φ), E is the number of episode takes the algorithm to converge. Overall, the extra complexity is in linear of target graph size, which is reasonable for domain adaption. 

In this section, we will empirically evaluate SRNC. We focus on answering the following questions in our experiments: 

Datasets. In our experiments, we perform node classification tasks on four benchmark networks (see Table 2 ), These four networks are: Cora, Citeseer, PubMed [21] and ogb-arxiv [8] .

We conduct open-set shift experiments on first three datasets and close-set shift on ogb-arxiv, because ogb-arxiv already exhibits close-set shift across different time periods.

Compared algorithms. We compare SRNC with strong baselines on open-set shift and close-set shift. Since many domain adaption algorithms cannot deal with open-set shift, we apply the previous state-of-the-art open-set progressive graph learning (PGL) [15] on different GNNs such as GCN [10] , GraphSAGE [6] , SGC [28] , GAT [24] , DGI [25] . On closeset shift, we provide the result from multiple close-set domain adaption methods on two different GNNs: (1) GCN (2) DGI [25] . These close-set domain adaption methods are as follows, 1. DANN [4] minimizes the feature representation discrepancy by an adversarial domain classifier. We add the adversarial head at the last hidden layer with activation in GCN and MLP. 2. CMD [30] matches the means and higher order moments between source and target in the latent space. 3. SRGNN [33] is a recent shift-robust algorithm for localized training data, which combines CMD and importance sampling to improve the generalization on target.

To provide a comprehensive study of SRNC, we design two ablations of our algorithms: 

Notice that in open-set shift, target data D t has one more class than training -the unknown class (N test = N train + 1)).

Parameter Settings and Reproducibility. In SRNC, we use the GCN [10] and add self-loop to every node in the graph. We use the DGL [27] implementation of different GNN architectures. The Adam SGD [9] optimizer is used for training with learning rate as 0.01 and weight decay as 0.0005. The hidden size of all GNN models including SRNC is 256. We set cluster number as 16 in cluster GNN Φ for open-set experiment and same as number of classes in close-set experiment. Note that none of Cora, Citeseer and Pubmed has 16 classes, which do not favor our method as a fair comparison.

We create synthetic open-set shift by removing 3 classes on Cora/Citeseer and one class on PubMed from training data. In testing, nodes from the masked classes are all re-labeled as the unknown class. For each known class, we random sample 20 nodes for training and report the mean and standard deviation of micro-F1 and macro-F1 for 10 runs. Besides, we report the performance of a GCN with full class visibility -GCN(IID) in Table 1 , and relative performance drop (of other methods) in Micro (∆F1). In validation set, we have nodes from the unknown class and only use it to select the best hyper parameters for both PGL [15] and SRNC. For example, in PGL, we use validation to pick the best threshold α among each episodes {α k } and label nodes with lower probabilities into unknown class. SRNC has an explicit class for unknown, so we use the Micro-F1 on validation to select batch size B and step size T for GNN Φ . On average, SRNC outperforms all the other representative GNNs + PGL for at lease 4% and 2% in micro-F1 and macro-F1, respectively. Among the baselines, the most competitive baseline is GraphSAGE+PGL. It is probably because the self information is most sensitive to distributional shifts and GraphSAGE specifically promotes self information. PubMed is the largest graph among three datasets and open-set shift is larger. The experiment also shows that the end-to-end supervised GNNs (GCN) are more robust than unsupervised GNNs (DGI) when there is open-set shift.

From the comparison between our own ablations, we find that the graph adversarial clustering (see SRNC w.o. Φ) contributes the most to the performance gains. This ablation randomly samples pseudo unknown nodes from low confident predictions. In other words, drawing samples from unseen clusters is the key towards better open-set generalization. Moreover, the iterative brings more improvement (1∼3% F1) on top of SRNC Ep.1 ablation since variational EM takes several round to converge in practice.

On ogb-arxiv, we split the train, validation and test nodes by year. There are total 12974 paper before 2011 (train), 28151 (validation), 28374 (2014-2016), 51241 (2016-2018), 48603 (2018-2020). In testing, each test graph contains all previous nodes such that train graph is a subset of all test graphs. This dynamic graph setting is close to real world applications, where model is trained on an earlier snapshot (source) and deployed on real-time graphs (target).

First, we are wondering whether unsupervised representation learning is more robust to distribution shifts. We choose DGI as unsupervised GNN approach and apply DGI encoder optimized on test graph to obtain training and testing node embeddings. From Table 3 , we can observe the performance of DGI is worse than GCN, which is consistent with openset shift -unsupervised GNNs are more vulnerable to dis- 

Parameter Sensitivity. Among all the hyper-parameters, the pre-defined number of clusters |C| seems to be the most crucial in open-set scenario. Now we provide the performance of SRNC with varied numbers of clusters k, i.e. 7 (optimal) vs. 16 (used in experiment) in Table 4 . Both Micro-F1 and Macro-F1 are further boosted, since seven is the ground truth number of classes. On the one hand, SRNC is not sensitive to the choice of |C| and better than baselines regardless, especially when there are multiple missing classes (closer to the real-world setting). On the other, a accurate estimation of |C| can facilitate the model training.

Use Case. Our model do not have specific requirement on classification GNN architectures. Hence, when there are unseen class or substantial close-set shifts expected in testing, our graph adversarial clustering can be plugged in and joint optimized with convergence guarantee. The clustering output can help find potential unseen classes and we provide a case study on DBLP academia network in Appendix.

In this paper, we proposed a general framework, SRNC, to enable graph neural networks for distributional shift. Different from existing work on either open-set shift or close-set shift, SRNC works for both scenario with a novel graph adversarial clustering component. Accordingly, SRNC employ a latent variable model (i.e. cluster GNN) to model the latent structure and allows the clustering GNN to improve the generalization of shift-robust classification GNN. Future work includes applying the model to applications such as medical diagnosis and molecular dynamics, where robustness towards distribution shifts are in critical need.

In SRNC, adversarial regularization term on clustering (Equation 8) also calibrates the cluster GNN Φ . The quality of the clustering is a key step towards successfully open-set classification. Therefore, in Table. 5, we report the clustering performance on two data sets against Constrained-KMeans with different node embeddings. Three classes are masked during the training of SRNC. The superior performance of SRNC on ACC and NMI validates that the adversarial regularization on clustering also helps generalization on unseen labels. In particular, without the L Φ,θ,S , our cluster GNN (modularity based) reports similar numbers with DGI as both methods are based on GNN. 

Dataset. DBLP is a computer science bibliography website. We construct a heterogeneous network upon three types of nodes: (A)Author, (P)Paper and (T)Term. There are five types of papers labeled by the corresponding venues 2 -Data Mining (DM), Database (DB), Artificial Intelligence (AI), Natural Language Processing (NLP) and Computer Vision (CV). We sampled 50,000 papers between 2010 and 2015, then use the venue information to label 5567 of them into 5 classes. After filtering out rare authors and terms, there are 20601 papers, 5158 authors and 2000 terms in total. In DBLP, we have more than 70 % of 20,000 papers as instances from unseen classes. In other words, the majority of nodes in the DBLP graph belong to unseen classes.

Unseen Class Discovery. In main paper, we describe the use case of SRNC to discover the new classes and facilitate the design of label space. In Table 6 , we show the paper in DBLP dataset that are assigned to the (unmatched) clusters, discovered by our Cluster GNN Q Φ and Constrained-KMeans + DGI embeddings. Interestingly, we find two shared clusters about "Computational Mathematics" and "Network & Distributed System" from two algorithms. The results show ad-2 DM: SIGKDD, ICDM DB: SIGMOD, VLDB NLP:ACL, EMNLP, NAACL AI: AAAI, IJCAI, ICML CV: ECCV, ICCV, CVPR hoc unsupervised clustering on node embeddings produces erroneous cluster membership even for the top-ranked items. For example, it contains almost half papers from the computer architecture in "Network & Distributed System", whereas unseen classes suggested by our adversarial regularized cluster are much more consistent.

Effectiveness of iterative optimization. In Figure 4 , we demonstrate the learning procedure on DBLP dataset. Each column shows the predicted labels from classification GNN (upper) and the cluster assignments from the cluster GNN (lower). We visualize the node embeddings from cluster GNN into two dimensions using t-SNE. In this example, we set number of clusters to six. In Figure 4a and 4d, the pretrained classification GNN predicts all papers into 5 seen classes. Our cluster GNN shows 6 initial clusters. During the training, the clusters in Figure 4e gradually align with the classifier's predictions in same color. The classifier presented in Figure 4b starts to recognize nodes of unseen class with samples from cluster corresponds to unseen class (yellow) in 

In Section 5.3 of the main paper, different algorithms suffer from close-set shift between training and testing-time graphs. Although it would be hard to directly demonstrate the conditional shift on Pr(Y |X), we provide the label distribution on 40 classes in Figure 3 . As we can observe, the distribution changes dramatically betweeen train/val and three test splits. Thus, a shift-robust classification algorithm is in a great need for real-world node classifications. Table 6 : Top paper titles from unmatched clusters on DBLP dataset. We choose the two most similar clusters for comparison. (x) indicates the paper is not consistent with the area of other papers in the same cluster.

A theory of learning from different domains

Towards open set deep networks

Spectral clustering with graph neural networks for graph pooling

Domain-adversarial training of neural networks. The journal of machine learning research

Generalization and representational limits of graph neural networks

Inductive representation learning on large graphs

A baseline for detecting misclassified and out-of-distribution examples in neural networks

Open graph benchmark: Datasets for machine learning on graphs

Adam: A method for stochastic optimization

Semi-supervised classification with graph convolutional networks

Variational graph autoencoders

The hungarian method for the assignment problem

Learning transferable features with deep adaptation networks. ICML'15

Deep transfer learning with joint adaptation networks

Progressive graph learning for open-set domain adaptation

Subgroup generalization and fairness of graph neural networks

Domain adaptation: Learning bounds and algorithms

A view of the em algorithm that justifies incremental, sparse, and other variants

Modularity and community structure in networks

Probability models for open set recognition. IEEE transactions on pattern analysis and machine intelligence

Collective classification in network data

Deep open classification of text documents

Graph clustering with graph neural networks

Graph attention networks

Deep graph infomax

Stability and generalization of graph convolutional neural networks

Deep graph library: Towards efficient and scalable deep learning on graphs

Simplifying graph convolutional networks

Openwgl: Open-world graph learning

Central moment discrepancy (cmd) for domain-invariant representation learning

On learning invariant representations for domain adaptation

Beyond homophily in graph neural networks: Current limitations and effective designs

Unseen Cluster Cluster GNN Constrained-KMeans (DGI) meshless method for solving nonlinear two-dimensional • upper and lower bounds for dynamic cluster assignment

domains using radial basis functions with error analysis for multi-target tracking in heterogeneous wsns(x) • application of the ritz-galerkin method ... wave equation • multi-rate multicast

heterogeneous wsns • application of the ritz-galerkin method...in the wave equation • a finite volume spectral element ... (mhd) equations • a finite volume spectral element ... (mhd) equations • a tau approach for solution ... space fractional diffusion equation • numerical solution of the higher-order linear ... variable coefficients • numerical solution ... using chebyshev cardinal functions • a technique for the numerical ... birkhoff-type interpolation method • computation of two time-dependent ...additional specifications • computation of two time-dependent ... additional specifications • a technique for the ... birkhoff-type interpolation method • numerical solution ... using chebyshev cardinal functions • matrix equations over (r,s)-symmetric ...symmetric matrices • weight evaluation

• on an integral-type operator from w-bloch ... u-zygmund spaces • new results ... past-sequence-dependent delivery times (x)

to mixed-norm spaces on the unit ball • adaptive finite element ... pointwise and integral control constraints • composition operators ... on the unit disk and the half-plane • sparsely connected neural network-based time series forecasting (x) goldstein's type methods ... variant variational inequalities Networks & Distributed Systems • an implementation and evaluation ... mobile ad-hoc networks • a pattern-based

distributed agreement protocol • model checking prioritized timed systems(x) • reduction of processing ... synchronize multmedia replicas • learning-based adaptation to ... reconfigurable network-on-chip (x) • quorum-based replication of multimedia ... distributed systems • design issues in a performance

• dynamic clusters of servers to reduce total power consumption • the architecture of parallelized cloud-based ... testing system • experimentation of group communication protocols • counterexample-guided assume-guarantee synthesis learning (x)

• a dynamic energy-aware server selection algorithm • mechanism design ... allocation in non-cooperative cloud systems • trustworthiness ... in peer-to-peer(p2p) overlay networks • a qos-aware uplink scheduling paradigm for lte networks • trustworthiness-based group communication protocols • a cluster-based trust-aware routing protocol ... networks • quorums for replication ... objects in p2p overlay networks • a robust topology control solution for the sink placement problem in wsns • performance of optimized link ... networks cloud computing • profit and penalty aware (pp-aware) ... variable task execution time (x) • error model guided joint

for flash-based storage systems (x) • skimpystash: ram space skimpy

flash memory storage systems (x) • traffic load analysis ... on ieee 802.15.4/zigbee sensor network • trustworthiness in p2p: ... systems for jxta-overlay platform • cloud computing in taiwan • a place-aware stereotypical trust supporting scheme • implementation of load balancing method for cloud service with open flow • price-based congestion ... multi-radio wireless mesh networks • acsp: a novel security protocol against counting attack for uhf rfid systems • hiam: hidden ... multi-channel multi-radio mesh networks • implementation of a dynamic adjustment