key: cord-0541980-816gfg5b authors: Mauro, Giovanni; Luca, Massimiliano; Longa, Antonio; Lepri, Bruno; Pappalardo, Luca title: Generating Synthetic Mobility Networks with Generative Adversarial Networks date: 2022-02-22 journal: nan DOI: nan sha: 64f15e5590134dd9e87878b9dfa4bc274d48f28f doc_id: 541980 cord_uid: 816gfg5b The increasingly crucial role of human displacements in complex societal phenomena, such as traffic congestion, segregation, and the diffusion of epidemics, is attracting the interest of scientists from several disciplines. In this article, we address mobility network generation, i.e., generating a city's entire mobility network, a weighted directed graph in which nodes are geographic locations and weighted edges represent people's movements between those locations, thus describing the entire mobility set flows within a city. Our solution is MoGAN, a model based on Generative Adversarial Networks (GANs) to generate realistic mobility networks. We conduct extensive experiments on public datasets of bike and taxi rides to show that MoGAN outperforms the classical Gravity and Radiation models regarding the realism of the generated networks. Our model can be used for data augmentation and performing simulations and what-if analysis. The increasing complexity of urban environments [1, 2] and the crucial role played by human displacements in the diffusion of epidemics, not least the COVID-19 pandemic [3, 4, 5, 6, 7, 8] , have created a great deal of interest around the study of individual and collective human mobility [9, 10, 11] . The prevention of detrimental collective phenomena such as traffic congestion, air pollution, segregation, and epidemics spread, which is crucial to make our cities inclusive, safe, resilient, and sustainable [12, 13, 14] , depends on how accurately we can predict and simulate people's movements within an urban environment. In this regard, a particularly challenging task is generating realistic mobility flows, i.e., flows of people among a set of geographic locations given their demographic and geographic characteristics (e.g., population and distance) [9, 15, 11, 10, 16] . Traditionally, flow generation is addressed through the Gravity model [17, 18, 19, 11, 20] , the Radiation model [21, 11, 10] , and their variants [15, 11, 22, 23] . The Gravity model assumes that the number of travelers between two locations (flow) increases with the locations' populations while decreasing with the distance between them. The Radiation model is a parameter-free model that only requires information about geographic locations (e.g., population) and their intervening opportunities. The Gravity and the Radiation models are designed to generate single flows between pairs of locations and are typically used to complete a network in which some mobility flows are missing. In this paper, we address mobility network generation, a variation of flow generation that consists in generating a city's entire mobility network. A mobility network is a weighted directed graph in which nodes are geographic locations and weighted edges represent people's movements between those locations, thus describing the entire set of mobility flows within a city. Our solution to mobility network generation -MoGAN (Mobility Generative Adversarial Network) -is based on Generative Adversarial Networks (GANs) [24] , deep learning architectures composed of a discriminator, which maximizes the probability to classify real and artificial mobility networks correctly, and a generator, which maximizes the probability to fool the discriminator producing artificial mobility networks classified by the discriminator as real. The choice of GANs is motivated by the fact that mobility networks can be represented as weighted adjacency matrices, similarly to how images are typically represented, and considering that GANs are tremendously effective in generating realistic images [24, 25, 26, 27] . While several papers show that GANs can generate individual mobility trajectories [9, 28, 29, 30, 31, 32, 33, 34] with a realism comparable to or better than mechanistic mobility models [35, 36, 37, 11] , to what extent GANs can generate realistic mobility flows has never been explored in the literature. We train MoGAN on a set of real mobility networks and develop a tailored evaluation methodology to test the model's effectiveness in generating realistic mobility networks. We conduct extensive experiments on four public mobility datasets, describing flows of bikes and taxis in New York City and Chicago, US, to demonstrate that MoGAN generates synthetic mobility networks that are way more realistic than those generated by the Gravity and the Radiation model. Our results prove that our model can synthesize aggregated movements within a city into a realistic generator, which can be used for data augmentation and performing simulations and what-if analysis. Mobility network generation consists of generating a realistic mobility network, i.e., a weighted directed graph in which nodes are locations and edges represent flows between those locations. The locations are defined by a discretization of the geographic space defined by a spatial tessellation, i.e., a covering of the bi-dimensional space using a countable number of geometric shapes called tiles, with no overlaps and no gaps [9] . In mobility networks, nodes are tiles of the spatial tessellation and edges flows of people among these tiles. Formally, we define a mobility network as a weighted directed graph G = (V, E, w), where: • V is the set of nodes, i.e., tiles of the spatial tessellation; • w : V × V → N is a function that assigns to each pair of nodes the number of people moving between the two nodes (mobility flow); is the set of the weighted directed edges in the network. A mobility network may contain self-loops (edges in which the origin and destination coincide), which describe movements of people within the same tile. Here, we represent a mobility network as a weighted adjacency matrix A n×n with n = |V |. Thus, an element a i,j ∈ A represents the number of people moving from node i to node j, with i, j ∈ V . A generative model of mobility networks M is any algorithm able to generate a set of n synthetic mobility networks X M = {Ĝ 1 , . . . ,Ĝ n }, which describe the set of mobility flows on a given spatial tessellation. The realism of M is evaluated with respect to: 1. A set of network patterns K = {s 1 , . . . , s m } that describe some statistical properties of mobility networks. A realistic set T M of synthetic mobility networks is expected to reproduce as many of these mobility patterns as possible. To solve the problem of mobility network generation, we design MoGAN (Mobility Generative Adversarial Network), a deep learning architecture based on Deep Convolutional Generative Adversarial Networks (DCGANs) [27] . MoGAN consists of a generator G, which learns how to produce new synthetic mobility networks, and a discriminator D, which has the task of distinguishing between real and fake (artificial) mobility networks. G and D are trained in an adversarial manner: D maximizes the probability to correctly classify real and fake mobility networks; G maximizes the probability to fool D, i.e., to produce fake mobility networks classified by D as real. Both D and G are Convolutional Neural Networks (CNNs), which are proven to be effective in capturing spatial patterns in the data [9] . During the training phase, G repeatedly takes a 1 × 100 noise vector as input and operates a series of transposed convolutions, which perform upsampling of the input vector to generate a 64×64 adjacency matrix representing a mobility network. Then, D takes a set of real and generated 64 × 64 matrices as input and performs a binary classification task to classify these matrices as real or fake. The above process is repeated for a certain number of epochs and stopped when some criteria are met (see Supplementary Note 1). Once MoGAN is trained, G can be used to generate as many mobility networks as desired. A visual representation of the networks generated during the training phase can be found in Supplementary Note 2. Figure 1 The generator (a Convolutional Neural Network or CNN) performs transposed convolution operations that upsample the input random noise vector, transforming it into a 64 × 64 adjacency matrix representing a mobility network. The discriminator (a CNN) takes as input both the generated mobility networks and the real ones from the training set and performs a series of convolutional operations that end up with a probability, for each sample, to be fake or real. Both the discriminator and generator weights are then backpropagated. We compare MoGAN with the Gravity and the Radiation models, two classical approaches for mobility flows' generation [11, 9, 15, 21] . The singly-constrained Gravity model [17, 18, 19, 11] prescribes that the expected flow,ȳ, between an origin location l i and a destination location l j is generated according to the following equation: where m j is the population of location l j , p ij is the probability to observe a trip (unit flow) from location l i to location l j , β 1 is a parameter and f (r ij ) is the deterrence function, which is a function of the distance r ij between two locations. Typically, the deterrence function f (r ij ) can be either an exponential, f (r) = e β 2 r , or a powerlaw function, f (r) = r β 2 , where β 2 is another parameter. These parameters can be fitted from a subset of available flows. The Radiation model [21, 11] is a parameter-free model that aims to generate flows between locations given their characteristics (e.g., population) and the intervening opportunities among them. The choice of the destination consists of two steps: (i) we assign a fitness z to each location opportunity sampled from a distribution p(z) that represents the quality of the opportunity for each travel; (ii) the traveler ranks the opportunities according to their distance from the origin location and chooses the nearest location with a fitness higher than a certain threshold. As a result, the mean flow between two locations l i and l j is calculated as: where O i is the number of people leaving location l i , m i and m j are the opportunities in l i and l j , M is the sum of all the opportunities, and s i,j is the number of opportunities in a circle of radius r i,j . Note that the Gravity and the Radiation models do not solve mobility network generation directly. While MoGAN, once trained, can generate an entire mobility network, the Gravity and the Radiation models are designed to generate single flows between pairs of locations. To generate a mobility network using the Gravity and the Radiation models, we proceed as follows: (i) we take a real mobility network; (ii) for each node, we compute its relevance m i and total outflow O i ; and (iii) we use m i and O i in Equations 1 and 2. For the Gravity model, we also fit parameters β 1 and β 2 from the real mobility network assuming a power-law deterrence function. For both the Gravity and Radiation models, we use the implementations available in the library scikit-mobility [38] , which provides methods to fit parameters and generate flows from locations' relevance and outflow. To train/test MoGAN and the baselines we use four real-world public datasets, which describe trips with taxis and bikes in New York City and Chicago during 2018 and 2019 (730 days). Two datasets contain daily information regarding the use of bike-sharing services: the City Bike Dataset for New York City [39] and the Divvy Bike Dataset for Chicago [40] . Each record describes the coordinates of each ride's starting and ending station, and the starting and ending times. We remove trips with a duration lower than 60 seconds because they could be false starts or users trying to re-dock a bike to ensure it is secure [39, 40] . We also use two datasets containing daily information about the movements of taxis: the New York City taxi dataset [41] and the Chicago taxi dataset [42] . A record describes each ride's starting and ending location and the starting and ending times. Both datasets are already preprocessed to remove dummy and noisy rides. In the Chicago taxi dataset, we know the GPS points corresponding to the starting and ending points of each taxi trajectory. In the New York City taxi dataset, we only know the trajectories' starting and ending zones, i.e., administrative areas in New York City. We use an administrative area's centroid as a taxi ride's reference starting or ending point. We select the island of Manhattan for New York City and the central districts for Chicago (see Supplementary Figure S3 ) and split the selected zones into 64 equallysized squared tiles (1840 meters per side for New York City, 1405 meters per side for Chicago). For each dataset, we count the daily number of taxis or bikes moving between each pair of tiles to obtain an origin-destination matrix representing the daily mobility network. We compute the relevance of each location (tile), which is needed for generating flows in the Gravity and the Radiation models, as the total number of daily drop-offs in that location. Table 1 shows some statistics about the dataset used in our study. As an example, Figure 2 We develop a tailored approach to evaluate the realism of the mobility networks generated by MoGAN. For each dataset, we construct a mobility network for each day obtaining 730 real mobility networks in total. We split the 730 networks into a training set (584 networks) and a test set (146 networks). We train MoGAN on the training set and generate 146 synthetic mobility networks (synthetic set). We evaluate the model's realism computing the difference between each network in the synthetic set and each network in the test set, so obtaining 146 × 146 = 21, 316 values. If the generated mobility networks are realistic, they should differ from the real networks to the same extent real networks differ between themselves. To stress this aspect, we create a set of 146 mobility networks (mixed set), in which half of them are chosen uniformly at random from the test set, and the other half is chosen uniformly at random from the synthetic set. We then compute the pairwise difference between any possible pair of mobility networks in the mixed set. A crucial aspect is how to compute the difference between two mobility networks, considering that directed weighted networks are hard to compare, even in the case of known-node correspondence (i.e., networks with the same nodes but different edges) [43] . We compute this difference in two ways. The first one consists of computing an error metric between two networks' adjacency matrices. In our experiments, we try three error metrics: RMSE, CPC, and CD. The Root Mean Square Error (RMSE) [9, 15] is defined as: where a ij and b ij are the elements (flows) in position (i, j) in the two networks' adjacency matrices of A and B and n is the number of elements of the matrices (64 × 64). Note that RMSE is substantially equivalent to the Frobenious norm (see Supplementary Note 4). The Common Part of Commuters (CPC), also known as Sørensen-Dice index [11, 15, 44, 15] , a well-established measure to compute the similarity between real and generated, is defined as: [45] is based on the notion of cut weight, widely used in network theory [43] , and measures how much a network is bipartite. The cut norm ||A|| C of a real matrix A = (a ij ), i ∈ R, j ∈ S with a set of rows indexed by R and a set of columns indexed by S, is the maximum over all I ⊂ R, J ⊂ S of the quantity | i∈I,j∈J a ij |. The Cut Distance (CD) between two adjacency matrices A and B is the cut norm of their difference: with V being the number of nodes (64, in our case), e G (S, T ) = i∈S,j∈T w ij is the cut weight of adjacency matrix G with weights w ij , i.e., the sum of the weights of the edges that starts in S and ends in T and S C = V \ S. [46] . Maximizing this quantity is a computationally heavy problem, so we use the Semidefinite Program (SDP) approximation proposed by Chan and Sun [47] . For calculating CD, we use the python implementation available in the library cutnorm [48] . The second approach to computing the difference between two mobility networks consists of comparing their distributions of edge weights and weight-distances. Edge weights indicate the values (flows) of the adjacency matrices describing the two mobility networks. Weight-distances indicate the combination of an edge's weight (flow) and the distance between the two nodes composing the edge. We compute the weighted-distance adjacency matrix of a mobility network as = A/(d+ ), where A is the network's weighted adjacency matrix, d is the distance matrix having the same dimension and node ordering of A and representing the geographic distances between all pair of nodes. 1 We add the residual term = 0.8 to the denominator to avoid dividing by zero only for elements on the diagonal of the adjacency matrices. Given two mobility networks, the more similar their distribution of edge weights or weightdistances are, the more similar the two mobility networks are. We measure the similarity between two distributions using the Jensen-Shannon divergence [49, 36] : where P and Q are two density distributions, M = 1 2 (P + Q), and KL is the Kullback-Leibler divergence (KL) [50, 51] , defined as: Results Table 2 shows, for each model, the JS-divergence between the CPC distribution of the mixed and test sets and the JS-divergence between the CPC distribution of the synthetic and test sets. To compute the improvement in performance of MoGAN with respect to the Gravity and the Radiation models, for each metric, each set and each baseline, we define the quantity: where JS (MoGAN) is the JS divergence between the set (synthetic or mixed) of networks generated by MoGAN and the test set, while JS (baseline) is the JS divergence between the set (synthetic or mixed) of networks generated by the baselines (Gravity or Radiation) and the test set. Table 2 shows that, according to the CPC, MoGAN outperforms the other models on all datasets, with a relative improvement of up to 86% on the Gravity model and 91% on the Radiation model over the mixed set, and a relative improvement of up to 49% on the Gravity model and 37% on the Radiation model over the synthetic set. We report the results for RMSE, CD, weights distribution and weight-distances distribution in Supplementary Notes 6, 7 and 8. MoGAN's JS-divergences between the mixed and test sets and between the synthetic and test sets are the lowest for each dataset, meaning that our model produces the most overlapping distributions (see Table 2 ). Our results also show that the difference (either in terms of CD, CPC, or RMSE) between a real network and a synthetic one is similar to the difference between two real networks or two synthetic networks. This means that MoGAN generates realistic mobility networks that are, to a certain extent, indistinguishable from real ones. Figure 5 shows the distributions of the pairwise similarities among the edge weights for the synthetic, mixed, and test sets built over the four datasets. For each dataset, we report the performances of MoGAN, the Gravity model, and the Radiation model. Again, MoGAN significantly outperforms the baselines, except for two cases (mixed set of NYC and CHI taxi) in which the Gravity model and MoGAN achieve similar performance. We find a similar result for the weight-distances (see Supplementary Note 7). In this work, we presented MoGAN, a deep-learning-based model for generating realistic urban mobility networks. Our results, conducted on four public datasets representing flows of bikes and taxis in New York City and Chicago, show that the realism of the networks generated by MoGAN outperforms those generated by the Gravity and the Radiation model. Although MoGAN's performance is encouraging, it has some limitations. Being based on Deep Convolutional Generative Adversarial Networks (DCGAN) [27] , Mo-GAN can generate 64 × 64 adjacency matrices, that is, mobility networks with 4096 locations. We plan to extend MoGAN's architecture to generate mobility networks with an arbitrary number of nodes as future improvements. Other technical improvements may be the use of Graph Neural Networks (GNNs) [52] for the generator and the discriminator, which would better capture the network dependencies and include other information related to the locations (e.g., population or relevance), and the use of the Wasserstein loss [53] , which has been proven to improve GANs in several contexts [54, 55] . Finally, it would also be interesting to test MoGAN's effectiveness on cities of different sizes and shapes as well as on the generation of individual mobility trajectories or [56, 57, 58] , which represent the aggregated movements of single individuals among a city's locations. In the meantime, our study demonstrates the great potential of artificial intelligence to improve solutions to crucial problems in human mobility, such as the generation of realistic mobility networks. Our model can synthesize aggregated movements within a city into a realistic generator, which can be used for data augmentation and performing simulations and what-if analysis. Given the flexibility of the training phase, our model can be easily extended to synthesize specific types of mobility, such as aggregated movements during workdays, weekends, specific periods of the year, or in presence of pandemic-driven mobility restrictions, events, and natural disasters. The code to train/test MoGAN and reproduce our analyses, and the links to the datasets used in our experiments, can be found at https://github.com/jonpappalord/ GAN-flow. We thank Daniele Fadda and Eleonora Cappuccio for the visualization suggestions. We also thank René Ferretti and Dante Milonga for the inspiration. We modify the Deep Convolutional Generative Adversarial Network (DCGAN) model [27] to work over bi-dimensional matrices of dimension 64 × 64 as if they were gray-scale mono-channel images, but without constraining the values to be in the range [0, 255]. We use the Rectified Linear Unit (ReLU) as activation function. Consistently with the proposed architecture, a Generator G takes a noisy vector z as input and operates a series of "Transposed Convolutions" with fixed values for the parameters, in order to perform the upsampling operations and generate a 64×64 matrix. As stated in the original DCGAN architecture, each block of Upsample-Convolution-LeakyReLU of the model is followed by a batch normalization. A discriminator D performs a binary classification task using a sigmoid activation function as last layer. The convolutional process gradually reduces the size of the matrix, learning different peculiarities at each epoch. We train MoGAN using a batch size of 146 to create 584/146 = 4 equally spaced batches and avoid an uncontrollable variance in both G and D. We fix the two parameters controlling the Adam optimizer, responsible for the backward optimization, to b1 = 0.5 and b2 = 0.999. We fix the latent dimension of the noise vector z to 100. We find no significant improvement when changing the values of the above parameters. The minimum number of epochs required for our model to perform at its best over the four datasets is 6000. We use PyTorch [59] for implementing MoGAN. An explanation of the training loop can be found in Supplementary Note V. In detail, iterating through the epochs and, for each epoch, iterating through the minibatch we: • Define a vector of valid labels (ones) and a vector of synthetic labels (zeros) having the same dimension of the current minibatch. • Set the G's weights' gradients to zero. We do this before starting the backpropragation, since PyTorch accumulates the gradients on subsequent backward passes. • Generate a minibatch of synthetic images (output of the Generator). • Calculate G's loss as the loss between the output of D over the generated images and the valid labels vector. • Backpropagate the error calculated with this latter loss and update G's weights. • Set D gradients to zero. • Calculate D's loss as an average loss between two losses. The first one is the loss between the output of D over the real images of the minibatch and the valid labels. The second one is the loss between the output of D over generated images and the synthetic labels. • Backpropagate the error calculated with this latter loss and update D's weights. Note that we use only one loss, both for G and D, as in [24] . Nevertheless, changing the values over which this loss is calculated (generated/real images, true/synthetic labels) originates different and feasible values, given the nature of the Mini-Max game solution of the GAN's loss. Supplementary Figure S1 (left) shows the evolution of the scores with the number of epochs. For each epoch, we define as score the average label assigned by D to the mobility networks. D assigns label 1 to the real mobility networks and 0 to the generated (fake) ones. Therefore, the score is a measure of accuracy. In particular, the real score is the average label assigned by D to real mobility networks; the synthetic score is the average label assigned to generated mobility networks. Given an epoch, the score is 0 (1) if D labels all the mobility networks as real (fake). Ideally, at the end of the training process, both the real score and the synthetic score should be around 0.5, meaning that D can no longer determine whether a sample comes from the real distribution or the synthetic one. In the initial phase, D cannot discriminate between real and fake mobility networks, assigning to both of them an average score of 0.5 (Supplementary Figure S1 ). After about 100 epochs, D learns to classify the generated mobility networks. Then, there is a progressive convergence to 0.5, meaning that D can no longer distinguish real mobility networks from synthetic ones. as: Root Mean Square Error (RMSE) is defined as: The right hand sides of Equations (1) and (2) only differ for the 1/n term. Nevertheless, given that in our case n is fixed (and equal to the number of elements of the same matrices), studying the distribution of the RMSE or the distribution of the Frobenius norm is equivalent. In Table 3 we show the numerical improvements in terms of JS divergence among distributions per each dataset and model. As it can be inferred from the latest four columns, MoGAN achieves the best performance in almost all the cases, with an improvement on the baselines ranging from 12% to 87%. Only in two cases, the baselines produce a synthetic and a mixed set that are more similar to the actual test set. Supplementary Figure S4 shows the RMSE scores over the four datasets. For each dataset, we show the distributions of the RMSE among mobility networks in the test set (red), the synthetic set (blue), and the mixed set (green), for the Radiation model (left), Gravity model (center), and MoGAN (right). MoGAN's RMSE distributions over the three sets overlaps almost completely in all the four scenarios. Both Gravity and Radiation distributions are not overlapping, especially for the latter. We also report the numerical improvements in terms of JS divergence among distributions per each dataset and model in Table 4 MoGAN In Table 5 , we show the numerical improvements in terms of JS divergence among distributions per each dataset and model. As it can be inferred from the latest four columns, MoGAN's performance is the best in almost all the cases, with an improvement on the baselines ranging from 22% to 90%. Only in two cases the baselines produce a mixed set that is more similar to the actual test set. The same results hold for weight-distances distribution. Table 5 : JS divergences of the distributions of the weights. For each model, we report the JS divergence between mixed and test set (column JS m ) and the JS divergence between synthetic and test set (column JS s ). The last four ∆ x/y,Z -like columns represent, respectively, the improvement of our MoGAN model with respect to the Gravity model in terms of mixed and synthetic set (columns ∆ m,G and ∆ s,G ) and the improvement of our models with respect to the Radiation model in terms of mixed and synthetic set (columns ∆ m,R and ∆ s,R ). In Supplementary Figure S5 , we repeat the analysis for the weight-distances. Mo-GAN's good performance suggests that it can reproduce a notable law of mobility networks: the closer two nodes are, the higher the weight (flow) between them tends to be. The New Science of Cities so) big data and the transformation of the city Living in a pandemic: changes in mobility routines, social activity and adherence to COVID-19 protective measures Covid-19 outbreak response, a dataset to assess mobility changes in italy following national lockdown Measuring mobility, disease connectivity and individual risk: a review of using mobile phone data and health for travel medicine Assessing the impact of coordinated covid-19 exit strategies across europe The effect of human mobility and control measures on the covid-19 epidemic in china Mobile phone data for informing public health actions across the COVID-19 pandemic life cycle A survey on deep learning for human mobility Urban human mobility: Data-driven modeling and prediction Human mobility: Models and applications Towards integration at last? the sustainable development goals as a network of targets Sustainable development goals (sdgs): Are we successful in turning trade-offs into synergies? United Nations General Assembly: Transforming our world: the 2030 agenda for sustainable development A deep gravity model for mobility flows generation Gravity versus radiation models: On the importance of scale and heterogeneity in commuting flows JB Lippincott & Company, ??? (1867) The p 1 p 2/d hypothesis: on the intercity movement of persons Systematic comparison of trip distribution laws and models The Gravity Model in Transportation Analysis: Theory and Extensions A universal model for mobility and migration patterns Gravity and scaling laws of city to city migration Universal model of individual and population mobility on diverse spatial scales Generative adversarial nets Generative adversarial networks: An overview Nips 2016 tutorial: Generative adversarial networks Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks trajgans: Using generative adversarial networks for geo-privacy protection of trajectory data (vision paper). In: Location Privacy and Security Workshop Gans based density distribution privacy-preservation on mobility data. Security and Communication Networks Generative models for simulating mobility trajectories A non-parametric generative model for human trajectories A variational autoencoder based generative model of urban human mobility Learning to simulate human mobility Predicting taxi-passenger demand using streaming data A mechanistic data-driven approach to synthesize human mobility considering the spatial, temporal, and social dimensions together Data-driven generation of spatio-temporal routines in human mobility The timegeo modeling framework for urban mobility without travel surveys scikit-mobility: a Python library for the analysis, generation and risk assessment of mobility data CitiBike System Data Divvy System Data TLC Trip Record Data TLC Trip Record Data Comparing methods for comparing networks Systematic comparison of trip distribution laws and models Cut based method for comparing complex networks Approximating the cut-norm via grothendieck's inequality An optimal sdp algorithm for max-cut, and equally optimal long code tests cutnorm package Jensen-shannon divergence and hilbert space embedding Information Theory and Statistics. Courier Corporation Rényi divergence and kullback-leibler divergence The graph neural network model Wasserstein generative adversarial networks From gan to wgan Improved training of wasserstein gans Generating synthetic mobility data for a realistic population with rnns to improve utility and privacy The purpose of motion: Learning activities from individual mobility networks Unravelling daily human mobility motifs Pytorch: An imperative style, highperformance deep learning library Exploring network structure, dynamics, and function using networkx Luca Pappalardo and Giovanni Mauro thank EU project SoBigData++ grant agreement #871042 for supporting this research. We thank Ramon Ferrer-i-Cancho, Matteo Böhm, Giuliano Cornacchia, and Vasiliki Voukelatou for the useful suggestions. Supplementary Figure S1 (right) shows how the D's and G's losses change with the number of iterations. Initially, G's loss is high and unstable, meaning that it cannot fool D. In contrast, D's loss is close to 0, meaning that it well discriminates between a real and a fake mobility network. After almost 6000 iterations (1500 epochs), G's loss decreases, and the two losses stabilize, converging to a similar value. This result seems to suggest that G has becomes capable of fooling D, i.e., to generate synthetic mobility networks indistinguishable from real ones. Supplementary Figure S2 shows four snapshots of the mobility networks generated during the training: the network at the end of the first epoch, and those after 30%, 60%, and 90% of the epochs. After the first training epoch, mobility flows are light or non-existing (Supplementary Figure S2a) . As the epochs go by, MoGAN starts identifying the most connected nodes, generating heavier flows between them (Supplementary Figure S2 b-d) . In Supplementary Figure S3 , we provide a visualization of the tessellations we use for our analysis, for both of the analyzed cities. We show that calculating the RMSE between two adjacency matrices is equivalent to calculate the Frobenius norm of the difference matrix. As suggested by [43] , a norm of the difference matrix is a metric for evaluate how much two weighted networks with the same node-correspondence are similar. The Frobenius norm of a matrix A is defined as:Therefore, the Frobenius norm of the difference matrix A − B can be calculated