key: cord-0668992-zl4f9zid authors: Barannikov, Serguei; Trofimov, Ilya; Sotnikov, Grigorii; Trimbach, Ekaterina; Korotin, Alexander; Filippov, Alexander; Burnaev, Evgeny title: Manifold Topology Divergence: a Framework for Comparing Data Manifolds date: 2021-06-08 journal: nan DOI: nan sha: b3da8b723d912ffc372f2b7a03fbfbd6d3727c87 doc_id: 668992 cord_uid: zl4f9zid We develop a framework for comparing data manifolds, aimed, in particular, towards the evaluation of deep generative models. We describe a novel tool, Cross-Barcode(P,Q), that, given a pair of distributions in a high-dimensional space, tracks multiscale topology spacial discrepancies between manifolds on which the distributions are concentrated. Based on the Cross-Barcode, we introduce the Manifold Topology Divergence score (MTop-Divergence) and apply it to assess the performance of deep generative models in various domains: images, 3D-shapes, time-series, and on different datasets: MNIST, Fashion MNIST, SVHN, CIFAR10, FFHQ, chest X-ray images, market stock data, ShapeNet. We demonstrate that the MTop-Divergence accurately detects various degrees of mode-dropping, intra-mode collapse, mode invention, and image disturbance. Our algorithm scales well (essentially linearly) with the increase of the dimension of the ambient high-dimensional space. It is one of the first TDA-based practical methodologies that can be applied universally to datasets of different sizes and dimensions, including the ones on which the most recent GANs in the visual domain are trained. The proposed method is domain agnostic and does not rely on pre-trained networks. Geometric perspective in working with data distributions has been pervasive in machine learning [5, 8, 12, 10, 23, 20] . Reconstruction of the data from observing only a subset of its points has made a significant step forward since the invention of Generative Adversarial Networks (GANs) [13] . Despite the exceptional success that deep generative models achieved, there still exists a longstanding challenge of good assessment of the generated samples quality and diversity [7] . According to the well-known Manifold Hypothesis [12] the support of the data distribution P data is often concentrated on a low-dimensional manifold M data . We construct a framework for comparing numerically such distribution P data with a similar distribution Q model concentrated on a manifold M model . Such distribution Q model is produced, for example, by a generative deep neural network in one of applications' scenarios. The immediate difficulty here is that the manifold M data is unknown and is described only through discrete sets of samples from the distribution P data . One standard approach to resolve this difficulty is to approximate the manifold M data by simplices with vertices given by the sampled points. The simplices approximating the manifold are picked based on proximity information given by the pairwise distances between sampled points [5, 25] . The standard approach is to fix a threshold r > 0 and to take the simplices whose edges do not exceed the threshold r. The choice of threshold is essential here since if it is too small, then only the initial points, i.e., separated from each other 0-dimensional simplices, are allowed. And if the threshold is too large, then all possible simplices with sampled points as vertices are included and their union is simply the big blob representing the convex hull of the sampled points. Instead of trying to guess the right value of the threshold, the standard recent approach is to study all thresholds at once. This can be achieved thanks to the mathematical tool, called barcode [2, 10] , that quantifies the evolution of topological features over multiple scales. For each value of r the barcode describes the topology, namely the numbers of holes or voids of different dimensions, of the union of all simplices included up to the threshold r. However, to estimate numerically the degree of similarity between the manifolds M model , M data ⊂ R D , it is important not just to know the numbers of topological features across different scales for simplicial complexes approximating M model , M data , but to be able to verify that the similar topological features are located at similar places and appear at similar scales. Edges(red) connecting P −points(red) with Q−points(blue), and also P −points between them, are added for three thresholds: α = 0.2, 0.4, 0.6 Our method measures the differences in the simplicial approximation of the two manifolds, represented by samples P and Q, by constructing sets of simplices, describing discrepancies between the two manifolds. To construct these sets of simplices we take the edges connecting P −points with Q−points, and also P −points between them, ordered by their length, and start adding these edges one by one, beginning from the smallest edge and gradually increasing the threshold, see Figure 1 . We add also the triangles and k−simplices at the threshold when all their edges have been added. It is assumed that all edges between Q−points were already in the initial set. We track in this process the births and the deaths of topological features, where the topological features are allowed here to have boundaries on any simplices formed by Q−points. The longer the lifespan of the topological feature across the change of threshold the bigger the described by this feature discrepancy between the two manifolds. Homology is a tool that permits to single out topological features that are similar, and to decompose any topological feature into a sum of basic topological features. More specifically, in our case, a k−cycle is a collection of k−simplices formed by P − and Q− points, such that their boundaries cancel each other, perhaps except for the boundary simplices formed only by Q−points. For example, a cycle of dimension k = 1 corresponds to a path connecting a pair of Q−points and consisting of edges passing through a set of P −points. A cycle which is a boundary of a set of (k + 1)−simplices is considered trivial. Two cycles are topologically equivalent if they differ by a boundary, and by collection of simplices formed only by Q−points. A union of cycles is again a cycle. Each cycle can be represented by a vector in the vector space where each simplex corresponds to a generator. In practice, the vector space over {0, 1} is used most often. The union of cycles corresponds to the sum of vectors. The homology vector space H k is defined as the factor of the vector space of all k−cycles modulo the vector space of boundaries and cycles consisting of simplices formed only by Q−points. A set of vectors forming a basis in this factor-space corresponds to a set of basic topological features, so that any other topological feature is equivalent to a sum of some basic features. The homology are also defined for manifolds and for arbitrary topological spaces. This definition is technical and we have to omit it due to limited space, and to refer to e.g. [15, 24] for details. The relevant properties for us are the following. For each topological space X the vector spaces H k (X), k = 0, 1 . . . , are defined. The dimension of the vector space H k equals to the number of independent k−dimensional topological features (holes,voids etc). An inclusion Y ⊂ X induces a natural map In terms of homology, we would like to verify that not just the dimensions of homology groups H * (M model ) and H * (M data ) are the same but that more importantly the natural maps: induced by the embeddings are as close as possible to isomorphisms. The homology of a pair is the tool that measures how far such maps are from isomorphisms. Given a pair of topological spaces Y ⊂ X, the homology of a pair H * (X, Y ) counts the number of independent topological features in X that cannot be deformed to a topological feauture in Y plus independent topological features in Y that, after the embedding to X, become deformable to a point. An equivalent description, the homology of a pair H * (X, Y ) counts the number of independent topological features in the factor-space X/Y , where all points of Y are contracted to a single point. The important fact for us is that the map, induced by the embedding, H * (Y ) → H * (X) is an isomorphism if and only if the homology of the pair H * (X, Y ) are trivial. Moreover the embedding of simple simplicial complexes Y ⊂ X is an equivalence in homotopy category, if and only if the homology of the pair H * (X, Y ) are trivial [33] . To define the counterpart of this construction for manifolds represented by point clouds, we employ the following strategy. Firstly, we replace the pair (M model ∩M data ) ⊂ M model by the equivalent pair M model ⊂ (M data ∪ M model ) with the same factor-space. Then, we represent (M data ∪ M model ) by the union of point clouds P ∪Q, where the point clouds P , Q are sampled from the distributions P data , Q model . Our principal claim here is that taking topologically the quotient of (M data ∪ M model ) by M model is equivalent in the framework of multiscale analysis of topological features to the following operation on the matrix m P ∪Q of pairwise distances of the cloud P ∪ Q: we set to zero all pairwise distances within the subcloud Q ⊂ (P ∪ Q). Let P = {p i }, Q = {q j }, p i , q j ∈ R D are two point clouds sampled from two distributions P, Q. To define Cross-Barcode(P, Q) we construct first the following filtered simplicial complex. Let (Γ P ∪Q , m (P ∪Q)/Q ) be the weighted graph with the distance-like weights on edges defined as the complete graph on the union of point clouds P ∪ Q with the distance matrix given by the pairwise distance in R D for the pairs of points (p i , p j ) or (p i , q j ) and with all pairwise distances within the cloud Q that we set to zero. Our filtered simplicial complex is the Vietoris-Rips complex of (Γ P ∪Q , m (P ∪Q)/Q ). Recall that given such a graph Γ with matrix m of pairwise distances between vertices and a parameter α > 0, the Vietoris-Rips complex R α (Γ, m) is the abstract simplicial complex with simplices that correspond to the non-empty subsets of vertices of Γ whose pairwise distances are less than α as measured by m. Increasing parameter α adds more simplices and this gives a nested family of collections of simplices know as filtered simplicial complex. Recall that a simplicial complex is described by a set of vertices V = {v 1 , . . . , v N }, and a collection of k−simplices S, i.e. (k + 1)−elements subsets of the set of vertices V , k ≥ 0. The set of simplices S should satisfy the condition that for each simplex s ∈ S all the (k − 1)-simplices obtained by the deletion of a vertex from the subset of vertices of s belong also to S. The filtered simplicial complexes is the family of simplicial complexes S α with nested collections of simplices: for α 1 < α 2 all simplices of S α1 are also in S α2 . At the initial moment, α = 0, the simplicial complex R α (Γ P ∪Q , m (P ∪Q)/Q ) has trivial homology H k for all k > 0 since it contains all simplices formed by Q−points. The dimension of the 0−th homology equals at α = 0 to the number of P −points, since no edge between them or between a P −point and a Q−point is added at the beginning. As we increase α, some cycles, holes or voids appear in our complex R α . Then, some combinations of these cycles disappear. The persistent homology principal theorem [2, 37] implies that it is possible to choose the set of generators in the homology of filtered complexes H k (R α ) across all the scales α such that each generator appears at its specific "birth" time and disappears at its specific "death" time. These sets of "birth" and "deaths" of topological features in R α are registered in Barcode of the filtered complex. Definition. The Cross-Barcode i (P, Q) is the set of intervals recording the "births" and "deaths" times of i−dimensional topological features in the filtered simplicial complex R α (Γ P ∪Q , m (P ∪Q)/Q ). Examples of Cross-Barcode i (P, Q) are shown on Fig. 2, 11 , 14, 17, 18. Topological features with longer "lifespan" are considered essential. The topological features with "birth"="death" are trivial by definition and do not appear in Cross-Barcode * (P, Q). Basic properties of Cross-Barcode * (P,Q) Here is a list of basic properties of Cross-Barcode * (P, Q): • if the two clouds coincide then Cross-Barcode * (P, P ) = ∅; • for Q = ∅, Cross-Barcode * (P, ∅) = Barcode * (P ), the barcode of the single point cloud P itself; • the norm of Cross-Barcode i (P, Q), i ≥ 0, is bounded from above by the Hausdorff distance The proof is given in appendix. The bound from eq.(3) and the equality Cross-Barcode * (P, P ) = ∅ imply that the closeness of Cross-Barcode * (P, Q) to the empty set is a natural measure of discrepancy between P and Q . Each Cross-Barcode i (P, Q) is a list of intervals describing the persistent homology H i . To measure the closeness to the empty set, one can use segments' statistics: sum of lengths, sum of squared lengths, number of segments, the maximal length (H i max) or specific quantile. We assume that various characteristics of different H i could be useful in various cases, but the cross-barcodes for H 0 and H 1 can be calculated relatively fast. Our MTop-Divergence(P, Q) is based on the sum of lengths of segments in Cross-Barcode 1 (P, Q), see section 2.6 for details. The sum of lengths of segments in Cross-Barcode 1 (P, Q) has an interesting interpretation via the Earth Mover's Distance. Namely, it is easy to prove (see Appendix B.2) that EM-Distance between the Relative Living Time histogram for Cross-Barcode 1 (P, Q) and the histogram of the empty barcode, multiplied by the parameter α max from the definition of RLT, see e.g. [19] , coincides with the sum of lengths of segments in H 1 . This ensures the standard good stability properties of this quantity. Our metrics can be applied in two settings: to a pair of distributions P data , Q model , in which case we denote our score MTop-Div(D,M), and to a pair of distributions Q model , P data , in which case our score is denoted MTop-Div(M,D). These two variants of the Cross-Barcode, and of the MTop-Divergence are related to the concepts of precision and recall. These two variants can be analyzed separately or combined together, e.g. averaged. To calculate the score that evaluates the similitude between two distributions, we employ the following algorithm. First, we compute Cross-Barcode 1 (P, Q) on point clouds P, Q of sizes b P , b Q sampled from the two distributions P, Q. For this we calculate the matrices m P , m P,Q of pairwise distances within the cloud P and between clouds P and Q. Then the algorithm constructs the Vietoris-Rips filtered simplicial complex from the matrix m (P ∪Q)/Q which is the matrix of pairwise distances in P ∪ Q with the pairs of points from cloud Q block replaced by zeroes and with other blocks given by m P , m P,Q . Next step is to calculate the barcode of the constructed filtered simplicial complex. This step and the previous step constructing the filtered complex from the matrix m (P ∪Q)/Q can representing "births" and "deaths" of topological discrepancies Algorithm 2 MTop-Divergence(P, Q), see section 2.6 for details, default suggested values: b P = 1000, b Q = 10000, n = 100 resenting discrepancy between the distributions P, Q be done using one of the fast scripts 1 , some of them are optimized for GPU acceleration, see e.g. [35, 4] . The calculation of barcode from the filtered complex is based on the persistence algorithm bringing the filtered complex to its "canonical form" ( [2] ). Next, sum of lengths or one of other numerical characteristcs of Cross-Barcode 1 (P, Q) is computed. Then this computation is run a sufficient number of times to obtain the mean value of the picked characteristic. In our experiments we have found that for common datasets the number of times from 10 to 100 is generally sufficient. Our method is summarized in the Algorithms 1 and 2. Complexity. The Algorithm 1 requires computation of the two matrices of pairwise distances After that, the complexity of the computation of barcode does not depend on the dimension D of the data. Generally the persistence algorithm is at worst cubic in the number of simplices involved. In practice, the boundary matrix is sparse in our case and thanks also to the GPU optimization, the computation of cross-barcode takes similar time as in the previous step on datasets of big dimensionality. Since only the discrepancies in manifold topology are calculated, the results are quite robust and a relatively low number of iterations is needed to obtain accurate results. Since the algorithm scales linearly with D it can be applied to the most recent datasets with D up to 10 7 . For example, for D = 3.15 × 10 6 , and batch sizes b P = 10 3 , b Q = 10 4 , on NVIDIA TITAN RTX the time for GPU accelerated calculation of pairwise distances was 15 seconds, and GPU-accelerated calculation of Cross-Barcode 1 (P, Q) took 30 seconds. We examine the ability of MTop-Div to measure quality of generative models trained on various datasets. Firstly, we illustrate the behaviour of MTop-Div on simple synthetic datasets (rings (Fig.2) , disks (Fig. 13) ). Secondly, we show that MTop-Div is superior to the GScore, another topology-based GAN quality measure. We carry out experiments with a mixture of Gaussians, MNIST, CIFAR10, X-ray images, FFHQ. The performance of MTop-Div is on par with FID. For images, MTop-Div is always calculated in pixel space without utilizing pre-trained networks. Thirdly, we apply MTop-Div to GANs from non-image domains: 3D shapes and time-series, where FID is not applicable. We show that MTop-Div agrees with domain-specific measures, such as JSD, MMD, Coverage, discriminative score, but MTop-Div better captures evolution of generated data manifold during training 2 . As illustrated on Fig. 2 the GScore does not respond to shifts of the distributions' relative position. 1 Persistent Homology Computation (wiki) 2 we additionally calculated IMD [31] for the pairs of point clouds from our experiments, see Appendix E. One of common tasks to assess GAN's performance is measuring it's ability to uncover the variety of modes in the data distribution. A common benchmark for this problem is a mixture of Gaussians, see Fig. 3 . We trained two generators with very different performance: original GAN, which managed to capture all 5 modes of the distribution and WGAN-GP, which have only covered poorly two. However, the Geometry score is not sensitive to such a difference since two point clouds have the same RLT histogram. While the MTop-Div is sensitive to such a difference. shows an experiment with MNIST dataset. We compare two point clouds: "5"s vs. vertically flipped "5"s (resembling rather "2"s). These two clouds are indistinguishable for Geometry Score, while the MTop-Div is sensitive to such flip since it depends on the relative position of the two clouds. Ideally, the quality score should monotonically increase with the disturbance level. We evaluate the proposed MTop-Div(D,M) using a benchmark with the controllable disturbance level. We take CIFAR10 and apply the following modifications. Firstly, we emulate common issues in GAN: mode drop, mode invention and mode collapse by doing class drop, class addition and intra-class collapse (removal of objects within a class). Secondly, we apply two disturbance types: erasure of a random rectangle and a Gaussian noise. As 'real' images we use the test set from CIFAR10, as 'generated' images -a subsample of the train set with applied modifications. The size of the later always equals the test set size. Figure 5 shows the results. Ideally, the quality measure should monotonically increase with the disturbance level. We conclude that Geometry Score is monotone only for 'mode invention' and 'intra-mode collapse' while MTop-Div(D,M) is almost monotone for all the cases. The average Kendall-tau rank correlation between MTop-Div(D,M) and disturbance level is 0.89, while for Geometry Score the rank correlation is only 0.36. FID performs well on this benchmark, not shown on Figure 5 for ease of perception. Additionally, we calculated MTop-Div for higher order Cross-Barcodes, see Appendix D. We evaluated the performance of StyleGAN [17] and StyleGAN2 [18] generators trained on the FFHQ dataset 3 . We generated 20 × 10 3 samples with two truncation levels: ψ = 0.7, 1.0 and compared them with 20 × 10 3 samples from FFHQ. The truncation trick is known to improve average image quality but decrease image variance. Figure 7 show the results (see also Table 2 in Appendix for more data). Thus, the ranking via MTop-Div(M, D) is consistent with FID. We also tried to calculate Geometry Score but found that it takes a prohibitively long time. Waheen et al. [32] described how to apply GANs to improve COVID-19 detection from chest X-ray images. Following [32] , we trained an ACGAN [26] on a dataset consisting of chest X-rays of COVID-19 positive and healthy patients 4 . Next, we studied the training process of ACGAN. Every 10'th epoch we evaluated the performance of ACGAN by comparing real and generated COVID-19 positive chest X-ray images. That is, we calculated FID, MTop-Div(D,M) and a baseline measure - discriminative score 5 of a CNN trained to distinguish real vs. generated data. The MTop-Div agrees with FID and the discriminative score. PCA projections show that generated data approximates real data well. Figure 20 in Appendix presents real and generated images. Additionally, we compared real COVID-positive and COVID-negative chest X-ray images, see horizontal dashed lines at Fig. 6 . Counterintuitively, for FID real COVID-positive images are closer to real COVID-negative ones than to generated COVID-positive images; probably because FID is overly sensitive to textures. At the same time, evaluation by MTop-Div is consistent. We use the proposed MTop-Div score to analyze the training process of the GAN applied to 3D shapes, represented by 3D point clouds [1] . For training, we used 6778 objects of the "chair" class from ShapeNet [9] . We trained GAN for 1000 epochs and tracked the following standard quality measures: Minimum Matching Distance (MMD), Coverage, and Jensen-Shannon Divergence (JSD) between marginal distributions. To understand the training process in more details, we computed PCA decomposition of real and generated objects (Fig. 8, bottom) . For computing PCA, each object (3D point cloud) was represented by a vector of point frequencies attached to the 3-dimensional 28 3 grid. Figure 8 , top, shows that conventional metrics (MMD, JSD, Coverage) doesn't represent the training process adequately. While these measures steadily improve, the set of generated objects dramatically changes. At epoch 50, the set of generated objects (green) "explodes" and becomes much more diverse, covering a much larger space than real objects (red). Conventional quality measures (MMD, JSD, Coverage) ignore this shift while MTop-Div has a peak at this point. Next, we evaluated the final quality of GAN by training a classifier to distinguish real and generated object. A simple MLP with 3 hidden layers showed accuracy 98%, indicating that the GAN poorly approximates the manifold of real objects. This result is consistent with MTop-Div: at epoch 1000 it is even larger than at epoch 1. Next, we analyze training dynamics of TimeGAN [34] tailored to multivariate time-series generation. We followed the experimental protocol from [34] and used the daily historical market stocks data from 2004 to 2019, including as features the volume, high, low, opening, closing, and adjusted closing prices. The baseline evaluation measure is calculated via a classifer (RNN) trained to distinguish real and generated time-series. Particularly, the discriminative score equals to the accuracy of such a classifer minus 0.5. Fig. 9 , top, shows the results. We conclude that the behaviour of MTop-Div is consistent with the discriminative score: both of them decrease during training. To illustrate training in more details we did PCA projections of real and generated time-series by flattening the time dimension (Fig. 9, bottom) . At 2000-th epoch, the point clouds of real (red) and generated (green) time-series became close, which is captured by a drop of MTop-Div score. At the same time, discriminative score is not sensitive enough to this phenomena. We have proposed a tool, Cross-Barcode * (P, Q), which records multiscale topology discrepancies between two data manifolds approximated by point clouds. Based on the Cross-Barcode * (P, Q), we have introduced the Manifold Topology Divergence and have applied it to evaluate the performance of GANs in various domains: 2D images, 3D shapes, time-series. We have shown that the MTop-Div correlates well with domain-specific measures and can be used for model selection. Also, it provides insights about the evolution of generated data manifold during training and can be used for early stopping. The MTop-Div score is domain agnostic and does not rely on pre-trained networks. We have compared the MTop-Div against 7 established evaluation methods: FID, discriminative score, MMD, JSD, 1-coverage, IMD, and Geometry score and found that MTop-Div outperforms many of them and captures well subtle differences in data manifolds. Our methodology permits to overcome the known TDA scalability issues and to carry out the MTop-Div calculations on the most recent datasets such as FFHQ, with the size of up to 10 5 and the dimension of up to 10 7 . The simplicial complex is a combinatorial data that can be thought of as a higher-dimensional generalization of a graph. Simplicial complex S is a collection of k−simplices, which are finite (k + 1)−elements subsets in a given set V , for k ≥ 0. The collection of simplices S must satisfy the condition that for each σ ∈ S, σ ⊂ σ implies σ ∈ S. A simplicial complex consisting only of 0− and 1−simplices is a graph. Let C k (S) denotes the vector space over a field F whose basis elements are k−simplices from S with a choice of ordering of vertices up to an even permutation. In calculations it is most convenient to put F = Z 2 . The boundary linear operator ∂ k : The k−th homology group H k (S) is defined as the vector space ker ∂ k / im ∂ k+1 . The elements c ∈ ker ∂ k are called cycles. The elementsc ∈ im ∂ k+1 are called boundaries. The general elements c ∈ C k (S) are called chains. The elements of H k (S) represent various k−dimensional topological features in S. A basis in H k (S) corresponds to a set of basic topological features. Filtration on simplicial complex is defined as a family of simplicial complexes S α with nested collections of simplices: for α 1 < α 2 all simplices of S α1 are also in S α2 . In practical examples the indexes α run through a discrete set α 1 < . . . < α max . The inclusions S α ⊆ S β induce naturally the maps on the homology groups H k (S α ) → H k (S β ). The evolution of the cycles through the nested family of simplicial complexes S α is described by the barcodes. The persistent homology principal theorem [2, 36, 37] states that for each dimension there exists a choice of a set of basic topological features across all S α so that each feature appears in H k (S α ) at specific time α = b j and disappears at specific time α = d j . The H i barcode of the filtered simplicial complex is the record of these times represented as the collection of segments [b j , d j ]. The barcodes are defined and calculated through bringing the set of matrices of the boundary operators ∂ k to the "Canonical Form" by a change of the basis in C k preserving the nested family S α [2, 3] . Let (Γ, m) be a weighted graph with distance-like weights, where m is the symmetric matrix of the weights attached to the edges of the graph Γ. The Vietoris-Rips filtered simplicial complex of the weighted graph R α (Γ, m), is defined as the nested collection of simplices: where Vert(Γ) is the set of vertices of the graph Γ. Even though such weighted graphs do not always come from a set of points in metric space, barcodes of weighted graphs have been successfully applied in many situations (networks, fmri, medical data, graph's classification etc). Here we gather more details on the construction of sets of simplicies that describe discrepancies between two point clouds P and Q sampled from the two distributions P, Q. As we have described in section 2, our basic methodology is to add consecutively the edges between P −points and Q−points and between pairs of P −points. All edges between Q−points are added simultaneously at the beginning at the threshold α = 0 . The P P and P Q edges are sorted by their length, and are added at the threshold α ≥ 0 corresponding to the length of the edge. This process is visualized in more details on Figures 10 and 11 . The triangles are added at the threshold at which the last of its three edges are added. The 3− and higher k−simplices are added similarly at the threshold corresponding to the adding of the last of their edges. The added triangles and higher dimensional simplices are not shown explicitely on Figure 1 for ease of perception, as they can be restored from their edges. As all simplices within the Q−cloud are added at the very beginning at α = 0, the corresponding cycles formed by the Q−cloud simplices are immediately killed at α = 0 and do not contribute to the Cross-Barcode. The constructed set of simplices is naturally a simplicial complex, since for any added k−simplex, we have added also all its (k − 1)−faces obtained by deletion of one of vertices. The threshold α defines the filtration on the obtained simplicial complex, since the simplices added at smaller threshold α 1 are contained in the set of simplices added at any bigger threshold α 2 > α 1 . With adding more edges, the cycles start to appear. In our case, a cycle is essentially a collection of simplices whose boundary is allowed to be nonzero if the boundary consists of simplices with vertices from Q. For example, a 1− cycle in our case is a path consisting of added edges, that can start and end in Q−cloud and that passes through P −points. This is because any such collection can be completed to a collection with zero boundary since any cycle from Q−cloud is a boundary of a sum of added at α = 0 simplices from Q. A 1−cycle disappears at the threshold when a set of triangles is added whose boundary coincides with the 1−cycle plus perhaps some edges between Q−points. Notice how the 1−cycle with endpoints in Q−cloud is born at α = 0.4 on Figure 10 , shown with green. It survives at α = 0.6 and it is killed at α = 0.8. The process of adding longer edges can be visually assimilated to the building of a "spider's web" that tries to bring the cloud of red points closer to the cloud of blue points. The obstructions to this are quantified by "lifespans" of cycles, they correspond to the lengths of segments in the barcode. See e.g., Figure 11 where a 1−cycle is born between α = 0.5 and 0.9, it then corresponds to the green segment in the Cross-Barcode. Figure 11 : The process of adding the simplices between the P −cloud(red) and Q−cloud(blue) and within the P −cloud. Here we show the consecutive adding of edges together with simultaneous adding of triangles. All the edges and simplices within Q−cloud are assumed added at α = 0 and are not shown here for perception's ease. Notice the 1−cycle born between α = 0.5 and 0.9, it corresponds to the green segment in the shown Cross-Barcode Remark A.1. To characterise the situation of two data point clouds one of which is a subcloud of the other S ⊂ C, it can be tempting to start seeking a "relative homology" analog of the standard (single) point cloud persistent homology. The reader should be warned that the common in the literature "relative persistent homology" concept and its variants, i.e. the persistent homology of the decreasing sequence of factor-complexes of a fixed complex: K → . . . K/K i → K/K i+1 → . . . K/K, is irrelevant in the present context. In contrast, our methodology, in particular, does not involve factorcomplexes construction, which is generally computationally prohibitive. The point is that the basic concept of filtered complex contains naturally its own relative analogue via the appropriate use of various filtrations. A.3 Sub-manifolds and bars in Cross-Barcode * (P, Q) It is natural to start analyzing the closeness of the data point cloud P to the data point cloud Q by looking at the matrix of the P Q pairwise distances. If there are many points from P such that their distance to their closest point from Q is relatively big then the clouds P and Q are not close. However, in applications, it is important to distinguish the different situations here. The first case is when all these remote from Q points are close to each other. Then this remote from Q cluster of P points represents a single topological feature distinguishing cloud P from Q. Another case is when the remote from Q points form several clusters so that each such remote from Q cluster represents a separate topological feature. The long bars in the zero-dimensional Cross-Barcode record the lifespans on the distances' scale of these remote from Q clusters of P −points. In practice it also happens more often that it is not possible to distinguish a separate cluster of P points which are all remote from Q. Rather, there are some P −points inside the same P −cluster that are close to Q and other P −points from the same P −cluster which are further away from Q, as on Fig.1 . This situation is captured and quantified by the higher dimensional topological features distinguishing cloud P from Q. Intuitively such an i−dimensional topological feature represents an i−dimensional P −cloud's sub-manifold whose boundary is close to the Q−cloud, but whose interior P −points are remote from the Q−cloud, like the green polygonal chain on Fig.10 at α = 0.4. Such features are constructed in the algorithm using the distance matrix combinatorics from (i + 1)−tuples of P −points or P & Q− points. The distances within each of these tuples are less or equal to the feature's appearance, or birth, threshold. The disappearance, or death, of such a feature calculated by the algorithm corresponds approximately to the scale at which the feature becomes indistinguishable from the Q−cloud. The i−dimensional Cross-Barcode i (P, Q), i ≥ 1, is the set of segments (bars) recording the birth and the death thresholds of such topological features. A.4 Cross-Barcode * (P, Q) as obstructions to assigning P points to distribution Q Geometrically, the lowest dimensional Cross-Barcode 0 (P, Q) is the record of relative hierarchical clustering of the following form. For a given threshold r, let us consider all points of the point cloud Q plus the points of the cloud P lying at a distance less than r from a point of Q as belonging to the single Q−cluster. It is natural to form simultaneously other clusters based on the threshold r, with the rule that if the distance between two points of P is less than threshold r then they belong to the same cluster. When the threshold r is increased, two or more clusters can collide. And the threshold, at which this happens, corresponds precisely to the "death" time of one or more of the colliding clusters. At the end, for very large r only the unique Q−cluster survives. Then Cross-Barcode 0 (P, Q) records the survival times for this relative clustering. Figure 12 : Paths/membranes (red) in the void that are formed by small intersecting disks around P points (orange), and are ending on Q (blue), are obstacles for identification of the distribution P with Q. These obstacles are quantified by Cross-Barcode 1 (P, Q). Separate clusters are the obstacles quantified by Cross-Barcode 0 (P, Q). Notice that in situations, like, for example, in Figure 12 , it is difficult to attribute confidently certain points of P to the same distribution as the point cloud Q even when they belong to the "big" Q−cluster at a small threshold r, because of the nontrivial topology. Such "membranes" of P −points in void space, are obstacles for assigning points from P to distribution Q. These obstacles are quantified by the segments from the higher barcodes Cross-Barcode ≥1 (P, Q). The bigger the length of the associated segment in the barcode, the further the membrane passes away from Q. A.5 More simple synthetic datasets in 2D Figure 13 : The first picture shows two clouds of 1000 points sampled from the uniform distributions on two different disks of radius 1 with a distance between the centers of the disks of 0.5. The second and third pictures show the dependence of the GScore metric (the GScore is equal to zero independently of the distance between the disks), maximum of segments in H0 and the sum of lengths of segments H1 as a function of the distance between the centers of the disks averaged over 10 runs. The length of the maximum segment barcode in H0 grows linearly and equals to the distance between the pair of closest points in the two distributions. As illustrated on Figures 2,13 the GScore is unresponsive to changes of the distributions' positions. A.6 Cross-Barcode and precision-recall Figure 14 : Mode-dropping, bad recall & good precision, illustrated with clouds P data (red) and Q model (blue). The Cross-Barcode 0 (P data , Q model ) contains long intervals, one for each dropped mode, which measure the distance from the data's dropped mode to the closest generated mode. The Cross-Barcode captures well the precision vs. recall aspects of the point cloud's approximations, contrary to FID, which is known to mix the two aspects. For example, in the case of mode-dropping, bad recall but good precision, the Cross-Barcode 0 (P data , Q model ) contains the long intervals, one for each dropped mode, which measure the distance from the data's dropped mode to the closest generated mode. The mode-dropping case (bad recall, good precision) is illustrated on Figure 14 . Analogously, in the case of mode-invention, with good recall but bad precision, the Cross-Barcode 0 (Q model , P data ) contains long intervals, one for each invented mode, which measure the distance from the model's invented mode to the closest data's mode. The mode-invention (good recall, bad precision) case is illustrated on Figure 15 . The Bottleneck distance [11] , also known as Wasserstein−∞ distance W ∞ , defines the natural norm on the Cross-Barcodes: The Proof. Let c ∈ R α0 (Γ P ∪Q , m (P ∪Q)/Q ) be an i−dimensional cycle appearing in the filtered complex at α = α 0 . Let us construct a simplicial chain that kills c. Let σ = {x 1 , . . . , x i+1 } be one of the simplices from c. Let q j denote the closest point in Q to the vertex x j . The prism {x 1 , q 1 , . . . , x i+1 , q i+1 } can be decomposed into (i + 1) simplices p k (σ) = {x 1 , x 2 , . . . , x k−1 , q k , . . . , q i+1 }, 1 ≤ k ≤ i + 1. The boundary of the prism consists of the two simplices σ, q(σ) = {q 1 , . . . , q i+1 }, and of the (i + 1) similar prisms corresponding to the the boundary simplices of σ. If c = n a n σ n then c = ∂( n a n k p k (σ n )) + n a n q(σ n ) For any k, j, d(x j , x k ) ≤ α 0 since c is born at α 0 . Therefore Therefore all simplices p k (σ n )) appear no later than at (α 0 +sup x∈P d(x, Q)) in the filtered complex. All vertices of the simplices q(σ n ) are from Q. It follows that the lifespan of the cycle c is no bigger than sup x∈P d(x, Q)) To illustrate the proposition 1 we have verified empirically the diminishing of Cross-Barcode * (Q 1 , Q 2 ) when number of points in Q 1 , Q 2 goes to +∞ and Q 1 , Q 2 are sampled from the same uniform distribution on the 2D disk of radius 1. The maximal length of segments in H1 as function of number of points in the clouds of the same size is shown in Figure 16 . The Cross-Barcode for a given homology H i is a list of birth-death pairs (segments) Relative Living Times is a discrete distribution RLT (k) over non-negative integers k ∈ {0, 1, . . . , +∞}. For a given α max > 0, RLT (k) is a fraction of "time", that is, parts of horizontal axis τ ∈ [0, α max ], such that exactly k seg- For equal point clouds, Cross-Barcode i (P, P ) = ∅ and the corresponding RLT is the discrete distribution concentrated at zero. Let us denote by O 0 such discrete distribution corresponding to the empty set. A natural measure of closeness of the distribution RLT to the distribution O 0 is the earth-mover's distance (EMD), also called Wasserstein-1 distance. Proposition. Let for all d i ≤ α max , then MTop-Div(P, Q) = α max EMD(RLT (k), O 0 ). Let's use all the distinct b i , d i to split [0, α max ] to disjoint segments s j : Each s j is included in K(j) segments [b i , d i ] from the Cross-Barcode i (P, Q). Thus, RLT (k) = 1 α max j:K(j)=k |s j |. At the same time: We have made experiments in various settings and on the following datasets: • on a set of gaussians in 2D in comparison with distributions generated by GAN and WGAN. • MNIST We have observed that GScore is not sensitive to the flip of the cloud of "fives", while our score MTop-Divergence is sensitive to such flip since it depends on the positions of clouds with respect to each other • CIFAR10 We have evaluated our MTop-Div(D,M) using a benchmark with the controllable disturbance level. We have observed that Geometry Score is monotone only for 'mode invention' and 'intra-mode collapse' while MTop-Div(D,M) is almost monotone for all the cases. • FFHQ We have evaluated the quality of distributions generated by StyleGAN and StyleGAN-2, without truncation and with ψ = 0.7 truncation. We observed that the ranking via MTop-Div is consistent with FID • ShapeNet 6 We have studied the training dynamics of the GAN trained on 3D shapes. We observed that MTop-Div is consistent with domain specific measures (JSD, MMD, Coverage) and that MTop-Div better describes the evolution of the point cloud of generated objects during epochs; • Stock data We have studied the training dynamics of TimeGAN 7 applied to market stock data. We observed that MTop-Div is consistent with the discriminative score but better captures the evolution of point cloud of generated objects during epochs; • Chest X-ray images We have studied the training dynamics of ACGAN applied to chest X-ray images. We observed that MTop-Div is more consistent with the discriminative score than FID; For computation of FID we used Pytorch-FID 8 . For computation of Geometry Score we used the original code 9 patched to supported multi-threading, otherwise it was extremely slow. The RLTs computation was averaged over 2500 trials. We calculated persistent homology via ripser++ 10 . We used the following hyperparameters to compute MTop-Div: • MNIST: b P = 10 2 , b Q = 10 3 ; • Gaussians: b P = 10 2 , b Q = 10 3 ; • CIFAR10: b P = 10 3 , b Q = 10 4 ; • FFHQ: b P = 10 3 , b Q = 10 4 . • ShapeNet: b P = 10 2 , b Q = 10 3 ; • Market stock data: b P = 10 2 , b Q = 10 3 ; • Chest X-ray data: b P = 10 2 , b Q = 10 3 . MTop-Div scores were were averaged over 20 runs. We compared Geometry Score and MTop-Div in the experiment with mixtures of Gaussians. Table 4 shows the results. We conclude that the MTop-Div is consistent with the visual quality of GAN's output while Geometry Score fails. Figure 17 shows Cross-Barcodes for the experiment with StyleGAN's trained on FFHQ. Figure 19 shows one of Cross-Barcodes in H 0 from the experiment with CIFAR10 dataset to illustrate that the 0−dimensional Cross-Barcode can also be applied. Figure 18 shows the Cross-Barcodes in H 1 from the experiments with GAN 11 and WGAN-GP 12 trained on mixture of Gaussians. As proposed by a reviewer, we did additional experiments with IMD [31] applied to point clouds from our experiments. IMD is not sensitive to the rings shift (Section 3.1) and the digits flipping on MNIST (Section 3.3). For the experiment "Mode dropping on Gaussians" (Section 3.2), IMD incorrectly ranks poorly performing WGAN-GP (see Fig.3 ) higher than the original GAN (Table 4) . For the experiments "GAN model selection" (Section 3.5), IMD ranks a better performing model lower in one case, while the ranking via MTop-Div is consistent with true GAN performance. For the "Synthetic variations of CIFAR10" experiment (Section 3.4), the average Kendall-tau correlation between IMD score and the disturbance level is 0.55, which is lower than the same measure of MTop-Div (0.89). Learning representations and generative models for 3d point clouds Framed Morse complexes and its invariants Canonical Forms = Persistence Diagrams. Tutorial Ripser: efficient computation of Vietoris-Rips persistence barcodes Laplacian eigenmaps and spectral techniques for embedding and clustering Pros and cons of gan evaluation measures Geometric deep learning: Going beyond euclidean data An information-rich 3d model repository An introduction to topological data analysis: fundamental and practical aspects for data scientists Geometry helps in bottleneck matching and related problems Deep learning Generative adversarial networks A domain agnostic measure for monitoring and evaluating gans Algebraic topology Gans trained by a two time-scale update rule converge to a local nash equilibrium A style-based generator architecture for generative adversarial networks Analyzing and improving the image quality of StyleGAN Geometry score: A method for comparing generative adversarial networks PLLay: Efficient topological layer based on persistence landscapes Improved precision and recall metric for assessing generative models Are gans created equal? A large-scale study Topological autoencoders Computational Topology for Biomedical Image and Data Analysis: Theory and Applications Finding the homology of submanifolds with high confidence from random samples Conditional image synthesis with auxiliary classifier gans Skill rating for generative models Classification accuracy score for conditional generative models Improved techniques for training gans Aäron van den Oord, and Matthias Bethge. A note on the evaluation of generative models The shape of data: Intrinsic distance for data distributions Covidgan: data augmentation using auxiliary classifier gan for improved covid-19 detection Elements of homotopy theory Time-series generative adversarial networks Gpu-accelerated computation of Vietoris-Rips persistence barcodes Computing persistent homology. Discrete & Computational Geometry Computing and comprehending topology: Persistence and hierarchical Morse complexes The dataset is free for non-commercial purposes The problem statement was developed in the framework of Skoltech-MIT NGP program. The work of Serguei Barannikov and Evgeny Burnaev was supported by Ministry of Science and Higher Education grant No. 075-10-2021-068. Authors are thankful to Alexei Artemov for help with the 3D shapes experiment.