key: cord-0563961-87l62ypt authors: Crectu, Ana-Maria; Gu'epin, Florent; Montjoye, Yves-Alexandre de title: Dataset correlation inference attacks against machine learning models date: 2021-12-16 journal: nan DOI: nan sha: 82a50e3b54d665b11b65f04a10a7e0aa294d608e doc_id: 563961 cord_uid: 87l62ypt Machine learning models are increasingly used by businesses and organizations around the world to automate tasks and decision-making. Trained on potentially sensitive datasets, machine learning models have been shown to leak information about individuals in the dataset as well as global dataset information. We here take research in dataset property inference attacks one step further by proposing a new attack against ML models: a dataset correlation inference attack, where an attacker's goal is to infer the correlation between input variables of a model. We first show that an attacker can exploit the spherical parametrization of correlation matrices, to make an informed guess. This means that using only the correlation between the input variables and the target variable, an attacker can infer the correlation between two input variables much better than a random guess baseline. We propose a second attack which exploits the access to a machine learning model using shadow modeling to refine the guess. Our attack uses Gaussian copula-based generative modeling to generate synthetic datasets with a wide variety of correlations in order to train a meta-model for the correlation inference task. We evaluate our attack against Logistic Regression and Multi-layer perceptron models and show it to outperform the model-less attack. Our results show that the accuracy of the second, machine learning-based attack decreases with the number of variables and converges towards the accuracy of the model-less attack. However, correlations between input variables which are highly correlated with the target variable are more vulnerable regardless of the number of variables. Our work bridges the gap between what can be considered a global leakage about the training dataset and individual-level leakages. When coupled with marginal leakage attacks,it might also constitute a first step towards dataset reconstruction. Abstract-Machine learning models are increasingly used by businesses and organizations around the world to automate tasks and decision making. Trained on potentially sensitive datasets, machine learning models have been shown to leak information about individuals in the dataset as well as global dataset information. We here take research in dataset property inference attacks one step further by proposing a new attack against ML models: a dataset correlation inference attack, where an attacker's goal is to infer the correlation between input variables of a model. We first show that an attacker can exploit the spherical parametrization of correlation matrices, imposing boundaries on the correlation coefficients, to make an informed guess. This means that using only the correlation between the input variables and the target variable, an attacker can infer the correlation between two input variables much better than a random guess baseline. We propose a second attack which exploits the access to a machine learning model using shadow modelling to refine the guess. Our attack uses Gaussian copulabased generative modelling to generate synthetic datasets with a wide variety of correlations in order to train a meta model for the correlation inference task. We evaluate our attack against Logistic Regression and Multi-layer perceptron models and show it to outperform the model-less attack. We find the MLP to be less vulnerable to our attacks. Our results show that the accuracy of the second, machine learning-based attack decreases with the number of variables and converges towards the accuracy of the model-less attack. However, correlations between input variables which are highly correlated with the target variable are more vulnerable regardless of the number of variables. Our work bridges the gap between what can be considered a global leakage about the training dataset and invididual-level leakages. When coupled with marginal leakage attacks, it might also constitute a first step towards dataset reconstruction. Index Terms-property inference attacks, machine learning privacy Machine learning models are increasingly used by businesses, researchers and organizations to automate processes and decision making. The applications of machine learning are very broad, ranging from research [1] , personalized recommendations [2] , anomaly detection, spam detection, content moderation [3] , [4] , automated completion of emails [5] , speech or object recognition [6] , [7] , insurance risk assessment [8] , screening of job applicants [9] , predicting healthcare outcomes [10] to medical diagnosis [11] . The growth of machine learning has been driven by decreasing storage costs for data and by people's increasing use of smartphones and *These authors contributed equally and are listed alphabetically. the Internet of things generating a lot of data [12] , [13] . In response to the growing impact of machine learning on society, policymakers around the world are now discussing bills to regulate its use [14] . While the performance of machine learning models benefits from large amounts of training data, these data can be very sensitive. Typical datasets used in real-world applications consist of user emails, patient records, voice recordings, images, fine-grained behavioural logs or interaction data. As models are being shared widely through APIs, either commercially or by researchers for reproducibility reasons, their release raises concerns about the potential leakage of sensitive information about the training data. The main release paradigms are whitebox, where a model's weights are released and black-box, where only access to model predictions is allowed. The seminal paper by Shokri et al. [15] showed ML models to be vulnerable to membership inference attacks. The work demonstrated how a model's outputs can be used to retrieve whether a record belongs to the training dataset. Since then, numerous studies have been investigating the risk posed by the release of ML models to individual privacy through membership inference attacks [16] - [24] , attribute inference attacks [17] , [22] , [25] , [26] , link stealing attacks [27] and data extraction [28] , [29] or even reconstruction attacks [30] . Recent works on property inference attacks have further drawn attention to the leakage of global dataset properties by ML models [31] - [33] . The term "property" refers in a broad sense to aggregate information about the training dataset, e.g. the proportion of records pertaining to a category that is unrelated to the model's task. Existing work has focused exclusively on properties relating to one variable, with a recent paper showing the marginals of an input variable to be vulnerable even when it is uncorrelated with the target variable [33] . Our work takes research in property inference attacks one step further by studying the potential leakage of correlations between input variables. We develop what is, to our knowledge, the first dataset correlation inference attack, whose aim is to infer the correlations between input variables of a machine learning model. We argue that information that is not limited to a single variable, but to how two or more variables interact with one another, may leak more information about the dataset, and even leak sensitive information relating to individuals in the dataset. We propose a first attack that does not require access to an ML model trained on the dataset and acts as a new, strong baseline for our second attack. This attack uses the knowledge of the correlations between each input variable and the target variable -knowledge which we argue is unlikely to be a secret -to predict the range of values for the correlation between the input variables. The vulnerability stems from fundamental properties of the linear correlation matrix between n variables. We show that this attack is able to correctly predict the correlation between two input variables (binned into "negative", "low" or "positive") 56% of the time, much better than a random guess (33%). We propose a second attack which exploits the access to a machine learning model trained on the dataset. Our attack is based on the shadow modelling technique [15] , [31] as we train a meta-model to infer correlations based on information from the machine learning model. We develop new methodology addressing the unique challenge of having to generate a variety of shadow datasets such that the correlations between each input variable and the target variable are equal to the attacker's knowledge. Existing PIA work studied simple binary properties for which it is straightforward to generate shadow datasets satisfying e.g. a specific proportion of records pertaining to women [32] . We here generate random correlation matrices under constraints and synthetic shadow datasets having these correlation matrices using Gaussian copulas. Our work is trying to achieve a more realistic setting, only using synthetic data for side information, while like most of the papers, which supposed the use of an auxiliary dataset of real records. This attack works well, recovering the correlations coefficients of datasets of three columns with an average accuracy of 96% when the access is given to the predictions of a logistic regression and 82% against those of a Multi-Layer Perceptron (MLP). As the number of variables increases to 7-8 variables, the results of the second attack converge to values similar to those of the first attack, achieving an accuracy of 57% for the logistic regression and 56% for the MLP while the random guess baseline is 33%. Finally, our results show that some correlations values are more vulnerable to our attack than others, independently of the number of columns in the dataset. Our work bridges the gap between what constitutes a dataset vs individual-level leakage. As it shows the retrieval of correlations between the variables to be possible based on the outputs of machine learning models, it might also constitutewhen coupled with information about the marginals -a first step toward dataset reconstruction. In this section, we introduce the concepts and notation used in the paper. First, we describe the concept of correlation matrix and an algorithm from the literature for generating a random correlation matrix. Second, we describe the Gaussian copula-based generative model. Third, we describe shadow modelling, a common technique used in attacks against machine learning models. We describe the mathematical properties satisfied by the matrix of correlations between n variables. We then present the spherical parametrization of correlation matrices [34] and an algorithm using it to generate random correlation matrices [35] . Notation. Let X 1 , . . . , X n denote n real-valued random variables. We assume that they are of finite variance σ( We denote by C ∈ R n×n their Pearson correlation matrix, where C ij = ρ(X i , X j ). As a reminder, the correlation between two random variables is defined as: where Cov(X, Y ) denotes the covariance between random variables X and Y : A correlation matrix satisfies the following properties: 1) All elements are valid correlations, i.e., real values between −1 and 1: −1 ≤ C ij ≤ 1. 2) All diagonal entries are equal to 1, i.e., there is perfect correlation between a variable and itself: 3) The matrix is symmetric, i.e., the correlation between X i and X j is the same as the correlation between X j and X i : C ij = C ji . 4) The correlation matrix is positive semi-definite: x T Cx ≥ 0, ∀x ∈ R n . Pinheiro and Bates [34] introduced in 1996 the spherical parametrization of correlation matrices. The idea is to use the fact that C is positive semi-definite to write its Choleski decomposition C = BB T , where B is lower triangular: The matrix B can thus be expressed using spherical coordinates as illustrated on Fig. 1 below. Written in closed form, the coefficients of B are equal to: The spherical parametrization of B allows to describe a correlation matrix using only n×(n−1) 2 parameters, namely the angles θ i,j , 1 ≤ i ≤ n, i < j < n. Numpacharoen and Atsawarungruangkit [36] introduced an algorithm to generate random correlation matrices based on the spherical parameterization, building on prior work [35] , [37] . The key insight is that the correlation coefficients c i,j can be expressed as sums of products between cosines and sines of θ i,j , by developing the computation of c i,j = (BB T ) i,j . As a result, each c i,j lies within a boundary determined by the angles θ p,q for 1 ≤ p ≤ i and 1 ≤ q < j. This insight can be used to generate a valid correlation matrix by sampling the correlation coefficients one by one, uniformly within the boundaries derived from the values previously sampled. Algorithm 1 provides a high-level description of the procedure to generate a random correlation matrix using boundaries of their coefficients [36] . The first column of the matrix is initialized uniformly at random within its boundaries, namely the interval [−1, 1] (lines [4] [5] . The correlation coefficients are sampled, in order, from top to bottom and from left to right. The elements of the first column are initialized uniformly at random c i,1 ∼ U([−1, 1]), i = 1, . . . , n. When sampling the correlation c i,j with i > 1, j ≥ 2, the previously sampled correlation coefficients restrict the values for angles θ p,q , for 1 ≤ p ≤ i and 1 ≤ q < j, allowing to derive the boundaries for c i,j . For complete details about this procedure, we refer the reader to Algorithm 4 in the Appendix. To ensure that every correlation coefficient in the correlation matrix is equally distributed (i.e.that their CDF is almost identical), the algorithm shuffles them at the end (lines 9-12). We describe the Gaussian copulas generative model, which can be used to generate datasets of n variables with a variety of dependencies between the variables. Marginal distribution. Consider a random variable X taking values in R. We denote by marginal distribution its cumulative distribution function (CDF): F : R → [0, 1], F (x) = P (X ≤ x). One example is the marginal of a standard normal distribution, which we denote by Φ: Gaussian multivariate distribution. We denote by N (0, Σ) the Gaussian multivariate distribution with mean 0 and covariance matrix Σ. Its CDF is equal to: Copulas. Copulas denote the set of multivariate cumulative distribution functions F C : [0, 1] n → [0, 1] over continuous random vectors (X 1 , . . . , X n ) such that the marginal of each variable satisfies F i (x) = x, i.e., is uniformly distributed in the interval [0, 1]. Sklar's theorem [38] , [39] states the fundamental result that for any random variables X 1 , . . . , X n with continuous marginals F 1 , . . . , F n , their joint probability distribution can be described in terms of the marginals and a copula F C modeling the dependencies between the variables. To see why, let us consider a continuous random vector (X 1 , . . . , X n ) with CDF F and marginals F i . Using the fact that the random variable U i = F −1 i (X i ) is uniformly distributed in the interval [0, 1], it follows that the CDF of (U 1 , . . . , U n ) is equal to P (U 1 ≤ u 1 , . . . , U n ≤ u n ) = Pr(X 1 ≤ F −1 1 (u 1 ), . . . , X n ≤ F −1 n (u n )). As a result, the copula of (X 1 , . . . , X n ) can be written as follows: This result can be used to generate samples from F when both the copula and the marginals of the variables are known. A variety of copula-based generative models, including Gaussian copulas which we use in this paper, assume that the copula belongs to a given restricted family. Gaussian copulas. Given an n-dimensional correlation matrix C, its Gaussian copula is defined as: Alg. 2 describes a procedure to generate a sample Y = (Y 1 , . . . , Y n ) from a distribution satisfying the following two properties: (1) the marginals of the distribution are F 1 , . . . , F n and (2) its dependencies are given by the Gaussian copula F C . Note that for arbitrary marginals, the correlations between the variables Y 1 , . . . , Y n are not necessarily equal to the correlations C. However, when the marginals are standard normals F i = Φ, i = 1, . . . , n, the correlations are the same. Some Algorithm 2 SAMPLEFROMGAUSSIANCOPULAS 1: Inputs: n: Number of variables in the sample. C ∈ S d + : The correlation matrix parametrizing the Gaussian copula. F 1 , . . . , F n : Marginals of the distribution to sample from. 2: Output: A sample y = (y 1 , . . . , y n ) from a distribution with marginals F 1 , . . . , F n and dependencies given by the Gaussian copula F C . // Z follows the law of a normal multivariate distribution. work exists quantifying the precise relationship between the correlations of Y and C for simple marginal distributions [40] . Shadow modelling has been the main technique used to develop attacks against machine learning models [15] , [31] . Fig. 2 illustrates the shadow modelling setup. This technique is applied in attacks where an attacker is trying to infer information about the training dataset D target of a target model M T . Examples of such information from prior work include various properties of the dataset or whether a given record is in the dataset. The attacker trains a meta model A to recognise this information on features extracted from k shadow models M 1 , . . . , M k having similar architecture to the target model. The inference task of the meta model is usually framed as a classification task with classes P 1 , . . . , P B . Each shadow model M j is trained on a shadow dataset D j , for j = 1, . . . , k. The shadow dataset is usually sampled from an auxiliary dataset of real records from a similar distribution to that of the target training dataset. To obtain a balanced training set for the meta model, the shadow datasets are sampled to ensure that each class P 1 , . . . , P B is equally represented among the shadow datasets. For instance, in binary property inference attacks (B = 2), half of the shadow datasets are sampled so as that they satisfy each property P 1 and P 2 , respectively. The features extracted from each shadow model as well as the target model are expected to encode information relevant to the inference task. A vector F i is extracted from each shadow model M i , i = 1, . . . , k, e.g. its weights, and labelled with the corresponding class as per the property of its training dataset L i ∈ {P 1 , . . . , P B }. We denote by D meta = {(F 1 , L 1 ), . . . , (F k , L k )} the features of all the shadow models, together with their class labels. The meta model is trained on D meta and the final prediction from the meta model is retrieved for the features F T of the target model. We propose a novel attack targeting the correlations between the input variables of the training dataset of a machine learning model, which we call dataset correlation inference attack. We consider n random variables: the input variables X 1 , . . . , X n−1 and the target variable Y . We denote by D the joint distribution of the n variables (X 1 , . . . , X n−1 , Y ) ∼ D. We assume the task of predicting the target variable given the input variable to be of interest, such that a target machine learning model M T would be trained for the task on a dataset sampled for D target ∼ D. We assume that a malicious agent, the attacker has access to the following auxiliary information: (1) the linear correlations between the input variables and the target variable ρ(X i , Y ), i = 1, . . . , n − 1 and (2) the marginals of the variables, here denoted by ∀x, F i (x) = Pr(X i ≤ x), i = 1, . . . , n − 1 and ∀y, F n (y) = Pr(Y ≤ y). The attacker's goal is to infer the correlations between the input variables ρ(X i , X j ), 1 ≤ i < j ≤ n − 1. Importantly and differently from existing work on property inference attacks, our attack does not assume that the attacker would have access to an auxiliary dataset of real records. The information about correlations between the input variables is confidential and can be sensitive depending on the use case. To see why, assume that an attacker learns that in the training dataset of a hypothetical model to predict a person's risk from COVID-19 complications (the target variable Y ), the input variables "age" and "has cancer" are highly correlated. For instance, the attacker learns that the Pearson correlation coefficient between the "age" and "has cancer" variables is equal to ρ(X 1 , X 2 ) = 0.9, while in a dataset representative of the wider population ρ(X 1 , X 2 ) = 0.1. The attacker also knows that Alice's age and that record is in the dataset. The attacker can combine these two pieces of information to learn that Alice is likely to have cancer. We also argue that the retrieval of correlations between the variables, when coupled with information about the marginals, constitutes a first step toward dataset reconstruction. We consider both assumptions to be reasonable. A target machine learning model is trained to infer Y given X. A model trained for this task is expected to learn the dependencies between the input variables X and the target variable Y . An attacker having access to the model could probe different inputs to assess how changes in one of the variables X i impact the output of the model. Similarly, it has been shown that an attacker could use the existing marginal leakage attack from the literature to retrieve the marginals of each variable [33] . In some cases, a data practitioner might even have disclosed marginal information e.g. to report summary statistics in a paper or to comply with legal requirements relating to the Fig. 2 . Classical shadow modeling architecture. We illustrate the shadow modeling architecture. In a first step, datasets satisfying properties P 1 , . . . , P B are generated by the attacker. In a second step, the attacker uses the datasets to train the k shadow models. In a third step, the attacker extracts and labels all the features of the shadow models, to create another dataset Dmeta. In a fourth step, the attacker uses Dmeta to train the meta-model. Finally, the attacker can query the meta model, using the features from the target model, to infer properties. representativity of the training dataset for the intended use case of the model [14] . We consider three attack scenarios, having different levels of attacker access to a target machine learning model M T trained on a dataset sampled from the target distribution D target ∼ D. No access to the model. Under this attack scenario, the attacker does not have access to the target model M T . This scenario might arise in situations where the model is not released, yet information about the marginals and the correlations between the input variables and the target variables is made available, e.g. as part of a scientific paper. Black-box access to the model. Under this attack scenario, the attacker has knowledge of (1) the model architecture and training details, e.g., number of training epochs for a neural network, allowing to train from scratch a similar model and (2) query access to the target model M T , allowing the attacker to retrieve the output probabilities for each class M T (x) for inputs x. White-box access to the model. Under this attack scenario, the attacker has complete knowledge of the models. In particular, apart from the architecture and training details, the attacker has access to the weights of the model. In this section, we present our methodology for the dataset correlation inference attack. First, we formalize in section IV-B the correlation inference task as classification task with B classes. Second, we describe in section IV-C an approach for generating random correlation matrices under constraints that relies on the procedure described in section II-A. We are thus able to generate a variety of shadow correlation matrices C satisfying the constraints posed by the attacker knowledge C i,n = ρ(X i , Y ) for each i = 1, . . . , n − 1. These matrices are used (1) to inform the attacker on the range of values attained empirically by the correlations between pairs of input variables C i,j , 1 ≤ i < j ≤ n − 1 and (2) to generate shadow datasets using Gaussian copulas with a wide variety of correlations between the input variables, all the while satisfying the constraints posed by the attacker knowledge. Third, we describe in section IV-D an attack that does not require access to a machine learning model trained on a dataset sampled from D in order to infer -better than random -the correlations between the input variables X 1 , . . . , X n−1 . This attack is informed by the range of values attained empirically by the shadow correlation matrices. Fourth, we describe in section IV-E an attack which exploits the access to the target model M T in order to retrieve more information about the dataset. This attack is based on a novel approach to generate shadow datasets satisfying the attacker's knowledge about the correlations between the input variables and the target variable. Indeed, prior works on property inference attacks only focused on properties relating to the proportion of records in a certain category (e.g., men), in which case it is straightforward to sample shadow datasets satisfying the properties. We here generate k shadow correlation matrices C 1 , . . . , C k under constraints relating to the knowledge available to the attacker. Then, we generate k shadow datasets using Gaussian copulas, each parametrized by the k-th shadow correlation matrix. Finally, we use the shadow modelling technique to train a meta model A for the correlation inference task. Note that, unlike prior works on property inference attacks, our approach does not require the attacker to have access to a dataset of real records from a distribution similar to the target dataset's distribution D, as the shadow datasets are purely synthetic. We define the correlation inference task as a classification task and divide the range of correlations [−1, 1] into B bins of equal length. The B bins are formally defined as the following intervals: For instance, if B = 3, the bins are [−1, −1/3), [−1/3, 1/3) and [1/3, 1], which we refer to by writing negative, low and high correlation, respectively. The goal of our approach is to generate correlation matrices subject to a set of constraints on correlations ρ(X i , Y ) ∀ i ∈ {1, · · · , n − 1}. If we were to naively use Alg. 4 with the desired constraints, we would obtain a correlation matrix C that has all constraints in reverse order on the first column. Additionally, the resulting matrix C would not have equal distribution among all the unconstrained correlation coefficients. We have to solve those two challenges when using Alg. 4. To solve the first challenge, we sample a random permutation σ to shuffle the constraints before giving them to the Alg. 4 (line 4-5). By applying the inverse permutation to the output of Alg. 4, we make sure to shuffle all correlations of the form ρ(X i , X j ) ∀(i, j) ∈ n × n, while reordering in place the constrained ones (line [12] [13] [14] . To solve the second task, we use the permutation σ = 1 2 3 ··· n−1 n n n−1 n−2 ··· 2 1 . We consider a matrix M by block of the form M = M 1 M 2 · · · M n where each M i represents a column of the matrix M . We refer to the inverse of a vector, as the vector read in reverse order. Then, if V = (v 1 , · · · , v n ), the inverse of V can be written as V −1 = (v n , · · · , v 1 ). Applying the permutation σ to the matrix M (applying first σ to the column and then to the rows of the matrix) result in a matrix of the form: . We observe that the last column of the matrix is now in first position. The produced matrix of the Alg. 4 can be described as C = σ (C) whereC is a valid correlation matrix. We call C 1 = (c 1,1 , c σ(n−1),1 · · · , c σ(2),1 ) the first column of C, which contain all the constrained values. Therefore, applying σ −1 to C gives a matrixC with C −1 1 = (c σ(2),1 , · · · , c σ(n−1),1 , c 1,1 ) as its last column (line 8-10). If we now apply the inverse permutation of the correlation σ to C and we let in place the last column, we obtain a valid correlation matrix C with the constrained last column in order (line 12-14). We present a first dataset correlation inference attack exploiting the knowledge available to the attacker about the correlations between the input variables and the target variable in order to make an informed guess about the correlations between the input variables. We consider a pair of input X 1 , Y ) , . . . , ρ(X n−1 , Y )): A set of constraints to be enforced on the last column of the correlation matrix. 2: Output: C ∈ R n×n : A valid correlation matrix satisfying c i,n = ρ(X i , Y ), for i = 1, . . . , n − 1. 3: // Draw a random permutation to shuffle the constraints 4: σ ← random permutation(n − 1) 5: constraints ← σ(constraints) 6: C ← FILLCORRELATIONMATRIX(n, K, constraints) // Ensure that the constraints are applied to the last column as by default, they are applied to the first column. // Shuffle back the input variables to ensure the constraints appear in the original order. The target variable is kept fixed. variables, for instance X 1 and X 2 , without loss of generality. The key idea of this attack is that having knowledge of ρ(X 1 , Y ) and ρ(X 2 , Y ), restricts the range of possible values for the unknown value ρ(X 1 , X 2 ). To derive bounds on the values that can be attained by ρ(X 1 , X 2 ), our approach generates a large number k of correlation matrices C 1 , . . . , C k such that the correlations between the first n − 1 variables and the last one are equal to ρ(X 1 , Y ), . . . , ρ(X n−1 , Y ). The values generated for the correlations range from . Given B, the number of target bins, we compute the bin that is most covered by the interval [m 1 , m 2 ]. This approach makes no assumption about the distribution of correlations inside the interval [m 1 , m 2 ], by assuming it is uniform and returning the majority prediction under this assumption. More specifically, we distinguish between three cases: 1) If both m 1 and m 2 fall inside a bin, i.e., for some b, 3) If more than one bin is covered by the interval [m 1 , m 2 ], our approach returns one of those bins as its guess, uniformly at random. We present a second dataset correlation inference attack that exploits the access to a machine learning model trained on a dataset D target ∼ D in order to refine the guess from the previous section IV-D. First, our approach generates k shadow correlation matrices with constraints on the last column equal to ρ(X i , Y ), i = 1, . . . , n, using Algorithm 3. Second, our approach generates k synthetic shadow datasets using Gaussian copulas as described in Algorithm 2. The inputs to the algorithm are the marginals of the target distribution D (which we assume to be known by the attacker) and a shadow correlation matrix. We use the shadow modelling technique to first train k shadow models with the same architecture as the target model on the shadow datasets, extract features from the shadow model and finally train the meta-model for the correlation inference task. Feature extraction. We consider three types of features that could be extracted from the target and shadow models. In the black-box attack scenario, the attacker extracts the output probabilities of the model for each class on an auxiliary dataset D aux sampled from a shadow correlation dataset generated as described above. We henceforth call these features model predictions. Unlike other works on property inference attacks [33] , our work uses purely synthetic data to query the target model. In the white-box attack scenario, the attacker extracts the weights of the model. In the particular case where the target architecture is a Multi-layer perceptron (MLP), we also use as features the weights of the model in a canonical form, as described by Ganju et al. [32] . Indeed MLP architectures have the property that applying a permutation to neurons in the hidden layers does not change the model predictions. By reordering the weights using one "canonical" permutation, the redundancy can thus be removed, improving the performance of a meta-model trained on these features. We implement the same approach and refer to these features as canonical model weights. Finally, we also combine the model weights with the model predictions and call these features model weights and predictions. The goal of our experiments is to evaluate the efficacy of attacks proposed in section IV. First, we describe the data model we use to generate target correlation matrices used in the evaluation. Second, we describe a grid evaluation setup allowing to compare the boundary-based and machine learning-based attacks for all possible ranges of the correlation constraints ρ(X i , Y ), i = 1, . . . , n − 1. Third, we describe an evaluation setup for the machine learning-based attack under more realistic settings, as we vary the number of variables and randomize the target correlations. Both evaluations setups can be applied to any features extracted from the target models. We generate synthetic datasets of n variables X 1 , . . . , X n−1 , Y using the Gaussian copulas generative model. This model takes as input the marginals F 1 , . . . , F n for each variable and the correlation matrix C of the variables. Throughout the experiments, we assume that the marginals are all standard normals F i = Φ, i = 1, . . . , n and that the correlation matrix is positive definite. We assume that the target model is a binary classification model predicting whether Y > 0 given (X 1 , . . . , X n−1 ). The attacker goal thus becomes the inference of the correlations between the pairs of input variables ρ(X i , X j ) while knowing the correlations between the input and target variables in the original dataset ρ(X i , Y ). The goal of this evaluation is to understand how the correlations between the input variables and the target variable impact the accuracy of the attack. We use n = 3 variables and vary the constraints ρ(X 1 , Y ) and ρ(X 2 , Y ) between -1 and 1. More specifically, given the resolution parameter N > 0 we divide [-1, 1] into 2 × N equally sized segments. This leads to a grid G of 4N 2 equal cells of the form: We evaluate the performance of our attack for all ranges of constraints as follows. For each cell I 1 × I 2 ∈ G (the matrix and dataset generation procedure are illustrated on Fig. 3) : 1) We sample k pairs of constraints, uniformly at random (c j 1,3 ,c j 2,3 ) ∼ U(I 1 × I 2 ), j = 1, . . . , k. 2) We sample k correlation matrices C k with these constraints c k 1,3 =c j 1,3 and c k 2,3 =c j 2,3 . We use Alg. 3 to generate the matrices. 3) From each matrix C j we extract the correlations between the first and second variables c j 1,2 , j = 1, . . . , k. 4) We run the boundary-based attack by calculating the range of correlations between the input variables m 1 = min k j=1 c j 1,2 and m 2 = max k j=1 c j 1,2 , and returning the majority prediction under the assumption that the correlations are uniformly distributed in [m 1 , m 2 ]. 5) We run the machine learning-based attacks on models trained on datasets sampled using Gaussian copulas parametrized the correlation matrices. We generate datasets of size S using Gaussian copulas as described in Algorithm 2. As our marginals are all equal to the standard normal distribution, the correlations of the target dataset are only slightly different to the input correlations C j , 1 ≤ j ≤ k due to the sampling error. We train a model M k on each dataset. We extract the features (see section IV-E for details) and train a metamodel A. To extract the model prediction features, we further generate an auxiliary dataset D j aux using the same For each cell inside the grid (we only show one for simplicity), we draw k times two constraints uniformly at random (one on each dimension of the cell) ρ(X 1 , Y ) and ρ(X 2 , Y ). In the second step, we are using those correlation constraints to generate k random correlation matrices under constraints, using Alg. 3. In the third step, we make use of this generated valid correlation matrix under constraints by using Alg. 2 to generate our synthetic shadow dataset. steps as for the other k datasets. We use 5-fold crossvalidation and report the accuracy of the meta model for the correlation inference task, averaged over the folds. We perform a large-scale evaluation of our dataset correlation inference attacks by randomizing the target correlation matrices. We generate N T target correlation matrices C 1 , . . . , C N T using Algorithm 1 (with no constraints). For each of these targets 1 ≤ t ≤ N T , we generate k shadow correlation matrices C 1 t , . . . , C k t with constraints on the last column derived from C t , and datasets D 1 t , . . . , D k t . We train a shadow model M j t for each 1 ≤ j ≤ k and 1 ≤ t ≤ N T . For arbitrary numbers of variables, there are (n − 1) × (n − 2)/2 pairs of input variables for which we want to infer their correlations ρ(X i , X j ). For each such pair (i, j), we train a meta model A i,j t on features extracted from the shadow models M 1 t , . . . , M k t . Finally, we use it to predict the correlations between the i-th and the j-th variables. We compute the accuracy as the number of targets t such that the correlation bin predicted by the meta model A i,j t (C t ) is equal to the correlation bin of the target (C t ) i,j . Target models. The target models we study in this paper are the Logistic Regression and the Multi-layer perceptron (MLP). When the target model is a logistic regression, we use the scikit-learn implementation setting the solver to liblinear. We use the same architecture as the meta model. When the target model is an MLP, we use our own implementation in Pytorch. The architecture consists of two hidden layers of sizes 20 and 10. The optimizer is Adam and the learning rate is equal to 0.05. The model on 90% of the shadow datasets for up to 100 epochs and we use early stopping after 5 epochs of nonimproving accuracy on the remaining 10%. We also use an MLP architecture as the meta model, having two hidden layers of sizes 50 and 20, respectively. The training details are the same as before, except that the learning rate is 0.005, we use L 2 weight decay of 0.05 and early stopping after 10 epochs of non-improving accuracy. In this section, we present the results of our dataset correlation inference attacks under the grid and randomized target evaluation setups. We run the grid evaluation using a resolution parameter of N = 100, resulting in a grid of 40,000 cells. We use k = 1, 500 shadow models in each cell. The number of data samples in each shadow dataset is equal to S = 1, 000. The size of the auxiliary dataset is equal to |D aux | = 1, 000. Fig. 4A and Fig. 5A shows that it is possible for an attacker to infer the correlation between the input variables ρ(X 1 , X 2 ) significantly better the random even when the attacker has no access to the model. Our boundary-based attack succeeds 56.0% of the time on average over all the cells. This is much better than the random baseline over 3 bins which is at 33%. We can see that the extreme regions of the gridcorresponding to ranges over the constraints ρ(X 1 , Y ) and ρ(X 2 , Y ) close to 1 or -1 -are highly vulnerable to our attack, as the accuracy is close to perfect. This is likely due to the fact . 4 . Accuracy of the correlation inference attack for 3 variables and a Logistic Regression target model.. We show the accuracy for the 3-way classification task of predicting the correlation between input variables ρ(X 1 , X 2 ). We consider the complete range of possible values for the correlation between input and target ρ(X 1 , Y ) and ρ(X 2 , Y ). We show results for various attack models, from left to right: no access to model, access to model predictions, access to the model weights, and access to both model predictions and weights. Accuracy of the correlation inference attack for 3 variables and an MLP target model. We show the accuracy for the 3-way classification task of predicting the correlation between input variables ρ(X 1 , X 2 ). We consider the complete range of possible values for the correlation between input and target ρ(X 1 , Y ) and ρ(X 2 , Y ). We show results for various attack models, from left to right: no access to model, access to model predictions, access to the model weights, both in standard and canonical form [32] and access to model weights (canonical) and predictions of the ML model. that when both constraints are close to 1 (in absolute value), the correlation between the input variables always belongs to the same bin. As the constraints decrease to values close to zero, the accuracy drops as well. Fig. 4B , C and D show that an attacker having access to a logistic regression is able to dramatically improve the accuracy of the attack. As a reminder, in each cell the logistic regression is trained on a dataset with correlations ρ(X 1 , Y ) and ρ(X 2 , Y ) sampled uniformly in that cell. The average accuracy over all the cells is equal to 95.6% under blackbox access and using the model predictions as features. The average accuracy over the cells is slightly lower, at 94.4%, under white-box access and using the model weights as features. The accuracy is also equal to 95.6% under whitebox access using both the model weights and predictions as features. We believe that the low accuracy when the constraints are close to 0 across B-D but not A might be due to the fact that the target model is unlikely to learn to predict information about Y when based on uncorrelated input variables. Fig. 5B to E shows an MLP to be less vulnerable to our attack compared to the logistic regression. The accuracy (averaged over all the cells) is equal to 82.2% using the model predictions. The performance drops to 56.2% (almost the same as not having access to the model) when using the model weights. However, using the canonical ordering [32] improves the accuracy to 65.3%. Finally, combining the model predictions and the canonical weights does not improve the performance of the attack, achieving 82.1%. Overall, on the one hand, it is surprising that the MLP, having 441 parameters, i.e. two orders of magnitude more parameters compared to the logistic regression is less vulnerable, as we would have expected it to encode more information. On the other hand, as the parameters of the (non-convex) MLP might correspond to a local, rather than a global minimum of the loss function, there could be many other parameters achieving similar performance, potentially making it harder for the meta model to learn from these features. To summarize, our results show that the degree to which the correlation between two input variables X 1 and X 2 is vulnerable to our attack highly depends on the value of the other coefficients ρ(X 1 , Y ) and ρ(X 2 , Y ) as well as the target model architecture. We run the randomized target evaluation for a number of variables ranging from n = 3 to n = 10. We compute the accuracy of our dataset correlation inference attacks against N T = 1, 000 target correlation matrices. As before, we use k = 1, 500 shadow datasets and a dataset size of S = 1, 000. Fig. 6 shows the accuracy of our dataset correlation inference attacks for varying number of variables, averaged over all the pairs of input variables. We see that the accuracy under the no access to model assumption is roughly constant as we vary the number of variables n. This is expected as the boundarybased attack only uses the constraints regarding ρ(X i , Y ) and ρ(X j , Y ) and no information on the other variables in order to infer ρ(X i , X j ). We see that the accuracy of our machine learning-based attack against the logistic regression drops as the number of variables increases, from 96.2% (n = 3) to 60.0% (n = 6) and down to 50.0% (n = 10), when using the model predictions. When using the model weights, the accuracy drops as well but not as steeply, from 95.0% (n = 3) to 60.0% (n = 6) and 56.0% (n = 10). The early advantage of the model predictions compared to the model weights is lost for larger values of n. We see a similar trend when the target model is an MLP, as the accuracy of the attack drops from 81.9% (n = 3) to 62.4% (n = 6) and 55.5% (n = 10), when using the model predictions. The canonical model weights starts lower and the gap relatively to the standard model weights reduces quickly. Interestingly, while the MLP is less vulnerable to our attack than the logistic regression for n = 3 columns, the accuracy of the best attacks against each of them becomes very similar for n ≥ 6 (60.1% for LR and 62.4% for MLP). Finally, the performance of our machine learning-based attack converges towards the boundary attack-based performance as n increases. However, this is still 22-23% superior to the 33.3% random guess baseline. This suggests that the models might similarly learn information about the constraints ρ(X i , Y ), i = 1, . . . , n − 1. Finally, we further study the pairs of inputs on which the attack is more likely to succeed, something we observe in the n = 3 case in Fig. 4 and Fig. 5 . For each pair of input variables 1 ≤ i < j ≤ n and for each target correlation matrix, we thus compute the minimum of the constraints (in absolute value): min(|ρ(X i , Y )|, |ρ(X j , Y )|), as a lower estimate on the strength of the constraints between the input pair and the target variable. Fig. 7 confirms that in larger dimensions too, input variables that are more correlated with the target variable, i.e., min(|ρ(X i , Y )|, |ρ(X j , Y )|) is large, are more vulnerable to our attack. Note that across our 1,000 target correlation matrices, the highest bin only contains 122 samples among 36,000 (the total number of pairs of input variables for n = 10). We conducted the same experiment, while looking at the maximum of the pair, i.e., max(|ρ(X i , Y )|, |ρ(X j , Y )|), as an upper estimate on the strength of the constraints between the input pair and the target variable. Fig. 8 shows that when their is no correlation at all (i.e. X i , X j and Y are statistically independent (since for Gaussian variables this is equivalent), the machine-learning based attack does not improve upon the random guess baseline in large dimension. However, Fig. 8 also shows that when at least one variable is highly correlated with the goal (i.e. either |ρ(X i , Y )| or |ρ(X j , Y )| is big), then correlations are vulnerable to our attack even in large dimension. The proportion of pairs in highly correlated classes (max(|ρ(X i , Y )|, |ρ(X j , Y )|) ∈ [0.6, 1]) is 45.3% (16,327 out of 36,000). We refer the reader to Fig. 9 in the Appendix for similar results using the average of constraints between the input variables (|ρ(X i , Y )| + |ρ(X j , Y )|)/2. The results we present in the paper rely on several assumptions, allowing the attacker to narrow the wide range of possibilities for synthetic dataset generation. First, we assume that the target distribution marginals are continuous and, in particular, standard normal variables. Second, we suppose that the correlations of a dataset sampled by the Gaussian copulas are the same as the correlation matrix parametrizing the copulas. While this is true for variables having standard normal marginals, previous work showed that this is not the case for arbitrary marginals [40] . Third, we suppose throughout our experiments that the attacker has access to the exact value for each correlation ρ(X i , Y ) ∀i. While we discuss in Section III why we consider those assumptions to be realistic enough, we leave the relaxing of them for future work, e.g. evaluating on target correlation matrices with arbitrary marginals, or correcting the correlations parametrizing the copulas. In this paper, we focus only on the range of the correlation, referred to as "negative", "low" or "positive". While increasing the number of classification bins or framing the correlation inference task as a regression rather than a classification task would allow for a more fine-grained prediction, we preferred this simple setup allowing us to understand the information leakage, including in the boundary-based attack and to support our claim that two strongly paired correlations (ρ(X i , Y ), and ρ(X j , Y )) together leak the correlation between X i and X j . We also believe the range of the correlations to be sensitive information even if the attacker does not learn the precise value of the correlation. In this work, we study a simple setup, where we predict the correlation between each pair of input variables independently. On the one hand, predicting them all together might improve the meta model accuracy, as the boundaries for the coefficients -when considered together -could be restricted even more. On the other hand, the number of classes to predict increases exponentially with the number of pairs. We believe that the number of shadow datasets required to train the meta model might have to increase exponentially to obtain similar accuracy. We leave the study of this question for future work. We focused here on two kinds of machine learning models: the logistic regression and the multi-layer perceptron. These models are common choices for learning tasks on tabular datasets as the ones we considered in this paper. The results of our experiments show that the logistic regression model, surprisingly, leaks more information about correlations of its training dataset than the MLP. Future work might aim to better understand why this is happening as well as to the vulnerability of other machine learning models to dataset correlation inference attacks. Finally, we argue that our work, which shows the retrieval of correlations between the variables to be possible based on the outputs of machine learning models, might constitutewhen coupled with information about the marginals -a first Access to model predictions Access to model weights (canonical) Access to model weights No access to model Random guess baseline Fig. 6 . Accuracy of the correlation inference attack for increasing number of variables. We predict the correlation between each pair of input variables independently from the other pairs. We report the mean and standard deviation over all pairs. The number of targets is 1,000 for each number of variables. Accuracy to predict the correlation ρ(X i , X j ) as we vary the range for min(|ρ(X i , Y )|, |ρ(X j , Y )|). We show results for the machine-learning based attack using as features the model predictions (solid lines), for a varying number of variables n. The accuracy is computed over all the (n − 1)(n − 2)/2 pairs of variables. For the comparison, we also show the results obtained using the boundary-based attack, under no access to the ML model (dotted lines). step toward dataset reconstruction. For instance, demographic datasets can be well fitted by Gaussian copulas distributions [41] parametrized by their marginals and a correlations matrix parameter. VIII. RELATED WORK Privacy attacks. Research on the threats posed by the release and sharing of machine learning models has mostly focused on the risks to the privacy of individuals in the training dataset. A variety of attacks have been proposed in the literature studying different threats posed by the release of machine learning models to individual. Early works showed machine learning models to inherently learn side information during training [15] , [42] . Prior work can be broadly categorized into membership inference attacks (MIA), where the adversary aims to infer whether a record belongs to the dataset [15] , and attribute inference attacks (AIA), where the goal is to infer one or more sensitive attributes about a record that is partially known to the attacker [17] , [22] , [26] . . The vulnerability of a model to MIA has been connected to overfitting [15] , [16] , [43] and has been shown to be a threat even in blackbox setting [19] or against generative models [44] . Finally, prior work has also looked into Reconstruction Attacks (RA) [45] , [46] . The goal of RA is to reconstruct a version of the confidential dataset, therefore recovering the records of all users, using the model and public aggregate information. These threats models have been studied in both classic and federated machine learning settings [18] , [30] , [47] , [48] . Dataset property inference attacks. Recently, works have been studying whether ML models also leak general information about the training dataset. The goal of such attackscommonly called Property Inference Attacks (PIA) -is to infer whether the training dataset satisfies a given property, such as "X% of the images consist of attractives faces" or "the dataset contains a higher proportion of records from women" [32] , Accuracy to predict the correlation ρ(X i , X j ) as we vary the range for max(|ρ(X i , Y )|, |ρ(X j , Y )|). We show results for the machine-learning based attack using as features the model predictions (solid lines), for a varying number of variables n. The accuracy is computed over all the (n − 1)(n − 2)/2 pairs of variables. For the comparison, we also show the results obtained using the boundary-based attack, under no access to the ML model (dotted lines). [33] . Existing works have studied the vulnerability of Logistic Regression [31] , Multi-Layer Perceptron [32] , and CNN [32] , [49] models to PIA. Multi-Layer Perceptrons were shown to pose a technical challenge, due to the higher number of parameters, leading to an improvement of the attack [32] . By exploiting the invariance of the MLP to permutations of neurons corresponding to the hidden layers, this work proposes, among other things, a canonical representation of the weights that helps reduce the difficulty of the property inference task. Following this work, a study has been conducted to see how the numbers of parameters affect PIA performances [49] . Using PIA, an attacker is trying to predict a binary property of the dataset. While white-box has been commonly used for PIA, the black-box setting has been recently shown to also be vulnerable to PIA [33] . However, we identify two serious limitations of existing work, which our work addresses. First, existing works all assume that an attacker would have access to an auxiliary dataset drawn from a similar distribution as the training dataset. We consider this assumption to be very strong and unrealistic in practice. We remove this strong assumption by using only synthetic datasets to train the meta model, and argue that the side information needed to obtain useful synthetic data can be reached with other means than having access to part of the original dataset. Second, previous works focused on predicting binary, non-mutually exclusive properties, such as whether the distribution for an attribute is 33:67 or 67:33 [32] . We believe this assumption to be very limiting in scope. Black-box attacks on dataset properties. Concurrent work by Zhang et al. [33] partially answered some of the limitations of PIA. This work proposes a different goal for the attacker, as the attacker's goal is to retrieve one property among more than two classes, that can ultimately lead to being able to find the marginal of the sensitive variable. However, while they focused on the marginals our work focus on the correlation of the variables. We believe that information that is not limited to one variable, but how different variables interact with one another, may leak more information. Indeed, while marginals are completely linked to the dataset's distribution, correlations might reveal more about the data holder. Our work is trying to achieve a another/ more realistic setting, only using synthetic data for side information, while like most of the papers, Zhang et al. supposed the use of an auxiliary dataset, distributed according to the target's dataset. We believe this assumption to be highly unrealistic, since a dataset strongly protected will not have auxiliary data available without a privacy threat. Synthetic data for shadow modeling. [15] proposed an approach to create synthetic data using the target model's approval rate. Intuitively, a model will be highly confident on records similar to those seen previously during training. They then developed a search and sample algorithm to produce synthetic data. This adds an additional constraint to the problem, as the attacker would have to explore the entire space of possible inputs in order to generate useful synthetic records. In our problem, this space is too large to allow such a procedure. Another approach could be to use the side information we know about datasets, as proposed in [15] . Again, this procedure cannot be applied in our case. Indeed, two datasets containing statistical similarities could be very different in terms of the correlations. In this paper, we first introduced an attack pipeline to retrieve correlation coefficients from the training dataset of a machine learning model. We proposed first a boundary-based attack to retrieve the correlations coefficients and second, a machine learning-based attack. We showed that releasing a machine learning trained on a dataset makes the correlation coefficients of this dataset vulnerable. For a number of variables inside the dataset equal to 3, we observe that a logistic regression's model predictions leak in average 95.6% of the correlation coefficients, while those of an MLP leak on average 82.2% of the correlations coefficients. As the number of columns increases, the performance of our machine learningbased attack converges towards that of the boundary-based attack. This work sets the first step towards dataset inference correlation attacks against machine learning models. While there are still many parameters of this paper that can be manipulated, our results show the correlation coefficients of the training dataset of a machine learning model to be vulnerable. We argue that the retrieval of correlations between the variables, when coupled with information about the marginals, constitutes a first step toward dataset reconstruction. . Accuracy to predict the correlation ρ(X i , X j ) as we vary the range for Avg(|ρ(X i , Y )|, |ρ(X j , Y )|). We show results for the machine-learning based attack using as features the model predictions (solid lines), for a varying number of variables n. The accuracy is computed over all the (n − 1)(n − 2)/2 pairs of variables. For the comparison, we also show the results obtained using the boundary-based attack, under no access to the ML model (dotted lines). Highly accurate protein structure prediction with alphafold Deep neural networks for youtube recommendations Moderating content Expanded protections for children Subject: Write emails faster with smart compose in gmail Hey siri: An on-device dnn-powered voice trigger for apple's personal assistant Artificial Intelligence & Autopilot Machine learning in insurance Mitigating bias in algorithmic hiring: Evaluating claims and practices," ser. FAT* '20 High-performance medicine: the convergence of human and artificial intelligence A deep learning ensemble approach for diabetic retinopathy detection Science in an exponential world Big data: The next frontier for innovation, competition, and productivity Membership inference attacks against machine learning models Ml-leaks: Model and data independent membership inference attacks and defenses on machine learning models Privacy risk in machine learning: Analyzing the connection to overfitting Exploiting unintended feature leakage in collaborative learning Demystifying membership inference attacks in machine learning as a service Logan: Membership inference attacks against generative models On the privacy properties of gangenerated samples Ml-doctor: Holistic risk assessment of inference attacks against machine learning models Membership inference attacks against recommender systems Membership leakage in label-only exposures Evaluation of inference attack models for deep learning on medical data Black-box model inversion attribute inference attacks on classification models Stealing links from graph neural networks The secret sharer: Evaluating and testing unintended memorization in neural networks Extracting training data from large language models Updates-leak: Data set inference and reconstruction attacks in online learning Hacking smart machines with smarter ones: How to extract meaningful data from machine learning classifiers Property inference attacks on fully connected neural networks using permutation invariant representations Leakage of dataset properties in multi-party machine learning Unconstrained parametrizations for variance-covariance matrices The most general methodology to create a valid correlation matrix for risk management and option pricing purposes Generating correlation matrices based on the boundaries of their coefficients On the generation of correlation matrices Fonctions de repartition an dimensions et leurs marges Random variables, joint distribution functions, and copulas Calculating correlation coefficient for gaussian copula Estimating the success of re-identifications in incomplete datasets using generative models Model inversion attacks that exploit confidence information and basic countermeasures Demystifying the membership inference attack Logan: Membership inference attacks against generative models Reconstruction attack through classifier analysis Review of artificial intelligence adversarial attack and defense technologies Comprehensive privacy analysis of deep learning: Passive and active white-box inference attacks against centralized and federated learning Inverting gradients -how easy is it to break privacy in federated learning? Property inference attacks on convolutional neural networks: Influence and implications of target model's complexity A. Algorithm to fill the coefficients of a correlation matrix Alg. 4 details the procedure to fill the coefficients of a correlation matrix by Numpacharoen and Atsawarungruangkit [36] . In the original work, this procedure initialized the first column uniformly at random. As we adapt this procedure to generate random correlation matrices under constraints, we abstract away the initialization of the first column, and have it take as input a list of values that will be used to initialize the correlations in its first column.B. Accuracy to predict the correlation ρ(X i , X j ) Fig. 9 shows the accuracy of our machine learning-based attack (using model predictions as features) and the accuracy of the boundary-based attack as we vary the range for Avg(|ρ(X i , Y )|, |ρ(X j , Y )|). for j ∈ {1, · · · , i} do 15: 16: if z i,j − y i,j < K then 17: if −1 ≤ aux ≤ 1 then if aux > 1 then 29: b i,k ← 0 for k ∈ {j + 1, . . . , n} 30: // Convert lower triangular C to a symmetric matrix having 1 on the diagonal. 38 : C ← C + C T + I n