key: cord-130143-cqkpi32z
authors: Tajan, Louis; Westhoff, Dirk
title: Approach for GDPR Compliant Detection of COVID-19 Infection Chains
date: 2020-07-16
journal: nan
DOI: nan
sha: 
doc_id: 130143
cord_uid: cqkpi32z

While prospect of tracking mobile devices' users is widely discussed all over European countries to counteract COVID-19 propagation, we propose a Bloom filter based construction providing users' location privacy and preventing mass surveillance. We apply a solution based on Bloom filters data structure that allows a third party, a government agency, to perform some privacy-preserving set relations on a mobile telco's access logfile. By computing set relations, the government agency, given the knowledge of two identified persons, has an instrument that provides a (possible) infection chain from the initial to the final infected user no matter at which location on a worldwide scale they are. The benefit of our approach is that intermediate possible infected users can be identified and subsequently contacted by the agency. With such approach, we state that solely identities of possible infected users will be revealed and location privacy of others will be preserved. To this extent, it meets General Data Protection Regulation (GDPR)requirements in this area.

Cases of COVID-19 disease have been reported in more than 190 countries and its spreading has been characterized as pandemic by the World Health Organization on 11.03.2020. One of its multiple side effects consists of European democracies being challenged. Indeed, several countries are collecting location-based data from their own citizens. The state of emergency for health reasons has been established in countries as Spain, Portugal, France or Switzerland. Such a specific situation empowers a government to perform actions that would normally not be allowed to undertake. For instance, in Milano, Italy, mobile network operators are providing information on users' traffic to public authorities. In Germany, issues regarding how and for which usage to process the location-based information are ones of the most discussed. Indeed, efforts in Germany are twofold regarding digital support to detect infection chains. First, with an app Corona-Warn-App deployed and downloaded more than 15 millions times in Germany (population of approx. 83 millions). It consists of using a tracking app with Bluetooth in which a smartphone of an infected user is subsequently informing all devices which have been in proximity (within the beaconing received range at some point in time in the past). Such an approach is very vulnerable due to the requirement of continuously activated Bluetooth. The recently published families of BIAS [1] or BlueBorne [2] attacks have shown that mobile devices with activated Bluetooth can easily be remotely executed, e.g. CVE-2017-0781, CVE-2017-0782 or CVE-2017-14315 and are classified as a severe risk. Moreover, it has been pointed out that the harvesting of contacts via Bluetooth with a tracking app is only properly working in case the app is activated continuously in the foreground, and, moreover, that at least 60% of the smartphone users need to download and continuously use it to indeed have an impact with respect to the identification of infection chains.

Second, telco operators would provide access logfiles of mobile network base stations to RKI (Robert-Koch-Institute) to support inferring infection chains.

On the contrary, the Netherlands' government decided to not approve a general confinement, for the reason of being incompatible with individual freedom.

For these reasons, in the work at hand we attempt to propose a construction which combine the efficiency to help the public authorities to contain the virus spreading with the possibility to provide privacy with respect to the citizens. Therefore, we concentrate on providing a privacy-preserving solution for the 2nd effort currently done within Germany. Our proposed solution makes use of our previous works [3] , [4] which allows a non-trusted third party to privately compute operations and relations on sets using Bloom filters data structure. Such data structure allows one to represent a large set of elements in a simple tabular of bits which could provides obfuscation and privacy on the set.

We recall that GDPR's two main objectives are to firstly enhance the personal data protection by processing them and to secondly empower the companies in charge of this processing procedure. Even if this regulation does not apply on fields as public health or national security [5] , weaving the proposed Bloom filter based private protocols into infection chains investigation would limit government agencies to solely identify users with high probability of being infected instead of a massive data analysis of all mobile users.

Several approaches from related work allow one to perform computations on pseudonymized, obfuscated or even encrypted data without the need to discern them. We could list homomorphic encryption [6] , [7] or multi-party computation [8] , [9] which represent the mainly investigated techniques. In [10] , we applied our Bloom filter based construction to several use cases of post-mortem mobile device tracking. In our former work [4] , we have shown that this alternative approach based on Bloom filter could be used to secure data while preserving the ability of performing relevant tests or computations on the private data. Bloom filters have been used in many different scenarios as presented in [11] . For instance, Kerschbaum directly encrypts the Bloom filter with homomorphic encryption [7] . In [12] authors applied the Bloom filter to key exchange mechanisms in wireless sensor network (WSN) environment while in [13] , authors optimize the sensor nodes broadcasting with the use of Bloom filters.

Regarding the investigation of privacy-preserving location tracing solutions in the environment of COVID-19 spreading, we could mention the work of PEPP-PT consortium [14] . This European team provide standards, technology, and services to countries and developers with the objective to help stopping the COVID-19 spreading.

A government agency, which role is to reduce the spreading of the SARS-CoV-2 virus in its country, knows different pairs of infected persons (A, B). Its objective here, is to identify all the possible paths which relies user A to user B and considers the case where infection of user B is a consequence of user A's infection. By retrieving all possible paths (surely it could also turn out that no path exists and the infection of users A and B was unrelated), the agency could identify all the users within this path that may be also infected by the virus and try to contact them. Indeed, different mobile device's users close to the same mobile base station at the same time could potentially spread the virus in case of one being infected. To do so, the agency is analyzing connection data provided by a telco company. The connection logs are collected on the base stations which are providing network access to the users' mobile devices.

A. Parties Involved.

Four parties are involved in the scenario: Users: could be infected by the SARS-CoV-2 virus. They are connecting to the base stations to access the mobile network. Telco company: provides network to the users via several base stations. It also provides log data from the network connections to government agencies. Base stations: are distributed over several countries, provide network to the users' mobile devices and collect connection data. Government agency: aims to identify "infection chains" in order to contact the possible infected users and counteract the COVID-19 disease pandemic.

For each base station j, the telco company firstly generates and initializes a fresh Bloom filter BFj represented by a tabular of bits, all set to 0. Any time a user is connecting to the mobile network using base station j, the following connection information is aggregated and added to BFj:

with idi the user's credentials and t 1 i and t 2 i respectively the starting and ending times of its connection to the access point. Such connection data should be considered as sensitive regarding the location privacy of the users. As it will be presented, we consider a Bloom filter-based approach which brings privacy to the stored data. Indeed, on the one hand the base stations are using usernames to characterize the users and on the other hand only the telco company could generate and access the connection information from the base stations.

C. Proximity Chain -Infection Chain

As notation rule, we use to express proximity chains and [] for infection chains. A proximity chain consists of a list of users where two successive ones have been at the same location at the same time. To establish a proximity chain, these times of contact should be ordered. In other words, in the proximity chain A, D, F, E, B , the time at which users A and D have been at the same location should precede the one for users D and F (i.e. [t 1

. In addition to be defined as a proximity chain, the list could also represent an infection chain. In this case, all the users composing the chain should have a probability of being infected P r(Xi) greater than a certain threshold T r. More concretely, an infection chain [A, X1, . . . , Xn, B] is a proximity chain for which it holds that: ∀ Xi : P r(Xi) > T r, otherwise it is solely a proximity chain.

Therefore, an infection chain [A, X1, . . . , Xn, B] represents how the SARS-CoV-2 virus may have spread from an initially infected user A to a consecutive infected user B.

It may happen that one or several subsets of a proximity chain A, X1, . . . , Xn, B are considered as infection chains, e.g.

[A, X1, . . . , Xi] and/or [Xj, . . . , B].

We consider the government agency as the principal threat for the application's users. As we stated previously, even if GDPR does not apply on public health security matters, we aim to apply limitations on government agencies. In such a way, we would like that the agencies could only identify users with high probability of being infected instead of having a massive data analysis of all mobile users. As we will present in the following sections, having the telco company colluding with the government would allow the agency to access personal data of all users and therefore we do not consider such assumption.

Even if we do not get any collision, we could also precise that users are not trusting the telco company. Indeed, they seek to limit the mobile devices to collect personal data as much as possible.

We also consider that users do not trust any approaches that require to maintain Bluetooth continuously on since multiple types of attacks could occur as by example remote code executions from Bleedingbit vulnerabilities [15] .

As recently proposed in [4] , the Bloom filter data construction could allow to privately represent sets of elements and at the same time enable performance-saving computation on them. Exactly due to this performance-saving privacy extension, we argue that our approach also suits for such massive data sets like mobile access logfiles. At next, we give a background on Bloom filters and the relevant set relation and recall the basic protocol's functions.

A Bloom filter is a data structure introduced by Burton Howard Bloom in 1970 [16] . It is used to represent a set of elements. With a Bloom filter representing a certain set, one can verify whether an element is a member of this set. Such a data structure consists of a tabular of m bits which is associated to k public hash functions. At first, all the m bits are initialized to 0. To add an element to the Bloom filter, one has to compute the hashes of this element with each of respective k hash functions. Then, set the bit to 1 for each position corresponding to a hash value. To test whether one element is included in the Bloom filter, one has, similarly, to compute the respective hash values of this element and verify if the respective bits are set to 1. If at least one of these bits is set to 0, then we know for sure that the tested element is not a member of the set represented by the Bloom filter (i.e. no false negative could append when testing an element). On the contrary, with some (minor) probability, the testing function could retrieve a false positive. Indeed, even if all the bits that have been verified are set to 1, the tested element may not be part of the set represented by the Bloom filter.

Multiple types of operations could be performed on sets. For privacy concerns it could be of interest to solely reveal the cardinality of the resulting set instead of its content. Therefore, we propose a solution on adapted Bloom filters (see III-C) to use one kind of set relations namely the inclusiveness defined as follows:

Definition 1 (Inclusiveness): Let A and B be finite sets. We consider A included in B, i.e. A ⊂ B, iff all elements from A are included in B : ∀a ∈ A : a ∈ B.

To guarantee full privacy of the sets' content along with their cardinality, we proposed in [4] to modify the Bloom filters approach in two aspects. Firstly, instead of using k public hash functions, we are using a unique HMAC function with k secret keys. Secondly, the exact value of k is kept secret and is privately and randomly generated within two publicly known boundaries. We specify the functions regarding the initialization phase and the inclusiveness protocol. 

Then we express the number of bits set to 1 in the resulting Bloom filter. If it is equal to m, we can conclude that A ⊆ B if no false positive occurred. Otherwise we get A B with certainty.

For an evaluation of the correctness and the security of this protocol, we refer the readers to [3] . It is shown that a proper selection of parameters m and k considering the number of elements to be inserted, guarantees the limitation of overlapping bits in the resulting Bloom filter and enables the 3 rd party to correctly conclude on the inclusiveness property of the two sets. Indeed, a too large amount of overlapping bits in the resulting Bloom filter would lead to a case of false negative.

From any two given infected users A and B, the government agency first aims to identify all the proximity chains A, idX 1 , . . . , idX n , B . In our protocol, we recall that the telco company provides all the relevant Bloom filters to the government agency. We propose to dissociate three cases: We remark here that we know users A and B but we do not know user X nor his access credential idX , so the government agency has to search in all base stations for all Xj for which the above two inclusiveness tests INC hold. If P r(idX ) > T r we can denote [A, idX , B]. • CASE 3: the general case A, idX 1 , . . . , idX n , B :

we

Our solution consists of having the government agency building a data tree structure representing all the proximity chains starting from user A. From this tree, the agency could easily identify the proximity chains from user A to user B. For the next step of the protocol, the government agency has to evaluate the chain to determine its plausibility to actually be an infection chain. We give the outlines of this step but not its evaluation function that we save for the epidemiologists.

We emphasize that at this point, the proximity or infection chains will only reveal usernames of users X1, . . . , Xn and not their real identities. At the very end of the protocol, the government agency will request from the telco company the identities of the intermediate infected users.

To obtain a proximity tree, the government agency starts by creating an empty tree T with user A as root. Then, it processes the recursive algorithm prox tree(A, A, B, t ) presented in Algorithm 1 with t the time from when user A could have started the infection process. The recursive algorithm does as follow: first, it generates the list BSN of base stations that the current node N has been connected to at a time later than t. To test if a user N has been connected to a base station j (i.e. test if (idN , t 1 j , t 2 j ) ∈ BFj), the government agency receives from the telco company all the Bloom filters composed of each of the 3-tuples (idN , t 1 j , t 2 j ). Then, the government agency performs the inclusiveness testing between the received Bloom filters and BFi, the Bloom filter corresponding to the connections logfile from BSj as : IN C(BFN,j, BFi) . The next step of the algorithm consists of identifying all the users that visited the base stations from set BSN at the same moment than user N . As before, the telco company generates Bloom filters with the 3-tuples (id l , t 1 l , t 2 l ) for all users l and all time ranges [t 1 l ; t 2 l ] that overlap the connection time of user N . To determine which users should be listed, the government agency performs the inclusiveness operator between these Bloom filters and BFN the one composed by the elements from BSN . Finally, for every identified users, they are added to the proximity tree T as a leaf of current node N and Algorithm 1 is then recursively processed on the leaves.

An additional aspect to take into account while recursively processing the algorithm is to consider the upper nodes of the current node in the proximity tree. Indeed, we would like to avoid creating some loops in the tree which are irrelevant when dealing with infection problems; if user A infected user C, it makes no sense to consider user C infecting user A in short period of time. The algorithm should then exclude all the users which are already inserted as upper nodes in the tree. Regarding the tree construction, if we consider that user C has been in proximity of user A and idC is added as a leaf of root A, user A should not be considered anymore as potential leaf of node idC and so on.

In Figure 1 we give a toy example of our recursive algorithm with seven users A, B, C, D, E, F, G, three base stations BSj 1 , BSj 2 , BSj 3 and times as integers in [0; 24]. We show the content of connection logfiles from the three base stations and the proximity tree from user A to user B that has been generated by computing prox tree (A, A, B, 0) . We observe in Figure 1 that two users might be in contact around different base stations. Indeed, the Algorithm 1 prox tree (N, A, B, a) Algorithm optimization: With respect to performance, one could consider computing the algorithm on the opposite way, namely with input B as root. To do so, the algorithm should be modified so that time is considered backwards. It starts at ending time (24 for our toy example) and we build the proximity tree by going back in time. We consider as reverse prox tree this reverse recursive algorithm.

In Figure 2 we show the proximity tree obtained after computing reverse prox tree(B, A, B, 24) from user B considering the time backwards. As expected, the resulting proximity chains are the same than in Figure 1 but we remark that the resulting tree is smaller than the one obtained in Figure 1 . In this specific toy example we notice that obtaining the proximity tree was made faster by reversing our algorithm.

Example of a proximity tree obtained from reverse prox tree(B, A, B, 24) with the same toy example than Another aspect we could consider while comparing the two resulting trees, is that the order the tree is being build and the proximity chain obtained is also reversed. Indeed, in Figure 1 we obtain first A, C, G, B then A, G, B (via j1), A, G, B (via j3) and finally A, B . In Figure 2 we see that we obtain the chains in the exact opposite order with reverse prox tree. Still aiming to optimize the computation time of our algorithm, in particular when dealing with large numbers of users and base stations, one could simultaneously start the tree generation using the algorithm and its reversed version. For both cases the tree propagates and every time we find a proximity chain in the tree (meaning N = B or N = A for reverse prox tree) we could store the chain in a set S (or S for reverse prox tree). Then for each round (i.e for iteration) we test if the two sets have a common element. If not, we continue. In case they have a common proximity chain, we could stop both algorithms and the complete set of proximity chains from users A to B is composed of the addition of sets S and S .

To illustrate the approach of computing both versions at the same time and, as argued, gain on performance, one could explain:

• if you throw one stone into the water and you want the resulting waves to reach a point in r meters distance, then the circle at the end will encompass many square meters. • if you throw two stones into the water (one at the original position, the other one at the position you want to reach), the intersection of the resulting waves propagation will be approx. at a distance r/2 meters. • adding the area of these two circles shall be much smaller than the circle's area obtained with one stone. For example, with A = π × r 2 and r = 10 A = 314.159, and with r = 5 the area of the two circles is altogether approximately 160! Another level of optimization could be considered in order to identify some of the proximity chains faster as for instance to support the start of a localized quarantine immediately. Instead of storing the chains into S and S , at each propagation round we look at the chains while they are processed so that we stop both algorithms when:

• prox tree has built a path A, X1, . . . , Xi • reverse prox tree has built a path B, Xn, . . . , Xj • and it holds Xi == Xj Then the two parts of the proximity chain could be concatenated to create the proximity chain A, X1, . . . , Xi == Xj, . . . , Xn, B

We could refer to Table I to see that if we perform both algorithms at the same time in the toy example configuration, we could retrieve the proximity chain A, C, G, B faster with this second level of optimization.

In Table I we could observe in detail how we retrieve the proximity chains using the two versions of Algorithm 1 and the optimization with the toy example's configuration. As stated previously, reverse prox tree(B, A, B, 24) was executed way faster than prox tree (A, A, B, 0) . Indeed, the original algorithm ended after 18 rounds while the reverse one stopped after the 9 th round. Since it is not possible to predict which of the two will finish processing first, computing both in parallel will optimize the retrieving. As for the second level of optimization, concatenating two parts of proximity chains allows to retrieve A, C, G, B at round 2 while discovered at round 6 with prox tree and round 9 with reverse prox tree. It is of value especially when proximity chains are composed by a high number of intermediate users.

The performance gain obtained with our two levels of optimization is downplayed due to the extreme smallness of logfiles in our toy example. One could easily imagine that applied to real life scenario and big data these optimizations are highly performance saving. For example, in another scenario dealing with mobile connection logfiles [10] , authors propose to process on these logfiles and therefore Bloom filters up to 10 6 elements.

b) Algorithm decentralization: The European PEPP-PT consortium is advocating a decentralized approach as well as the DP3T protocol [17] which relies on Bluetooth, and also as [18] where decentralization has been investigated. With our presented optimization, we could integrate such construction by introducing two additional parties besides the ones already presented. We precise that these two additional parties should be extremely powerful in terms of computation and perform parallel computing such as server farms or clusters:

• Computing party 1 which runs prox tree • Computing party 2 which runs reverse prox tree This way the agency is only receiving per round the values for Xi (from computing party 1) and Xj (from computing party 2) and comparing if Xi == Xj. Only in the case Xi == Xj we obtain that computing party 1 is sending A, X1, . . . , Xi and computing party 2 sending Xj, . . . , Xn, B to the agency. With such a construction, multiple parties are involved in the computation and the whole effort does not rely on the government agency. c) Algorithm complexity: One could easily see by analyzing the obtained results in Figures 1 and 2 that the size of the resulting tree will depend on the size of the base stations' logfiles. These logfiles will naturally depend on the amount of users and thus connections during the particular time. The more base stations and users there are, the more logfiles will be numerous and fully filled. In our toy example, we have 11 connection entries in all combined base stations as displayed in Figure 1 . They result in a tree with respectively 19 and 10 nodes by computing prox tree and reverse prox tree. We also recall that in case we find the final user of the wanted infection chain (user B in our example) in the tree, the algorithm reaches a break instruction and therefore the respective sub-tree is no longer explored. A high activity of this particular user could then reduce the tree's spreading. As seen previously, one of the two algorithms will be faster to execute without being able to predict which one and applying the presented optimization could reduce the complexity to the faster one.

From all the proximity chains A, idX 1 , . . . , idX n , B obtained by performing the aforementioned protocol, the government agency should determine if users Xi might also be infected. To do so, the agency could estimate the users' probability of being infected and compare it to a threshold (i.e. P r(Xi) > T r). Such a probability obviously depends, among others, on the respective neighbors within the chain. We consider the probability value computed as a function inf ection(previous node, contact time, contact distance, reproduction number, saturation) where saturation shall denote the percentage of infected persons within the human population of a region, which obviously changes over time.

More precisely, in Germany the reproduction number R, which is defined as the mean number of people infected by a case, was 3 at the beginning of the COVID-19 crisis and by 17.04.2020 could be reduced to 0.7 (and meanwhile R = 1.1). Clearly this number is only an average but still indicates that inference from a proximity chain to an infection chain very much depends on the concrete time and location entities met during the pandemic wave. Similar numbers also exist for other countries as for instance R = 0.8 for Belgium at 17.04.2020. Another important observation is that since a proximity chain can easily build up over a period of weeks, P r(Xi) may significantly vary. But only if all probabilities are larger than T r the agency can at least argue having identified a possible infection chain.

It goes without saying that it is out of scope to determine the inf ection function. On the one hand, specialists emphasize the high contagiousness of the virus but on the other hand, having two users connecting to the same base station at the same time does not necessarily imply any physical contact between the two.

Without being able to determine the exact probability of a user to be infected by another one, we could propose a model to evaluate the probability of a proximity chain becoming an infection chain. First, we know that users A and B are infected and we would like to determine if user B has been infected due to user A or via another chain and other infection events. Therefore, applying probability theory to such a problem is relevant and reflects the chain characteristic of it.

We define as P r(Xi) the following conditional probability P (Xi|Xi−1) of the event "Xi−1 has infected Xi knowing that Xi−1 is already infected". It holds that P r(X1 ∩ · · · ∩ Xn) = n i=1 P r(Xi).

Considering a proximity chain A, X1, . . . , Xn, B , there is a clear tendency that the overall probability to have user B infected due to user A is inversely proportional to the length of the proximity chain. We propose the following probability model for evaluating a proximity chain: The proximity tree obtained at the previous stage of the protocol contains nodes with users' credentials and only these usernames are revealed. It is only in case a proximity chain turns out to be an infection chain, that the agency will request from the telco company the real identities of the users composing the chain. Therefore, users' identity are solely revealed in case of infection function outcomes so. Moreover, we recall that during the overall process no additional location information of other users listed in the mobile operator's logfile are revealed to the agency. Another way to tune prox tree and make the overall computation more salable could be, during the computation of prox tree and reverse prox tree, to only consider such paths in the proximity tree as long as they still fit the criterion to also be an infection chain. It could consists of having the testing from equation (2) at line 20 from Algorithm 1 and a break instruction in case the test is not fulfilled.

One may notice that a trivial optimization would be to switch users A and B in the sense that "infection of user A is coming from user B". In Figure 3 we show the proximity tree obtained from our algorithm by computing prox tree(B, B, A, 0) with our toy example logfiles. We notice that it results in a very different tree than in Figure 1 obtained by prox tree(A, A, B, 0) . In case the government agency holds some information on the infection time of users A and B, for example that user A has been infected before user B, only one direction should be considered by the agency. To be the most efficient, the government agency should perform a final step in the protocol. All the users identified as infected at the previous stage (i.e. all Xi where P r(Xi) > T r) should be considered as new users A and respectively B in the proposed solution. Indeed, our protocol is initiated with users tuples (A, B) already identified as infected by the agency. The freshly identified users are thus incrementing the list of known infected persons and the protocol should be applied to them to optimize the search. In such a way, the most infected users could be identified and contacted.

We argue that the proposed solution provides privacy for the users by three different means. Firstly by using only personal credentials as usernames and secondly thanks to the Bloom filter's construction and its obfuscation feature. Indeed, as explained previously, the real identities of users are not provided and stored in the Bloom filters nor the logfiles. The telco company uses usernames to distinguish users and the private mapping will be provided to the government agency solely on-demand, when a user is identified as being part of an infection chain.

The second aspect of location privacy is given by the Bloom filters based approach from [4] which allows to compute relations among logfiles while keeping these data sets private. We recall that such an approach uses an HMAC function instead of a bunch of public hash functions and therefore only the telco company could create the Bloom filters and no other party. To this extent, the government agency could not try to retrieve locations of a specific user by generating a Bloom filter with a unique element and performs the inclusiveness relation between this Bloom filter and the ones from base stations. For that reason, using secret keys to generate a valid Bloom filter enhances the privacy aspect of the protocol. Finally we recall that secret keys are generated and stored only at the telco company side and are not required by the government agency to perform our protocol.

The third aspect of location privacy consists of having no other party than the provider itself (which anyhow has this information) gets the location data of the users. This can be easily done by not revealing which BFi comes from which BSi. This way, the only information revealed to the authority is the contact information of users having entered the same cell during the same time interval.

Providing the concrete location information of this cell is totally irrelevant for the authority to compute the proximity resp. infection chain.

We proposed in this work to use the Bloom filter approach from [4] for a real life use case, similarly to [10] where we applied it to a post-mortem mobile device tracking scenario. Our detailed protocol supports a government agency to track possible COVID-19 infection chains and therefore identify plausible infected mobile users. Throughout the entire protocol, the agency will only handle usernames which do not allow to retrieve the users' identities and therefore their privacy will be preserved. Solely in the case of possible infection by the life-threatening SARS-CoV-2 virus, real identities will be revealed to the agency, that will be able to contact them and provide medical support. In such way, the telco companies act GDPR compliant and could still guarantee a certain level of location privacy to their clients. We could stress that if data stem from the 'in proximity' mobile telco's logfile, it means that two devices have been in the same transmission range of a base station. In the worst case they can still have a 2×r distance (easily 500 m or more). However, if the same approach can be applied to the RSSI based Swarm-mapping approach for Android or iOS collected data then 'in proximity' has a much better accuracy [19] . In particular also the WiFiLocationHarvest file of each mobile device contains timestamp, latitude, longitude, trip-id, speed, course at an amazing accuracy which comes close to the accuracy required to check if two devices got nearer than 2 m (infection distance). And, moreover, compared to the promoted App based approach with Bluetooth from Germany Fraunhofer Institutes and others in the RSSI based approach the mobile's WLAN and Bluetooth can be off, and yet, simply due to the measured RSSI from the access point the approach provides the location data of the devices equipped with such modern mobile operating systems.

To conclude, our approach may be a good starting point for debating a reasonable GDPR compliant detection of COVID-19 infection chains since we argue it does not provide additional privacyleakage to other parties than those who already have the knowledge of our location data.

BIAS: Bluetooth Impersonation AttackS

Bluetooth applicationlayer packet-filtering for blueborne attack defending

Private set relations with bloom filters for outsourced SLA validation

Solving set relations with secure bloom filters keeping cardinality private

The european union general data protection regulation: What it is and what it means

Efficient private matching and set intersection

Outsourced private set intersection using homomorphic encryption

Privacy-preserving set operations

An efficient bloom filter based solution for multiparty private matching

Retrospective tracking of suspects in gdpr conform mobile access networks datasets

Optimizing bloom filter: Challenges, solutions, and comparisons

Secure and efficient authenticated key exchange mechanism for wireless sensor networks and internet of things using bloom filter

Bloom filter based data collection algorithm for wireless sensor networks

Privacy-Preserving Proximity Tracing

Bleeding bit -exposes enterprises access points and unmanaged devices to undetectable chip level attack

Space/time trade-offs in hash coding with allowable errors

Centralized or decentralized? the contact tracing dilemma

Standortlokalisierung in modernen smartphones -grundlagen und aktuelle entwicklungen