key: cord-0156766-icz8y5ew authors: Lucchini, Lorenzo; Centellegher, Simone; Pappalardo, Luca; Gallotti, Riccardo; Privitera, Filippo; Lepri, Bruno; Nadai, Marco De title: Living in a pandemic: adaptation of individual mobility and social activity in the US date: 2021-07-26 journal: nan DOI: nan sha: 230ebb45f4e54574a7e2895330f3f9ea62e45fce doc_id: 156766 cord_uid: icz8y5ew The non-pharmaceutical interventions (NPIs), aimed at reducing the diffusion of the COVID-19 pandemic, has dramatically influenced our behaviour in everyday life. In this work, we study how individuals adapted their daily movements and person-to-person contact patterns over time in response to the COVID-19 pandemic and the NPIs. We leverage longitudinal GPS mobility data of hundreds of thousands of anonymous individuals in four US states and empirically show the dramatic disruption in people's life. We find that local interventions did not just impact the number of visits to different venues but also how people experience them. Individuals spend less time in venues, preferring simpler and more predictable routines and reducing person-to-person contact activities. Moreover, we show that the stringency of interventions alone does explain the number and duration of visits to venues: individual patterns of visits seem to be influenced by the local severity of the pandemic and a risk adaptation factor, which increases the people's mobility regardless of the stringency of interventions. The COVID-19 pandemic has prompted many countries to implement numerous Non-Pharmaceutical Interventions (NPIs) such as international travel restrictions, physical distancing mandates, closures of business venues, and stay-at-home orders to prevent the spread of the virus [12, 18, 23, 31, 45, 46, 51] . These policies have had a profound impact on numerous aspects of human life including employment [9, 32], economy [11, 13, 17, 27 ] and people's social behaviour [29, 80, 85] . Previous literature has exploited mobile phone data to simulate the evolution of the epidemic [8, 16, 41, 47, 61, 84] and the effectiveness of physical distancing interventions, which reduced people's mobility between and within cities [34, 40, 72, 80, 83] , physical activity [48, 78] and person-to-person contact patterns [10, 50, 66, 70] . Thus, there is no doubt that NPIs proved effective in reducing people's mobility and increasing physical distancing. However, much less is known about an individual's behaviour. Most studies have limited their focus on how much people were moving and how many visits each Point of Interest (POI) experienced (e.g. [15, 40, 48, 50] ), with little evidence on how individual habitual leisure activities and social interactions changed, and adapted, over time during the pandemic. This paper studies the changes in the daily routine activity of people, focusing on the number and the chronological sequences of visits to Points Of Interest (POIs), the time spent in them and the places where person-to-person contacts happen. We combine geographical open data from OpenStreetMap (OSM) with privacy-enhanced longitudinal GPS mobility traces of more than 837,000 anonymous opted-in individuals, measured for nine months from 3 January 2020 to 1 September 2020. Our dataset has an average accuracy of 22 meters and covers 16 hours of activity per day, allowing us to describe human mobility at a fine spatial and temporal granularity while ensuring users' privacy (see SI Appendix B 1 for additional details). We analyse and compare the individual's mobility in four US states with the highest and lowest values of daily COVID-19 death rate and NPIs stringency, specifically on Arizona (many deaths and low stringency), Oklahoma (few deaths and low stringency), Kentucky (few deaths and high stringency), and New York (many deaths and high stringency) (see SI Appendix E for details). Our results describe the disruption effect of the COVID-19 pandemic on human behaviour over time at an unprecedented level of detail. Overall, we find that individuals profoundly changed their daily routine activities preferring shorter and more predictable movements between places. People's reduced the number of visits to POIs but also changed their duration, which struggles to recover even during the re-opening phases. Since visits and social interactions at POIs are intertwined, we model the expected number of person-to-person contact activity through co-location events between two individuals. We find that individuals reduce the number and the duration of co-location events more than expected. Notably, this reduction is strongly correlates with the daily number of deaths. Finally, we find that the number of visits increases over time even when NPIs policies do not change. To explain such contradictory behaviour, we model the visits to POIs and find that people might adapt to the COVID-19 risk and increase their time spent outside an individual's residential area over time regardless of the NPIs stringency. RESULTS We explore and characterize human mobility from an individual's set of stop locations, defined as places where a person stays for at least 5 minutes within a distance of 65 meters. From the original GPS data (see Figure 1A) , we detect the stop locations of each individual through a combination of the Lachesis [43] and DBSCAN algorithms [28] (see Methods and Figure 1B) . As a result, stop locations are described as tuples (lat, lon, start-time, end-time), where an individual stays in a particular location with latitude (lat) and longitude (lon) from start-time to end-time. lat and lon are the mean latitude and longitude values of the GPS points found within the specified distance of 65 meters (we refer to the Methods and SI Appendix C for the details). Then, we focus on the home and work locations of people. To preserve privacy, the data provider obfuscates users' precise home and work locations by transforming it to the centroid of the corresponding Census Block Group. Thus, we identify the home census block group of users, from now on called Residential area, by looking at the most visited locations during the nights (from 8 pm to 4 am) with a moving time window of 28 days. Similarly, we also identify the Workplace, defined as the most visited census block group during the week (from 9 am to 5 pm), which is not marked Residential. We refer to SI Appendix D for additional details. Finally, we add semantic meaning to individuals' mobility trajectories associating each stop location to the nearest Point of Interest (POI), extracted from OpenStreetMap (OSM) [62] , whenever a POI lies within 65 meters from the stop location (see Figure 1C ). POIs are commonly described as public locations that people may find interesting, for example, for business or recreational activities [3] . Since the OSM POIs taxonomy is not hierarchically organized, we create a humancurated mapping from the OSM tagging system [1] to the Foursquare venue category hierarchy [4] (see Methods and SI Appendix C for details). SI Appendix G 2 shows the popularity of POIs in different states, highlighting the variety of the visiting behaviour in the US. To validate the data provided by Cuebiq and our pre-processing, we compute the correlation of the time-series of visits to POIs, residential areas and workplace areas between our data, Google data [40] , and Foursquare data [33] . We report an average Pearson correlation in New York state of 0.91 and an average Pearson correlation of 0.84 in all four selected states (see SI Appendix H). Starting from the visits to POIs of each individual in the selected US states and periods, we address the following questions: How did the pandemic change how individuals visit POIs? How did their mobility routines change? How did the physical social contacts adjust with the pandemic? What are the main factors influencing the visits to POIs throughout the pandemic? Figure 2C shows the distribution of time spent by individuals over time in the state of New York. We group the time spent into five categories: Residential, Workplace, POI, Other (i.e. stop locations not matched with a POI and not detected neither as Residential or Workplace), and Moving (i.e. time spent moving from place to place). After the stay-at-home order, we find that the percentage of time spent in residential areas increases significantly to the expense of the time spent at Workplace, POI, Other, and Moving. After phase three, we find that the time spent in residential areas gradually reduces while other categories increase. The time spent in each category, however, does not return to the pre-pandemic period. Then, after phase three, people significantly increase the time spent in some categories such as Outdoors & Recreation and Travel & Transport, probably also due to seasonal effects, but other categories such as Arts & Entertainment do not recover to pre-pandemic levels. We refer the readers to SI Appendix A for a concrete example of the reduction of visits in New York City. We also observe significant differences between states. For example, New York experienced the most significant drop in the number of visits (e.g. -75% in Shop & Services), while all other states experienced different reductions (e.g. around -40% in Shop & Services for the other states), see SI Appendix G 5) for details. Noteworthy differences are present also in the recovery phases due to the different re-opening measures. As an example, the state of Arizona experienced a smaller drop in the number of visits during the safe-at-home phase (−60.7% for Food and −37.6% for Shop & Service), but the number of visits recovers better and much faster than the state of New York (due to early re-opening phases and the absence of restrictions after the end of the stay-at-home order on 15 May 2020). However, following a record high in the number of hospitalizations, on 29 June 2020, the government issued a partial reversal of the stay-at-home that halted the recovery and resulted in −22.5% for Food and −11.7% for Shop & Service at the end of the period of study. Interestingly, we observe that individuals allocate the everyday time spent similarly across states, regardless of the NPIs implemented, but again we found differences due to different re-opening strategies. We refer to SI Appendix G 5 for the results in the four states, and to SI Appendix I for a set of additional metrics (i.e. individuals' unique stop locations, diversity of visits, and radius of gyration) supporting our findings. By looking at the aggregated mobility, we have only a partial view of the disruption in people's life due to COVID-19. Thus, we here focus on individual's sequences of visits. We transform the individual's chronological sequence of visits to places into a sequence of symbols (e.g., Food, Residential, Workplace). Then, we apply the Sequitur algorithm [25, 54] to generate a hierarchical representation of the original sequence, compressing repeated occurrences into new symbols called words. Here, each word represents a routine, namely a chronological sequence of two or more places that frequently appears into the patterns of visits of an individual (see Appendix J for an example). Due to computational constraints, we focus our attention on two 4-week periods, before (from 1 February 2020 to 28 February 2020) and during pandemic (from 21 March 2020 to 17 April 2020), and compute the significant routines. First, we generate 1000 randomized POI sequences with length equals to the original sequence for each individual. Then, we define an individual's sequence as significant if it has a z-score < 2 between the occurrence of the real routines with respect to the randomized ones. Finally, we filter out all the non-significant routines. We refer to the Methods for additional details. The selected routines represent meaningful sub-sequences of an individual's mobility and allow to better understand, at a micro-level, how human mobility preferences changed with COVID-19. To that end, we model the ordered sequences of visits to POIs as a weighted undirected network in which the nodes represent the POI categories, and a link between category c 1 and category c 2 exists if there is at least a sequence where c 1 immediately follows c 2 or vice-versa. The weight of the link represents the daily average proportion of sequences containing [c 1 , c 2 ] or [c 2 , c 1 ]. Figure 3A -B shows the weighted network of the POI categories in the state of New York, where the size of the links is proportional to the intensity of the relationship between the two POI categories. We observe that all links reduce their intensity (on average -79%) with the exception of the Residential ↔ Residential, which increased by 5%. Figure 3C -D shows the distributions of the network weights in the pre-pandemic (from 1 February 2020 to 28 February 2020) and during pandemic (from 21 March 2020 to 17 April 2020) periods in New York state, from which we excluded all self-loops (e.g., Residential ↔ Residential, Food ↔ Food connections). Figure 24 ). The dramatic change of people's behaviour also emerges from the analysis of the similarity between the characteristic routine of different individuals. We represent each individual's significant routine behaviour regarding the presence or absence of two-elements sequences between POI categories. Then, we compute the Jaccard similarity between a randomly selected sample of 10,000 individuals (due to computational constraints). Finally, we apply agglomerative hierarchical clustering [77] to find relevant groups of individuals with similar routine behaviour. By comparing Figure 3E and Figure 3F , we observe that Figure 3F contains larger clusters, which means that mobility routines simplify and people's behaviour gets more homogeneous. This result is also quantitatively confirmed by the larger silhouette score [69] Figure 20 ). Similar results apply in all the other states (see SI Appendix J 2). As a proxy to understand how much people engage in physical and social activities, we define a co-location event as when two individuals stop for at least fifteen minutes and are at most 50 meters apart from each other. These co-location events are aggregated in four different categories depending on the place where the possible social contact took place: (i) Residential, a co-location event where only one of the two individuals have the stop marked as Residential location; (ii) Workplace, a co-location event that happened in a venue labeled as a workplace for both the individuals; (iii) POI, a co-location event where both the individuals are in the same POI; and (iv) Other, a co-location event in which the two individuals meet in a place that it is neither a Residential nor a Workplace nor a POI (see SI Appendix K for additional details). We here note that Residential and Workplace co-locations are defined at a coarse level, due to the anonymization process done by Cuebiq. Figure 4A shows the abrupt change of the co-location events in the New York state, starting at the school closure day on 15 March and reaching low points of −92%, −67% and −83% for POI, Workplace and Other co-location events respectively. Interestingly, we observe that Residential co- location events, namely between people who do not live together, experience the smallest reduction, decreasing at most by 32% from the pre-pandemic levels. During the strictest measures put in place in New York state, we notice that people maintain their co-location events inside other's people residential areas and in places which are not marked as POIs (see Residential and Other in Figure 4A ). This presumably happens because of the impossibility of having co-location events in venues such as pubs and restaurants (see Food and Nightlife Spot categories in SI Appendix K 1) due to the NPI interventions including physical distancing measures and closures of POIs such as Arts & Entertainment and College & University. Again, we find some differences between the number and the duration of co-location events. First, the duration decrease less than the number of co-location events, reaching low-points of −7%, −46%, −5%, −3% for Residential, POI, Workplace and Other co-location events respectively (see SI Appendix K 2). We obtain similar results in the other states (see SI Appendix K), although with some differences. For example, in Oklahoma, Kentucky and Arizona, the Residential co-locations events experience a lower reduction, with a low point around −20%. Notably, we observe a decrease in co-locations events from the partial reversal of the reopening from 29 June 2020 in Arizona. We correlate the daily number of co-location events with the NPIs stringency in each state and find a strong negative Spearman correlation (−0.83 with p-value p < 0.001). Similarly, we find that the daily number of co-location events is also negatively correlated with the daily number of new cases and deceased (−0.58 and −0.67 respectively, with p-values p < 0.001). The number of co-location events and the visits to places are inherently connected. Therefore, as soon as the number and the duration of visits decrease, it becomes less probable to have co-location events. Thus, we estimate the daily expected number of co-location events through a null model and compare it with the observed co-locations. The number of co-location events e i,d occurring at a POI i on a specific day d can be estimated from both the number of individuals visiting the POI n i,d , and the median duration of their stops thered i,d . We can then have an estimate of the co-location events following e i,d = n i,d 2 p i,d , where p i,d is the probability of having a co-location event given two individuals visiting POI i on day d and p i,d is computed assuming a Uniform distribution for the time-interval of visit of two individuals potentially having a co-location event (see SI Appendix K 3 for additional details). We follow a similar reasoning to model the expected duration of the individual's co-location events. Figure 4B shows that, during the pandemic, the observed number of co-location events at POIs are lower than expected. Similarly, Figure 4C shows that the duration of co-location events is slightly lower than expected. Thus, we compare the deviation from the expected number of co-location events and find an higher Spearman correlation with the daily new cases and deaths (0.66 p < 0.001 and 0.48 p < 0.001, respectively) than with the NPIs stringency (0.28 p < 0.001). While the intensity of the local restrictions is the main driver for reducing the number of co-location events, the discrepancy between the theoretical and observed number of co-location events seems largely driven by the epidemic burden (i.e. daily new cases and deaths). We have previously shown that the recovery in the number of visits to POIs seems to have just a loose connection with the NPIs stringency. To explain this unexpected behaviour, we now shift our attention to the different factors that might influence the number of visits to POIs. Using a multivariate Bayesian linear mixed model, we investigate the combined effect of the NPIs, the daily death ratio, the weather (i.e., daily max temperature and precipitation) on the daily number of visits to POIs in each state. Our model also accounts for the different mobility behaviour of people across states and day of the week, by including a random effect for the state and a random effect for each day of the week. We select, as a baseline, a model that includes as fixed effects only the NPIs stringency and the death ratio over the state population. We evaluate the model through the well-established Bayesian R 2 [37] and the PSIS-LOO information criterion [79] . Table I shows that this simple model achieves R 2 = 0.67 and PSIS-LOO = 610.60 and, as expected, shows that the NPIs stringency correlates negatively with the mobility of people. Interestingly, we also find that the death ratio influences the number of visits to POIs. Then, we account for seasonal effects that might influence the visits to POIs. The Weather model adds the daily precipitations and maximum temperature to the baseline model. Table I shows that these two variables significantly increase the model's performance (R 2 = 0.72, PSIS-LOO = 747.44) that grow by 7.46% and 22.41%, respectively. As mentioned in previous sections, despite the absence of significant changes in state restrictions in some phases of the pandemic, we find an increase in the number of visits to POIs. To explain this behaviour, we hypothesise the presence of progressive behavioural relaxation and adaptation to the epidemic risk, also observed by previous literature [50, 68] . We model the effects of this risk adaptation as a function of time with a sigmoid function, fitted by our model. We refer to the Methods and SI Appendix L for additional details. Table I shows that the Full model provides the highest performance (R 2 = 0.78, PSIS-LOO = 846.76) and that the second-most important factor in understanding the daily visits to POIs is the risk adaptation factor. This result also holds even when we predict the time spent outside the home and when we hypothesise that the risk adaptation depends on the cumulative NPIs stringency, which varies from state to state (see SI appendix L). Weather Full of visits to POIs. We report the mean and 95% confidence intervals of all the β coefficients. We report the mean and standard deviation for R 2 and PSIS-LOO. This paper digs into the COVID-19 induced changes in human behaviour at an unprecedented scale and detail. We exploit a privacy-enhanced longitudinal GPS dataset of more than 837,000 anonymous opted-in individuals to show how individuals changed the patterns of visits to places and the person-to-person contact activity over time. We show that, as previously found [40, 67] 74] , which are all assumed to be almost universal. Thus, an open question is wherever these mobility regularities are "resilient" to altered mobility, such as during a pandemic. Finally, we find people adapt to the pandemic risk over time, revealing a two-fold behaviour in the visits to POIs. On the one hand, individuals visits and time spent outside the home are influenced by the NPIs stringency as well as the number of deaths in the state. On the other hand, people increase the time spent outside the residential area, the number and duration of visits to POIs despite no significant changes in the NPIs even when we account for the deaths and the weather. Multiple reasons may explain this risk adaptation, so we can only speculate about its causes. One hypothesis is that the risk adaptation in the patterns of visits results from a change in the risk assessment. As the pandemic lasts for months, people might get more used to the number of deaths, reduce their self-protection and act less prudently. However, there is no evidence of a reduction of mask-use over time in the US [2] , and our results show that the number of co-locations is lower than what is expected from the null-model. Thus, people protect themselves against person-to-person contacts. Another hypothesis is motivated by the sustained economic burden, which may get people to decrease policy adherence to get back at work. Finally, the risk adaptation might also be a consequence of the psychological burden, which reduces the ability or motivation to perform self-protective behaviour [44, 52] . We cannot dismiss any of these hypotheses. However, our evidence well aligns with previous self-reported results [68] and suggests that epidemiological models and public health communication campaigns should consider people's relaxation to governmental orders. Analysing everyday activities from GPS data does not come without limitations. First, the analysis from smartphones data might be biased towards younger adults and fail to capture the mobility of those people who do not carry their phones while visiting places. Second, our Residential and Workplace are just an estimate of the real home and work locations, which are obfuscated for privacy reasons by the data provider. Third, we acknowledge that our co-location events are a loose proxy of social interactions, and people might share the same location even without knowing each other (i.e., familiar strangers [53] We highlight some of these dates in the plots and we describe the important days for all the states in Appendix E 1. We use GPS location data provided by Cuebiq, a location intelligence company that shared a dataset consisting of anonymized GPS locations from users that opted-in to share the data anonymously for research purposes through a CCPA (California Consumer Privacy Act) compliant framework. To further preserve privacy, the data provider obfuscates the precise home and work locations of users by transforming it to the centroid of the corresponding Census Block Group. The dataset span a period of 9 months, from January 2020 to September 2020 (details in SI Appendix C). The data is provided through the Cuebiq Data for Good COVID-19 Collaborative program, which provides access to de-identified and privacy-enhanced mobility data for academic research and humanitarian initiatives only. To ensure the data well describes the mobility of people throughout the pandemic, we filter out all users with less than one month of data before the declaration of national emergency (March 13, 2020) and less than four months after it. We also require users have 5 hours per day covered by at least one GPS location. The resulting dataset includes more than 837,000 anonymous, opted-in individuals. For all users, we extract their stop events with an algorithm based on Hariharan and Toyama [43] . A stop event is defined as a temporal sequence of GPS coordinates in a radius of ∆ s meters where a user stayed for at least ∆ t minutes. The algorithm, its optimization, and its computational complexity are explained in detail in SI Appendix C. To define a stop event, we used ∆ s = 65 meters and ∆ t = 5 minutes due to the distribution of accuracy of the underlying data (see SI Appendix C). For each user, we then define their stop locations as the sequences of stop events that can be considered as part of the same place. To determine a stop location from a sequence of stop events we use the DBSCAN algorithm [28] . With DBSCAN, we group points within a distance of = ∆ s − 5 meters to form a cluster with at least minPoints = 1 stop event (see SI Appendix C for more details). We extract all POIs from OpenStreetMap (OSM) [55] and then, due to the lack of structure in OSM POIs, we map each OSM POI to the corresponding Foursquare Venue Category Hierarchy [56]. After the association, each OSM POI is mapped to the Foursquare categorization with 8 first-level categories and 178 second-level categories. Further details are described in the SI Appendix G. Then, for all users, we associate each stop location to its nearest POI whenever the Haversine distance is less or equal than 65 meters. Since in OpenStreetMap POIs can be represented as Points or Polygons, sometimes nested, we assign POIs to stop locations with the heuristic described in SI Section G. Throughout the analysis we compute the change with the same methodology of the Google mobility reports [40] . Specifically, we compute the percent change as: where v i is the original value at day i and b w is the median value at day of the week w(i), going from 0 to 6, computed during the baseline period (i.e., before the pandemic). Sequitur is a compression algorithm that reduces a sequence size by introducing new symbols/words when in the original sequence appear repetitions of short sub-sequences and motifs [54] . We represent an individual's mobility through a sequence of symbols that maps a stop location into a category (e.g., Residential, Workplace, Food ). From these sequences of symbols we extracted recurrent patterns of visits consisting of sub-sequences of length l ≥ 2 following Di Clemente et al. [25]. We focus our attention on two 4-week periods. The first one starts on February 1, 2020 and excludes January, which might display some unusual patterns due to seasonal effects (e.g., the end of the holiday period). The second one starts at the beginning of each state emergency response orders, i.e. the "stay-at-home" order (in Kentucky, we selected the "healthy-at-home" order since no "stay-at-home" was ever issued). This period terminates before the first re-openings to include the most stringent early regulatory phase for each state (for more details on relevant dates see SI Table II) . For each individual sequence of visits, we randomly shuffle the sequence 1000 times and apply the sequitur algorithm on each one of them [54] . For each symbol s (a visit in our case), we compute the mean number of occurrences, µ s and its standard deviation, σ s , across all random synthetic sequences. We use this quantity to compute the standard score, z s = os−µs σs , where o s is the number of occurrences of each symbol s within the original sequences. Significant routines are selected from all the sub-sequences if z s is greater than 2. We refer to significantly recurrent sub-sequences as "routines". The expected number of co-location event is assumed to depend on both the number of individuals visiting a POI, i on day d and the average duration of their stop there, d i,d . The first quantity is used to compute the combinations of possible co-location between different individuals. The second is used to estimate the probability p i,d of having two temporal interval of durationd i,d overlapping by at least 15 minutes over the entire time span of a day. Combined together, our estimate of the number of co-location events, e i,d , can be written as: . More information about the formulation and computations at support of these models can be found in SI Appendix K 3. We model the daily average number of visits to POIs Y with a Bayesian linear regression formulated for each day of year d, state s and weekday w as: where is the residual error, α s is the state-specific intercept, β policy , β deaths , β temp , β prec and β adapt are the β coefficients for each independent variable in the regression, while ρ w is the week-day random effect controlled by the variable w, the day of the week of (from 0 to 6 where 5 is Saturday and 6 is Sunday). S d is the value of the Stringency Index, measuring local enacted regulations and preventive and informative campaigns (see [42] for more details), D d−1 is the death ratio at the previous day (measured over a population of 100k people), T d is the maximum temperature in Celsius degrees and P d represents the millimeters of precipitations. T d and P d account for the seasonal effects of POI visits. Then, is a standard sigmoid function that models the collective behavioural adaptation of people to the perceived epidemic risk. The sigmoid function depends on time where φ s and γ s model the location and the sharpness of the sigmoid, respectively. All the independent variables are z-score standardized. We extract the daily temperature and precipitation from the PRISM Climate Group [26], which provides the maximum temperature and precipitations with a 4km grid. We compute the average maximum temperature and precipitations for each state. The average is weighted with the population of each county to account for the number of people that were exposed to the measured temperature and precipitations. We assess the out of sample predictive accuracy through the Pareto-smoothed importance sampling Leave-One-Out cross-validation (PSIS-LOO) [79] . This metric overcome the issues of the Deviance Information Criterion (DIC) [75] such as its lack of consistency and the fact that is not a proper predictive criterion [76, 79] , and it has rapidly become the state of the art for evaluating Bayesian models. The PSIS-LOO is defined in the log score as: where n is the number of data points, θ s are draws from the full posterior p(θ|y), s = 1, . . . , S represent the S draws, and w s i is a vector of weights that are the Pareto Smoothed importance ratios built through an algorithm described in the PSIS-LOO original paper [79] . The best model is associated with the highest PSIS-LOO value. We also report Bayesian R 2 [37] as an additional and easy-to-interpret measure of goodness of fit. Replication code is available on GitHub at https://github.com/denadai2/living-the-pandemic. All the data sources are freely available on the Internet while the mobility data from Cuebiq can be accessed only through the Data for Good initiative of the company [57] . Limitations apply to the availability of this data, due to the rigorous anonymity constraints. We would like to thank Cuebiq for allowing the use of anonymized data through their COVID19 [15] G Cacciapaglia, C Cot, and F Sannino. Mining google and apple mobility data: Twenty-one shades of european social distancing measures for covid-19. 2020. [19] Cuebiq. Sensitive points of interest policy -cuebiq. Accessed on 2021-07-22. [20] Governor Andrew M. Cuomo. Governor cuomo announces gatherings of up to 25 people will be allowed in phase three of reopening. https://www.governor.ny.gov/news/ governor-cuomo-announces-gatherings-25-people-will-be-allowed-phase-three-reopening-0, 2020. [21] Governor Andrew M. Cuomo. Governor cuomo issues guidance on essential services under the 'new york state on pause' executive order. https://www.governor.ny.gov/news/ governor-cuomo-issues-guidance-essential-services-under-new-york-state-pause-executive-order, 2020. [ The location data is provided by Cuebiq Inc., a location intelligence and measurement company. The dataset was shared within the Cuebiq Data for Good program, which provides access to de-identified and anonymized mobility data for academic and research purposes. The location data used consists of users in the US over nine months, from January 2020 to August 2020, and includes only users who have opted-in to share their data anonymously. The data is General Data Protection Regulation (GDPR) and California Consumer Privacy Act (CCPA) compliant. Furthermore, to increase and preserve users' privacy, Cuebiq obfuscates home and work locations to the census block group level. The data is collected through the Cuebiq Software Development Kit (SDK) that collects user locations through GPS and Wi-Fi signals in Android and iOS devices. The device determines the location accuracy, which varies from 0 to more than 100 meters. Figure 6 (left) shows the accuracy in meters of the original GPS events in the dataset. We can see that the accuracy distribution is bimodal, with one peak around 5 meters and the other peak at 65 meters. We speculate that the latter peak is caused by the home obfuscation mechanism of Cuebiq to preserve users' privacy. Figure 6 (right) shows the average number of hours per day with at least one GPS location per user. We can see that most users have almost all the hours covered, enabling us to describe human behaviour accurately. policies are built from regulations made at the state level and below (e.g., indexes also include decisions of county or city governments) [63] . As an aggregate measure of the strength of the enacted regulations, we use the "Stringency Index " [42] . The index is composed of nine different indicators, each quantifying the strength of the enacted policies on a specific regulatory category both concerning the containment and closure policies and the health care system policies. Here we report the nine indicators (and their respective range of values) from which the "Stringency Index" is constructed (for a more detailed description on how these indexes are aggregated, see [42] or [63] ): • school closure (from 0 to 3); • workplace closure (from 0 to 3); • cancel public events (from 0 to 2); • restrictions on gatherings (from 0 to 4); • public transport closure (from 0 to 2); • stays at home requirements (from 0 to 3); • restriction of internal movements (from 0 to 2); • restriction on international travels (from 0 to 4); • intensity of public information campaigns (from 0 to 3). A single original GPS location does not contain any information about people's movements, and it is thus impossible to know, from a single point, whenever a user is stationary in a place or not. Moreover, the coordinates of GPS points can fluctuate over time even when the user is not moving. Therefore, we apply a stop location detection algorithm to transform a sequence of original GPS data into a sequence of locations in which a user stops. We detect the users' stops in a two-step algorithm composed of detecting (i) stop events and (ii) To form a stop event we heuristically choose to group locations in a time-ordered fashion. In other words, in this step, we aim at finding all those places at most ∆s meters large where people stopped for at least ∆t minutes. Each stop event is composed of at least two locations, and the locations can belong only to at most one stop event. To extract stop events we base our method on Hariharan and Toyama's work [43] . The algorithm is depicted in Algorithm 1 and can be summarised as follows: for each user, we first order their GPS locations by timestamp, and then we select groups of GPS sequences with the desired spatial (∆s) and temporal (∆t) thresholds to form stop events. The Diameter function computes the greatest distance between points, while the Medoid function selects the GPS location with the minimum distance to all other points in the set. The computational complexity of the stop event algorithm [43] is O(n 3 ), because of the repeated Diameter function that computes a distance matrix, whose complexity is O(n 2 ). To reduce the computational burden of the algorithm, we divide the sequence of points of each user into buckets/chunks. To do so, we use the algorithm we depict in Algorithm 2. Then, we horizontally parallelize the Diameter algorithm across users and chunks, as each chunk can compute the stop locations independently from the others. The Diameter d(i, j) is computed through the Haversine great-circle distance between r i and r j . Given the average radius of the Earth r and two points with latitude and longitude ϕ 1 , ϕ 2 and λ 1 , λ 2 respectively, the Haversine distance d between them is: d(r 1 , r 2 ) = 2r arcsin sin 2 ϕ 2 − ϕ 1 2 + cos(ϕ 1 ) cos(ϕ 2 ) sin 2 λ 2 − λ 1 2 . Algorithm 2: Algorithm for dividing an user's stop events into buckets. Input: Time-ordered list of a user's original GPS positions R = [r 0 , r 1 , . . . , r n ] and the spatial threshold ∆s. Output: The set B buckets of an user's stop events. Following the distribution of accuracy in Figure 6 , we choose ∆s = 65 meters. We also choose ∆t = 5 minutes to detect all stops from the shortest to the longest ones. We select = ∆s − 5 meters to avoid the creation of an extremely -and incorrect-long chain of sequential stop events. To analyze human mobility in detail, we need to classify each user's stop location semantically. For this reason, we computed the most probable residential and workplace areas for each user. Furthermore, to capture possible changes in residential and workplace areas due to the pandemic, we computed these areas multiple times for a moving window of 28 days. We proceed as follow, for 2. Potential work time. We sum the weekday time between 9 am and 5 pm spent in stop s i . Moreover, we assume that a potential work stay should last at least 30 minutes and have a frequency of 5 visits per week. These assumptions are similar to previous literature [49] . We choose the aforementioned working hours since they represent the most popular working time in the US [81] . For each day t, we label a stop s i (t) as Residential stop if this stop has the largest potential Residential time. Then, we label a stop s i (t) as work stop if this stop is not Residential stop and it has the largest potential work time. We aim at understanding whether there were differences in the mobility behaviour patterns of individuals in states that put in place different strategies in response to the COVID-19 pandemic. To select these four US states, we took into account (i) the daily COVID death rate and (ii) the stringency measures the states adopted to slow down the spread of the virus. Both information was retrieved from the OxCGRT dataset [42] . As shown in Fig 7 we selected the states of Arizona (many deaths/low stringency), Oklahoma (few deaths/low stringency), Kentucky (few deaths/high stringency) and New York (many deaths/high stringency). We retrieved individual state dates for relevant events concerning the pandemic and the local regulatory response from state-dedicated web pages on the Wikipedia website [82] . We here summarize the main events, focusing on the first case reported within the state borders and the main regulatory responses. The dates are conveniently reported also in Table II . • First case reported: 26 January 2020; • Universities start to move to online courses: 12 March 2020; • Governor issues a statewide "stay-at-home" order, barring Arizonans from leaving their houses except for essential activities: 31 March 2020; • A partial reopening for a selected set of non-essential activities is announced: 4 May 2020; • The "stay-at-home" order expires: 15 May 2020; • Following a record high in the number of hospitalizations a partial reversal of the "stay-athome" order is announced: 29 June 2020. • First case reported: 6 March 2020; • School closure: 16 March 2020; • Governor issues a ban for mass-gatherings: 20 March 2020; • Non-essential business closure is enacted and a "healthy-at-home" order (similar to a "shelter in place" policy): 26 March 020; • Reopening starts requiring business to follow public health guidelines ("healthy-at-work" initiative): 9 May 2020; • Restaurants reopening: 22 May 2020; • Bars reopening operating (similarly to restaurants) at a 50% indoor capacity: 29 June 2020. • First case reported: 1 March 2020; • School closure in New York city: 15 March 2020; • Statewide "stay-at-home" order is declared, all non-essential business are ordered to close: 22 March 2020; • "Phase one", a county-level partial reopening upon meeting qualifications is announced: 15 May 2020; • "Phase two", if meeting qualifications counties can reopen restaurants outdoor activity as in-store activities for specific shop categories: 29 May 2020; • "Phase three", if meeting qualifications counties can reopen indoor restaurant activity and bars at 50% capacity: 24 June 2020. • First case reported: 7 March 2020; • Announcement of a "safer-at-home" order requiring vulnerable people to remain at their residences except for essential activities, "school closure" order and ban of mass gatherings are enacted on the following day: 24 March 2020 (and 25 March 2020); • "Phase one" of business reopening allows for outdoor activities, personal care facilities, restaurants, cinemas, gyms, and places of worship to restart activities subject to physical distancing and sanitation protocols (specific guidelines are issued for the industry): 24 April 2020; • "Phase two" allows religious ceremonies, bars to reopen, and organized sports to restart activities subject to physical distancing: 15 May 2020; • "Phase three" reopens business that has been restricted to appointments only, summer caps and (although limited) also visitation at hospitals: 1 June 2020. For each US state we studied, we performed all our analyses on panels of individuals that: 1. have been seen for at least 7 days before the stay-at-home declaration and that have more than 5 hours of activity on average; 2. have been seen for at least 7 days after the stay-at-home declaration and that have more than 5 hours of activity on average. This pre-processing step leaves us with individuals for which we have information before and after the stay-at-home declaration, enabling a more sound analysis of how mobility behaviour changed during the pandemic. After the POIs extraction phase, we remain with a total of 3,353,502 POIs for the entire US. In To further enrich our analysis, we added semantic meaning to individuals' trajectories associating stop locations to their nearest POI when looking at the distance between the stop location and the POI. When we associate a POI to a stop location, we prioritize a Point POI over a Polygon POI, checking whether a POI is inside another POI. This situation happens, for example, when an individual is inside a shopping mall. In this case, we will have a POI representing the entire shopping mall (Polygon POI), and the distance between the stop location of the individual and the POI will be 0. Now, suppose that we have a cafeteria (Point POI) inside the shopping mall, which is at 3 meters distance. In this case, we will assign the stop location to the cafeteria since it gives us a fine-grained piece of information on the visit patterns of the individual. In Fig 10 we "Spiritual Centers", "Governmental Buildings", and "Prisons" were removed due to the sensitive nature of those locations. Here we report the number of visits and the duration of visits to POIs over time for individuals in the state of New York. Figure 12 shows J a n 0 6 J a n 2 0 To assess the reliability of the Cuebiq data, we compared the mobility data of Cuebiq to the mobility reports provided by Google and Foursquare. Google provides the relative change in the number of visits to a specific category by comparing the actual number of visits to a baseline computed for each day of the week, over 5 weeks from 3 January to 6 February 2020 (e.g., 5 values for Monday, 5 values for Tuesday). The only exception is for residential places, for which Google computes the change metric in terms of the fraction of time spent at home on a specific day. We apply the same procedure to our mobility data, and we compare the results with the data provided by Google (comparing residential and workplaces) and Foursquare (comparing the different visits to POIs categories). We computed the Pearson correlation between the relative change in the number of visits for each US state and category according to Google/Foursquare and Cuebiq. The reliability of the Cuebiq mobility data is shown in Fig. 18 , where we can see that Cuebiq mobility data show high agreement in the changes of mobility patterns with both Google and As the threat of the COVID-19 pandemic unfolds in different states, there is a consistent reduction in the population mobility. This effect is reflected at various levels in different key quantities capturing several aspects of human mobility. Thus, we extract some additional mobility metrics to complement the analysis on the changes in individual mobility patterns presented in the main paper. As shown in Fig 19, for each individual, we computed the following quantities: • Number of unique stop locations; • Diversity of visited locations weighted by the number of visits to an individual's locations; • Diversity of visited locations weighted by the time spent in an individual's locations; • Radius of gyration weighted by the number of visits to an individual's locations; • Radius of gyration weighted by the time spent in an individual's locations. We then aggregate these metrics, averaging over all individuals. Finally, we compute the percentage change for each metric m as: where m b is the median of the aggregated metric m during the baseline period before the pandemic, and m w(t) is the median of the aggregated metric m for a window w(t) for all individuals. To compute the percentage of change for all the metrics we described, we used a time window of 2 weeks that we shift by 1 day, setting a baseline period that starts on 3 January 2020 and ends on 29 February 2020. We centre the last window at 29 February 2020, so that the last window included in the baseline period ends on 7 March 2020 (still before the stay-at-home measures). We introduce the entropy of visited locations as a measure of the diversity of an individual's patterns of visit: Here, k is the total number of unique visited locations of an individual i, p ij = The Sequitur compression algorithm is used in this work as a tool for detecting recurrent patterns in the sequence of location type visits [54] . However, since we are interested in understanding how individual routines changed as the pandemic unfolded, it is important to ensure that the recurrent patterns identified are not spurious and represent meaningful patterns in terms of individual routines' description power. Given the nature of mobile phone data, random sequences can be erroneously identified as relevant routines. Following the approach of Di Clemente et al. [25], we remove routines whose number of occurrences in the original sequence, if compared to a set of randomized versions of the original sequence, does not significantly differ. This process consists of generating 1 000 copies of each individual's original sequence of visits (including their homes, workplaces, uncategorized locations, and all venue categories following the temporal order of visits) and independently shuffling each of them. We then compare the occurrences of recurrent patterns identified from the original routine with the average occurrence they show in the randomized sequences. Operationally, we compute the z-score, using each recurrent pattern standard deviation and their average number of occurrences, and keep as "significant routines" only those with a score greater than 2. While this procedure ensures that the routines we are analyzing are representative of an individual's recurrent patterns and thus help in understanding how deeply rooted behaviours were affected by the pandemic, it is also of interest to understand the changes that occurred to the overall sequences of visits. This is performed in the next section with a thorough analysis of the direct compression of the original sequence for the two time windows under study. As (panels C and D). We also note that most of the non-compressible sequences (i.e., with compression ratios equal to 0) are short: during-pandemic, in 71% of the cases the length is smaller than four. Importantly, these short sequences are mostly composed by a single stop category: during-pandemic, in 71% of the cases this category corresponds to an individual's "Residential location". From the pre-pandemic to the during-pandemic period, we find a global decrease in all the mobility routines. In Figure 24 we report the reductions faced by each link. College & Universities closure exposed all the activities linked to that category to the largest reduction. A similar behaviour hit the Arts and Entertainment category due to the restrictions and closures of museums and other cultural places. The silhouette score is a metric that compares an object cohesion to its own cluster with its distance from other clusters. Silhouette score ranges from -1 to +1. A high value indicates a good match and in general a well matched set of cluster components. For each unit in a system, silhouette is computed as the average distance of one unit with all other units outside its own clusterd out (i) minus the average distance of the same unit with the other components from its own clusterd in (i): . To better understand the routine characteristics of people during the pandemic, we display in Figure 26 the network of subsequent visits, extracted from the top six clusters defined from the routine activity of Sequitur. We select the clusters to visualize by selecting the top six clusters at the 95% height of the dendogram formed by the complete linkage agglomerative clustering described in the manuscript. For clarity reasons, we filter out the links with less than 5% of intensity. Figure 26 shows that most of the users, which belong to Cluster 1 (24%) and Cluster 2 (9.6%), have their significant locations between Residential and Shops & Services. Other clusters include more diversity in mobility, including also the work places. Appendix K: Co-location events Two individuals are said to be "co-located" if they both have a stop location that is at most 50 meters far from the other one, and these stop locations are overlapping on date and time for at least 15 minutes. A "co-location event" is thus defined as one event in time when two or more individuals are co-located. These co-location events are grouped in four different categories depending on the place where the possible social contact took place: • Residential, a co-location event where one and only one of the two individuals have the venue marked as Residential location; • Workplace, a co-location event that happened in a venue labeled as a workplace for both the individuals; • POI, a co-location event where both the individuals are in the same POI (same ID of the POI); • Other, a co-location event in which the two individuals meet in a place that it is neither a Residential nor a Workplace nor a POI. As the pandemic unfolds in the US, its impact on the number of visits to POIs grows, reducing by about 80% their number during the first phase period. This reduction of the number and duration of visits are expected to impact the number of co-location events and their duration strongly. To this extent, we propose two null models that aim at capturing the expected number and duration of co-locations given the reduction in the number of visits to POIs, the time spent at POIs, and the duration of those stops during which a co-location event occurred. Number of co-location events null-model. To provide an estimate for the daily number of co-location events, we focus on co-locations happening at a single POI i on a specific day d. The number of co-location events occurring at that location on a specific day e i,d can be estimated from both the number of individuals visiting the POI n i,d , and the median duration of their stops therē We can then provide an estimate for e i,d using the following: where p i,d is the probability of having a co-location event given two individuals visiting POI i on day where is a minimum number of minutes the two time-interval are required to overlap in order for a co-location to be counted as a co-location event, andd is the duration of a day in minutes. Co-location event duration null-model. The expected duration of a co-location event is estimated following a similar reasoning. Conditioning on the fact that two individuals are co-located at POI i on day d, we use the average length of their stops there, namelyd i,d , to estimate their expected time-overlap under a uniformly distributed probability for the displacement of their visit over the entire day. Thus, our null-model assumes that individuals visit POIs without the specific intent of meeting with someone. Under this assumption, the expected temporal overlap, o i,d , which corresponds the expected duration of a co-location events, can be easily derived as the average overlap between two time-intervals moving over the same domain: where x is the distance between the center of the two intervals of equal lengthd i,d . Prior selection: Here, we introduce the prior distributions adopted to fit the full model. The baseline and weather models use the same prior distribution without the additional parameters Table L shows that also this model better captures the reduction and subsequent recovery of the number of visits to POIs. Modeling "time not at home" : As an additional robustness check, we also tested the performances of these four models in describing the timeseries of the time not spent in residential areas. Also in this case, Tab. L summarizes the better performance of both models accounting for a behavioural adaptation component. Map features Personal measures taken to avoid covid-19 Venues categories -foursquare The scales of human mobility Evidence for a conserved quantity in human mobility Phase 2 guidance for employers reopening in new york Understanding individual human mobility patterns The use of mobile phone data to inform analysis of covid-19 pandemic epidemiology A global panel database of pandemic policies (oxford covid-19 government response tracker) Project lachesis: Parsing and modeling location histories Geographic Information Science Behavioral fatigue: Real phenomenon, naïve construct, or policy contrivance? Ranking the effectiveness of worldwide covid-19 government interventions Which interventions work best in a pandemic? Integrated vaccination and physical distancing interventions to prevent future covid-19 waves in chinese cities Effect of covid-19 response policies on walking behavior in us cities The timegeo modeling framework for urban mobility without travel surveys Reshaping a nation: Mobility, commuting, and contact patterns during the covid-19 outbreak Effect of non-pharmaceutical interventions to contain COVID-19 in china The concept of "fatigue The familiar stranger: An aspect of urban anonymity. The individual in a social world Identifying hierarchical structure in sequences: A linear-time algorithm OpenStreetMap contributors. Planet dump retrieved from covid-policy-tracker/codebook.md · oxcgrt/covid-policy-tracker · github Urban characteristics attributable to density-driven tie formation Returners and explorers dichotomy in human mobility Covid-19 outbreak response, a dataset to assess mobility changes in italy following national lockdown Non-pharmaceutical interventions during the covid-19 pandemic: A review A worldwide assessment of covid-19 pandemic-policy fatigue Silhouettes: a graphical aid to the interpretation and validation of cluster analysis Estimating the burden of sars-cov-2 in france The universal visitation law of human mobility Covid-19 lockdown induces disease-mitigating structural changes in mobility networks New york city public schools to close to slow spread of coronavirus Limits of predictability in human mobility Bayesian measures of model complexity and fit The deviance information criterion: 12 years on Introduction to Data Mining Worldwide effect of covid-19 on physical activity: a descriptive study Practical bayesian model evaluation using leave-one-out cross-validation and waic Impacts of social distancing policies on mobility and COVID-19 case growth in the US Business hours -wikipedia U.s. state and local government responses to the covid-19 pandemic -wikipedia Early social distancing policies in europe, changes in mobility & covid-19 case trajectories: Insights from spring 2020 Mobile device data reveal the dynamics in a positive relationship between human mobility and covid-19 infections The impact of relaxing interventions on human contact patterns and sars-cov-2 transmission in china College & University', 'Professional & Other Places'] ['Arts & Entertainment College & University', 'Outdoors & Recreation'] ['Arts & Entertainment', 'Travel & Transport Nightlife Spot Arts & Entertainment', 'Shop & Service'] ['College & University College & University', 'Travel & Transport Arts & Entertainment', 'home Arts & Entertainment', 'work Professional & Other Places Professional & Other Places', 'work'] ['Nightlife Spot Professional & Other Places Arts & Entertainment', 'Professional & Other Places Outdoors & Recreation', 'Travel & Transport Outdoors & Recreation', 'work Arts & Entertainment', 'Outdoors & Recreation'] ['Nightlife Spot Professional & Other Places', 'Travel & Transport'] ['Professional & Other Places Outdoors & Recreation', 'Professional & Other Places Travel & Transport', 'home Professional & Other Places'] ['Outdoors & Recreation Outdoors & Recreation', 'home J a n 0 4 J a n 1 8 F e b 0 1 F e b 1 5 F e b 2 9 M a r 1 4 M a r 2 8 A p r 1 1 A p r 2 5 M a y 0 9 M a y 2 3 J u n 0 6 J u n 2 0 J u l 0 4 J u l 1 8 A u g 0 1 A u g 1 5 A u g 2 9 introduced in the full model only. Therefore, for the sake of simplicity, we report here the complete list of priors without differentiating for the three main models presented and discussed in the main