key: cord-339405-sj7dd6jr authors: Grantz, K. H.; Cummings, D. A. T.; Zimmer, S.; Vukotich, C.; Galloway, D.; Schweizer, M. L.; Guclu, H.; Cousins, J.; Lingle, C.; Yearwood, G. M. H.; Li, K.; Calderone, P. A.; Noble, E.; Gao, H.; Rainey, J.; Uzicanin, A.; Read, J. M. title: Age-specific social mixing of school-aged children in a US setting using proximity detecting sensors and contact surveys date: 2020-07-14 journal: medRxiv : the preprint server for health sciences DOI: 10.1101/2020.07.12.20151696 sha: doc_id: 339405 cord_uid: sj7dd6jr Comparisons of the utility and accuracy of methods for measuring social interactions relevant to disease transmission are rare. To increase the evidence base supporting specific methods to measure social interaction, we compared data from self-reported contact surveys and wearable proximity sensors from a cohort of schoolchildren in the Pittsburgh metropolitan area. Although the number and type of contacts recorded by each participant differed between the two methods, we found good correspondence between the two methods in aggregate measures of age-specific interactions. Fewer, but longer, contacts were reported in surveys, relative to the generally short proximal interactions captured by wearable sensors. When adjusted for expectations of proportionate mixing, though, the two methods produced highly similar, assortative age-mixing matrices. These aggregate mixing matrices, when used in simulation, resulted in similar estimates of risk of infection by age. While proximity sensors and survey methods may not be interchangeable for capturing individual contacts, they can generate highly correlated data on age-specific mixing patterns relevant to the dynamics of respiratory virus transmission. role schoolchildren play in facilitating transmission (44) (45) (46) (47) . Schoolchildren generally display 80 highly assortative mixing by age (i.e., they preferentially interact with children of the same age) 81 and high contact rates with adults and the elderly (their parents and grandparents) which may 82 facilitate transmission among schoolchildren and within their surrounding communities (3, 4, 13, 83 16, 25, 48) . Many public health interventions, including school closures and vaccination 84 campaigns, focus on the role of schoolchildren in the spread of respiratory infections (49, 50) . One challenge in drawing links between patterns of social contacts and respiratory disease 87 transmission is the difficulty in empirically measuring patterns of proximal social 88 interaction. Social contacts that can lead to transmission of pathogens can potentially be 89 transient, non-synchronous (i.e., through contamination of the environment), and of varying 90 intensity (2, 51). Multiple methods have been used to measure social contact, the relative 91 disadvantages and advantages of which have been described elsewhere (51). The majority 92 have used interviews or surveys to collect data on self-reported contacts, raising the possibility contacts was less skewed, but the presence of several high-degree nodes (individuals with 133 many contacts) became increasingly apparent as the minimum number of cumulative contacts 134 (an approximation of contact duration) required to be considered a unique contact was 135 increased. The average duration of a survey-reported contact was 124.3 minutes, compared to 136 just 7.5 minutes for sensor-recorded contacts. There was marked similarity between the 137 distribution of survey-reported in-school contacts (Fig. 1C) and unique sensor-recorded contact 138 events with at least 100 cumulative contacts (Fig. 1G ), but the association at an individual level 139 was unclear. In multivariate regression analysis adjusted for participant age, sex, and survey design, sensor-143 recorded and survey-reported contacts rarely served as significant predictors of one another 144 (Fig. 2, Supp. Fig. 1 ). Increasing the cumulative contact threshold for sensor contacts did not 145 improve these associations. Generally, the number of survey-reported contacts increased with 146 age. Duration of survey-reported contacts increased with age as well, but the effect size was 147 reduced compared to the number of contacts. Survey type or method of administration was not 148 associated with number or duration of recorded contacts. Male students were less likely than 149 female students to report contacts and reported shorter contacts on average in contact surveys. Results using multiple thresholds of cumulative sensor contact are shown in the supplement 152 (Supp. Fig. 2 ). We found significant associations between sensor outcomes and number of 153 survey-recorded contacts; however, the effect size was small relative to other factors (e.g., age). Age-specific mixing patterns 156 Age-specific contact patterns derived from both data collection methods showed highly proportionate mixing assumptions (Fig. 3A) . There was also a striking consistency between pairwise survey-and sensor-recorded contact ratios as a function of the difference in grade. The 161 average departure from proportionate mixing expectations for participants in the same grade 162 was 4.07, compared to just 0.72 for participants one grade apart and 0.15 for participants two or 163 more grades apart (Fig. 3E ). Assortativity of age-specific matrices based on contact surveys and sensor data ranged from q 166 = 0.68 to q = 0.95 (Fig. 3) . The range was partially due to the structure of the participating 167 schools; in this study, there were no schools with both high school and non-high school 168 students. However, even within each school, mixing patterns showed high degrees of 169 assortative mixing (e.g., in-school contact survey-based matrices range from q = 0.62 to q = 170 0.99, Supp. Fig. 2 ). The effect of school structure on mixing patterns was most apparent in matrices based on 173 unique sensor contacts, which revealed three elementary grade clusters (K-2, 3-4, 5-8) within 174 which there was strong assortative mixing ( Fig. 3B ). High school students (grades 9 to 12) 175 represented a well-mixed, modular cluster (q = 0.05 and 0.12 for HS1 and HS2, Supp. Fig. 3 ). Transmission models 187 When used in age-specific simulation, sensor-and survey-based mixing matrices produced 188 similar attack rates when adjusted by proportionate mixing expectations (Fig. 4) . Increasing the 189 contact threshold resulted in more heterogeneity relative to the proportionate mixing baseline. There was discordance between the sensor-and survey-based predicted attack rates in 191 particular schools, which increased with cumulative sensor contact threshold and disjuncture in 192 contact matrices. However, in other schools, there was a marked degree of similarity between 193 attack rates regardless of contact matrix employed. In simulations based on unadjusted contact 194 rate matrices, predicted attack rates were lower in younger children when using survey-based 195 matrices, a reflection of the different reporting rates by age and specific demography of each 196 school (Supp. Fig. 6 ). We explored multiple parameters in our transmission model, assuming 197 reproductive numbers of 1.5, 2 and 3. We found little qualitative difference between these 198 simulations (Supp. Fig. 7) . The utility of social contact data to the study of infectious diseases has been limited in part by 202 questions of how to best measure social interactions relevant to transmission. In this project, we 203 found that, while the two commonly used methods captured different information at the 204 individual level, they gave similar results in several aggregate patterns of contact that are 205 thought to be relevant to pathogen transmission, namely, patterns of age-specific mixing and 206 probability distributions of the total number of contacts. As in other work, we found evidence for 207 strong assortativity of contacts by grade (32). This work has important implications for the 208 empirical parameterization of mathematical models of transmission, particularly of respiratory 209 pathogens. This work suggests that either empirical approach could be used to characterize CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 14, 2020. . https://doi.org/10.1101/2020.07.12.20151696 doi: medRxiv preprint found poor individual-level concordance between the two methods: anywhere from 15% to 96% patterns across age ranges using two different methods (30, 57). We also observed substantial absolute differences in the number and type of contacts recorded 219 by self-reported contact surveys and proximity sensors. We found either metric was a poor 220 predictor of the other, even when adjusting for age, sex, and study factors. However, we found 221 stronger individual-level correspondence between the measures when we restricted sensor data 222 to contacts with longer cumulative duration (true for 3-minute and 30-minute minimum 223 thresholds), consistent with earlier work which found longer contacts were more likely to be 224 reported in surveys (11, 22, 23) . In practice, the two methods are designed to capture different 225 social interactions. Per the study protocol, survey-recorded contacts should only have included 226 those with interactions that involved talking, playing, or touching, while sensors recorded all 227 other sensors within proximity regardless of whether participants were socially interacting. That 228 the correspondence increased when limiting sensor information to proximal contacts with longer 229 duration suggests that these were more likely to be contacts which include social interactions. It 230 is unclear which type of contact (proximal or social interaction) is most relevant for the spread of 231 respiratory pathogens. To determine whether contact patterns measured using different empirical approaches lead to 234 different transmission dynamics, we simulated transmission using models parameterized with 235 data from the two empirical techniques. In simulations using mixing matrices adjusted by 236 proportionate mixing expectations, similar age-specific infection patterns were found using 237 sensor and survey data. Previous work has similarly found that, while simulations using 238 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 14, 2020. . https://doi.org/10.1101/2020.07.12.20151696 doi: medRxiv preprint unadjusted contact data from surveys and proximity sensors differ, appropriate adjustment to 239 survey data which capture key structural elements of the contact network (e.g., age 240 assortativity) leads to consistent simulation results using both kinds of contact data (57). Here, 241 differences in attack rates appear to be driven by increasing disjuncture between grades and 242 age assortativity in certain mixing matrices. Importantly, the metric we used to compare age-specific contact patterns from survey-and 245 sensor-recorded data did not account for absolute differences in the overall contact rates of 246 children in each grade. In simulation, the β estimation procedure (see Supplementary Methods) 247 scaled the overall rate of contact between age-specific contact matrices, but did not account for Our study has some important limitations. Though we adjusted for the demographics of the 260 specific schools and deployments that we conducted, our results may not be generalizable to 261 other settings. The physical and architectural environment of our schools, the density of sensors 262 that we were able to deploy in our schools, and the specific days that we deployed our study 263 may all have affected our results. Technical issues, though not common, did occur with the 264 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 14, 2020. . https://doi.org/10.1101/2020.07.12.20151696 doi: medRxiv preprint sensors, resulting in lost data for some sensors. Similarly, recall bias and misclassification by participants when completing contact surveys may have obscured the relationship between our 266 two methodological measurements. We found that the design and administration of contact 267 surveys led to some censoring in the number of contacts reported (Fig. 1) . Nonetheless, we 268 believe that the relationships we found were robust to the misclassifications and biases that may 269 be generated by these sources. Previous work has indicated that risk of infection with influenza is more closely linked to the 272 average mixing patterns of an individual's age group, rather than the individual's contact 273 behaviour (7). We found that two common methods of collecting social contact data, self-274 reported surveys and proximity sensors, recorded qualitatively and quantitatively different 275 individual social mixing behaviour but could still generate similar aggregate age-specific social 276 contact patterns. The collection of high-quality social contact data through either method has 277 important implications for surveillance, prediction, and prevention of respiratory virus 278 transmission. Our finding that these two methods found some commonality in aggregate age-279 specific social contact patterns suggests that these phenomena are not an artefact of either 280 specific empirical method but attributes of these study populations. Study description 284 Enrolment in the Social Mixing and Respiratory Transmission (SMART) study operated on an 285 opt-out basis, and all students registered in a participating school before the start of the study 286 were eligible to participate. Students in kindergarten (typically aged 5 years) to 12 th grade 287 (typically aged 18 years) from two elementary (K to 4 th grade, K to 5 th grade), two middle (5 th to 288 6 th grade, 7 th to 8 th grade), two elementary-middle (K to 8 th grade), and two high (both 9 th to 12 th 289 grade) schools were eligible to participate in SMART. Participation rates were high in all schools 290 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 14, 2020. . https://doi.org/10.1101/2020.07.12.20151696 doi: medRxiv preprint (82 to 99%). Each school provided aggregate demographic information about the school 291 population, and individual grade and sex of participating students. Proximity sensor deployments The details of proximity sensor deployments have been described in detail elsewhere (60). In 294 brief, participating students were given proximity sensors in plastic pouches and instructed to 295 wear the pouch around their neck for the duration of the school day without removing or 296 otherwise tampering with the sensor. In six of the eight schools, all participating students were 297 given a sensor; in two schools, the large student population limited the deployment to randomly 298 selected classrooms in each grade. Deployments typically lasted from the first class period 299 (08:00 -09:00) to the last class period (14:00 -15:00). Deployment days in each school were 300 chosen to be representative of a typical school day, without any special schoolwide or grade-301 specific activities that could modify normal contact patterns. We used TelosB wireless sensors (61) programmed in the NesC language to send beacons 304 every 20 seconds (beacon frequency 3 per minute). The receiving sensor recorded the 305 contacting sensor's identity, an internal time stamp, and a radio strength signal indicator (RSSI). Signal strength provided an estimate of physical proximity, but was highly dependent on the 307 orientation of the two sensors and any obstructions between them and therefore could not be 308 used to define an exact distance between contacts. Based on pilot studies and previous work on 309 effective distances of respiratory virus transmission (29, 62), we chose a signal threshold (-80 310 dBm) that should correspond to contacts of relevance to respiratory disease transmission. The number of unique proximity sensor contacts recorded for a participant was defined as the 313 total number of other participants with whom their proximity sensor recorded at least one 314 interaction during each deployment. To explore patterns of contacts of varying length, we 315 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 14, 2020. Participants were asked to report information about any individual they talked with, played with, 332 or touched the previous day, including the contact's age and sex, whether they attended the 333 same school as the participant, the context in which the contact was made, whether the contact CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 14, 2020. . https://doi.org/10.1101/2020.07.12.20151696 doi: medRxiv preprint We defined total survey contacts as the total number of individuals a student reported having 341 interacted with on the day before the survey was completed. Detailed contacts were the subset 342 of total contacts for which the student reported contact age, sex, duration, and context. We 343 considered further subsets of detailed survey contacts, including those occurring within school, 344 those reported to have lasted more than 10 minutes over the course of the day, and those 345 occurring on the same day as a sensor deployment. Briefly, each sensor interaction was assumed to represent an independent contact of between 0 354 to 20 seconds; the total interactions between a pair of participants were summed to compute the 355 total duration of contact in one deployment. Participants were asked to record the approximate 356 durations of survey-reported contacts. We used negative binomial regression to investigate which factors were associated with the 359 number of reported contacts for each student who participated in a sensor deployment and 360 completed at least one contact survey. Each model included participant grade, gender, and a 361 random intercept term for day of survey completion or sensor deployment. Survey 362 administration and sensor deployment days were unique to each school. Terms for the type and 363 method of survey administration were added to models of survey-recorded outcomes. Age-specific mixing matrices 366 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 14, 2020 . . https://doi.org/10.1101 /2020 We estimated two metrics of age-specific contact patterns: an average per-capita mixing rate, and the age-specific mixing ratio of observed contact rates to those expected under the 368 assumption of proportionate mixing. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 14, 2020 . . https://doi.org/10.1101 /2020 where xj is the number of individuals in grade j with whom participants in grade i could record a 392 contact, and all other terms are as defined above. Values greater than 1 indicate more contacts 393 were recorded by participants in grade i with individuals of grade j than would be expected 394 under proportionate mixing. Proportionate mixing assumes that an individual in grade i mixing at 395 random will contact individuals in grade j with a probability equal to the proportion of the 396 population in grade j, but no assumption is made on the probability of individuals in grade i 397 making any contact relative to other groups. By design, rj, the participant population, is equal to xj, the contact population, in sensor 400 deployments. For within-school contacts, we used the demographic information of all registered 401 students in each school to define the potential contact population. Combined K-12 matrices 402 were generated by averaging age-specific matrices from all participating schools, weighted by 403 the number of participants in each school. Confidence intervals were calculated using 1,000 resampled bootstrap replicates of contact 406 events. Mantel correlation coefficients were used to compare mixing matrices. The degree of 407 assortative mixing, q, was calculated as the ratio of the first minor eigenvalue to the dominant 408 eigenvalue (63), where q ranges from -1, representing completely disassortative mixing, to 1, 409 completely assortative mixing. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 14, 2020. . https://doi.org/10.1101/2020.07.12.20151696 doi: medRxiv preprint proximity sensors and self-reported surveys were likely to record contacts with different transmission potential, we fitted β for each set of parameters, including the age-specific mixing 18 (EP/N014499/1). The findings and conclusions in this report are those of the authors and do not 445 necessarily represent the official position of CDC. Additional Information We declare no competing interests. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 14, 2020 . . https://doi.org/10.1101 /2020 modes of transmission: A critical review. J. Infect. 57, 361-373 (2008 PLoS One 9, e95978 (2014). Tables Table 1. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted July 14, 2020. . https://doi.org/10.1101/2020.07.12.20151696 doi: medRxiv preprint Figure 1 . Distribution of the number of contact events recorded in a US school setting by self-reported contact surveys and proximitydetecting sensors: (A) total survey-reported contacts; (B) detailed survey-reported contacts; (C) survey-reported in-school contacts; (E) all unique contacts recorded by sensors; (F) all unique contacts with more than 10 cumulative contacts (roughly 3 minutes of interaction); and (G) all unique contacts with more than 100 cumulative contacts (roughly 30 minutes of interaction). Insets in (E)-(G) show the plot of in-school survey contacts versus each metric of sensor-recorded contacts with a cubic smoothing spline. (D) shows the population distribution by grade of participants who completed at least one contact survey or participated in a sensor deployment, compared to the population distribution of the Pittsburgh standard metropolitan statistical area (PSMSA) for 2012. Figure 2 . Factors associated with the number and duration of survey-reported in-school contacts in a US school setting. All models include a random intercept for day of survey completion. Figure 3 . Age-specific mixing matrices generated from in-school survey contacts and unique sensor-recorded contacts in a US school setting at various cumulative contact thresholds. Matrices are presented as log-10 ratio of observed contacts relative to expectation under proportionate mixing assumptions for survey-reported in-school contacts (A) and sensor-recorded unique contacts with thresholds of 0 (B), 10 (C), and 100 (D) cumulative contacts. Blue colours indicate more contacts than expected under proportionate mixing assumptions, and red colours indicate less mixing than expected. Bolded ratio values deviate significantly from the null expectation, ɑ=0.05, and q equals the degree of assortative mixing. Scatterplots (F-H) show the corresponding i,j values of the survey-and sensor-based mixing matrices at each threshold (0, 10, 100). (E) shows the average departure from proportionate mixing as a function of difference between grade for each matrix. . Grade-specific final predicted attack rates of a respiratory virus in a US school setting, based on stochastic simulation using mixing matrices of in-school survey contacts and unique sensor-recorded contacts at various contact thresholds, adjusted by proportionate mixing expectations, within each school (ELEM, elementary; MS, middle school; HS, high school). . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted July 14, 2020. . https://doi.org/10. 1101 /2020 Inactivation of influenza A viruses in the environment and 4 Using data on social contacts to estimate age-46 School Opening Dates Predict Pandemic Influenza A(H1N1) Outbreaks in the United States Spatial Transmission of 2009 Pandemic Influenza in the US Estimating the impact of school closure on social mixing behaviour and 61 Telos: Enabling ultra-low power wireless research in 605 2005 4th International Symposium on Information Processing in Sensor Networks How far droplets can move in indoor 65 Transmissibility of 615 swine flu at Fort Dix, 1976