key: cord-0952877-hj4c1ndl authors: Choi, Yoon-Jung; Park, Mi-jeong; Park, Soo Jin; Hong, Dongui; Lee, Sohyae; Lee, Kyung-Shin; Moon, Sungji; Cho, Jinwoo; Jang, Yoonyoung; Lee, Dongwook; Shin, Aesun; Hong, Yun-Chul; Lee, Jong-Koo title: Types of COVID-19 clusters and their relationship with social distancing in Seoul Metropolitan area in South Korea date: 2021-02-17 journal: Int J Infect Dis DOI: 10.1016/j.ijid.2021.02.058 sha: 49f5a1628419f6205588f1b767d4211a8430e27a doc_id: 952877 cord_uid: hj4c1ndl Background Complete contact tracing of COVID-19 patients in Korea allows a unique opportunity to investigate cluster characteristics. This study aimed to investigate all the reported COVID-19 clusters in Seoul Metropolitan area from January 23 to September 24, 2020. Methods Publicly available COVID-19 data was collected from the Seoul Metropolitan city and Gyeonggi Province. Community clusters with ≥ 5 cases were characterized by size and duration and then categorized using K-means clustering, and the correlation between the types of clusters and the level of social distancing was investigated. Results A total of 134 clusters including 4,033 cases were identified. The clusters were categorized into small (Type I, II), medium (type III), and large (type IV) clusters. With the same number of daily confirmed cases, cases were composed of different types of clusters by different periods of time. Raising social distancing was related with shifting types of clusters from large to small sized clusters. Conclusions Classification of clusters may provide opportunities to better portray the pattern of COVID-19 outbreaks and implement more effective strategies. Social distancing administered by the government may be effective in suppressing large clusters but may not be effective in controlling small and sporadic clusters. During the week before October 27, an exponential increase in the number of new weekly coronavirus disease-19 (COVID-19) cases was reported worldwide, with over 2.8 million cases and nearly 40,000 deaths (WHO, 2020). As of October 27, a total of over 42 million cases and 1.1 million deaths had been reported worldwide (WHO, 2020) , and among these a total of 22,364 confirmed cases and 460 deaths were reported in South Korea (Korea Disease Control and Prevention Agency, 2020) . During the initial COVID-19 outbreak in Korea, a massive outbreak in Daegu accounted for the majority of cases from February to March, whereas cases in the Seoul metropolitan area were not prominent despite outbreaks at Guro-gu customer service call center and Itaewon clubs (Jung et al., 2020; Park et al., 2020) . Since mid-August, however, daily new cases in the Seoul metropolitan area increased dramatically, reaching 200-300 daily cases from large outbreaks in churches and the National Liberation Day rally on August 15. From mid-August up until late September, the majority of the new COVID-19 cases have been occurring in the Seoul metropolitan area. To contain the spread of COVID-19, it is crucial to understand the circumstances under which SARS-CoV-2 is transmitted (i.e. setting and activities) and the measures to reduce the spread of the disease. During the early phase of the COVID-19 pandemic, Pung et al. (2020) reported three COVID-19 clusters in Singapore and suggested the need to expand the scope of the surveillance system for imported cases to cover local cases. Liu et al. (2020) systematically reviewed 65 studies on 108 COVID-19 clusters (790 cases) from 13 countries and revealed that cluster infections may play an important role in modifying transmission patterns of COVID-19. However, the scope of the individual studies included in the systematic review was limited J o u r n a l P r e -p r o o f to specific types of clusters, including family clusters and cluster outbreaks at specific gatherings or places such as conferences, religious events or shopping malls. Contact tracing of every individual COVID-19 patient in South Korea allows a unique opportunity to investigate cluster characteristics. To better understand the patterns of SARS-CoV-2 transmission, we conducted a more comprehensive cluster analysis on the entire community clusters that have been reported in the Seoul metropolitan area. We categorized the clusters according to size and duration, and then investigated the relationship between cluster types and the level of social distancing administered by the government. Data on COVID-19 cases from January 23 to September 24, 2020 in Seoul and the Gyeonggi province was provided by the Seoul Metropolitan Government and Gyeonggi Provincial Office. Because we accessed publicly available de-identified data on COVID-19 cases collected as part of the public health response and released for risk communication purposes, written informed consent was not required. The data included the patient number in the order of diagnosis, the laboratory confirmed diagnosis date of COVID-19, residential area, and transmission route. The data used in this study was produced as a result of telephone or personto-person interviews with COVID-19 patients as a part of epidemiological investigation; thus, the transmission route is verified in each patient directly by contact tracer or epidemiological investigator. Time-series data was obtained from daily new cases and transmission routes of the individual cases from Seoul and Gyeonggi. The diagnosis of COVID-19 was confirmed via J o u r n a l P r e -p r o o f real-time reverse transcription PCR testing from nasopharyngeal or oropharyngeal swab specimens. Since the number of new daily cases dramatically increased after August 11, we divided the time period into the time period prior to (phase 1) and after August 11 (phase 2). Nearly 42% of the South Korean population resides in the Seoul metropolitan area, which includes Seoul, Incheon, and the Gyeonggi Province. Since all the cases in the region are collected in the municipal offices, the data covers a population of 22.90 million including 9.66 million from Seoul and 13.24 million from the Gyeonggi province. Data from Incheon was excluded as information on transmission routes was only partially available. The response to COVID-19 in South Korea is implemented by the Infectious Diseases Control and Prevention Act (Ministry of Government Legislation, 2020; Son et al., 2020) . When a new case is diagnosed, the institution that confirms the initial diagnosis is obligated to report the new case to the municipal office and the Korea Disease Control and Prevention Agency (KDCA). A case study is then conducted by the local public health center under the supervision of the Immediate Response Team. The study includes contact tracing two days prior to symptom onset and up to the SARS-CoV-2 test date. In case of asymptomatic cases, contact tracing is conducted two days prior to the day of SARS-CoV-2 testing. Within 24 hours after identifying a new case, contacts are identified and isolated upon recognition, and prior visits to medical institutions or commercial facilities are reviewed (Seoul Metropolitan Government, 2020). Transmission routes and contacts are identified via patient interviews with subsidiary data as GPS, CCTVs, and credit card use information, under the Infectious Diseases Control and Prevention Act (Article 76-2) (Ministry of Government Legislation, 2020; Ministry of Health and Welfare, 2020a). We defined a cluster as a group of ≥ 5 cases who shares a common transmission route such as place and event, excluding cases with secondary epidemiological links such as withinhousehold transmission.(Furuse et al. 2020) The selection of cases for analysis was performed as shown inFigure 1. Briefly, the analysis was limited to community outbreaks only, and the cases transmitted at nursing homes and medical facilities, including hospitals and clinics, were excluded, since the transmission patterns are different between communities and hospitals in terms of population susceptibility and infection opportunities (Duque et al., 2020) . Imported cases were not included in the analysis since those entering Korea are strictly controlled by COVID-19 testing and quarantine at temporary living facilities (Ministry of Health and Welfare, 2020b), resulting in a limited number of local contacts (Korea Disease Control and Prevention Agency, 2020). Cases with undefined transmission routes were excluded from the analysis. The South Korean government introduced social distancing measures on February 29 and strengthened the measures to high level on March 22. As the number of daily new cases dropped to less than 20, social distancing was eased to moderate level on April 20. As new daily cases were maintained at < 5 for a few weeks from mid-April to early May, social J o u r n a l P r e -p r o o f distancing was further eased on May 6, adopting the concept of "social distancing in life" (Jung et al., 2020) . "Social distancing in life" aimed to maintain a balance between the economy and social activities, and infection prevention and containment, in order to prepare the society for a long-term fight against COVID-19 (Ministry of Health and Welfare, 2020c). On June 28, the government adopted a new 3-level social distancing scheme: level 1 (applied when new daily cases <50), level 2 (daily cases from 50 to <100), and level 3 (daily cases exceeding 100 or doubling twice a week). The size of social gatherings, including sports events, and the operation of public facilities, education/nursery facilities, and workplaces are restricted according to the level of social distancing. Thus, the level of social distancing is a multi-dimensional measure in the context of South Korea, and the changes in social circumstances are complex as a result of changing the level of social distancing. The details of the countermeasures for 3-level social distancing is presented in the Supplementary Materials. For purposes of comparison, we categorized the levels of social distancing prior to June 28 as low, moderate, and high level, and the three levels of social distancing adopted on Jun 28 as low for level 1, moderate for level 2, and high for level 3. In other words, the level of social distancing adopted in this study is relative, as low, moderate, and high rather than absolute as in 1, 2, or 3, because the criteria of application of social distancing was not uniform from January to September. Cluster characterization and the following analyses were conducted separately for phase 1 and 2. Clusters were characterized by two variables, size (the total number of cases in a cluster) J o u r n a l P r e -p r o o f and duration (time period between the first and the last confirmed cases in a cluster). The duration of a cluster was estimated based on diagnosed date rather than onset date since about 33.3% of confirmed cases were reported to be asymptomatic at the time of diagnosis in part due to proactive testing (Workman, 2020). Using these two variables, clusters were categorized into 4 groups by K-means clustering. The process of selecting the variables, and categorization of clusters is elaborated in the Supplementary Materials including Supplementary Table 1 and The individual curves of clusters were investigated by cluster types defined by K-means clustering, and the time period of each cluster was mapped against the level of social distancing administered by the government. If more than two levels of social distancing were implemented over the time period of a cluster, the level of social distancing during the peak of the epidemic curve was selected. The correlation between the cluster types and the concurrent level of social distancing was analyzed by Spearman's rank correlation coefficient. We used R software (v4.0.2) (R Development Core Team, https://cran.r-project.org/) for statistical analyses. Pvalue <0.05 was used as the significance level. A total of 3,281 and 6,174 cases were reported in the Seoul metropolitan area during the phase 1 and 2, respectively. Community clusters with ≥ 5 cases were identified after excluding imported cases, cases associated with nursing homes or medical facilities, clusters with <5 cases, and unclassified cases or cases with unknown transmission routes (Figure 1) . As a result, J o u r n a l P r e -p r o o f a total of 43 clusters including 1,154 cases (35.2% of the total cases) during phase 1, and 91 clusters of 2,879 cases (46.6% of the total cases) were identified during phase 2. Cluster distribution showed a J-shape with large clusters with long duration and small clusters spanning over short and long duration (Supplementary Figure 2) . All large clusters showed long duration; thus, clusters with large size and short duration were not observed. Clusters were effectively categorized into 4 and 5 groups in phase 1 and 2, respectively. Type I~IV showed similar patterns of distribution in the both phases. Type V cluster in phase 2 corresponded to exceptionally large cluster from the S church with 1,010 cases. We combined type IV and V as the large cluster type (type IV) in the phase 2 ( Figure 2 and Supplementary Figure 2) . We plotted clusters against size and duration, and indicated the types of clusters designated from K-means clustering. As a result, each type of clusters was distributed as in Figure 2 : type I with size <30 cases and short duration <2 weeks, type II with small size <30 cases and long duration ≥ 2 weeks, type III with medium size (30~99 cases) and long duration (≥ 2 weeks), and type IV with large size (≥ 100 cases) and long duration (≥ 2 weeks) ( Table 1 ). The characteristics and distribution of the 4 types of clusters are shown in Table 2 . Over the both phases, Type I clusters included 21 workplaces, 15 church activities, 13 family/friends' gatherings, and several other circumstances (Table 3) . Type II included 14 church activities, 10 workplaces, 4 family/friends gathering. Club activities such as trekking club, book club, and volunteering club were unique to type II. Type III clusters included 7 church activities, 4 workplaces, a family/friends gathering, a restaurant, and a shop. Type IV clusters included outbreaks at a customer service call center, Itaewon night clubs, a network marketing service company, an outbreak at a church involving at least 1,010 cases, and the National Liberation Day rally. Figure 3 shows new daily cases belonging to clusters ≥ 5 cases according to types designated from K-means clustering along with the level of social distancing. The surge of daily cases was followed by leveling up social distancing in late March and late August. The peak of daily cases was preceded by easing social distancing in early May (Itaewon clubs) and early August (National Liberation Day rally and churches). From mid-June to early August, small clusters such as type I and II predominated; thus, the level of social distancing were kept low. During the phase 2, large clusters (type IV) dramatically decreased as high level of social distancing was administered; however, type I and II clusters persistently occurred. The number of cases from small clusters (type I and II) gradually increased over the entire period of time. The distribution of the types of clusters were mapped against the levels of social distancing administered by the government. Spearman's rank correlation coefficients between the type of clusters and the level of social distancing showed a positive correlation in phase 1 and a negative correlation in phase 2 and the correlation were statistically significant ( Figure 4 and Supplementary Table 3 ). In other words, smaller clusters were correlated with low levels of social distancing in phase 1, while smaller clusters were correlated with high levels of social distancing in phase 2. We categorized 43 clusters from phase 1 and 91 clusters from phase 2 into 4 types depending on cluster size and duration. We then investigated the correlation between the types of clusters and the levels of social distancing. Large clusters (type IV) mostly occurred on occasions where more than several hundreds of people were concentrated in a single area such as call centres, dance clubs, network marketing service companies, and protests. The details of characteristics of clusters are introduced in the Supplementary Material. Type III clusters tended to occur during gatherings with several dozens of people and occurred while eating at a cafeteria, playing table tennis, or worshiping and singing in close proximity without wearing masks. Type I and II clusters occurred at small-sized workplace, church, and family and neighborhood gatherings, which often involved dining together. The reasons why type II clusters showed longer durations than type I clusters were not clear, but may be associated with the nature of activities that are distinct to type II such as club activities including book club and trekking club, school, public/private academy, and meeting including business meeting and auction. These circumstances could involve many people but the intimacy is limited than in type I. For example, 13 family/friends' gatherings were identified in type I, but 4 were found in type II. It is possible that type II clusters could have occurred at larger sizes with longer duration but were effectively controlled to a lower number of cases since social distancing was more easily applied due to limited intimacy among the people in those circumstances. During the phase 1, the correlation between the size of clusters and the level of social distancing was positive, but it was negative during the phase 2. The Government's administration of social distancing measures follows the trend of the daily confirmed cases, and the level of social distancing implemented also affects the spread of infection. Starting in March, large outbreaks (type IV) occurred and were followed by an implementation of higher levels of social distancing, showing a positive correlation between cluster sizes and the level of social distancing during the phase 1. High level social distancing suppresses the spread of COVID-19 so that only small clusters (type I) occur in September, resulting in negative correlation between the two variables in the phase 2 (Supplementary Figure 3) . The relationship between social distancing and the number of COVID-19 cases is supported by previous studies which showed that early mitigation strategy is associated with a smaller number of cases while relaxation of social distancing is linked with a greater number of cases (Duque et al., 2020; Kaur et al., 2020) . Thus, precautions should be taken when making decisions to ease social distancing and the duration is to be considered. Considering types of clusters helps discern the pattern of disease propagation and its relationship with social distancing. For example, raising the level of social distancing succeeded in suppressing the daily number of cases, but in fact, strict social distancing changed the types of clusters, shifting from large to small clusters, as seen from August to September in Figure 3 . Also, even though the number of daily reported cases are both approximately 25 in early May and mid-September, in early May, the cases are all from large clusters (type IV), while in mid-September, the cases are mostly from small clusters (type I). Different strategies should be implemented by different types of clusters. For example, the durations of type IV clusters are almost always longer than 4 weeks ( Figure 2) ; thus, prolonged investigation is anticipated once type IV clusters are identified and resources should be allocated accordingly Increase in the proportion of small clusters (type III and IV) is also observed as time progresses, which may imply that citizens become exhausted from social distancing over time. At the beginning of pandemic, citizens might have been more alert regarding social distancing at individual levels, which prevented small clusters, but as time passes, the cases from small clusters increased, probably because of fatigue from social distancing at personal levels. This is also observed when we view the entire cases including non-cluster cases from January to J o u r n a l P r e -p r o o f September (Supplementary Figure 3) ; a substantial proportion of the entire cases are "other community infections" that are not otherwise grouped after the level of social distancing was increased during the phase 2. This pattern suggests that different strategy should be adopted to control small and sporadic clusters as well as targeting large scale events. . During the phase 2, the diagnosis-based epidemic curve of type IV clusters was already downsloping before the high level of social distancing was initiated in mid-August. This can be interpreted as the following, but it needs further investigation. Firstly, the downward sloping of the peak of type IV clusters may have been achieved by epidemiological investigation and mass screening, which contributed to early detection and early isolation of cases. Secondly, people may have changed behaviors ahead of the actual implementation of leveled-up social distancing because the media reveals that the government is considering to raise the level of social distancing about a week before the actual implementation. Lastly, it can be evaluated that the timing of raising the level of social distancing was relativelylate, because at the time of elevating the level of social distancing, the peak of the curve was already down-sloping. We inspected all the reported cases during the study period to identify clusters, thus the cluster distribution could reflect the cluster situation of COVID-19 as a whole in the Seoul metropolitan area. However, this study has a few limitations. Firstly, since detailed information on epidemiological factors was not open to public, further analyses of each cluster in terms of patients' characteristics, environment, and circumstances were not possible, which should be pursued for further studies. For example, if information on specific conditions is gathered for each cluster event, such as the number people gathered in one place, the size of the area in J o u r n a l P r e -p r o o f which disease propagated, whether the place was properly ventilated, whether people were wearing masks, whether people had meals or sang together, a prediction model can be developed to predict the type of cluster at the early phase of a cluster. If the type of clusters can be predicted, resources for non-pharmaceutical interventions may be allocated preemptively according to the types of clusters. Secondly, there could have been misclassification of transmission routes during contact tracing since capacities and the level of training may be different by municipal offices in different districts. More systematic information gathering is required during contact tracing to improve the quality of contact tracing data. Thirdly, since the data are based on the patients' residential addresses, the patients who shares the common transmission route but reside outside Seoul or Gyeonggi were excluded from a cluster. To include all the patients in a cluster, further resources are needed from the municipal offices nationwide. Lastly, the duration of cluster may be affected by generation time from infectors to infectees; however, the information on the relationship between infectors and infectees were unavailable. Further studies are warranted to understand the relationship between generation time and the duration of clusters. We categorized the entire clusters in the Seoul metropolitan area showing the temporal distribution of clusters by types. Cluster categorization provides an opportunity to better understand the pattern of disease propagation. Temporal distribution of different types of clusters and its relationship with the level of social distancing implies the need for different containment strategies for a specific period of time. For this article, the ethical approval was not required. J o u r n a l P r e -p r o o f Table 3 . Number of COVID-19 clusters (f number of cases) by types from January to September, 2020 in the Seoul metropolitan area. The 4 types of clusters are the following: type I (small size (<30 cases) and short duration (<2 weeks)), type II (small size <30 cases and long duration (≥ 2 weeks)), type III (medium size (30~99 cases) and long duration), and type IV (large size (≥ 100 cases) and long duration). types of clusters: type I (small size (<30 cases) and short duration (<2 weeks)), type II (small size <30 cases and long duration (≥ 2 weeks)), type III (medium size (30~99 cases) and long duration), and type IV (large size (≥ 100 cases) and long duration). Timing social distancing to avert unmanageable COVID-19 hospital surges Coronavirus Disease 2019 Outbreak at Nightclubs and Distribution Centers after Easing Social Distancing: Vulnerable Points of Infection Understanding COVID-19 transmission, health impacts and mitigation: timely social distancing is the key Korea Disease Control and Prevention Agency Cluster infections play important roles in the rapid evolution of COVID-19 transmission: A systematic review Infectious Disease Control and Prevention Act COVID-19 response guideline for municipal officials (9th-2nd Edition) Central Disease Control Headquarters. Central Disaster Management Headquarters Available Ministry of Health and Welfare. 2020c. Basic guidelines for distancing in daily life: Central Disaster and Safety Countermeasure Headquarters Available Coronavirus Disease Outbreak in Call Center Investigation of three clusters of COVID-19 in Singapore: implications for surveillance and response measures. The Lancet Seoul Metropolitan Government. 2020. (Seoul & Gyeonggi) Raising the social distancing and disinfection responding system to level 2 Available Epidemiological characteristics of and containment measures for COVID-19 in Busan WHO. 2020. COVID-19 Weekly Epidemiological Update We thank epidemiological investigators and contact tracers in Seoul Metropolitan Government and Gyeonggi-do Provincial Office who devoted themselves to contain the outbreaks. We also acknowledge Prof. Sung-il Cho for his valuable advice on this paper. J o u r n a l P r e -p r o o f