key: cord-314466-6j4vuqer authors: Kim, A. S. title: Transformed time series analysis of first-wave COVID-19: universal similarities found in the Group of Twenty (G20) Countries date: 2020-06-14 journal: nan DOI: 10.1101/2020.06.11.20128991 sha: doc_id: 314466 cord_uid: 6j4vuqer As of April 30, 2020, the number of cumulative confirmed coronavirus disease 2019 (COVID-19) cases exceeded 3 million worldwide and 1 million in the US with an estimated fatality rate of more than 7 percent. Because the patterns of the occurrence of new confirmed cases and deaths over time are complex and seemingly country-specific, estimating the long-term pandemic spread is challenging. I developed a simple transformation algorithm to investigate the characteristics of the case and death time series per nation, and described the universal similarities observed in the transformed time series of 19 nations in the Group of Twenty (G20). To investigate the universal similarities among the cumulative profiles of confirmed cases and deaths of 19 individual nations in the G20, a transformation algorithm of the time series data sets was developed with open-source software programs. The algorithm was used to extract and analyze statistical information from daily updated COVID-19 pandemic data sets from the European Centre for Disease Prevention and Control (ECDC). Two new parameters for each nation were suggested as factors for time-shifting and time-scaling to define reduced time, which was used to quantify the degree of universal similarities among nations. After the cumulative confirmed case and death profiles of a nation were transformed by using reduced time, most of the 19 nations, with few exceptions, had transformed profiles that closely converged to those of Italy after the onset of cases and deaths. The initial profiles of the cumulative confirmed cases per nation universally showed 3 - 4 week latency periods, during which the total number of cases remained at approximately ten. The latency period of the cumulative number of deaths was approximately half the latency number of cumulative cases, and subsequent uncontrollable increases in human deaths seemed unavoidable because the coronavirus had already widely spread. Immediate governmental actions, including responsive public-health policy-making and enforcement, are observed to be critical to minimize (and possibly stop) further infections and subsequent deaths. In the pandemic spread of infectious viral diseases, such as COVID-19 studied in this work, different nations show dissimilar and seemingly uncorrelated time series profiles of infected cases and deaths. After these statistical phenomena were viewed as identical events occurring at a distinct rate in each country, the reported algorithm of the data transformation using the reduced time revealed a nation-independent, universal profile (especially initial periods of the pandemic spread) from which a nation-specific, predictive estimation could be made and used to assist in immediate public-health policy-making. A brief history of the first deaths 10 On December 31, 2019, Chinese health authorities treated a patient cluster of pneumonia caused by the newly recognized coronavirus, i.e., severe acute respiratory syndrome-coronavirus 2 (SARS-CoV-2): 1 they had closely monitored the cluster from the beginning of December, 2019. The first domestic death in China was reported on January 11, 2020, and afterward the city of Wuhan, with a population of 11·08 million, was locked down. Outside China, the first death was reported by the Philippines on February 2, 2020, followed by France on February 14, Italy on February 15 22, and the U.S. on February 29. This rapid transmissibility of the coronavirus was estimated by using the instantaneous reproduction number (R t ) and confirmed case-fatality risk for four megacities and multiple provinces reporting the highest number of confirmed cases in China. 2 The aggressive non-pharmaceutical interventions reduced only the first wave of COVID-19 outside of Hubei, and this effort that might have been more successful if foreign importation had been limited to prevent viral reintroduction. Effects of various non-pharmaceutical intervention attempts were 20 reported to be effective in reducing the transmission of COVID-19 (as well as influenza) in Hong Kong, such as border restriction, quarantine and isolation, and social distancing. 3 Further nation-specific situations of G20 nations can be found elsewhere: Argentina, 4 Australia, 5, 6 Brazil, 7 Canada, 8 China, 9, 10, 11 German, 12 France, 13 Indonesia, 14 India, 15 Italy, 16, 17, 18 Japan, 19 Korea, 20 Mexico, 21, 22 Russia, 23 Saudi Arabia, 24 South Africa, 25, 26 Turkey, 27 UK, 28 US, 29 and multiple European nations. 30 25 Because the failure of non-pharmaceutical intervention was ascribed to overseas travel, a global metapopulation disease transmission model was used to project how travel limitations contributed to the mitigation of the global COVID-19 spread. 31 Within 2 weeks after the first death in China on January 11, the coronavirus appeared to have already been transmitted to other major cities within mainland China. 32 The reported data suggest that nonpharmaceutical interventions were effective only within cities within China but did not significantly affect the trans- 30 port of COVID-19 overseas. As of April 30, 2020 , the global number of confirmed cases and deaths exceeded 3·3 million and 233 thousand, respectively. Nevertheless, the initial routes of COVID-19 transmission from China to other countries are not well identified, and the correlations among patterns of the occurrence of confirmed cases and deaths in each nation are still ambiguous. 2 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 14, 2020. . https://doi.org/10.1101/2020.06.11.20128991 doi: medRxiv preprint The objective of this study, therefore, was set to analyze the CCC and CCD of a number of selected countries and investigate any possible universal similarities in the patterns of increases in CCC and CCD over time from December 31, 2019 until April 30, 2020. The observed universal similarities were used to predict a few cases in mid May, 2020. For this purpose, a simple mathematical data transformation model was conceptually developed and implemented by using open-sourced software packages and utilities. All data sets used for this work were downloaded from the ECDC website 39 as a comma-separated value (csv) format file, named "download" without a file extension. The data file contained the daily numbers of confirmed cases and deaths for 206 locations, according to country or territory codes. Open-sourced utilities such as bash, 40 sed, 41 and awk 42 were used as needed to extract the daily CCC and CCD data for individual countries or territories. A total of 206 files were generated, with file names identical to specific countries or territories. Each extracted csvformat file had rows of six selected integer items (i.e., columns): date, day, month, year, cases, and deaths, where the date format was dd/mm/yyyy. In Octave (open-sourced software, an alternative clone to MATLAB), 43 the CCC and CCD were calculated and plotted against the number of days elapsed after December 31, 2019. For each country or 70 territory, eight graphs -new (daily) vs. cumulative, cases vs. death, and linear vs. logarithmic (with base 10) profiles -were plotted against the number of days. The script-generated graphs were automatically saved as image files for visual investigation after the data were plotted. On a desktop computer (Linux OS, Ubuntu 18·04·4 LST (Bionic Beaver), Intel(R) Xeon(R) CPU E5-2697 v4 @ 2·30GHz, 64 GB RAM), this task required only 3-4 minutes for all 206 countries and territories. Close visual inspection of these graphs per nation provided an initial understanding 75 of each nation's CCC and CCD time series in 2020, where January 1 was set as day 1. Although the basic data extraction and mining were completed for all nations by using the daily updated data file, analysis of all 206 countries or territories was challenging. To potentially include nations representing all continents (excluding Antarctica), 19 individual nations were selected in G20 for the current analysis. The European Union (EU) was excluded because it contains multiple member countries, and ECDC reports COVID-19 data only for individual countries and territories. functions embedded in Calc. The date-information in "dd/mm/yyyy" format was converted into "yyyy-mm-dd" format to calculate the number of days between two specific dates by using the "DAYS" function, embedded in Calc software. This time conversion from lexical date format to number of days (as a countable integer) was an important basic step for further data investigation and analysis. In this work, the cumulative data of cases and deaths (i.e., CCC and CCD) were primarily used instead of data of new daily occurrences with respect to the number of days after December 31, 2019. In principle, using the cumulative information in statistical analysis is equivalent to using a cumulative density function as an integral of a probability density function with respect to control variables. The fundamental advantages of using the cumulative data are as 95 follows. First, the daily data often fluctuate too much to capture specific variations and trends in target variables, so that statistically meaningful characteristics not only are subject to the data observer's viewpoint but also are often challenging to extract. Second, the cumulative profiles never decrease, so that either rapid/gradual or local/global variations can be systematically captured by using semi-log plots, i.e., common-logarithmic CCC and CCD on the y−axis vs. the linear day number on the x−axis. Third, the ever-increasing trend in logarithmic cumulative data is 100 often considerably smoother than the original daily time series. Therefore, variation trends can be captured without significant statistical noise. In addition, with a fixed time interval, i.e., 1 day in the current case, the original time series data can be easily retrieved by calculating the difference of the cumulative data between two consecutive days. For a consistent analysis, we define new variables, such as C (n) and D (n) of day n, indicating the CCC and 105 CCD, respectively, where n increases from n = 1 of January 1, 2020 to the day for which the latest data are available. The time series analysis in this study primarily used the cumulative data from January 1 to April 30, 2020, and the developed algorithm was tested using the data from May 1 to 15, 2020. Because the time series of cases and deaths are updated daily, a unit time interval is set as 1 day, i.e., δn = 1. Then, CCC and CCD are represented as functions of day number n: respectively, where δC (n) and δD (n) are the increased numbers of cases and deaths, respectively, on day n. Some nations do not have complete data sets: the missing days either are in early January, when no pandemic effects were found, or are intermittent 2-3 days after the pandemic report was started. Because cumulative data are processed and analyzed in this study, the missing days are treated as days with no new occurrences, which is to avoid arbitrarily 115 altering the statistical results by interpolation processes. Data similarity was found unexpectedly between the infection and death time series available online and the phase transition patterns of matter in thermodynamics. In statistical physics, the Clausius-Clapeyron equation describes a discontinuous transition between two phases of single-constituent matter. Various materials have their unique material 120 properties, often represented by using material constants, for example, evaporation enthalpy called latent heat at a specific temperature. Although many organic or inorganic solutions have various trends in their phase-diagrams, e.g., pressure vs. temperature (P − T ) curves, the overall pattern of how P increases with temperature T is universal in physics. Along the liquid-gas equilibrium, P and T are related as where ∆s = L h (T ) /T is the molar entropy change from liquid to gas phases, L h [Joule/mol] is the molar latent heat 125 as a function of absolute temperature T [K], and ∆v [liter/mol] is the molar volume difference per molecule between 4 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) the two phases, which is is often approximated as the gas-phase molar volume. Recent analysis of the Clausius-Clapeyron equation for water has been described elsewhere in detail. 45 Theoretical ideas for the current work stem from fundamental thermodynamics and statistical physics that the Clausius-Clapeyron equation provides a universal functional form that covers a number of materials undergoing transitions between two or more phases. Among the 19 nations belonging to G20, the cases and deaths reported in Italy were selected as principal data. Furthermore, we hypothesized that the CCC and CCD profiles of other nations would have certain degrees of similarity to those of Italy. Because minimizing deaths is a more immediate task than reducing the number of confirmed cases, we investigated the CCD data first. To directly compare the CCD time series of Italy (IT) and that of nation X, a 135 characteristic day is defined as the first day when Italy's CCD exceeded a certain threshold number selected here as 10, denoted ν d = 10. The day number of the date when Italy's total deaths exceeded the threshold is denoted m IT d = 54 (indicating February 26, 2020). Then, the x−axis of Italy's CCD vs. time graph moves from n to n − m IT d , which is equivalent to moving Italy's profile as many as m IT d = 54 days to the left on the time axis. In general, reduced time τ d (for the total deaths), specifically for Italy, is defined asτ by definition because Italy's CCD data form an international baseline of CCD and therefore do not need to be scaled. A general definition of the reduced time CCD is for nation X, which will be replaced by the two-character abbreviations of nations investigated. In this case, n − m IT d indicates the number of days after the sudden increase in Italy's CCD. The universal value ν d = 10 is determined, as it is frequently done for order-parameter estimations in statistical physics, because Italy's 145 CCD drastically increases after it exceeds 10. A negative value of n − m IT d indicates the number of days before the explosive CCD onset. For nation X, the reduced time τ X d can be obtained by identifying the nation's day number of CCD onset m X d , interpreted as a time-shifting parameter, and calculating β X d , defined as a time-scaling parameter. The physical meaning of β X d is explained as if β X d = 1 · 5; then the CCD rate of nation X is 1·5 times slower than that of Italy. In this study, it was found that most nations (except a few outliers) have β values greater than 1·0. After , is plotted on the same graph with respect to the nation's reduced time τ X d = n − m X d /β X d by using an initial guess of β X d = 1. Italy's variation in CCD has the stiffest slope among all countries after its first COVID-19-related death was reported on February 22, following France's first reported death on February 14. In this regard, the CCD profiles of other countries, especially the five European nations of Germany (DE), France (FR), Russia (RU), Turkey (TR), and the 155 United Kingdom (UK), are on the right-hand side in the time axis of n to Italy's CCD with lower CCD profiles. While m X d moves the CCD profile of nation X to the left to match its onset to that of Italy, the β X d value proportionally shortens (or lengthens) the CCD profile of nation X on the shifted axis of n − m X d . The CCD profiles of Italy and nation X are numerically integrated with n and τ X d , respectively, from 0 to min max n − m IT Values of the integrals are the areas under the CCC curves for Italy and nation X, whose absolute difference is minimized 160 by iterative adjustment of β X d in Calc software. That is, the optimal value of β X d maximizes the overlapping degree of Italy's linear CCD (without use of the time-scaling parameter β IT d ) and nation X's transformed CCD profiles. Because β X d is determined by comparing two finite integrals, β X d is independent of the pre-selected values of m X d . Moreover, CCC profiles are analyzed with the same method for CCD profiles, i.e., for nation X, identification of the CCC onset day, calculation of time-shifting parameter m X c , and determination of the time-scaling parameter β X c . The reduced 165 time for nation X's CCC is similarly defined as where m c and β c are different for each nation. The threshold of CCC is preset as ν c = 100 using the same criteria to preset ν d . Specific values of τ and β for CCD and CCC are listed in Table 1 for all G20 nations excluding the EU (denoted G19) nations. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 14, 2020. . This work was motivated by interesting similarities among the CCD data for six nations in the continent of Europe included in G19: DE, FR, IT, RU, TR, and the UK. These six nations are denoted E6 throughout the manuscript, and E5 denotes all those nations except Italy, i.e., the nation with the highest number of CCC and CCD at most times to date. Figure 1 shows the pandemic time series of the E6 nations: CCC profiles on (a) linear and (b) logarithmic 175 scales and CCD series on (c) linear and (d) logarithmic scales. Notably, the y−axis maxima of (a) and (b) for CCC are much higher (approximately a tenfold or more) than those of (c) and (d) for CCD. The current fatality rate of COVID-19 is estimated to be on the order of O 10 −2 , i.e., a few percent. Herein, CCC and CCD data are plotted on the y−axes of the linear and logarithmic (base 10) scales along (linear) x−axes of time, i.e., the number of days after December 31, 2019, which are denoted in linear and logarithmic plots, respectively. In general, two data lines in a 180 linear plot can be compared by observing the apparent dominance of one line over the other in terms of magnitude. The logarithmic plots allow for comparison of two data lines at various orders of magnitude. For example, the onset-time and latency-period of each nation are better visualized in the logarithmic plots, whereas their one-to-one comparison is more straightforward in the linear plots. Several unique aspects identified by simple visual investigation in figure 1 became the major motivation for this 185 study. First, the linear CCC profiles, shown in figure 1(a) appear to follow an ordered previously unknown pattern. Except for the CCC case of the UK, no intersections between two nations are seen. This trend is exceptionless in the CCD profiles shown in figure 1(c), thus strongly implying that, if the ordered pattern continues for all E6 nations, none of the E5 nations will have more severe situations than Italy, which has the highest CCC and CCD numbers. This argument was found to be reasonably valid until mid-April, 2020 (i.e., day 100 or later). Individual nations' immediate 190 enforcement of public health policies may be able to alter increasing CCC rates, but non-pharmaceutical interventions are known to be inefficient after COVID-19 spread become prevalent. Second, figure 1 (b) shows a sudden onset followed by a latency period of approximately 1 month. During this latency period, the CCC remains more or less 10, specifically, between 2 and 20. One exception is Turkey, showing only a few days of CCC latency: Turkey's CCC also intersects with that of Russia before it exceeds the threshold around 100. Determination of the onset time and latency More importantly, the latency periods of CCD profiles are often too short to recognize, and the subsequent death bursts are very abrupt, showing stiff slopes. The onset profiles of the E6's CCD suggest that after a nation identifies the first death related to and/or caused by the coronavirus, the death burst will inevitably occur within no more than a week. Updated plots of E6's CCC and CCD profiles until May 15, 2020, are included as figure A.1 in Appendix. 210 The pandemic profiles of the E5 nations are investigated by calculation of each nation's m and β values for CCC and CCD, which are compared with those of Italy as baseline values. Figures 2 (a) and (b) show the apparent similarities between the E5's transformed profiles and Italy's linear profiles of CCC and CCD, respectively. This universality of the pandemic time series data is present in both confirmed cases and deaths, because an infection is a necessary condition for death unless complications of the COVID-19 pandemic develop unexpectedly. Since the key 215 process in the current study is transforming the cumulative profiles of nation X onto those of the reference country (Italy), the threshold values for CCC and CCD must be preset to calculate m and β values. These threshold values are subject to intuitive data observation, allowing qualitative human perceptions to be input into numerical calculations for more meaningful data interpretations. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 14, 2020. . https://doi.org/10.1101/2020.06.11.20128991 doi: medRxiv preprint CCC of the E6 Figure 2 (a) shows that not all E6 nations have a similar onset trend after a certain latency period, but most appear to follow Italy's profile after their CCC numbers exceed approximately no more than 100, denoted here as the default CCC threshold ν c = 100. During the transformation, the actual threshold value of a nation's CCC is flexibly and manually adjusted to identify the best profile-matching values of m c and β c per nation. In general, the best matching is obtained by concurrently searching ν c , m c and β c by using the "Solver" function in OpenOffice Calc. The target 225 matching zone of the reduced time is where Italy's CCC varies from 100 to 10,000. Germany, France, and the UK clearly have longer latency periods of approximately 3-4 weeks, and three other nations have latency periods of less than 1 week. Italy's latency period appears as almost four weeks with a small CCC number of 3. Much in-depth research is required to understand the heterogeneous onset trends in neighboring nations, but the similarity of their transformed CCC profiles after τ c > 0 is, to the best of my knowledge, a unique finding in this work. All E5 nations' Table 1 . This onset time difference is graphically represented as the distance between the highest CCC values of the UK and Italy. In addition, β UK c = 1 · 021 indicates that the propagation speed of the confirmed cases in the UK is 1·021 times (or 235 2·1 percent) slower than that of Italy, so that the UK and Italy have the same rate of increase in CCC over time. The difference in the reduced time between the latest CCC numbers between Italy and the UK is 67-58·83=12·17 days in figure 2(a) , and if these values are multiplied by β UK c = 1 · 021, then 12·45 days is obtained, which is the time distance in real days for the UK to reach the highest level seen in Italy. That is, the UK's CCC will increase from 165 thousand (as of April 30) and reach that of current Italy (203,591 as of April 30) by May 12 or 13 (i.e., 12 or and conceptually visualized by a double arrow at the end of Italy's CCC profile in 2(a). In addition, Germany and France have CCD onsets 5 days after that of Italy, and France's CCD is approximately 23.3% slower than that of Italy, 245 but Germany's CCC propagation is as fast as that of Italy because of β DE c = 1 · 025 1 · 0. Turkey and Russia both started their CCC onset 21 days after Italy's: in terms of CCC propagation rates, Turkey (β TR c 1.0) has an equal pace to that of Italy, and Russia (β RU c = 1 · 353) is the slowest among the E6 nations and is close to the US (β US c = 1 · 332). These nation-specific, complex trends of CCC onset and burst are explained using the simple transformation method that can predict the time distance to the future situations using the reduced time concept. CCD analysis methods are the same as those of CCC, described above. higher β d values than β c : β FR d = 0 · 898, thus indicating that the death rate of France is 1/β FR d = 1 · 114 times (or 11·4 percent) faster than that of Italy. Because France is the only nation with β d < β c , its slower CCC and faster CCD rates than those of Italy may be ascribed to the first death in France occurring 8 days earlier than Italy's first death on February 22. Similarly to T c (UK/IT), the CCD time distance of the UK, denoted T d (UK/IT) is calculated as where τ IT is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 14, 2020. . https://doi.org/10.1101/2020.06.11.20128991 doi: medRxiv preprint any future scenario. After April 30, 2020, the UK's CCC and CCD numbers exceeded those of Italy, as shown in figure A.2(a) and (b), including May data on to figure 2(a) and (b), respectively. Russia appears to start the secondary 270 CCC burst after April 30; however, its CCD record is still one order of magnitude lower than that of Italy because Russia's first death is 31 days after that of Italy. More one-to-one comparisons between nations and Italy could be used to provide specific information for decision-making but are beyond the scope of this work's focus on developing and implementing the transformation algorithm with the newly defined time-shifting and time-scaling parameters of m and β, respectively. Nevertheless, the transformation methods used above for E6 nations are extended to additional 275 nations. We expand the list of nations to test whether the CCC and CCD similarities among E6 nations are also present in more nations on six different continents: Africa, North and South America, Asia, Australia/Oceania, and Europe. In this regard, we select G19 as described above. Because the EU is not an individual nation but a group member of 280 European countries, it is not included in the pandemic analysis in this study. Data used for the G19 analysis implicitly include those of the E6 used above. Figure 3 shows the G19 nations' linear profiles of CCC and CCD, denoted C (n) and D (n), respectively, and their scaled profiles of C (τ ) and D (τ ), respectively. The overall trend in C (n) profiles in figure 3 (a) is similar to that of D (n) in figure 4(c) slightly moved to the 285 left in the time axis, because a nation's CCC profile precedes its CCD profile in time. As previously discussed for the E6 cases, the threshold cut and latency periods of CCC onsets are generally higher and longer than those of CCD onsets, respectively. Most G19 nations show a two-step CCC burst: a first burst from O (1) to O 10 − 10 2 , followed by a certain latency period before a second burst from O 10 − 10 2 to at least O 10 4 or higher. As graphically shown in figures 3(b) and (d), the universality of the cumulative profiles is not limited to E6 nations in the European figure 3(b) , i.e., in the most recent days, shows a much stiffer slope than those of China, South Korea, and even Italy. It is concerned that C US τ US c will increase more rapidly than those for any other nations in G19. Except for a few abnormal or outlying CCC profiles, figure 3(b) shows a strong universality among G19 nations in converging to the baseline CCC of Italy, especially during the initial reduced time τ d of approximately 15 days, despite the large 310 population differences and the different continents. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted June 14, 2020. . https://doi.org/10.1101/2020.06.11.20128991 doi: medRxiv preprint of deaths is manageable. The universal linearity of all G19 nations shown in 3(d) in the reduced time τ from 0 to 30 mathematically implies a simple power law for a nation i such as 320 log 10 D (i) = ατ where α is the universal slope and γ (i) is an y−intercept. On the basis of Italy's CCD data from the first 21 days after onset, α = 0·121 and γ IT = 1·02 are calculated. In figure 3(a) and (c), as of April 30, C US (n = 121) = 1, 040K is approximately 5·10 times C IT (n) = 204K, and this ratio is in good agreement with the population ratio (in 2019) of the US (328·2 M) and Italy (60·36 M), i.e., 325 328 · 2 ÷ 60 · 36 = 5 · 44. However, the ratio of D US (n) = 609 · 7K and D IT (n) = 27 · 8K, i.e., 2·19, is below half the population ratio as of April 30. The other, more rigorous way to estimate the death rate is as follows. The which is closer to C US (n = 121) /C IT (n = 121) = 5 · 10 and the population ratio of 5·44. Although the ratios calculated above are not a common academic standard but instead are empirical, they provide a quantitative method to estimate the future CCC and CCD of the US, as derived from the baseline values for Italy. The coincidence of the total case and population ratios is limited to the US and further in-depth research is crucial for more fundamental evaluations. For the US, β US c = 1 · 332 and β US d = 1.221 indicate that both the case and death rates are 33·2 percent and 22·1 percent slower than those of Italy, respectively. The CCC and CCD onsets of the US are only 8 and 9 days after those of Italy. However, after the US's CCD onset, only 38 days (from March 5 to April 12) are required to exceed Italy's linear CCD profile, D IT (n), in figure 3(c) . The same trend in the US is found in figure 3(d) , such that ∆τ US figure 3 (b) should be further analyzed in depth, because the US currently has the largest number of CCD in the world, and, more severely, the greatest potential for further deaths. Closely monitoring the UK, Russia, and Brazil, in addition to the US, indicates that Italy's role as the reference country may end for this first wave of onset and burst 345 of global cases and deaths. The second wave is not yet predictable using the current knowledge, obtainable from the data released by ECDC. Figure 4 shows a scatter plot of β c vs. β d for all G19 nations. Italy is, in principle, positioned at (1,1) on the diagonal line. A position above the diagonal line indicates that the nation's death rate is slower than the infection case 350 rate, i.e., β d > β c . Most nations are located between the diagonal and upper lines, with the exceptions of Australia and South Korea. In Australia, even if the number of infection cases were to increase as rapidly as those of most nations, such as India, South Africa, and Canada, the death rate is approximately three times slower than the infection rate. South Korea is located at (0.914, 2.907) far above the Italy position (1,1), but below that of Australia. Although South Korea initially had a faster increase in the infection number (β KR c < 1) than that of Italy, their death rate 355 significantly decelerates after day 90 in figure 3(c) . Three nations, France, Mexico, and the US, are below the diagonal line. Although the positions of these three nations are still near the diagonal line, their death rates are of concern because of β d < β c . For more meaningful analyses, various domestic conditions in the nations should be considered systematically in addition to the actual number of CCC and CCD; however, such analysis is beyond the scope of this research. The present study used a mathematical transformation to identify universalities among many nations (on multiple continents) lacking apparent similarities in population, land size, and socioeconomic conditions. Two time-related parameters were newly introduced in this study for the CCC CCD: the time-shifting parameter m and the time-scaling parameter β. These parameters move a nation's cumulative profile to a new time-origin and allow the nation's profile to be matched to Italy's baseline by stretching or shrinking the profile (anchored at the new origin) along the time 370 coordinate (i.e., x−axis). The m and β values were obtained individually for the CCC and CCD of each nation and used to define the reduced time τ . By transforming a nation's data relative to Italy's baseline CCC and CCD, the short-term estimation of cumulative profiles becomes possible, and the results can be used for broad types of decisionmaking. Because the large number of individual nations and territories where the COVID-19 pandemic caused severe public health problems, the current study is restricted to the time series of the CCC and CCD of 19 independent nations 375 within G20 (excluding the EU). The primary research idea originated from the sequentially ordered patterns of CCD time series found in six nations on the European continent during the early stage of the pandemic spread, i.e., within 90 days. With the transformation methods of reduced time, both the CCC and CCD profiles of the five European nations converged to those of Italy, which were used as baselines for the rest of the present study. Exceptions observed were China, South Korea, and the US, owing to their noticeable deviations from the CCC and CCD profiles from 380 those of Italy. When the transformation of the cumulative data was extended to all G19 nations, the universality of the profile convergence was found to be valid for as many as 15 nations within G19, excluding the three exceptions above and the reference country, Italy. The common plateau profiles of China's CCC and CCD, reached in the middle of February 2020, showed early deviation from Italy's baseline profile. South Korea's CCC profile appeared to be a down-sized version of China's, representing only a small number of new recent cases per day. If the CCC profile is 385 assumed to be a good precursor of incoming CCD, South Korea's CCD profile already have appeared to deviate from that of Italy (see figure 3) . On a linear time-scale, the CCC and CCD profiles of the US already exceeded those of Italy in late March and early April, respectively, but the US profiles in the reduced time-scale became the world largest values much earlier than those observed on the linear time-scale. Visual investigation of the transformed CCC and CCD profiles implied that, unlike those of other nations, the US profiles intrinsically did not follow Italy's profiles but 390 increased much faster over the most recent CCD and CCC values of Italy. A rough but conservative prediction of the US's CCC is approximately five fold higher than that of Italy, a result similar to the US-to-Italy population density ratio. More importantly, for future responses to a similar pandemic spread, I emphasize new fundamental insights obtained and findings from the transformation model developed herein. The CCD and CCC profiles of different nations 395 show subtle but distinct characteristics in their onset behaviors. The six nations in Europe had drastic increases in CCD immediately after their onsets. Only France and Russia showed graphically recognizable CCD latency periods. However, five nations in Europe (except Turkey) showed a longer latency period of CCC, close to four weeks. For the six nations in Europe, the threshold values of CCC and CCD were on the order of O (10) and O 10 2 − 10 3 , respectively, which correlated with the fatality rate of COVID-19, close to 7.0% as of April and May, 2020. Similar 400 characteristics of the latency periods and threshold values of CCC and CCD were found for all G19 nations, implicitly including E6 nations. First, in general, the CCC profile of a nation usually has a longer latency period and a higher threshold value than those of CCD of the nation, respectively. Second, there exists a baseline CCC and CCD profiles of a nation whose status are more severe than that of any other nation in the initial pandemic period of at least 90 days; this nation was Italy in the current study. The three outlier nations whose CCD/CCC profiles did not converge to those 405 of Italy were the US,China, andSouth Korea. On the basis of visual investigation of transformed CCC/CCD plots of these nations, the CCC/CCD of the US, which already exceed those of Italy, is expected to increase much faster than those of any other nations, to an unprecedented level. The CCC and especially the CCD of South Korea have almost certainly already stabilized to seemingly constant values, thus indicating that the number of new confirmed cases per day should decrease down to one-or two-digit numbers. Profiles of China has the longest time series with an abrupt 410 daily jump of 1,290 in CCD but no similar variations in its CCC. In recent engineering disciplines, big data research and applications have become an essential component of future development. In this regard, two representative approaches are data-driven and data-oriented: in the former, progress is compelled by data, excluding human inputs, whereas the latter is originally intended to optimize software programs against the object-oriented programming of a poor data locality. In decision-making processes during the current 415 global health crisis, a novel paradigm should include the principal advantages of the robust data structures of datadriven approaches, relaxing the excessive dependence on data and ensuring efficient data-utilization in data-oriented approaches for prompt, accurate decision-making. In addition, data-informed approaches can be also considered to create proper balances in decision-making by fully utilizing demonstrated knowledge, experience, skills, and predictive ability. In conclusion, the current study of the mathematical transformation (i) combined three data-utilization approaches by using the pandemic data from ECDC on human infection and death caused by COVID-19 without any mining processes, (ii) developed program scripts by using open-sourced software, and (iii) intuitively applied basic principles of statistical physics to pandemic time series analyses. New outcomes of the current work are the developed transformation model of the reduced time τ with two parameters of m and β, tabulated for the CCC and CCD time series of 425 a total of 19 nations. Specifically, this study suggests that the universality found by using the transformation model can be effectively and efficiently applied in public health policy-making for any nations whose CCC and CCD profiles follow those of the leading nation, i.e., Italy. Because neighboring nations have similar trends in CCC and CCD propagation over time, close international collaborations involving sharing of human and medical resources would effectively decelerate the pandemic spread and risk to human lives. ASK designed the study, collected the data, developed scripts of open-sourced software, analyzed and interpreted the data using the scripts, and wrote the manuscript. The author declares no competing interests. This paper was reviewed by Thomas C. Hardy, a professional medical editor, for accuracy and correct language use. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 14, 2020. . https://doi.org/10.1101/2020.06.11.20128991 doi: medRxiv preprint Table 1 for details.) The double arrow toward the end-point of Italy's profile in (a) indicates the estimated time distance for the U.K. to reach the same CCC level as that of Italy. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 14, 2020. . https://doi.org/10.1101/2020.06.11.20128991 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 14, 2020. . https://doi.org/10.1101/2020.06.11.20128991 doi: medRxiv preprint and CCD have gradually exceeded those of Italy so that the current model with Italy as the reference country might not be as predictable as the present analysis. A population-based pandemic dynamics model is of great necessity. Table 1 for details.) The double arrow toward the end-point of Italy's profile in (a) indicates the estimated time distance for the U.K. to reach the same CCC level as that of Italy. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted June 14, 2020. . https://doi.org/10.1101/2020.06.11.20128991 doi: medRxiv preprint Novel Coronavirus -China, Disease outbreak news First-Wave Covid-19 Transmissibility and Severity in China Outside Hubei After Control Measures, and Second-Wave Scenario Planning: a Modelling Impact Assessment. The Lancet Impact assessment of non-pharmaceutical interventions against coronavirus disease 2019 and influenza in Hong Kong: an observational study. The Lancet Public Health A simulation of a COVID-19 epidemic based on a deterministic SEIR 460 model. arXiv Modelling transmission and control of the COVID-19 pandemic in Australia. arXiv COVID-19 National Incident Room Surveillance Team. COVID-19 So what?". The Lancet Diagnosis and Management of First Case of COVID-19 in Canada: Lessons applied from SARS Critical Review of the Present Situation of Corona Virus in China Epidemiology of Corona Virus in the World and Its Effects on the China Economy Sustaining Containment of COVID-19 in China. The Lancet Covid-19: Why Germany's case fatality rate seems so low Cluster of COVID-19 in Northern France: 480 a Retrospective Closed Cohort Study Review and Analysis of Current Responses To COVID-19 in Indonesia: Period of India Under COVID-19 Lockdown. The Lancet Similarity in Case Fatality Rates (CFR) of COVID-19/SARS-COV-2 in Italy and China Estimating the Asymptomatic Proportion of Coronavirus Disease 495 2019 (COVID-19) Cases on Board the Diamond Princess Cruise Ship Transmission Potential and Severity of COVID-19 in South Korea Predicting COVID-19 distribution in Mexico through a discrete and time-dependent Markov chain and an SIR-like model. arXiv A data driven analysis and forecast of an SEIARD epidemic model for COVID-19 in Mexico. arXiv Covid-19 As a Tool of Information Confrontation: Russia's Approach. SSRN Electronic Journal COVID-19: Preparing for Superspreader Potential Among Umrah Pilgrims To Saudi Arabia. The Lancet Preparedness and Vulnerability of African 510 Countries Against Importations of COVID-19: a Modelling Study. The Lancet Looming Threat of COVID-19 Infection in Africa: Act Collectively, and Fast. The Lancet Impact of Weather on COVID-19 Pandemic in Turkey. Science of The Total Environment Universal Weekly Testing As the UK COVID-19 Lockdown Exit Strategy. The Lancet COVID-19 in the USA: a Question of Time. The Lancet Clinical and Virological Data of the First Cases of COVID-19 in Europe: a Case Series. The Lancet Infectious Diseases The Effect of Travel Restrictions on the Spread of the 2019 Novel Coronavirus (COVID-19) Outbreak Risk for Transportation of Coronavirus Disease From Wuhan To Other Cities in China. Emerging Infectious Diseases Rapidly Increasing Cumulative Incidence of Coronavirus Disease (COVID-19) in the European Union/european Economic Area and the United Kingdom Estimating Number of Cases and Spread of Coronavirus Disease (COVID-19) Using Critical Care Admissions Predicting the Cumulative Number of Cases for the COVID-19 Epidemic in China From Early Data. arXiv Strong Correlations Between Power-Law Growth of COVID-19 in Four Continents and the Inefficiency of Soft Quarantine Strategies. Chaos: An Inter-545 disciplinary Journal of Nonlinear Science Effects of Chinese Strategies for Controlling the Diffusion and Deterioration of Novel Coronavirus-Infected Pneumonia in China. medRxiv Free Software Foundation. BASH (Born Again Shell) (3.2. 48) GNU Octave version 4.2.1 manual: a high-level interactive lan-560 guage for numerical computations The Document Foundation, LibreOffice [Free Office Suite A Two-Interface Transport Model With Pore-Size Distribution for Predicting the Performance of Direct 565 Contact Membrane Distillation (DCMD)