key: cord-0918274-tpol0yut authors: Mohammadi, Esmaeil; Azmin, Mehrdad; Fattahi, Nima; Ghasemi, Erfan; Azadnajafabad, Sina; Rezaei, Negar; Rashidi, Mohammad-Mahdi; Keykhaei, Mohammad; Zokaei, Hossein; Rezaei, Nazila; Haghshenas, Rosa; Kaveh, Farzad; Pakatchian, Erfan; Jamshidi, Hamidreza; Farzadfar, Farshad title: A pilot study using financial transactions’ spatial information to define high-risk neighborhoods and distribution pattern of COVID-19 date: 2022-02-09 journal: Digit Health DOI: 10.1177/20552076221076252 sha: 52addf6b1197a4a94d2da93a6e0b27a189b1ed6d doc_id: 918274 cord_uid: tpol0yut BACKGROUND: Development of surveillance systems based on big data sources with spatial information is necessitated more than ever during this pandemic. Here, we present our pilot results of a new technique for the incorporation of spatial information of transactions and a vital registry of COVID-19 to evaluate the disease spread. METHODS: We merged two databases of laboratory-confirmed national COVID-19 registry of Iran and financial transactions of point-of-sale devices from February to March 2020 as our training data sources. Spatial information was used for the visualization of maps and movements of sick individuals. We used the point-of-sale devices-related guild to check for the dynamics of financial transactions and effectiveness of quarantines. FINDINGS: In the study period, 174,428 confirmed cases were in the COVID-19 registry with accompanying transactions information. In total, 13,924,982 financial transactions were performed by them, with a mean of 1.2 per day for each person. All guilds had a decreasing pattern of “risky” transactions except for grocery stores and pharmacies. The latter showed a decreasing pattern by impose of lockdowns. Different cities were the hotspot of disease transmission as many “high-risk” transactions were performed in them, among which Tehran (mainly its central neighborhoods) and southern cities of Lake Urmia predominated. Lockdowns indicated that the disease gradually became less transmissible. INTERPRETATION: Financial transactions can be readily used for epidemics surveillance. Semi real-time results of such iterations can be informative for policy makers, guild owners, and general population to prepare safer commuting and merchandise spaces. Evidence before this study Big data and vital registries have the potential to improve the quality and speed of surveillance in the context of epidemics. We searched PubMed and Google Scholar from February 2019 (as when COVID-19 emerged) to July 2021 using search terms related to the aim of study ["big data", "machine learning", "artificial intelligence", "data science" in Boolean association with "COVID-19", and "surveillance"] and retrospectively expanded the search for other conditions. Search yielded mostly experimental studies on the use of big data sources such as public transport, surveillance cameras to supervise the virus while none utilizing the financial transactions. While there were small experiments on the use of credit card transactions for surveying the infected cases, this was not used over time for the virus distribution behavior and assess the efficacy of implemented public health interventions. We, therefore, evaluated the effectiveness of this data source for integration with vital registries like COVID-19 aiming for epidemiologic surveillance of disease. Our study evaluated the effectiveness of a big data source (financial transactions) with both spatial and temporal information to be integrated and used in epidemiologic surveillance of a fast-spreading condition. This method and source have not been previously used in large scale and can be successfully integrated and automated for epidemiologic supervision purposes. Acute respiratory syndrome related to the new coronavirus was declared as a pandemic by the World Health Organization (WHO) on March 2020. 1, 2 It has led to the death of millions of people, economic crises, and markets shutdown. Surveillance approaches have been implemented around the world for better control of the disease spread and prevention. 3 As it has been well emphasized "Amidst this crisis, one institutionalized response promises a modicum of certainty: surveillance", the importance of supervision measures is crystal clear. 4 After a while, Iranian governments faced financial tensions, and passed acts to restart the economic activities in a controlled fashion, a matter known as "controlled social distancing" 5, 6 Efforts were made to make this notion "intelligent" and surveyed as much as possible to trade-off the disease horizontal distribution in some part to support the economy. Supervisions were carried out by use of health system load indicators and statistics. 7, 8 On the other hand, considering Iran as a fast-senescing population with rapid epidemiologic transition, burden of predisposing mortality risk factors of COVID-19 is demandingly high and surveillance measures should be meticulously utilized to assess the efficacy of lockdowns and passed act. [9] [10] [11] [12] One attempt was the utilization of spatial information of banking system transactions to uncover the safety of city locations, working and marketing areas, and detection of hotspots of disease transmission. Infographics and reports were made available to the public and governors to discover high-risk places for disease transmission in a semi real-time fashion. 2 In this study, we report our experience on the combination of data sources of this communicable disease for surveillance purposes. (2) Records of Central Bank of Iran. Data of the current report are from February to March 2020 (i.e. phase one of the epidemic in Iran). This method was performed biweekly to this date and reports are provided to health system authorities. COVID-19 registry is the sum up of all screening tests performed around the country and includes everyone diagnosed with a lab test. It is important to note that in this pilot study, positive cases detected in the first phase of COVID-19 epidemic were enrolled, and due to the unavailability of diagnosis kits in many areas, there remained many undiagnosed cases. Spatial information of performed transactions on point-of-sale (POS) terminals was retrieved in the mentioned duration. Considering that in crowded spaces the risk of contagion increases, only POS devices were chosen. Another advantage of using these devices is that their exact location is registered. Besides the location of each financial transaction, the guild related to each device (e.g. grocery stores or gas stations) and the national IDs were fetched. Because of the pandemic and risk of transmission, guilds were made to not accept paper cash, and considering it as an untraceable transaction, we only chose payments on POS devices. A person can hold many cards registered to any given service provider bank, but all transaction information is automatically and simultaneously transferred to the Central Bank and summed up to the national ID. Central Bank is the upper-level organization of all other service provider banks in Iran. All transaction data are simultaneously and automatically recorded in Central Bank's database. National ID is a unique code given to every Iranian national after their birth. To protect the confidentiality of identified data, only one member of the data analysis team (EG) was assigned for the preprocessing of datasets on a single unlinked computer that we use to keep restricted information. Afterward, the preprocessed data were recoded and unidentified for further analysis. Data were cleaned for missing values, mispronounced inputs, or duplicates. Two data sources were merged by the recoded IDs. The density of transactions performed by test-positive individuals in different cities was used to find higher-risk locations. Movement of "high-risk" individuals between cities was also investigated to assess the between-city distribution of disease. As the Persian New Year holidays (Nowruz, March 21 until 2 weeks) was included in the study period, the dynamic of transaction could be utilized for evaluating people's commuting, incompliance to the universal lockdowns (from early March), and how effective preventions were implemented. Several subsequent steps were carried out for the visualization and creation of infographics. Details of methodology for each infographic and graphic are provided in the related figure caption. To evaluate the effectiveness of lockdowns and limitations on between-city travels, we used a network graph that was composed of edges as transactions performed by a test-positive person in a city [node] outside the location their debit card was issued. All analyses were performed in Python (Python Language Reference, version 3.6., Available at: www.python.org). In total, 174,428 laboratory-confirmed COVID-19 patients' information were included, being responsible for 13 924,982 transactions on POS devices. Of them, 97,932 (56.2%) were male patients and the mean age of sample was 52.9 (± SD: 19.3) years. Also, 145,639 (83.5%) of patients were first diagnosed in an inpatient setting while for 190 (0.1%) patients, the setting was missing. Considering the symptom of cough as a high potent distributor of virus, 84,923 patients had active tussis at the time of the first diagnosis. The mean number of transactions performed was 116,041 "high-risk" transactions in each day. In other words, each sick individual during the study period did 1.2 daily "high-risk" transactions. Over time, the total number of transactions gradually decreased and the pattern was intensified on the second week of March as the New Year holidays approached ( Figure 1A and 1B) . This pattern was similarly detected in commercial guilds, too, with varying intensities. Although it was increasing at first for grocery stores and pharmacies, they turned downward just before the holidays. Inside the grocery stores, those dedicated to vegetables and fruit trade were also found to be the most dangerous "super spreader" locations. Opposed to our impression, the number of transactions performed for purchases related to Internet providers and mobile devices was stagnant, relating to unpreparedness of Iranian infrastructure during the early stages of epidemic. During the first week of March, as the peak of the first phase of the epidemic, most northern and western areas and cities were densely infected with the virus, based on the number of "high-risk" transactions. In the next 2 weeks, limitations and lockdowns were implemented and enforced, leading to "dilution" of dense areas while populated cities of Tehran, Karaj, and southern region of Lake Urmia remained hotspots (Figure 2A , B, and C). Considering neighborhoods of large cities, word plots were created as infograms to represent the high-risk locations. In Tehran, central populated areas hosted more highrisk transactions (Figure 3 ). Efforts were made to withhold the distribution of disease between cities, but the movement of sick individuals based on the location of their transactions indicated active commuting of people not only in their provincial areas but also long-distance travels to the capital (Tehran) or other crowded cities (Figure 4) . In other words, there were plenty of debit card transactions performed by sick individuals in far cities other than the city of issuance, indicating nonadherence to lockdowns during the first phase of the epidemic. Even many movements were detected to small touristic cities and islands. The main point of this study is that spatial and geographical information data sources from non-health sectors, like financial transactions, can be successfully integrated into health datasets for surveillance purposes especially in the context of fast-spreading epidemics. Enforced lockdowns were related to a decrease in the density and crowdedness of shops and markets. Besides, unpreparedness of Internet-based platforms withheld the lockdowns in their bests. Surveillance systems are the cornerstone of any controlling and prevention strategy and are a crucial sector of public health organizations of each nation. 13 It is clearly emphasized that real-time measures of disease burden and dispersion are required for placement of feasible and yet effective interventions and increase the general awareness and knowledge of community. 14,15 One major drawback of health care systems in developing countries is the lack of an efficient surveillance system. 13 Many electric and digital surveillance systems are in action in different countries, in which many bioinformatics platforms are utilized to interactively elaborate real-time multi-measure data sources. Such platforms indeed necessitate large investments, although the outcome is much greater. 16 Other countries' experience on financial transactions and payment measures indicate that financial data are real-time readily available high-resolution information that can be relied on for surveillance purposes especially in epidemics and lockdowns. [17] [18] [19] A Spanish experiment on 2.1 billion transactions indicates that the mobility of individuals has changed during COVID-19 pandemic. They elaborate a divergence in the mobility amount of low-wage and wealthier populations and incompliance of such amenable groups to lockdown laws during weekdays. 17 A report from French banks showcases a sudden decrease in the consumption of money in their withheld accounts, while a rebound is being detected after a while, representing the clinging of people to lockdowns at the early stages of epidemic and slow incompliance to the restrictive rules. Another interesting finding was that wealthy deciles were more likely to save money and lower-wage groups were more likely to face debt. 20 Several other studies have pointed out the dynamic of transactions and assets during the COVID-19 pandemic with similar annotations, 21,22-26 but none have reported the possible role of transactions and banking system data for disease transmission surveillance which the WHO has elaborated its importance. 27 Big data and artificial intelligence can be effectively used for the detection of sick individual's mobility and implementation of preventive strategies. 27 Chinese authorities have extensively used big data sources for disease surveillance and prevention, among the integrated sources are transportation system databases, mobile phones, and social media can be found. 27 South Korea has also inspired COVID-19 Smart Management System (COVID-19 SMS) that integrates security credit card transactions, smartphones location, and security camera records to trace sick individual's movements. Singapore's experience on mobile apps and Bluetooth technology or Taiwanese use of cellular data for restriction and controlling sick people was also a success. 28, 29 Similar other experiments were also detected in literature from the United States, United Kingdom, and Japan to track the movements of sick individuals. [30] [31] [32] To date, these countries have successfully contained the disease. Although privacy annotations may restrain the use of such actions to be performed in other countries. We believe that the current approach can be a feasible technique for surveillance of disease spread and rate of incompliance of infected persons. But there are several shortcomings for this approach that needs to be mentioned to find better solutions. First of all, this system is semi real time. Other surveillance systems such as identity detecting surveillance cameras or smartphone applications can reveal a timelier output. On the other hand, this approach cannot notify individuals about their accommodation through a risky location and presence of an actively contagious person in their close adjacency; or prevent sick individuals from getting mobilized. One main limitation of current work is that due to constraints, real-time data retrieval was not possible, although the lag between registration of data and emergence of a sick individual was 3 days at most Moreover, COVID-19 registry of Ministry of Health is not only restricted to laboratory-confirmed cases and encloses many are diagnosed with radiographic requests and history taking, while for the homogeneity of findings, we only included genetically confirmed samples. On the other hand, a novel integration of a spatial source of information from the banking system with a disease repository was first used in this study as far as we could acknowledge. Unavailability of data for healthy and unaffected individuals was another limitation to compare and assess the efficacy of lockdowns and other preventive measures, a matter that should be sought in future works. Using the guilds and categorization data of bank accounts enabled us to understand high-risk marketing places and inform policy makers and owners to employ safety measures. Implementation of innovative iteration strategies in real-time platforms makes more use of such efforts. Additionally, a combination of multiple spatial data information (e.g. roadside cameras and surveillance systems on automobile plates, transactions, cell phones, etc.) could lead to more accurate results. Nowadays, we have access to the increasing amount of temporal and spatial data. Best practice is whenever these data sources are utilized to mitigate the disease spread by restricting individuals from closely contacting each other. 33 Innovative approaches and combinations of nowadays vast sources of data can be utilized in the context of newly emerged crises and epidemics. Inter-sectorial collaboration of organizations and cooperation are vital for fruitful preventive strategy implementation. Trial and error is an inseparable component of innovation and invention. Support for these small-scale efforts rather than de novo investment in unbacked commodities is crucial for the achievement of larger end results. We believe geographical and spatial information of the baking system can be successfully utilized and integrated with other vital data sources for the aim of disease control and other purposes. Informed consent: Not applicable, because this article does not contain any studies with human or animal subjects. ORCID iD: Farshad Farzadfar https://orcid.org/0000-0001-8288-4046 Telehealth for global emergencies: implications for coronavirus disease 2019 (COVID-19) A report on statistics of an online self-screening platform for COVID-19 and its effectiveness in Iran A new system for surveillance and digital contact tracing for COVID-19: spatiotemporal reporting over network and GPS Dis-ease surveillance: how might surveillance studies address COVID-19? The crossimpact between financial markets, covid-19 pandemic, and economic sanctions: the case of Iran The impact of the social distancing policy on COVID-19 incidence cases and deaths in Iran from Mapping the incidence of the COVID-19 hotspot in Iran-implications for travellers Predicting COVID-19 incidence through analysis of google trends data in Iran: data mining and deep learning pilot study Burden of non-communicable diseases in Iran: past, present, and future Non-communicable diseases' risk factors in Iran; a review of the present status and action plans Geographical, gender and age inequalities in noncommunicable diseases both at national and provincial levels in Iran Epidemiologic pattern of cancers in Iran; current knowledge and future perspective Epidemiologic surveillance for controlling covid-19 pandemic: types, challenges and implications Surveillance of childhood vaccinepreventable diseases at health facilities in jeddah. Saudi Arabia Prediction of epidemic spread of the 2019 novel coronavirus driven by spring festival transportation in China: a population-based study Global preparedness against COVID-19: we must leverage the power of digital health Tracking the COVID-19 crisis with high-resolution transaction data How does household spending respond to an epidemic? Consumption during the 2020 COVID-19 pandemic Consumption and geographic mobility in pandemic times. Evidence from Mexico Consumption dynamics in the COVID crisis: real time insights from French transaction bank data The economic impacts of COVID-19: evidence from a new public database built using private sector data Consumption in the time of Covid-19: Evidence from UK transaction data The impact of the COVID-19 pandemic on consumption: Learning from high frequency transaction data Consumer responses to the COVID-19 crisis: Evidence from bank account transaction data Consumers' Mobility, Expenditure and Online-Offline Substitution Response to COVID-19: Evidence from French Transaction Data Measuring the effects of the COVID-19 pandemic on consumer spending using card transaction data Surveillance strategies for COVID-19 human infection: interim guidance Singapore Wants All Its Citizens to Download Contact Tracing Apps to Fight the Coronavirus How Taiwan is Tracking 55,000 People Under Home Quarantine in Real Time JUE insight: measuring movement and social contact with smartphone data: a real-time application to COVID-19 Subsidising the spread of COVID-19: Evidence from the UK'S Eat-Out-to-Help-Out Scheme How Much Does COVID-19 Increase with Mobility? Evidence from New York and Four Other US Cities Predicting and controlling infectious disease epidemics using temporal networks Trial registration: Not applicable, because this article does not contain any clinical trials.