key: cord-0476527-42q1ckz1 authors: Hamoui, Btool; Alashaikh, Abdulaziz; Alanazi, Eisa title: Google Searches and COVID-19 Cases in Saudi Arabia: A Correlation Study date: 2020-11-29 journal: nan DOI: nan sha: bc5ba75842a9174a64dd378c4a91abf8615023a4 doc_id: 476527 cord_uid: 42q1ckz1 Background: The outbreak of the new coronavirus disease (COVID-19) has affected human life to a great extent on a worldwide scale. During the coronavirus pandemic, public health professionals at the early outbreak faced an extraordinary challenge to track and quantify the spread of disease. Objective: To investigate whether a digital surveillance model using google trends (GT) is feasible to monitor the outbreak of coronavirus in the Kingdom of Saudi Arabia. Methods: We retrieve GT data using ten common COVID-19 symptoms related keywords from March 2, 2020, to October 31, 2020. Spearman correlation were performed to determine the correlation between COVID-19 cases and the Google search terms. Results: GT data related to Cough and Sore Throat were the most searched symptoms by the Internet users in Saudi Arabia. The highest daily correlation found with the Loss of Smell followed by Loss of Taste and Diarrhea. Strong correlation as well was found between the weekly confirmed cases and the same symptoms: Loss of Smell, Loss of Taste and Diarrhea. Conclusions: We conducted an investigation study utilizing Internet searches related to COVID-19 symptoms for surveillance of the pandemic spread. This study documents that google searches can be used as a supplementary surveillance tool in COVID-19 monitoring in Saudi Arabia. Coronavirus was first reported in December 2019 in China, then has continuously and gradually spread in several different countries and became a global pandemic in March 2020. In response to the COVID-19 pandemic, countries have different governance mechanism applied to combat the pandemic. The pandemic has affected human life and triggered alerts around the globe. However, the responses toward it were mainly depend on the local governance. In Saudi Arabia, strict control and preventive measures were undertaken to prevent the spread of the outbreak. Although, the spread of the virus across the kingdom is considered limited comparing to other countries [1] ; there is a critical need to activate digital health surveillance to make the response more effective and reduce the risk of further spread of the disease. In the past few years, the Internet has become a very popular medium for people searching for health-related knowledge and information for self-diagnosis. During the coronavirus emerging, people opt to search about the disease signals, such as the appearance of specific symptoms, the treatments, and the recovery of those symptoms on the Internet. Consequently, Internet searches are considered important user-generated content which may include information about user health-related information. The social media posts content and Internet search data can play a prominent role to promote health situations during an emerging outbreak [2] . The systems or applications that use Internet-based data for the purpose of nowcasting or forecasting disease infections known as digital disease surveillance [3] . Numerous works have proposed disease surveillance methodologies by taking the advantage of user-generated web content, either in the form of social media posts or search engine query logs. In 2008, Google developed an early warning system that predicts influenza activity by aggregating Google search query volumes related to flu symptoms [4, 5] . During Zika outbreak in 2016 [6] , a predictive model developed utilizing tweets posts, google trends, and HealthMap reports. The model achieved a successful prediction of Zika cases counts in Latin America compared to traditional surveillance systems. Recently, several research attempts have exploited the information in google trends to better monitor the recent COVID-19 outbreak [7, 8, 9] . The study by Walker et al. [7] demonstrated a strong correlation between the google searches related to smell information and COVID-19 cases in Italy, Spain, UK, USA, Germany, France, Iran, and the Netherlands. The correlation analysis between google search keywords "coronavirus", "COVID", "COVID 19", "corona", and "virus" with daily confirmed cases in india presented in [8] . Another study examined google searches keywords "wash hands" and "face mask" correlation with number of confirmed cases among 21 countries [9] . Saudi Arabia has the largest Internet user population in the Arab world [10] . An investigation done by Alduraywish et al. [11] found that the Internet is one of the most common sources to retrieve health information for Saudis. The study aimed to validate utilizing Google Trends data as a complement data source for digital surveillance in Saudi Arabia. In this work, we investigate the correlation between Google Trends data on coronavirus common symptoms using Arabic keywords and the COVID-19 confirmed cases in Saudi Arabia. The study period was from March 2, 2020 to October 31, 2020. March 2, 2020 is the symptom onset day of the first confirmed case with a positive coronavirus result. Data of daily COVID-19 cases in Saudi Arabia were collected from the Ministry of Health (MOH) daily reports 1 . We manually crafted a list of 26 Arabic n-grams keywords related to coronavirus symptoms as shown in Table 1 . We obtained the symptoms from Google Search trends, using Google Trends, an open-access platform that provides the relative search volume (RSV) 2 (scaled search frequency data from 0 to 100). The daily trend data associated with the list of Arabic keywords obtained from Google Trends by setting the location parameter to "Saudi Arabia" and the time parameter to "March to October, 2020.". The total number of RSVs for each symptom is shown in Figure 1 . The strengths of the associations between both daily and weekly increase of confirmed cases, and google trends search queries will be assessed using the Spearman rank correlation. An r-value of > 0.5 is considered as a high correlation, and a p-value of < 0.05 is considered as a statistically significant result. We tested the correlation of each 29 symptoms keywords and the overall number for each RSV symptom. The results of daily and weekly correlations of overall RSV for each symptoms with COVID-19 cases are presented in Table 2 . During the period between March 2 and October 31, it is observed from the Figure 1 that the highest google searches queries were about Cough, Sore Throat, and Fever. The RSVs for each of them reach 21,428, 20,376 and 19,934, respectively. Regarding daily Spearman correlation, the Arabic searches about "Loss of Smell" were strongly correlated and statistically significant with COVID-19 cases as shows in the Table 2 . Moreover, moderate correlations were observed with symptoms related to "Loss of Taste", "Diarrhea", and "shortness of breath" with p-value < 0.05. Although weak correlations were found with RSVs about "Fever", "Headache", "Sore Throat", and "Fatigue", the associations are statistically significant with p-value <0.05. In terms of weekly Spearman correlation, the RSVs pertaining "Loss of Smell", "Loss of Taste" and "Diarrhea" have a strong correlation ranging from 0.578 to 0.83. All three correlations were statistically significant ( p-value < 0.05). Besides, moderate correlations with statistically significant ( p-value < 0.05) found with "Shortness of Breath", "Headache", "Fatigue" and "Fever". The correlations of overall symptoms were strongly correlated and statistically significant with both daily and weekly COVID-19 cases. Figure 2 shows the daily RSVs of "Loss of Smell", "Loss of Taste", and the overall symptoms with the confirmed COVID-19 cases. We observed that the "Loss of Smell" searches increase and decrease simultaneity with the daily COVID-19 cases. In Saudi Arabia, the number of daily confirmed cases reached its peak (4,919 cases) on June 16 th 2020, then it started decreasing for six days until June 23 rd . From this date, the number of daily confirmed cases began to increase again to peak on June 29 th , see Figure 2 . A decrease in the number of confirmed cases started on July 7 th and showed a continuous reduction for the rest of three months, August, September, and November. Consequently, the number of total confirmed cases reached its highest in June (107,083 cases), while the total number for July, August, September, and November were 87,783, 40,602, 19,366, and 12,693, respectively. Similarly, over the eight months, the highest total RSVs for (GT) related to "Loss of Smell", "Shortness of Breath", "Fever", and "All symptoms" the overall symptoms were found in June 2020 as shown inFigure 2(G),(A) and (E) . In late March 2020, the international medical community began circulating press releases of the loss of sense of smell as a sign of COVID-19, and possible markers of infection [12] . By the end of April 2020, the Centers for Disease Control and Prevention (CDC) added the "Loss of Smell" to the list of common symptoms of coronavirus [13] . In addition, multiple studies have identified "Loss of Smell" or "Anosmia" as a prominent symptom of COVID-19 infection [14, 15, 16] . Our findings are consistent with recent studies that demonstrated the association of google searches related to "Loss of Smell" and COVID-19 cases [7, 17, 18] . Between February and May 2020, strong correlations found (r>0.65 ) between google searches of the term "Loss of sense of smell" and COVID-19 cases in Brazil, Italy, USA, France, and Spain [17] . The term "Loss of Smell" was one of the ten keywords used to investigate the association between google searches and COVID-19 cases in United States [18] . In the period between January 22 and April 6 2020, the correlation was (r = 0.61) for the whole United States, while strong correlations found in New York and Arizona equal to (r =0.70) with lag -8 and (r= 0.66) with lag -3. Overall, the correlations found in our study demonstrate the utility of utilizing google searches in providing helpful data to be used in syndromic surveillance during an emerging pandemic such as COVID-19. The analysis showed the importance of taking advantages of google searches data. This type of data is easily accessible and available for free; it would augment and complement traditional public health disease surveillance. Since the data are made available to the public in real time, this will help health authorities to take the right action at the right time. However, there are limitations in the study presented here. For Internet-based data, the changes over time might be in response to media change during the outbreak. This is known as the media-driven bias that impacts internet-based surveillance systems, as reported by a previous study [19] . At the early stage of pandemic, the media and medical community have paid significant attention to certain symptoms such as Fever, Cough and Sore Throat. Hence, we justify the result of weak correlation with symptoms such as "cough" and "sore throat", that the awareness of individuals changed in accordance with media and news about the COVID-19 pandemic. As illustrated in Figure 2 (B) and Figure 2 (D), the most google searches about cough and sore throat were found in March. Additionally, one of the limitations of the study is coverage, google searches are not considered to be representative of the entire Saudi Arabia population. We only focus on the Arabic language as it is the most spoken language in Saudi Arabia, where there are other languages spoken by the residents such as English, Urdu and Indonesian. Furthermore, the residents of villages and remote areas that suffer from slow Internet connections or limited access do not have the opportunity to surf the Internet as others in urban area. Despite the limitations, as presented in previous studies [20, 21, 22] , we observed in our study that the symptoms of COVID-19 with less media coverage, which are: Loss of Smell, Loss of Taste and Diarrhea, can provide indications of virus spread. Hence, we still found that Internet searches are a viable data source and can be utilized as digitalized surveillance technique to assist public health efforts. Preparedness and response to covid-19 in saudi arabia: Lessons learned from mers-cov Social networks and health: new developments in diffusion, online and offline Social media-and internet-based disease surveillance for public health Detecting influenza epidemics using search engine query data Early detection of disease outbreaks using the internet Forecasting zika incidence in the 2016 latin america outbreak combining traditional disease surveillance with search, social media, and news report data The use of google trends to investigate the loss of smell related searches during covid-19 outbreak Prediction of covid-19 outbreaks using google trends in india: A retrospective analysis Google searches for the keywords of "wash hands" predict the speed of national spread of covid-19 outbreak among 21 countries Internet usage and user preferences in saudi arabia Sources of health information and their impacts on medical knowledge perception among the saudi arabian population: Cross-sectional study Loss of sense of smell as marker of COVID-19 infection Coronavirus (COVID-19) Information for Employees and Patients Loss of smell in covid-19 patients: a critical review with emphasis on the use of olfactory tests Covid-19 and anosmia: A review based on up-to-date knowledge Identifying and ranking common covid-19 symptoms from tweets in arabic: Content analysis Loss of smell and taste: a new marker of covid-19? tracking reduced sense of smell during the coronavirus pandemic using search trends Correlations between covid-19 cases and google trends data in the united states: A state-by-state analysis Prediction of dengue incidence using search query surveillance Is google trends a reliable tool for digital epidemiology? insights from different clinical settings The application of internet-based sources for public health surveillance (infoveillance): systematic review Correlations of online search engine trends with coronavirus disease (covid-19) incidence: Infodemiology study In this paper, we investigated the feasibility of using Google searches to track COVID-19 outbreak in Saudi Arabia. Our study showed the the overall Google Trends data about symptoms were strongly correlated with COVID-19 confirmed cases, and highly correlated with "Loss of Smell" in particular. Our study demonstrated the potential role of using the Internet searches in the fight against the current pandemic. The study also highlights the advantages and limitations of Internet based data for digital health surveillance in Saudi Arabia.