There's an App for That
Tree of Science with Scopus: A Shiny Application
Sebastian Robledo
Universidad Católica Luis Amigó
Martha Zuluaga
Universidad Nacional Abierta y a Distancia
Luis Alexander Valencia
Core of Science
Oscar Arbelaez-Echeverri
Core of Science
Pedro Duque
Universidad Católica Luis Amigó
Juan David Alzate-Cardona
Software Engineer
Hourly, Inc.
Tree of Science (ToS) is a scientific literature search tool that produces a small, selected list of citations from a larger pool of citations. Initially developed for searches in the Web of Science, this paper shows how to use it with bibliographic data from Scopus. This new Shiny web application analyzes data from Scopus. It processes a dataset from a Scopus search and creates three reports. The first one shows a descriptive analysis, the second one presents the Tree of Science of the search, and the third one presents a clustering analysis of the three main subtopics. The application is accessible from this link:
Keywords: Tree of Science, Scientometrics, Scopus
Recommended citation:
Robledo, S., Zuluaga, M., Valencia, L.A., Arbelaez-Echeverri, O., Duque, P., & Alzate-Cardona, J.D. (2022). Tree of Science with Scopus: A shiny application. Issues in Science and Technology Librarianship, 100.
Researchers and librarians can access millions of research papers. However, processing, selecting, and understanding the content of this data is a difficult and time-consuming task. Therefore, it is essential to use technology to identify the most relevant academic literature. There are several tools, and most of them are split between the point and click interface and code interface. Some examples of software point and click interfaces are CiteSpace (Chen, 2006), VOSviewer (van Eck & Waltman, 2010), and SciMAT (Cobo et al., 2012). However, the most popular programming languages for scientometric analysis are R and Python. Both have specialized packages; for example, R has bibliometrix (Aria & Cuccurullo, 2017) and litsearchr (Grames et al., 2019). Examples in Python are ScientoPy (Ruiz-Rosero et al., 2019) and metaknowledge (Evans & Foster, 2011).
The ToS algorithm creates a citation network and applies graph metrics to identify papers located in the roots, trunk, and leaves; for a detailed explanation, see Valencia-Hernandez et al. (2020). ToS has been widely applied in research topics such as entrepreneurship (Robledo et al., 2021), chemistry (Durán-Aranguren et al., 2021), management (Duque et al., 2021), and medicine (Gonzalez-Correa et al., 2022).
Scopus Search
The first step to creating the ToS of a research topic is searching the Scopus database. Figure 1a presents an example with the word scientometrics. In this case, here are 589 results from the search, see Figure 1b. This number is vital because ToS works best with a number of records between 100 and 600. A minimum number of records (100) is needed to create a citation network; a lower number generates dispersed networks (Pornprasit et al., 2022). A maximum number of about 600 records is due to the limited memory of Shiny apps (1024 MB); lower specificity will hinder the performance of the algorithm. In the last step, the user must select the BibTeX file, and all the parameters shown in Figure 1c. The “include references” item is key for creating the citation network.

ToS in a Shiny App
Shiny is an open-source framework to create web apps directly from R (Chang et al., 2017), and these apps can be uploaded to to be accessed through a link. Also, shiny developers do not need previous knowledge of JavaScript or HTML to create useful and user-friendly apps. Shiny is used for academics to visualize their research; for professors to teach statistical concepts and big companies in the tech and pharma industry (Wickham, 2021). Some examples of shiny apps are PeptCreatR (Arumugaperumal et al., 2022) and DiaThor (Nicolosi et al., 2022).
Figures 2a-e show the steps for creating the ToS from a Scopus search. Once the user has the BibTeX file from Scopus (the seed of ToS), the user can move forward to the ToS Shiny app following this link The browse button in Figure 2a opens a new window to upload the BibTeX file. Once the blue bar is completed, Figure 2b, the user can visualize a descriptive analysis in the Importance button, see Figure 2c. This descriptive analysis has the scientific production published each year and the most productive authors and journals. This report is created with the bibliometrix package (Aria & Cuccurullo, 2017).
The Evolution - ToS button presents the papers located in the roots, trunk, and leaves, see Figure 2d. Papers in the roots are seminal, papers in the trunk give structure to the research topic, and papers in the leaves are the current literature. The link buttons take the user to a search in Google with the preliminary information from the paper. For example, the seminal papers in scientometrics are Egghe (2006), Garfield (1955), and Hirsch (2005). Egghe (2006) proposed a new index called g-index to improve the famous h-index proposed by Hirsch (2005) and Garfield (1955) was the creator of the Institute of Scientific Information (ISI), nowadays known as Web of Science.
Finally, Figure 2e shows a clustering analysis of the main subtopics. This cluster analysis uses the Blondel et al. (2008) algorithm in the citation network. The Shiny app presents the biggest three clusters (or subtopics) of the seed (research topic) with a word cloud figure to understand the topic of each cluster. The user can change the features of the word cloud, for example, the number of words, their frequency, and remove the unnecessary words.

Discussion and Conclusions
ToS was developed as a part of a doctorate thesis, and later the creators decided to start a non-profit organization called Core of Science. The web tool was initially developed with WoS data; however, Scopus is also an important database often available in academic libraries. ToS uses the metaphor of the tree to present the most significant papers from the results in this case obtained from Scopus. Creating a web-based tool is expensive, and most of the time, users must pay this cost. The purpose of the Core of Science is “connecting people through sharing knowledge”; thus, one of the activities is to create free web-based tools for librarians and researchers to help them automate some processes. In this vein, this paper presents a new Shiny app that creates a scientometric analysis to have an overall view of a research topic.
One of the big challenges to creating a citation network with Scopus data is creating a unique identifier of each article and its references. Both should match with other papers in the same search. WoS data has a standard identifier for references, making it more accessible. Also, the references have their DOIs, which facilitates the match among the references and the primary papers.
A limitation of this study is that the ToS algorithm was designed for WoS data, but Scopus data is spread across a broader range of time which implies that some old papers will appear in the trunk because of their publication year. A further improvement of the ToS algorithm could take into consideration this feature in Scopus.
More information about Core of Science is found at:
This work is licensed under a Creative Commons Attribution 4.0 International License.
Issues in Science and Technology Librarianship No. 100, Spring 2022. DOI: 10.29173/istl2698