Web of Science's "Citation Mapping" Tool [Review] Previous Contents Next Issues in Science and Technology Librarianship Summer 2008 DOI:10.5062/F4NZ85MT URLs in this document have been updated. Links enclosed in {curly brackets} have been changed. If a replacement link was located, the new URL was added and the link is active; if a new site could not be identified, the broken link was removed. Electronic Resources Reviews Web of Science's "Citation Mapping" Tool Brian D. Simboli Science Librarian Library and Technology Services Lehigh University Bethlehem, Pennsylvania brs4@lehigh.edu Copyright 2008, Brian D. Simboli. Used with permission. In July 2008 Thomson Reuters added a new "citation mapping" tool to its {Web of Science} product. This tool, which is touted on the Web of Science (hereafter, WOS) search interface as a beta version, enables users to visualize the relationship between citing and cited references. The citation mapping tool is a welcome addition to WOS. Below I discuss how the tool works, offer some comments and suggestions about it, and conclude with some notes about future directions. Persons interested in bibliographic visualization software may also find of interest HistCite, developed by Eugene Garfield, pioneer of cited/citing searching and analysis. For an overview of HistCite, see Herther (2007). This review will not compare the new citation mapping tool in WOS to the features of HistCite, but will reference the HistCite web page in a few places. How WOS’s Citation Mapping Works Before examining how the citation mapping tool works, first some context setting. WOS enables two types of searching. The first involves searching on a topic, author or publication name to bring up "source records." Source records are bibliographic records for the journals and monographic series that WOS indexes. In executing this first type of search, WOS behaves the same way as any bibliographic database. The second search type starts with a known citation and either identifies documents it cites or identifies source records for documents that cite it. This type of searching is the traditional hallmark of the Science, Social Sciences, and Arts and Humanities Citation Indexes now integrated into WOS. (Incidentally, given its coverage of social sciences and arts and humanities, "Web of Science" could be named differently.) After searching using one of these methods, one retrieves a WOS source record such as the one in Figure 1. Note the links for citing records (labeled "Times Cited"), cited records ("References"), and finally, the citation mapping tool ("Citation Map beta"). Figure 1. Clicking on the "Times Cited" link brings up WOS source records for documents whose reference lists include the item paper in the source record above. Clicking on "References" brings up references that the paper in the record has cited. Using these enables users to discern a chain of citing-cited relationships: a list of records cited by the source record, the source record itself, and the items that cite the source record. The new citation mapping tool helps the user visualize citing-cited relationships. Clicking on the "Citation Map" link brings up a screen that provides three options. One can choose to map the source ("target") records and the items published later in time that cite it (the path forward option), the target record and the items that it cites (the path backward option), or both forward and backward. Also, one can select whether to bring up first or second-generation maps. The concept of a "generation" is documented as follows: "the records that directly cite or are directly cited by the target record are the first generation, records citing records that cite the target record and records cited by records cited by the target record are the second generation." The screen indicates that second generation maps can create a "time out," so in creating them one should just select the forward or back options, not both. Finally, to create a citation map, the user clicks on a button (that is a bit too far down on the page). For the source record that appears above, selecting "both" (forward and backward citations) and "one generation" creates the citation map that appears in Figure 2. Figure 2. Here are some highlights of the screen in Figure 2; for further details about aspects of the search screen, see the "Citation Mapping Help" available on the upper right of the page displaying the citation map. Double-clicking a node displays in the lower right the details for the relevant item as well as a link to the source record when one is available. In the lower left there is a list of the cited and citing documents. A right arrow indicates that the document cites the paper by Pazzaglia; a left arrow indicates a document that the Pazzaglia paper itself cites. Drag-down menus enable the user to change the appearance of the citation map. One can specify the node text (for example, author or titles) and the order and color of the nodes. Also, color coding enables one to group together those nodes that share certain similarities. Given that data associated with various nodes can overlap, the text on the labels associated with the nodes can be difficult to read. However, clicking and holding the left mouse key allows one to drag the map around the screen to make sections of it more legible. Also, clicking on the double arrows on the far left hides the two lower panels, increasing the space available to view the citation map. Possible Uses For librarians, researchers, and students, what advantages does using the citation map tool afford that just using the "Times Cited" and "References" links do not? When teaching about the concept of a citation database, the author of this review often puts on the board a diagram that has the same structure as the citation map in Figure 2. Now the software itself generates maps, which can be used by librarians in WOS training sessions to illustrate the concept of searching for cited/citing relationships. Persons who want a visual impression of the citing-cited relationships related to a given document can use the citation mapping feature. Spatially or graphically oriented persons may especially appreciate the map display. Also, to get a sense of second generation relationships, it is easier for anyone to consult the citation map than to consult the times cited and references links. Given that the maps are not necessarily exhaustive (as discussed below), and that citing data has increasingly become available from databases other than WOS (Simboli (2008) gives a non-exhaustive list of such resources), users of the citation download feature needing to add data to the citations maps may want to try the following suggestion from the documentation: Save Citation Map As Image This option allows you to save your Citation Map as an independent image. The default format for saving the image file is PNG (Portable Network Graphics). After you select this option, you can open the image or save the image to a folder on your computer. During this process, you can change the file format to another format such as GIF or BMP by replacing the PNG file extension after the file name. If you have a graphics tool (for example, Macromedia Fireworks), you can modify your graphics image. For example, you can add text or resize your image. Before using this option, check to ensure that Pop-up Blocker is turned off with your browser. Researchers referencing previously published work as part of grant proposals, or faculty members up for tenure and promotion, may want to submit the resulting citation maps as part of their documentation. (The ability to produce citation maps for purposes of publication or presentation is one of the {features} touted for HistCite.) Finally, a vendor representative suggested these other uses: "Display Subject categories to see how diverse the impact of a given paper has been (the multi-disciplinary nature of science). Display Countries to see how widespread the paper has been distributed and read. Display and group by Year to see relative changes in citations patterns over time." Comments and Suggestions The following are some comments about using the citation mapping feature as well as some possible areas in which to develop further the software (again, currently in beta). Two groups of persons use WOS. The first group uses WOS to locate relevant literature but are not concerned about exhaustive searching. The second group is more concerned about exhaustiveness. This could include faculty members documenting the full number of times their work has been cited or researchers or patent attorneys needing to identify every piece of literature related to a given topic. (My impression, without being able to offer any statistics, is that the first group may very well outnumber the second.) Librarians serving these two populations should understand this distinction in needs as they help users understand and use WOS. For the first group, it should often (if not usually) suffice to use the citation mapping feature and the "Times Cited" and "References" links, all available from a WOS source record. For the second group, however, it can be important to use the "Cited Reference Search" engine available in WOS. (For the benefit of the second group, the vendor should, in my view, make this point clear in a bit of text in the source record.) To understand why it is important to use the "Cited Reference Search" feature for the second group, consider the similar point for the "Times Cited" link present in a source record (cf. Figure 1). Clicking that link does not necessarily retrieve all the source records that cite the target record. There may be more citing papers. This is because the "Times Cited" link brings up citing source records that contain a reference list one of whose items matches the key elements of bibliographic details in the original source record. Using the "Cited Reference Search" engine in WOS rather than a source record's "Times Cited" link, one can look for variations (including errors) in the way the cited document's bibliographic detail was rendered in a citing paper. This enables one to identify additional citing references not captured by using "Times Cited." Similarly, the path forward citation mapping tool maps those citing records that closely match the bibliographic data in the original ("target") source record. There may be other citing records that do not appear in the map because their data varies in ways otherwise discernible by someone using WOS's "Cited Reference Search" engine. In sum, using the "Cited Reference Search" feature may help the exhaustive searcher locate some citing items that the "Times Cited" link as well as the citation map would not. Correspondence received from Thomson Reuters mentioned that most cited items do not face this problem of variation and indicated that "we have extensive and complex algorithms that analyze and unify citations to create the Times Cited count and list (and ultimately the map). The variants are the items that have so little information that we cannot with any degree of accuracy attribute." (Persons interested in issues about variations may want to see Buchanan (2006).) Librarians can help these individuals by explaining how to use WOS's "Cited Reference Search" engine to bring up the "Cited Reference Index," and then how to rummage through the latter to find variations in the way citing authors rendered bibliographic details. Note that when creating second generation maps, one may come across nodes that "stop" at the first generation. Citation searching aficionados may find the reason for this to be of interest. In the path backward case, the items that the target record cites include either source records or non-source records (where the latter can include variations that are actually identical to source records.) When they are non-source records, the branch ends at the first generation; when the first generation item is a source record, then the branch continues to the second generation. In the path forward case, source records that cite the target appear in the first generation. If those citing records themselves are cited by other source records, the branch continues to the second generation. Unlike the path backward case, the path forward items in a citation map, whether first or second generation, are all source records. The documentation indicates, concerning one of the features of the citation mapping tool, that: "This option allows you to assign colors to nodes according to their record properties. That is, you can select a field to which to apply colors. This process takes the top 20 occurring data values and assigns them one of the 20 node colors (see Set Node Color). Any node that is not within the top 20 values remains black." It would be worth noting in the documentation that when one creates a map and attempts to color code items by a specific data element, they will only color code if the node record contains that data element. For example, when the user color codes by institution name, if the WOS data for a paper produced at a given institution happens not to contain the institution’s name, that paper will not color code with other papers from that institution. (Thomson Reuters noted in correspondence that "you can custom color nodes so if you want to make things match colorwise you can.") It is, of course, understandable that color-coding cannot always occur in such a case if an author does not supply data for the specified field. But to alert the user, perhaps a way should be found to mark items that lack data for the field element over which one is color coding. It would be useful to have the ability to mark records so that one can "harvest" them for downloading, printing out, or exporting to citation management software such as EndNote or RefWorks. From the "Manage" menu, the user can select a "Save Citation Image" option. The resulting download consists of a static screen shot, comparable to a computer’s "print screen" function. An e-mail feature would enable e-mailing a citation map without having to download the map and make it an e-mail attachment. Another improvement would be to ensure that, in a citation map downloaded from WOS, all the text for a given node appears. Perhaps a plug-in made available off the WOS database could distribute nodes of a downloaded map such that the node text does not overlap in the downloaded file. Perhaps use of paper larger than 8.5 by 11 might help make legible the titles (or authors) of papers in those cases when there are large quantities of data. A software interface would be helpful that enables the user to manipulate a citation map, add or tweak nodes and their data, and hyperlink to source records, whether or not the source records are from WOS. (In this context, see the description of HistCite at {http://thomsonreuters.com/products_services/science/science_products/a-z/histcite/#tab2}.) In its current form, after selecting the option that labels the nodes with titles, only partial titles display. So too do the titles in the record that pops up when one mouses over the node. Thomson Reuters plans to rectify these drawbacks in the next release in October 2008. Thomson Reuters also produces software called {RefViz} that—like the citation mapping feature—enables its user to visualize relationships between bibliographic records. However, in the case of RefViz, the relationships between the records are not citing-cited ones, but consist rather of conceptual proximity. Having incorporated citation mapping into WOS, Thomson Reuters may want to expand this product’s "visualization" capabilities by incorporating some of RefViz’s capabilities into WOS. (For a review of the first version of this software, see Simboli and Zhang (2004).) Future Directions A few final comments about possible future directions. Right now, one can at most create second-generation maps using the citation mapping tool in WOS. Given server capabilities and Internet speeds, it may be too much to ask at this time, but one could imagine a future ability to follow seamlessly, beyond the second generation, a branch of the "tree" of citing and cited relationships within a map. One would go either forward or backward for as many generations as the data would allow, giving the citation mapping feature a truly dynamic quality. Perhaps it would save memory capacity or create a fast link if data would load and disappear as one proceeded along a specific branch. Algorithms could be used to make educated guesses about how to normalize citation variations, where those guesses could have a confidence level significantly below 100%, with confidence levels marked at nodes. Comprehensiveness of coverage would likely be more important than a high level of precision in normalizing citations. Given that most people will likely use citation maps to help locate literature in a non-exhaustive way, the issue of normalizing variations (in bibliographic detail for the same document) is not a critical problem in creating citation maps, if high precision in normalizing variant data would require compromising comprehensiveness. Creating such a tool would require a great deal of cooperation among publishers and disparate database vendors in supplying a huge quantity of citing-cited data going well beyond the practical scope of any single database’s coverage. (As mentioned, databases other than WOS have increasingly been including the ability to do citing-cited searching.) Such a tool, if ever developed, would provide further ability to represent, and make access to, the historical progress of human inquiry, including its interdisciplinary aspects. In the meantime, taking a cue from WOS, other database producers may want to include citation mapping tools. At some point, perhaps a web-based platform or client software will emerge that automatically integrates into one map the citing and cited relationships discovered by means of disparate databases and then enables the user to edit the map and add additional nodes. In this context, see Herther (2007), according to which HistCite Software LLC "plans to incorporate tools for automatically downloading records from other bibliographic databases in future versions of HistCite." Also along these lines, for a discussion of the role that citation management software can play with respect to collecting and displaying citing-cited relationships, see Simboli and Zhang (2002). I wish to thank John Adams and Don Sechler of Thomson-Reuters for many helpful suggestions, including the reference to the article by R. Buchanan. References Buchanan, Robert A. 2006. Accuracy of Cited References: The Role of Citation Databases. College and Research Libraries 67 (4): 292:303. Herther, N. 2007. Eugene Garfield Launches HistCite. NewsBreaks & The Weekly News Digest. [Online]. Available: http://newsbreaks.infotoday.com/nbReader.asp?ArticleId=40024 [August 8, 2008]. Simboli, B. 2008. Cited/Citing Resources: A Special Way to Build Bibliographies. [Online]. Available: {http://library.lehigh.edu/page.php?id=Cited/Citing_Resources%3A_A_Special_Way_to_Build_Bibliographies} [August 2, 2008]. Simboli, B., and Zhang, M. 2004. Clustering Concepts. Science 303 (5659): 768. [Online]. Available: http://www.sciencemag.org/cgi/content/summary/303/5659/768 [August 8, 2008]. Simboli, B., and Zhang, M. 2002. Citation Managers and Citing-Cited Data. Issues in Science and Technology Librarianship 35 Summer 2002. [Online]. Available: http://www.istl.org/02-summer/article4.html [August 8, 2008]. Previous Contents Next