Evidence Summary
Web-Scale Discovery Services Retrieve Relevant Results in Health
Sciences Topics Including MEDLINE Content
A Review of:
Hanneke, R., & O’Brien, K. K. (2016). Comparison of three web-scale
discovery services for health sciences research. Journal of the Medical Library Association, 104(2), 109-117. http://dx.doi.org/10.3163/1536-5050.104.2.004
Reviewed by:
Elizabeth Stovold
Information Specialist, Cochrane Airways Group
St George’s, University of London
Tooting, London, United Kingdom
Email: estovold@sgul.ac.uk
Received: 3 Mar. 2017 Accepted: 21 Apr.
2017
2017 Stovold.
This is an Open Access article distributed under the terms of the Creative
Commons‐Attribution‐Noncommercial‐Share Alike License 4.0
International (http://creativecommons.org/licenses/by-nc-sa/4.0/),
which permits unrestricted use, distribution, and reproduction in any medium,
provided the original work is properly attributed, not used for commercial
purposes, and, if transformed, the resulting work is redistributed under the
same or similar license to this one.
Abstract
Objective – To
compare the results of health sciences search queries in three web-scale
discovery (WSD) services for relevance, duplicate detection, and retrieval of
MEDLINE content.
Design – Comparative evaluation and
bibliometric study.
Setting – Six university libraries
in the United States of America.
Subjects – Three commercial WSD
services: Primo, Summon, and EBSCO Discovery Service (EDS).
Methods – The authors collected data
at six universities, including their own. They tested each of the three WSDs at
two data collection sites. However, since one of the sites was using a legacy
version of Summon that was due to be upgraded, data collected for Summon at
this site were considered obsolete and excluded from the analysis.
The authors generated three
questions for each of six major health disciplines, then designed simple
keyword searches to mimic typical student search behaviours. They captured the
first 20 results from each query run at each test site, to represent the first “page”
of results, giving a total of 2,086 total search results. These were
independently assessed for relevance to the topic. Authors resolved
disagreements by discussion, and calculated a kappa inter-observer score. They
retained duplicate records within the results so that the duplicate detection
by the WSDs could be compared.
They assessed MEDLINE
coverage by the WSDs in several ways. Using precise strategies to generate a
relevant set of articles, they conducted one search from each of the six
disciplines in PubMed so that they could compare retrieval of MEDLINE content.
These results were cross-checked against the first 20 results from the
corresponding query in the WSDs. To aid investigation of overall coverage of
MEDLINE, they recorded the first 50 results from each of the 6 PubMed searches
in a spreadsheet. During data collection at the WSD sites, they searched for
these references to discover if the WSD tool at each site indexed these known
items.
Authors adopted measures to
control for any customisation of the product setup at each data collection
site. In particular, they excluded local holdings from the results by limiting
the searches to scholarly, peer-reviewed articles.
Main results – Authors reported
results for 5 of the 6 sites. All of the WSD tools retrieved between 50-60%
relevant results. EDS retrieved the highest number of relevant records (195/360
and 216/360), while Primo retrieved the lowest (167/328 and 169/325). There was
good observer agreement (k=0.725) for the relevance assessment. The duplicate
detection rate was similar in EDS and Summon (between 96-97% unique articles),
while the Primo searches returned 82.9-84.9% unique articles.
All three tools retrieved
relevant results that were not indexed in MEDLINE, and retrieved relevant
material indexed in MEDLINE that was not retrieved in the PubMed searches. EDS
and Summon retrieved more non-MEDLINE material than Primo. EDS performed best
in the known-item searches, with 300/300 and 299/300 items retrieved, while
Primo performed worst with 230/300 and 267/300 items retrieved.
The Summon platform
features an “automated query expansion” search function, where user-entered
keywords are matched to related search terms and these are automatically
searched along with the original keyword. The authors observed that this
function resulted in a wholly relevant first page of results for one of the
search questions tested in Summon.
Conclusion – While EDS performed
slightly better overall, the difference was not great enough in this small
sample of test sites to recommend EDS over the other tools being tested. The
automated query expansion found in Summon is a useful function that is worthy of
further investigation by the WSD vendors. The ability of the WSDs to retrieve
MEDLINE content through simple keyword searches demonstrates the potential
value of using a WSD tool in health sciences research, particularly for
inexpert searchers.
Commentary
Previous studies such
as Ketterman and Inman (2014) have sought to compare WSDs directly with
traditional bibliographic databases. However the authors of this study
highlight research into typical library user behaviour that shows a preference
for Google-style searching over traditional methods due to ease, efficiency,
and relevance ranking. An assessment of WSD system performance using relevance
of the results as an indicator is therefore warranted.
This study was
evaluated using Perryman’s (2009) critical appraisal tool for bibliometric
studies. The objectives are clearly stated and the methodology is described in
detail for each aspect of the study. The chosen search questions are based on
real life examples, and the retrieval methods are designed to reflect common
user behaviours, and therefore both are appropriate for the stated aims of the
study. All of the search strategies are included in the online appendices, and
the processes for data collection and handling are well documented. Overall the
methods section of this paper is strong and the authors provide an equally
robust discussion of the limitations of their study, together with the controls
they put in place to help mitigate these, such as duplicate screening of the
results when assessing for relevance.
Results from each
strand of the study are clearly presented, however it would be helpful to see
the tabulated results in percentages as well as absolute numbers so that the
reader is able compare the performance of each WSD more easily. The authors
collected a large amount of data and it would be interesting to see more
reporting of this information, particularly the relevance assessments per
search query, as the authors noted in their discussion section that relevance
was often a function of the topic.
Although the authors
were not able to recommend one WSD tool over the other, this study is a good
starting point for library professionals considering promoting one of these
tools to their library users or implementing one of these products in their
library. There are many other issues to consider when evaluating a WSD, such as
usability and compatibility with other library tools, and these are recognised
by the authors. Deodato’s (2015) comprehensive guide to conducting a full
evaluation of WSDs is a useful resource.
The key finding of
this study is the ability of WSD products to retrieve MEDLINE content with
simple searches representative of typical student search behaviours. This has
implications for health sciences librarians who are involved in the training
and education of library users and the selection of library resources. There
are opportunities for further research to see if the findings of this study are
consistent across other test sites and in different health science disciplines,
and more studies designed to directly compare the performance of WSDs with
MEDLINE are needed.
References
Deodato, J. (2015). Evaluating web-scale
discovery services: A step-by-step guide. Information
Technology and Libraries, 34(2), 19-75. http://dx.doi.org/10.6017/ital.v34i2.5745
Ketterman, E., Inman, M. E. (2014). Discovery
tool vs. PubMed: A health sciences literature comparison analysis. Journal of Electronic Resources in Medical
Libraries, 11(3), 115-123. http://dx.doi.org/10.1080/15424065.2014.938999
Perryman, C. (2009). Evaluation tool for bibliometric studies. Retrieved from Carol
Perryman website: https://www.dropbox.com/l/scl/AAAL7LUZpLE90FxFnBv5HcnOZ0CtLh6RQrs