science.gov -- A Physical Sciences Information Infrastructure Previous   Contents   Next Issues in Science and Technology Librarianship Winter 2001 DOI:10.5062/F4833Q0F URLs in this document have been updated. Links enclosed in {curly brackets} have been changed. If a replacement link was located, the new URL was added and the link is active; if a new site could not be identified, the broken link was removed. science.gov - A Physical Sciences Information Infrastructure Walter L. Warnick, Ph.D. Director, Office of Scientific and Technical Information Office of Science U. S. Department of Energy walter.warnick@science.doe.gov Abstract In today's environment of increased expectations, access to comprehensive information in the physical sciences has been a long-standing need of researchers and librarians which can now be met using information technology. The Department of Energy (DOE) envisions the creation of an integrated information infrastructure for the physical sciences where content, technology and service converge to make resources readily accessible, openly available, useful, and usable. Various collections and resources already form a virtual library, the foundation of this endeavor. The proposed information infrastructure is called science.gov and will be pursued as a collaborative effort of DOE and other information providers. America Needs Basic Research Some time ago, a friend, a young father of two, lay in a hospital bed seriously ill. The physician said there was no treatment. The pancreas was secreting substances that were digesting itself and destroying surrounding tissue. Some patients recover on their own; others do not. Natural laws may allow some remedy which will assist the body's own defenses and cause the pancreas to heal. If there was such a remedy, why did the physician not use it? The answer is the lack of knowledge -- not of just one physician, but of the medical profession as a whole. Unless the physician is a researcher, he waits for others to discover the remedy. He waits because no predecessor mastered the natural laws which govern the pancreas. The young father in my story is a real person. His name is Vince Dattoria. As it happened, fate was kind to him. He recovered and he is now back working with me at DOE. Vince almost died because of ignorance, or shall we call it a lack of knowledge of natural laws? Knowledge of natural laws requires research. Daily, we see the results of research that have so improved people's lives. Almost every advance in the medical sciences has been made possible by a previous advance in the physical sciences. The human body is a chemical and physical problem, and these sciences must advance before we can conquer disease. Exploratory surgery, done when the physician lacks any other way to learn the nature of a patient's problem, is less common now than in years past. This decline is attributable in large measure to imaging technology like computer-aided tomography (CT) and magnetic resonance imaging (MRI) scans. CT and MRI scans came from research in physics. For creating the mathematical algorithm necessary for creating the images, DOE physicist Alan Cormack won the Nobel Prize in medicine, not in physics, in 1979. What separates ignorance from medicine is little more than physics and chemistry. Physics and chemistry are what we do in DOE. The 20th century has been called the Century of Physics. Advances in the physical sciences produced nuclear power, space travel, computers, and numerous other advances. Now in the 21st century, life sciences are offering immense opportunities. Much of the progress in the life sciences is dependent upon prior advances in the physical sciences. Benefits of research in the physical sciences are evident in much of modern technology, from medical applications to cleaner power plants to the Internet. Changing Expectations for Information To fuel basic research, and the scientific progress that follows, access to scientific information must be easy and efficient. Today, information technology has raised the expectations of researchers for accessing the benefits of research in the physical sciences. Even the very concept of an information collection is being revised. No longer need information be in one physical location; rather, it can reside at multiple sites, making it a virtual collection. Similarly, the concept of a library is changing from one physical location to being accessible from almost any place. A virtual collection has all the advantages of the Internet to which we have now become accustomed -- almost instantaneous access, no cost to patrons, and full-text information. The new technology has revolutionized the role of libraries from guiding patrons from resources confined to one building to resources located throughout the world. At times, the librarian who often works with digital and networked information is called a cybrarian. The DOE Office of Scientific and Technical Information (OSTI) has addressed the needs and expectations of researchers and librarians by providing a suite of innovative digital information resources by which scientists and researchers disseminate their findings and by which libraries can lead their patrons to information, regardless where it resides. These digital resources in effect constitute a digital library, an organized and managed collection of materials in digital form, often geographically dispersed, designed for the benefit of a particular user population, and with provision made to facilitate access to its contents. An example is EnergyFiles, an umbrella under which access to over 500 widely diverse collections is provided, both of DOE and worldwide energy-related scientific and technical information (STI). At last, free and from the desktop, the user may have easier, faster, cheaper, more complete and more convenient means of accessing and using STI. Need for Physical Sciences Information Resource For decades, researchers have expressed a need for a national information resource in the physical sciences. In May 2000, the DOE Office of Science sponsored a workshop, chaired by Dr. Alvin Trivelpiece and held at the National Academy of Sciences, where experts in the physical sciences and in science communication shared their views in assessing the advantage and need for a Future Information Infrastructure for the Physical Sciences. The Workshop Report, issued in July 2000, stated the need for a comprehensive collection of science information easily available to researchers and students. We are now referring to this as science.gov, which is envisioned as a convergence of content, technology, and service (tools, applications) to provide a vast resource for researchers, scientists, and engineers in government, academia, industry, and the interested public. Information technology has raised the expectations of researchers for immediate, online access to information in the physical sciences. Such expectations in the past were only visions, but today, they are possible. Workshop participants agreed that development of a strategy to achieve the vision was the next step, and a major multi-organizational endeavor is now under way to accomplish that step. The need for an interagency strategy is clear, and each agency will have the opportunity to contribute content and other resources consistent with its strengths and mission. The active support and participation of diverse stakeholders who have an interest in discrete but complementary facets of research and development (R&D) in the physical sciences was strongly voiced at the Workshop, which called for early successes. Within a few months, DOE and other agencies partnered to develop and launch two interagency products, Federal R&D Project Summaries and GrayLIT Network, which will be described later. And now the combination of conceptual agreement, early successes, and planning for implementation is emerging in science.gov. The current science.gov activity is building upon a foundation that DOE has been laying for the past several years. This foundation resulted from DOE's long tradition of ensuring access to STI. OSTI, identified by the Workshop Report as an organization within DOE that could well serve as the needed point of convergence, has been leading a collaborative effort to encompass government, academia, professional organizations, and industry. Consider the resources or foundation which are building blocks for this future gateway to the physical sciences. Foundation for a Future science.gov It has long been recognized that progress cannot be made unless knowledge is first shared. Since 1947, the mission of OSTI has been to collect, preserve, disseminate, and manage STI resulting from the agency's enormous investments in R&D. Although paper and microfiche were the means of sharing R&D results in the past, digital technologies have enabled us to use the Internet to provide STI, free of charge and accessible to scientists, researchers, industry, academia, and the public. Researchers communicate their findings in three main ways: technical reports or gray literature; journal literature; and preprints. OSTI has developed products in these areas in order to provide patrons with state of the art access to R&D results in physical sciences, no matter where the information resides. Report or Gray Literature The DOE Information Bridge, made available in April 1998 in collaboration with the U. S. Government Printing Office (GPO), contains DOE report literature from 1995 forward. It incorporates over 60,000 full-text reports comprising almost five million pages. It provides free, convenient, and quick access to full-text DOE R&D reports in physics, chemistry, materials, biology, environmental sciences, energy technologies, engineering, computer and information science, renewable energy, and other topics. Users remotely access and download the reports free of charge and in significant volume. The DOE Information Bridge focuses on providing access to scientific and technical reports produced by DOE, DOE national laboratories, and DOE contractors. New reports received by OSTI are added routinely and legacy reports are added as resources permit. Through the use of unique identifiers known as Persistent URLs (PURLs), DOE Information Bridge makes it possible for educators, students, scientists, and engineers to directly access individual documents and to easily direct others to them. The DOE Information Bridge is made available to the public through a partnership between OSTI and GPO on {GPO Access}. Published Literature {PubSCIENCE}, made available in collaboration with GPO in October 1999, provides for quick, easy, and free searching of a compendium of peer-reviewed journal citations and abstracts about the physical sciences and other energy-related disciplines. Hyperlinks provide access to publisher servers to obtain full-text articles if the user or organization has a subscription to the journal. Without a subscription, access to the full text can be obtained by pay per view, by special arrangement with the publisher, by library access, or through commercial providers. More than 40 publisher agreements provide PubSCIENCE patrons the capability to search and access almost two million records in more than 1,300 journal titles of peer-reviewed scientific and technical information. PubSCIENCE is an outstanding example of converging interests of the user's desire to access current scientific and technical literature, the Department's desire to facilitate the flow of peer-reviewed STI, and publishers' interest in obtaining the widest possible visibility for their published materials. Preprint Literature The {PrePRINT Network}, unveiled in January 2000, is a searchable gateway to preprint sites that contain information about scientific and technical disciplines of concern to DOE. Collections and resources included on the PrePRINT Network are provided by academic institutions, government research laboratories, scientific societies, private research organizations, and individual scientists and researchers. The PrePRINT Network expedites the dissemination of scientists' research results. It is web based and provides access to energy-related papers, draft journal articles, and other electronic materials produced by researchers. It provides links to more than 2,000 preprint sites housing over 340,000 documents and more than twenty heterogeneous preprint databases are available for distributed cross searching via a single query. In addition, it provides links to over 700 related scientific societies and associations. In most cases, access to the full-text information on the target sites is open, accessible, and free of charge. OSTI recently launched PrePRINT Alerts, a component of the PrePRINT Network and the first alert service that harvests information from the Deep Web. This new capability allows users to register, create their personalized profiles, and automatically receive weekly notification via e-mail of new preprint information fitting the profile of interest. Additional DOE and Interagency Digital Collections {DOE R&D Project Summaries} was unveiled in June 1997 to provide the public with access to key corporate information on over 20,000 R&D projects performed since 1995 by the Department's laboratories and other research facilities. It includes DOE research activities in a wide variety of energy-related scientific disciplines. R&D Project Summaries enables DOE to educate and inform the general public of its current research and development activities and provides a mechanism for public access to information about Departmental research capabilities and activities. The DOE R&D Accomplishments web site showcases the proud heritage of the Department's R&D and highlights benefits that are being realized now. It was unveiled in March 1999 as a central forum for providing the public with information about outcomes of past DOE-sponsored or generated R&D. The outcomes featured have had significant economic impact, have improved people's lives, or have been widely recognized as a remarkable advance in science. The core of the web site is the DOE R&D Accomplishments Database, consisting of searchable full-text and bibliographic citations of documents reporting accomplishments from DOE and DOE contractor facilities. Complementing the Database is a page of "Snapshots" with links to items or articles that contain information about or identify at least one R&D accomplishment. Subject Portals are a collection of bibliographic citations, broken out by subject area, from the Energy Science and Technology Database (EDB). For DOE reports, links are provided to full text. These long-standing paper publications were recently transitioned to a searchable web product and are now being re-directed into specific Subject Portals with additional features. This migration is scheduled for completion in Summer 2001. Additional products may also be developed using this technology, but presently, two electronic publications have been migrated -- {Photovoltaic Energy: Electricity from the Sunlight} (PHV) and {Superconductivity for Electric Energy Systems} (SUP). {Federal R&D Project Summaries} was released in 2000 and provides a unique window to the Federal research community, allowing Agencies to better understand the R&D efforts of their counterparts in government. It was developed in partnership with other agencies and provides insight to the public in how its investment in R&D is being used. It supports full-text single-query searching across more than 240,000 research summaries and awards in databases residing at DOE, the National Institutes of Health (NIH), and the National Science Foundation (NSF). The Federal databases available via this tool are the DOE R&D Project Summaries; the NIH CRISP (Computer Retrieval of Information on Scientific Projects) Current Awards; and the NSF Award Data. {GrayLIT Network} is an interagency tool that provides a portal for over 100,000 full-text technical reports located at DOE, the Department of Defense, Environmental Protection Agency (EPA), and National Aeronautics and Space Administration (NASA). Collections in the GrayLIT collaboration include the DOE Information Bridge; the Defense Technical Information Center (DTIC) Report Collection; the EPA National Environmental Publications Internet Site (NEPIS); the NASA Jet Propulsion Lab Reports; and the NASA Langley Technical Reports. EnergyFiles -- DOE's Virtual Library The umbrella for the suite of resources just described is {EnergyFiles}, the Virtual Library Collections for Energy Science and Technology, a web-based virtual library that provides easy access to over 500 widely diverse collections of both DOE and worldwide energy-related STI. EnergyFiles is a dynamic information system that offers users, participants and contributors the opportunity to leverage collections and capabilities and to maximize use of energy-related STI. The EnergyFiles search mechanism, EnergyPortal Search, provides for increased site efficiency and ease of knowledge discovery. EnergyPortal has conquered a major obstacle confronting multi-source virtual libraries. In April 1999, OSTI, in partnership with Innovative Web Applications (IWA), released {EnergyPortal Search} that addresses the search and retrieval of Deep Web content. The first product of its kind in government, EnergyPortal Search enables patrons to simultaneously search across distributed, Deep Web database content with a single search query. The user no longer needs to select individual links to sift through available information in pursuit of what is relevant. IWA's novel Directed Query Engine, Distributed Explorer, has since served as the cornerstone for additional OSTI products and services requiring Deep Web searches. Each of these web products has applied Directed Query Engines in slightly different manners based on the unique attributes of site content, but in each case the Deep Web content is tapped and displayed. By nesting Directed Query Engines so that one query launches several other search engines at host sites in a cascading fashion, the ability to assemble a comprehensive information infrastructure for the physical sciences is attainable. Simple algorithms plus enormous computing power can be a tremendous aid to human thought, thus benefitting the R&D process. Conclusion For librarians or cybrarians to better serve the research and educational communities in the physical sciences, there must be a comprehensive information infrastructure that provides information to serve all users, from students to scientists to concerned citizens. This proposed infrastructure truly has merit and great possibility. The wheels are already turning as agencies and others partner to give birth to information tools in the physical sciences which researchers only two decades ago would have considered as futuristic as "Star Wars." Resource and policy issues are challenges that will continue to be addressed, but the Internet must provide comprehensive access to worldwide information in the physical sciences. The Department of Energy and its partners will be diligently pursuing the vision of a gateway to the physical sciences - a gateway that will enable new ways of doing science. Previous   Contents   Next