GIS in Libraries Issues in Science and Technology Librarianship Winter 1999 DOI:10.5062/F4D50JZQ URLs in this document have been updated. Links enclosed in {curly brackets} have been changed. If a replacement link was located, the new URL was added and the link is active; if a new site could not be identified, the broken link was removed. GIS in Libraries: An Overview of Concepts and Concerns David Deckelbaum Cartographic Information Librarian Henry J. Bruman Library Maps and Government Information University of California, Los Angeles ddeckelb@library.ucla.edu Abstract: This article describes what a GIS is and its implications for incorporation into a library environment. It raises a variety of questions regarding staffing needs, choice of a GIS product or array of products, and the level of service that will be provided that need to be answered before deciding to make GIS available in your library. In the early 1990s geographic information systems (GIS) moved from a mainframe computer environment to a PC desktop setting. As a result of this technological change, libraries were given an opportunity to begin using a powerful new research tool. GIS offers the promise of a potent technology that can provide solutions to a variety of research problems across a broad array of academic disciplines. The same datasets can be used by researchers and scholars from a variety of disciplines with the hope of fostering a cross-fertilization of ideas among a diverse user group. Whereas traditional paper maps and atlases are created to answer specific, predetermined questions, a GIS can be employed to find answers to questions defined by users working with datasets of their own choosing. As you will come to understand, this cutting edge tool -- while promising to enhance the ability of researchers to accomplish their tasks -- also presents a whole series of challenges to libraries that are interested in becoming involved with GIS. These challenges take the form of funding for computer hardware and software, personnel costs, including the hiring of new staff or the training of existing staff, and the acquisition of appropriate datasets to be used in a GIS. Decisions on what kind of GIS should be offered and what level of service will be provided to the public will also need to be determined. So what is a GIS or geographic information system? A straightforward definition would include the idea that it is a combination of computer hardware, GIS applications software, geospatial data, attribute data to be associated with the geospatial data, and finally, but perhaps the most important all, a thinking human operator. This technology permits the use of a computer to input, store, manipulate, display, analyze, and output data that has a geographic orientation. The statement has been made numerous times that approximately 80% of all data created and distributed by the United States government has a geographic component to it. If you asked most people what a GIS is they would respond , if they knew anything at all about the subject, that it is a method for making maps. While this is a true statement one needs to be quickly disabused of the notion that mapmaking is the heart and soul of a GIS . Yes, you can, and frequently will make maps with a GIS, but if maps are simply what you intend there are automatic mapping programs as well as a variety of CD-ROM titles that allow you to display a variety of ready-made maps. There are WWW sites that will create maps of various kinds. These avenues for mapping, while useful, are not what a true GIS enterprise is all about. The true strength of any GIS is its ability to perform two functions, spatial query and spatial analysis. Any relational database allows for queries to determine what records in a dataset meet criteria decided upon by a user. That is, an individual can run a query that uses particular parameters that must be met in order to be included in a subset of the original dataset. Spatial queries allow an operator to select one or more features (individual records in a dataset) in a given theme or layer of information by using the features of another theme. This process allows for a resolution of issues like proximity, containment, and adjacency. Questions like the following might be answered by a spatial query: What communities does a particular freeway pass through? How many nesting sites are within a breeding area? Which privately owned parcels are adjacent to county parklands? Where are the instances of the outbreak of a particular disease caused by airborne agents that are within a mile of manufacturing plants storing toxic materials? Spatial analysis is a process that builds upon information gathered from spatial queries to gain additional knowledge about feature relationships in order to support decision making or problem resolution. Some examples of spatial analysis might include the following: Where should a new library be placed in a community? What trees should be harvested in order to do minimum damage to the landscape and be least visible from the highway? Why is a particular business successful at a particular location? The computer hardware for running a GIS needs to be as high end as a library can afford because geospatial datasets tend to be large, and the faster the computer the less time it takes to do the computing and display the results. A computer operating at 450 megahertz with 128 megabytes of memory will be suitable. Although one can get by with a smaller monitor, a monitor that is twenty-one inches or larger will more readily facilitate working with a GIS. The purchase of printers or plotters must also be placed in the mix when deciding what equipment should be acquired. Thoughtful consideration needs to be given as to whether or not you want to provide the ability for users to enter data into a GIS through a scanner or a digitizing device. Each library will need to decide what options are appropriate for its own operation. If GIS is to be provided on more than one computer there will probably be the necessity for creating a networked environment so that users will have access to shared data directories. If a network is indicated you need to have either in-house staff or assistance provided by your library or campus computing facility to support and maintain the network. The presence of a network and multiple users will require loading the GIS applications software on more than one computer to ensure no degradation in performance. This will increase the licensing costs. Purchasing a cdrom tower or server should be considered to minimize the amount of cdrom swapping that will inevitably occur. The ability to store large quantities of data will become increasingly important so buy computers with a large storage capacity. At some point additional storage may need to be purchased. When making decisions about hardware be sure to include provisions for new computers every few years or you will find soon find your existing computers will not be up to the task of providing adequate performance as new software upgrades inevitably appear. If possible include these computer upgrades as a line item in the budget. If this is not possible, hope and pray there is money available when new computers are warranted to maintain a satisfactory level of service. GIS applications software runs the gamut from full scale systems like Arc/Info to desktop systems like ArcView or MapInfo to a variety of cdrom products, that while not truly dynamic do allow users to work with geospatial data. The full scale systems like Arc/Info are command driven and can be mounted on PCs or UNIX boxes. They are difficult to learn and are not for the faint of heart. The desktop systems have a graphic user interface that makes them more "user friendly", but they are not as powerful as the full scale systems. There are many packaged CD-ROMs that allow users, without much assistance from librarians, to create a limited number of geopolitical maps, road maps, and some thematic maps. A few examples would include Map Expert and Global Explorer. The WWW has many sites that will create maps on the fly. Several of these sites will allow users to do some or all of the following: find a specific street address and map it, display various attractions, restaurants, and banking institutions near these addresses, or create route maps between two points. A few of these web sites are MapQuest (http://www.mapquest.com), Microsoft Expedia Maps ({http://maps.expedia.com/pub/agent.dll?qscr=mmfn}), Maps On Us ({http://www.mapsonus.com}), and Yahoo Maps ({http://maps.yahoo.com/}). Some web sites will allow users to create thematic maps displaying various variables taken from the 1990 US Census of Population and Housing. These sites include TIGER Map Server ({http://www.census.gov/geo/www/tiger/}) or DDViewer ({http://plue.sedac.ciesin.org/plue/ddviewer/}). It is very likely in the future that true GIS functionality will come to the web, but at present the true strength of GIS on the web is the web's ability to act as a distribution network for data suitable for incorporation into a GIS system. Data is being made available by commercial sites, organizational sites dealing with research, governmental sites, library sites, and academic departments. A sampling of these sites would include ArcData Online ( www.esri.com/data/online/index.html), DDCarto ({http://plue.sedac.ciesin.org/plue/ddcarto/}), USGS: Geo Data ({http://cida.usgs.gov/gdp/}), The Idaho Geospatial Data Center ({http://geolibrary.uidaho.edu}) MAGIC- Map and Geographic Information Center at the University of Connecticut (http://magic.lib.uconn.edu), Digital Chart of the World Server at Penn State University Libraries (www.maproom.psu.edu/dcw), and the California Geographical Survey (http://geogdata.csun.edu). The kind of geographic information system a library decides upon will be determined by the expertise of its staff and the level of service it wants to provide. Any GIS other than a product that simply packages a set of predetermined maps and data will require the assistance of staff trained in using a GIS. Unless a library is prepared to hire a librarian or technician who is highly qualified to work with GIS, I would advise against placing a full scale GIS, like Arc/Info, into a library setting. We made the decision at UCLA not to offer a full scale system in the library. It was thought that this level of service was more appropriate to campus computing facilities or computer labs within academic departments. Since UCLA has a campus wide site license for an array of GIS products produced by Environmental Systems Research Institute Inc. students faculty or staff are not prevented from having access to Arc/Info if they find it is necessary to complete their work. The learning curve for mastering a desktop GIS system is steep, but anyone who is highly motivated and willing to put in the time can be successful. This includes both staff and patrons. To successfully employ GIS technology requires that an individual possess a grab bag of abilities that go beyond just learning a particular software program. A user must have a facility with principles of file management and some proficiency in knowing how to manipulate and massage data in order to get it to work in a given GIS. GIS systems use proprietary formats for their geographic data that often require the use of translation programs for the data to be imported successfully from one system to another. This frequently ends up being a multi step process. A basic knowledge of statistics is highly desirable so that one can have some confidence that appropriate statistical measures are being used to calculate and display information. Since most individuals will be using GIS technology to output maps there is a need for an intelligent use of symbols and color , and an aesthetic appreciation for the placement of various cartographic elements like legends, scale bars, and north arrows in final presentations. When working with patrons wanting to use GIS it quickly becomes apparent that few individuals actually possess most or all of these abilities. A crucial element of any GIS service is what kinds of assistance will be given to the public. It is ill advised to simply provide access to a GIS and place potential users in a state of free fall with no parachute in site ( i.e., offer no assistance whatsoever). When UCLA's Henry J. Bruman Library opened its GIS Resource Center in April of 1996 we offered various kinds of assistance to the public. In order to obtain a login id and password users were required to take a two hour instructional session where they were given an orientation to our facility that included the following elements: a description of the equipment in the facility, some principles of drive and file management (crucial when saving your work in a GIS), instructions on how to access our twenty-eight drive cdrom server, an introduction to our shared data directory, instruction on how to use two added value products based on the1990 US Census Pro/Filer and TIGER92 US Boundaries and Streets produced by Wessex Inc., and instructions on how to begin to use ArcView. We also offered initial consultations to help individuals to determine if a GIS was an appropriate technology for achieving their anticipated goals. These consultations frequently included whether or not we had, or knew where to obtain, data necessary to begin a proposed project. We also found we were frequently being asked to provide a good deal of point of use assistance that over time prevented members of the GIS team from tending to some of their other non-GIS related responsibilities. Due to increasingly unrealistic demands on our time, beginning in 1999, we modified the kinds of assistance we are prepared to offer the public. We no longer require users to take a two hour instructional session before being eligible for an account. Most of the information that was conveyed in our introductory session is now provided in a series of handouts. We no longer engage in instructing patrons in how to use ArcView. Instead, we direct them to an online tutorial Getting to Know ArcView GIS : the Geographic Information System (GIS) for Everyone. We inform them that it will probably take at least twenty hours to complete the tutorial, but at the end of this period they will have the requisite familiarity with ArcView to begin working on complicated research problems. If patrons are having difficulties, but have made a good faith effort to solve their problems by first consulting the tutorial and other guides we make available, we will than step in and try to assist them. If users are utilizing our facility in conjunction with a class being offered in the current term, we reserve the right to refer them back to their TA or instructor. I suppose I have always believed that a library's primary responsibility with regard to GIS is, and must be, to acquire the data necessary to allow your clientele to use GIS technology. Any other GIS related capabilities offered to the public must be considered added services to be provided in accordance with the needs and demands of your constituents, but governed by whatever financial or policy constraints that may be in place at a particular institution. As a member of the United States Federal Depository Library Program UCLA receives a great quantity of data from the United States government that are candidates for importing into a GIS operation. Unfortunately most of the data like the TIGER files cannot be directly put to use in desk top systems. Large quantities of data suitable for use in a GIS are produced by virtually all levels of government. Getting data from these various governmental agencies can be quite an adventure and often can prove to be extremely frustrating. Some have an interest in their data being widely distributed either for free or for a nominal processing fee. You will also find governmental entities that will want you to pay dearly for data that has been generated by the use of tax revenues and these organizations use cost recovery or other reasons for charging outlandish prices. Still others will not give or sell their data at all. There are many commercial firms that sell data of all kinds. Some of this data may be repackaged government data that is made more useful or accessible by adding software that is not part of the government's original publication. The previously mentioned TIGER92 US Boundaries and Streets produced by Wessex Inc. is one such example. These kinds of products that enhance government issue data are well worth the cost. There are many private firms that for a handsome fee will assemble datasets for you. It is important for a library that collects any kind of data to be used in a GIS to be invested in providing the metadata for datasets that they own. Providing metadata described as data about the data is crucial to maximizing the use of your collection. One form of metadata is the catalog record created for your online catalog. Another form of metadata is a readme.txt file or other written documentation that gives the following kinds of information about the data: when and by whom was it created, comments on its quality and accuracy, details on its file structure, and a data dictionary if one is required to make sense of the data. There is nothing more frustrating than having a dataset with no clear idea about what data it contains or exactly what the data represents. This makes it next to impossible to do any spatial query and analysis or to meaningfully label features on a map. The Office of Management and Budget charged the Federal Geographic Data Committee to come up with content standards for geospatial metadata. In 1994 President Clinton signed an executive order mandating that Federal Agencies use the standards formulated by the committee by 1995. A revised 1998 edition of the content standard is available from the Federal Geographic Data Committee's web site at (www.fgdc.gov/metadata/contstan.html). If a library chooses to provide a GIS service beyond collecting GIS data the library staff must acknowledge, and their library administration must also acknowledge, that a new kind of high end service is being offered. Being able to assist users with the functionality of a GIS and the ability to work with digital datasets is very different than providing bibliographic instruction on how to use a bibliographic database or providing reference assistance to determine whether the library owns a particular title. The rewards can be great, but every step along the way can result in skinned elbows and knees. GIS is an acquired taste so to speak and is constantly changing. Geographic information system technology allows libraries to utilize data, that many already collect, in new and dramatic ways. References Franklin, Carl. 1992. An introduction to geographic information systems: linking maps to databases. Database (April):12-15, 17-21. Journal of Academic Librarianship. 1995. 21 (4) (10 articles devoted to GIS) McGlamery, P. & Lamont, M. 1994. Geographic information systems in libraries. Database (December):35-44. SPECIAL ISSUE: Making GIS a Part of Library Service. 1995. Information Technology and Libraries. 14 (2). (8 articles devoted to GIS) We welcome your comments about this article.