College and Research Libraries Estimating Data Processing Costs in Libraries ELECTRONIC DATA PROCESSING (EDP) is assuming an important and effective role in the operation of many libraries. Librarians who wish to explore the possi- bilities of this technology in their own li- braries are surprised to find that those who are using EDP systems are reluctant to-disclose their operating costs. Yet those who have mechanized operations within their libraries know that their own costs are not a reliable indication of what simi- lar operations would cost another library. Many factors influence data processing costs, and an operation which might seem costly to one library may be paying its way very nicely in another, simply be- cause of the way it fits into the opera- tional pattern. In other words, costs for EDP opera- tions cannot be extended without qualifi- cation from one library to another, be- cause time and equipment charges vary so widely from one institution to another. If the university library tries to compare costs with a nonuniversity operation, it will find that its internal costs, as well as its equipment costs, may be considerably less than those encountered in the non- university situation. If input is key- punched in the library, for example, it may cost much less than if it were sent out to a local data center for preparation. If a library adopts another library's sys- tem, it may find that it does things that are important for the originating library but unnecessary for itself. If a library does not need certatn features of a sys- tem, it is senseless to pay for them. Regardless of location or environment, one factor of the EDP operation remains constant for two libraries using the same BY HILLIS L. GRIFFIN Mr. Griffin is Information Systems Li- brarian at the Argonne National Labora- tories, Argonne, Illinois. computer system and the same program. This is operating time. Each group would take approximately the same time to punch the cards to order one hundred books and, given the same computer program, to process these through the same system. Costs would then be based upon the amount paid for labor, supplies, and the use of equipment at each location. One library, for example, might be billed for equipment rental, operator labor, and data center overhead. Another library, using a similar system, might pay nothing for the use of the equipment if the insti- tution writes off the data center as an operating expense. The cost of supplies might differ between the institutions. It should be plain that the costs of doing the job in one library are not necessarily the costs of doing it in another. The only con- stant factor is the amount of time required to do the job on the computer. If the amount of time required to process a job is known, and the number of items proc- essed is given, then it is easy to determine the cost of doing the job in any library. TYPES OF LIBRARY APPLICATIONS Library data processing applications can be grouped in three classes. The first is the housekeeping function, consisting of applications such as journal renewal, circulation control, book ordering, and other related tasks, generally in the tech- nical processing area. The second class is that of information dissemination. Ex- amples of this are library announcement 400 COLLEGE AND RESEARCH LIBRARIES bulletins of new acquisitions and printed catalogs of the library or branch library holdings. The third class is that of retro- spective searching, in which the computer is used to select references or produce bibliographies of items in the collection that relate to a particular subject or sub- jects. In the first two classes operations are essentially input/ output. By this is meant that, in general, a card is read into the machine and a line is written out on the printer (as in writing a list of all materials in circulation) . As part of the operation, cards might also be punched with ex- panded, updated, or rearranged data, or such records might be transferred to mag- netic tape or random-access files. These are generally one-to-one operations (i.e., we read, write, read, write . . . ) in con- trast to those encountered in retrospective searching. Here many records are read in and operated upon, but only a few lines are printed out at the end, such as a list of references (or a bibliography) which appears to satisfy the parameters of the search. Because of the cost of organizing ma- terial for retrospective searching, and be- cause of the cost of machine time for ac- complishing such tasks, operations such as housekeeping and dissemination can, at the present time, make the greatest im- pact upon a library. Preparation of book orders on the computer may require little more effort than is presently being ex- pended to do this manually. Yet the by- products of a mechanized operation, ob- tained by using the same punched cards over again with another program, can be useful in other ways without necessitating further input preparation. Materials that have not been received may be expedited, want lists prepared, and fiscal records provided quickly and automatically as the result of such operations. Other files may also be obtained, such as a listing, by author or title or both, of all materials on order. Such a listing might also be ar- ranged by cost code, requestor, vendor, SEPTEMBER 1964 etc. These listings would obviate the need to maintain certain files on a manual basis. All of these jobs may be done using the cards that were used initially to order the books, involving no further clerical effort. The essential advantages of the com- puter-based system are: ( 1) the ability to use the same information in many ways for a variety of purposes, and (2) the ability to perform tasks quickly, easily, and accurately. This flexibility is obtained from input (generally punched cards) which may cost no more to prepare than the documents now being produced in a manual system. Punched · cards lend them- selves to easy manipulation for the pro- duction of other reports containing the same information in a different format, or arranged in a different sequence. Type- written documents cannot be handled in this way; furthermore, they cannot be re- vised automatically. DESIGN OF THE SYSTEM The initial step in the design of any system is to define its purpose. An im- portant consideration in the design is the equipment available to do the job. EAM (electronic accounting machine) or unit record equipment1 impose restrictions not met in a computer-based system. This is not to denigrate unit record equipment, but it is, nevertheless, an important con- sideration. Is the type face on the output printer adequate for the job, or does it lack parentheses or other special char- acters that may be essential? Note the word "essential"! Diacritical marks, semi- colons, and question marks dress up the output, but they may not be vital to the job at hand. A decision to dispense with certain special characters or upper and lower case type can make it possible . to mechanize the procedure under consid- eration. The output may not be aestheti- cally satisfying and it may not appear in 1 N oncomputer equipment. Examples of EAM print- ers are the I~M 407 and IBM 402 accounting ma- chines. 401 exemplary bibliographic form, but it will work, and it may, in fact, work rather · well. Design of the format for the printed output is extremely important if the ap- plication is to be fully effective. If the ap- plication is a new one not previously used by the library, there may be con- siderable freedom in design of the output format. In this case there need be no rig- idly preconceived idea of a "right" out- put format. A fresh, new format can be used, and mistakes of the past can be eliminated. If the system is to produce an output product similar to one pres- ently in use, it is important to give crit- ical consideration to the content of the present form and to justify the retention of each item. Some information in the form may be obsolete or may be supplied by some other part of the system. It is also important to consider the output format in relation to the input format (i.e., the .format of the punched cards) for the system. The array of print- ed output items, from top to bottom, should generally bear a definite relation- ship to the sequence in which these items will enter the system. The printing opera- tion cannot begin until the information which must appear on the first line of the output has entered the system. If, for ex- ample, the name of the author appears on the last punched card of each item in a book ordering system, preceded by a number of title cards and cost and account information, we should not design an out- put form which requires the author's name as the first item of output informa- tion. This would require that we read and store the contents of several cards in the computer memory without doing any printing, and that we print several lines without reading any cards. In doing this, we take no advantage of the ability of the computer to do more than one thing at a time, and output speed (i.e., over-all speed of the printing operation) may be slower than necessary. It is important to "think big" in order . to take maximum advantage of the flexi- bility of modem data processing systems. It is important to consider all operations which may be served by the data enter- ing the system, and to tailor the input data to serve these needs. Equipment limitations will also have an important part in these considerations. Unit record equipment may require special forms and be less flexible than a modem stored-pro- gram computer which may be able, in effect, to emit certain constant informa- tion (e.g., page headings, date, etc.) that has been stored by the program, and to print this information as part of the out- put program at the appropriate time. In this way blank paper may take the place of specially preprinted forms in some op- erations. EAM equipment must generally print information sequentially as the cards are read by the machine and does not have the extended · emitting capability of the computer. CosT EsTIMATING With the design of the operating input and output in mind, and the available equipment known, the librarian is at last in a position to secure some idea of op- erating costs. Housekeeping and dissemi- nation systems are essentially output op- erations. This means that we read a card and print a line (or several lines) in re- sponse to the input and the program. The limiting condition for a system is gen- erally the speed of the output devices. By knowing the number o1: lines of output that we expect to generate and the speed at which our equipment will generate this output, it is possible to obtain a good esti- mate of what the costs of the system will be. It is only necessary to: ( 1) deter- mine the number of printed lines (or punched cards) required as output for a given number of items; (2) determine the time required for the printer and/ or card punch to produce this number of items; and ( 3) relate this to time charges for the computer or EAM equipment used. 402 COLLEGE AND RESEARCH LIBRARIES A simple example might be the prepa- ration on the ffiM 1401 of a list of the items in a punched card circulation file. We will write one line for each item in the file, i.e., one line per card. Although the rated writing speed of the 1403 printer (which is used by the IBM 1401) is six hundred lines per minute, a more realistic figure for estimating might be an output rate of five hundred lines per min- ute. A file of ten thousand items would take approximately 20 minutes to process. If computer time costs $60 per hour, the cost of the job is 2%o of $60, or $20.00. Consider next a job in which we wish to produce punched cards and write simultaneously. Let us assume that there are one hundred overdue items in the circulation file, and that while we are writing the complete circulation list we wish to punch a card for each of these overdue items for later processing. The 1401 will write a line and punch a card simultaneously, but it will do this only at the speed of the punch, which is 250 cards per minute. We will thus write nine thousand nine hundred lines at five hun- dred lines per minute (19.8 minutes) and one hundred lines will be written at the rate of two hundred fifty lines per minute while cards are punched (0.4 minutes) for a total job time of 20.2 minutes. If one thousand cards were punched the job time would increase to twenty-two min- utes (eighteen minutes for writing without punching, and four minutes of simultane- ous writing and punching). The cost per item processed is the total cost of the operation divided by the num- ber of items processed. The additional expense of preparation of the input must be added to this figure. Depending upon the format of the punched card, a key- punch operator might prepare eight hun- dred to one thousand cards per day. The cost of preparation of the input is the sum of operator salary and overhead plus machine rental divided · by the number of cards produced. Another type of operation which can SEPTEMBER 1964 be used to advantage is one in which the file is searched for certain items, and ac- tion is taken upon these items as they are found. A file of journal subscriptions is an example. The file might be passed through the computer at monthly inter- vals, with renewal orders being written automatically only for those items ex- piring during a particular month. It is easy to see that if the term of any sub- scription is one year, and if the various subscriptions in the file expire at various times throughout the year, each will re- ceive renewal action once during the year. The renewal action will require that several lines are written and that an up- dated subscription record card is punched. These cards will pass through the system with a minimum of action (probably a comparing step to test the expiration date) for the eleven monthly passes through the computer on which no action is taken. The cost of the operation is the sum of the eleven monthly passes at a nominal reading speed of six hundred cards per minute, and the twelfth pass during which the item is renewed, several lines are written, and a card is punched. It will take about two seconds over the course of a year to renew each subscrip- tion and to produce the updated subscrip- tion record. Valuable byproducts can be obtained from this file, such as a list of the cur- rent status of all library subscriptions, bid lists, lists by branch library receiving the subscriptions, etc. These supplemental records, obtained from the subscription file, may have value equal to or greater than the value of the original operation. Because of the wide variation in charges for computer time, it is easy to see that EDP costs in one library may be meaningless to another. This may be due to the difference in time and equip- ment charges between libraries, or dif- ferences in the equipment used. Another factor may well be that the library that de- veloped the system designed a deluxe sys- < Continued on page 481) 403 descriptive cataloging division at Library of Congress. GERALDINE ZIETZ has been appointed li- brarian in the social sciences reference ser- vice, University of California, Berkeley. WILLIAM E. ZIMPFER became theology librarian at Boston University on July 1. NECROLOGY LoUisE SAVAGE, assistant librarian of Uni- versity of Virginia, Charlottesville, died on July 5. MRS. MARY WHEELER WELLS, head of the business library at Indianapolis, Ind., public library, died on May 31. RETIREMENTS MAY DoRNIN, head of the department of archives at University of California, Berke- ley, retired on June 30. MRs. ELIZABETH K. GunDE, chief ac- quisitions librarian at Bancroft library, Uni- versity of California, Berkeley, retired . on August 13. BEss LoWRY, head of the humanities ref- erence service, University of California, Berkeley, retired this summer. EDNA LOUISE LUCAS, fine arts librarian at Harvard University, has retired after thirty- seven years of service in that capacity. }ANNETTE NEWHALL retired on July 1, from her position as school of theology li- brarian, Boston University. She leaves for Manila, P.I., in October to administer a fund for the acquisition of books for a newly created Methodist theological sem- inary. ALICE LEE PARKER, assistant chief of the prints and photographs division of LC re- tired after thirty-four years of service, on July 1. M. EDNA VoDRA, reference librarian at Jersey City State College, retired in June after thirty-five years of service. HILDEGARDE ZIEGLER, head of the cata- log department at MIT since 1926 and member of the cataloging staff there since 1921 retired in June. • • ESTIMATING DATA PRO·CESSING COSTS ... (Continued from page 403) tern, while the investigating library re- quires only parts of the system. Internal requirements of one library may require more input cards per item, or the input format may not be optimal for the job to be done. Many factors may influence costs, but the major factor, given good system design, is that of time charges for equipment usage. Cost estimation for systems involving magnetic tape and random access files is also possible but will riot be discussed here. The output speed (which is usually the speed at which lines are printed) is SEPTEMBER 1964 the limiting factor for most housekeeping operations. No matter how fast the tapes, discs, or card readers, most housekeep- ing systems will run no faster than the printer can handle the output. It is for this reason that cost estimating is rather easy, whether it be to relate someone else's job to your operation, or to find out what a proposed application will cost in your library. What is required is a logical analysis of the input and output of the proposed system. Then, given the output speeds of the equipment used, the volume of output, and the time charges for equip- ment usage, the costs follow logically and easily. • • 431