With Special Reference to the District Focus, District Information Documentation Centers (DIDC)

One is most grateful for being given the opportunity to address the distinguished representatives of the library, data processing, and educational communities of Kenya.  One is humbled, indeed, to attempt to present views and perhaps contribute to a shared interest in which professional goals are mutually held, but in which the library conditions and technological and other resources available vary so greatly from those of one's own experience.

Given that one's knowledge of local conditions is limited, but one's experience and work in the area of library automation is extensive, one trusts that this presentation will be of some value in two respects.

Presented will be the kinds of technology alternatives, which might serve as models for possible implementation in Kenya.  Some discussion will be offered about each of these alternatives. And the other respect in which these alternatives will be considered is their applicability to a regional document center program and regional information network.

As part of the technology discussion, this presentation will offer some brief comments on the possible role and function of automation in the district document centers about which one has been given some information.


The alternatives will be explored by first reviewing data processing and related technologies in their technological setting (at least in the United States), and then examining them in more detail in terms of library applications.  An effort will be made to consider the viability of these options for regional information networks in Kenya. Note well, one's knowledge of the DIDCs is limited to what I learned yesterday and to the DIDC document, so that one's thoughts probably will be more of a speculative and conjectural nature than of strong conviction.

1.1. Computers: Mainframe, Mini, and Micro computers are high speed data processing devices capable of doing an incredible variety of tasks at extraordinarily high speeds. Computers store, access, compute (i.e. process), and display information, and have been used with increasing success by libraries in the U.S. since the 1960's.

But it might be noted that the late John W. Cronin as Chief of the Library of Congress Card Service installed International Business Machine (IBM) tabulating equipment as long ago as the late 1930's for fund accounting purposes.

And it was also John W. Cronin, incidentally, who recognized the tremendous research value of the publications of Kenya by establishing the Library of Congress Regional Acquisitions Center in Nairobi, in 1967. As a further aside, and kindly forgive the personal reference, I was Mr. Cronin's Administrative Officer at the Library of Congress who assisted in the actual establishment of the Office. Never in my wildest dreams as a young librarian did I ever anticipate that I would be presenting a paper in Kenya.

Maintaining reasonable definitions and distinctions between these categories of computers is virtually impossible. The most powerful microcomputer of today dwarfs the minicomputer of the early 1970's and at least the smaller mainframes of the 1960’s. Mainframes, minis and microcomputers will be distinguished by use rather than by strict data processing definition.

1.1.1. Main Frame Computers are used primarily in large scale applications such as the major North American catalog networks, e.g. OCLC (Online Computer Library Center), UTLAS (formerly, University of Toronto Library Automation Systems), RLIN (Research Libraries Information Network) and others; some of the very largest libraries, such as The New York Public Library, University of Chicago, Library of Congress and others. They have been used primarily as cataloging resource utilities. OCLC is used by over 5000 libraries to search for cataloging copy, to enter individual library catalog holdings, to obtain catalog cards and other catalog products, for interlibrary loan, acquisitions, and to maintain union list of serial holdings (including full serial cataloging) . All of the OCLC functions are performed online through interactive telecommunication between the local library and the OCLC computer facility in Ohio. (It should be noted that OCLC actually employs a variety of computer devices. The database of over 20 million records is stored on the original Sigma computers upon which the network began, and the telecommunications traffic is handled by Tandem computers, which, depending on their size, can be viewed as mainframes or minis.)

Mainframe computers are used also for maintenance of, access to, and searching of such online bibliographic database services as DIALOG, ORBIT, and BRS. Such databases are searched by users from all over the world, typically gaining access to the desired database via some form of telecommunication device and local terminal or computer.

The OCLC network conceivably could serve as a model for Kenya at a time in the nation's development when a large mainframe computer had the resources to store all of the nation's holdings, and a telecommunications network with dedicated lines and terminals hooked up to the central computer from all over the country could be a reality.

In this way all of the libraries could share the cataloging done at a national center and not have to redundantly catalog the same books, as well as do all of the other things possible with 0CLC. This seems like a most unrealistic option at this time for Kenya, based on one's brief visit.

1.1.2. Minicomputers are smaller, but still quite powerful computer devices that have been quite effectively used in local library or regional automation efforts. They have been especially effective in such applications as online circulation control systems for individual libraries (as with many college and university libraries, plus some public libraries), and libraries in combination (e.g. public libraries with many service outlets; regional public library systems consisting in many individual public libraries; and academic consortia, i.e. several college libraries in combination, sharing a single minicomputer.) The MINICOMPUTER is successfully used as a circulation control device to maintain control of all of the books (and other library materials) held by all of the libraries participating in the system. It indicates where they are located (i.e. on the shelf, in circulation, or. any other status), who has them, reserves or holds them for use by library patrons when they are not immediately accessible; it maintains a complete record of all of the registered library users which includes their name, address and other appropriate information so that the user can be notified if he or she has overdue books, a reserved book is available, or for other information pertaining to library service; it also keeps track of the amount of money owed by individuals who returned their books late.

Individual suppliers of such minicomputer based systems are in the process of adding additional applications such as online acquisitions; online catalogs (currently referred to by U.S. librarians as PACs (Public Access Catalogs); serials acqui­sitions, cataloging and check‑in; film control and booking; and special reserve room control (an important application for colleges which have special collections from which books may circulate for one hour (or more, up to one day) at a time).

Many libraries are using these minicomputer-based systems today for full public access catalog use and complete circulation control. Note well that the precondition for such applications to run on the minicomputer is the creation of a machine-readable database of the library's holdings. The conversion of library holdings to machine readable form will be discussed later.

Based on a visit to the University of Nairobi on Tuesday, it seems within the realm of reason for the University of Nairobi to acquire in the next five years a minicomputer based integrated system, one which will control the circulation of materials, contain all of the University's cataloging data, do all of the acquisitions, and generally provide for the control of all of the University's materials.

Considering yesterday's discussion, the one in which Professor Kimani indicated that librarians and clerks alike spent valuable hours arranging and filing issue slips, the minicomputer circulation system will complete eliminate that particular odious function.

The librarians will do work much more appropriate to their training and the needs of the users, and the clerks will be able to concentrate on other work that is going undone while they are shuffling issue slips.

It will not be easy for a variety of reasons that can be taken up in the question and answer session, but it is a viable course upon which the University may well embark in the not‑too‑distant future.

1.1.3. MICROCOMPUTERS are used primarily in single library and single function applications. There is a variety of microcomputers available and in use by U.S. libraries. Microcomputers are being used in such administrative applications as fund accounting, word processing, and data base management. The kinds of microcomputers used to perform these functions typically are based on the Intel chips, that is, the 8088, 80286, and 80386 chips which are on the IBM PC, XT, AT, and compatibles; and the MacIntosh, which is based on the Motorola 60000 and 60010 chips. From smart toys and hobbyists' enthusiasms, micros have evolved to become such devices as the COMPAQ 386/20 microcomputer with a 300MB disk drive, a device of monstrous computing capacity that can support a sophisticated local area network of PCs and/or terminals. Microcomputers are being used to perform such library functions as acquisitions, cataloging, serials control and circulation control. In the last couple of years they have been used, too, as public access catalogs and sophisticated database storage and searching devices.

Typically, most micro based systems do not accept, process or output cataloging data in conformity with library standards, that is, the MARC record. However, there are exceptions, and definitely more are coming along.

What has made the big difference is the storage medium, CD‑ROM, which will be discussed at greater length below. CD‑ROM stands for Compact Disk ‑ Read Only Memory. The CD‑ROM platter is capable of storing over 16,000 pages of information, and is approximately 5” wide. The combination of the PC, a CD‑ROM reading device, and a CD‑ROM disk with a database on it, have made possible unprecedented extensions of automation, and more specifically have made possible the distribution of databases on to PCs which heretofore could only be stored on mainframe computers with massive disk storage.

With all of the equipment required to use a CD‑ROM now costing less than $5,000 U.S., a library now has the hardware for storing and accessing the entire U.S. MARC format database. The compaction of data storage and the relative inexpense of the CD­ROM storage medium, and the reduction in unit cost for data processing as represented by the PC have made it possible to provide the most sophisticated applications to operate wholly self‑sufficiently in the most remote locations. The only requirement is a stable power source compatible with the equipment.

The USIS library has several databases that are on CD‑ROM at its disposal, Books In Print Plus and Ulrich’s from R.R. Bowker, Dissertation Abstracts, and WilsOnline.

In addition to such bibliographic databases, companies such as the Library Corporation of America make the MARC database available on CD‑ROM as well as software to search it and prepare catalog cards and labels from the MARC records. This particular product is called Bibliofile and it has been very successful in the U.S. as a source or medium for inexpensively capturing Library of Congress cataloging data.

Bibliofile also provides for the printing of catalog cards and labels from the Library of Congress catalog record. No more searching the NUCs and all of the supplements, and no more typing catalog cards and labels, should those be decisions that are politically expedient. (One would suggest that there are ample other jobs clerical staff (and probably professionals) could be doing if they did not have to do those NUC searches and type catalog cards and labels.)

It would be extremely useful here for any library cataloging a significant number of monographs in western languages. Through Bibliofile a database of MARC records could be created for the books cataloged (as well as original cataloging for which the system has provision), and catalog cards and labels of Library of Congress cataloging standard can be created for the local 1ibrary. The company was good enough to send along some sample brochures, floppy disks and CD‑ROM platters, which I will leave with you.

Perhaps this is as good a place as any to make a few remarks about the MARC format, and conversion to machine-readable data of cataloging information. None of the computer applications discussed in this paper are possible without there being data for the computer to manipulate. U.S. libraries learned the hard way, and I urged the University of Nairobi staff on Tuesday afternoon to not repeat the U.S. errors, namely, that when a library is going to automate it should adopt the MARC format as the standard for converting its cataloging data. MARC stands for MAchine Readable Cataloging, and is a format specifically designed for putting cataloging information into a form the computer can use.

The tremendous advantage of Bibliofile for an institution such as the University of Nairobi, is that the University could use Bibliofile to convert its monograph holdings to the MARC format and then have these holdings—the Union Catalog—published on CD-­ROM and placed, not just in the college and departmental libraries, but distributed nationally to be used by any library or information agency with a PC and a CD‑ROM reader. Further the University would then have its MARC database ready for loading into the minicomputer circulation control system it will be getting in five years.

An approach, not quite as technologically intensive, which might be feasible in the District Focus program would be for every DIDC to have its own PC, all use the same PC database management program, and all enter their District's documents into the database program. Disks could be exchanged between DIDCs and each district center could be aware of the documents held by each of the others.

In addition the disks all could be sent to a central point at which they could be merged and a central database of DIDC documents could be created. There are a number of outstanding database management programs available which would readily support such an effort. And if one can safely assume that a PC for each DIDC is within the realm of possibility in the next several years, then such a program would be feasible.

A number of things would be required for such a program to work. And this assumes all the technological and electrical problems can be solved. There must be agreement on the use of a single database management program for all of the DIDCs. There must be a standard format adopted and understood by everyone concerned. Since I have no familiarity with the range, character and size of the documents to be acquired and maintained at the DIDCs, no specific database management program is suggested. But standardization is the key to success of all networking arrangements, and it will be vital here.

The questions I have concerning the success of the DIDCs based on my reading of the DIDC document have to do with their management, the staffing by clerks with there being professional participation from a host of agencies which have no control over the DIDCs' operations, and the seeming unlimited collection, organization and dissemination responsibility of the DIDC, yet only local funds available to support them. My guess is that there are 41 districts and 41 different ways that the DIDCs will be implemented. The PCs alternative suggested above, can work, but the assumptions underlying its chances for success should all be made explicit.

Noted already was the absolute requirement that there be standardization in the use of a database management program. This however assumed that the bulk of the documents in the DIDC would be locally generated documents, as opposed to research monographs.

It also assumed that the clerks running the DIDCs would be trained to use the PCs, and that the establishment of the PC database program and the standards for entering data into it would be taken care of by competent professionals at the national level. This further assumes that at least some of the documents collected at each DIDC will be of value and interest to one or more other DIDCs, and should be centrally collected by the Ministry responsible for establishing the DIDCs.

One regrets any perceived negativity in these remarks. I believe that the goal of the DIDC is a noble one, that is, making available for the people of each district all of the documents that have a bearing on their existence and the future of their lives, their homes and their work. Organizing this effort into some coherent enterprise that has a good chance of success will be a monumental effort. As a fellow professional librarian speaking to this assembly of my Kenyan colleagues, it is unfortunate that librarians were not designated to establish and manage these DIDCs.

1.2. Micrographics: The two major areas of micrographics employed in U.S.A. libraries are the traditional photographic micro reduction of hard copy documents, and computer based micro reduction. With photoreduction, the hard copy document is photo‑reduced to microfilm, 16x, 24x, reductions all the way down to 75x and 150x in some ultrafiche reductions. Computer micro­reduction is usually 42x or 48x. Both microfilm and microfiche have been used for storage and/or preservation of: deteriorating materials, infrequently used materials, and those items which suffer from vandalism, misuse, or for some other reason need this protection. The other reason is simply to save space. In addition microform is an alternate and cheaper means of subscription to some materials, i.e. many libraries with limited budgets in an inflationary economy find it acceptable to acquire given serials in microform rather than hard copy. The disadvantage is that the serials are never received on a timely basis.

1.2.2. COMPUTER‑OUTPUT‑MICROFORM (COM) is a medium used by libraries since the 1960’s that successfully utilizes both the data processing and microform technologies. Instead of photoreducing hard copy documents, the information displayed on COM is converted into machine readable copy from hard copy format or directly keyed into machine readable form, thus manipulable by the computer to suit the given library application. The machine-readable data, instead of being displayed (i.e. printed) in hard copy form by the printer, is input to the COM device. The COM device, also called a "camera" because of its camera‑like function, converts the machine-readable data into its visual analog format, i.e. it displays it on a cathode‑ ray‑tube (CRT) screen in a human readable (not digital) form. The CRT display is then projected through a lens onto reel microform at extremely high speeds. In this manner either 16mm microfilm or microfiche are created. (Note that the microfiche are created as a single roll of film and cut into the individual sheets later.) There are a large number of libraries in the U.S.A. which have COM catalogs. It is an extremely compact medium, a given sheet of microfiche can contain the equivalent of 224 pages of 11 ½” x 14” computer printout. It is relatively inexpensive, as the largest costs are associated with the data processing effort required for organizing the machine readable data into the desirable manner of display, i.e. the way the catalog is to look; the costs for the COM master and the copies from which it is made tend to be modest, assuming the catalog is not too large. Many libraries in the U.S.A. have found that COM catalogs are a viable intermediate step between the card catalogs that still predominate, and the online catalogs that require far more expensive equipment to use, computers. Generally COM has functioned as an inexpensive storage medium and it has eliminated bulky and comparatively expensive paper listings. It is used best as a disposable product, and is not thought of as an archival medium. For example, many libraries use COM microfiche for current on‑order information. Since libraries tend to order, receive and cancel (orders for) books on a frequent basis, a COM listing of the status of the orders must be fairly frequent if it is to be of value. Hence the disposability of the COM on‑order file and its regularly being supplanted by a more current edition. It is also the case that libraries use COM as a substitute for any necessary computer generated report that involves a large amount of paper.

Finally COM does not enjoy a great future; as the cost of computing storage and processing continue to drop, the viability of wholly online and interactive systems increases, thereby eliminating a major reason for utilizing COM.

The other major reason COM will further erode as a remote display medium is the aforementioned CD‑ROM. Major COM catalog manufacturers now offer CD‑ROM catalogs to customers with increasing success in selling the newer medium. All of the costs associated with CD‑ROM are far greater. As noted the CD‑ROM requires a PC and the CD‑ROM device. With COM all that is required for display is a $150 ‑ $200 reader. The preparation of the COM masters and duplicates are appreciably cheaper as well. Typically the master costs from $10 ‑ $25, and each duplicate fiche is about $.25 depending on the supplier.

Again there may not be much applicability of COM or CD‑ROM to a regional network because it presupposes the existence of a machine-readable database from which a vendor can create the COM or CD‑ROM.

This leads to a discussion of the CD‑ROM medium. The more comprehensive term is laser recording and display media. Machine-readable data in digital format and/or analog information is converted into digital format.

By the use of the laser technology the information is encoded on to a silver platter of varying size and varying storage capacity. There are Optical Disks and CD‑ROMs, the two laser media prevalent today.

1.3. Optical Disks are relatively new as regards library usage. Experimentation is proceeding with this medium as a storage device capable of holding much more information than is found in microform; two gigabytes (i.e. 2 billion bytes or more than 65,000 pages) stored on a single optical disk; they are capable of producing appreciably clearer images on playback or display; and lastly, they are capable of combining on a single optical disk both digital and analog (i.e. pictorial) information. The Library of Congress and other major research libraries are studying this closely as another means of preserving deteriorating books.

Since optical disks, as noted, include digital information much work is being done to find applications which will combine the computer's indexing, searching and control capability with the laser disk's capacity for accessing pictorial information through digital data encoded on the disk as well. Note that the optical disk is read or examined by a laser device and displayed on a television monitor or some other form of cathode‑ray‑tube display device such as a computer monitor.

One company has taken huge medical databases and placed them on optical disks, and reduced the storage device costs of the databases by many factors. Some optical disks are known as WORM disks, i.e. WO for Write Once, RM for Read Many times. These applications are usually tied to devices significantly more powerful than microcomputers.

CD‑ROM: This disk, to which reference has been made several times, especially in the microcomputer discussion, has become a hot item in United States library discussions, and its impact is just beginning to be felt. CD stands for Compact Disk, and ROM means Read Only Memory. Each disk can hold about 500 million bytes of information, which means that over 16,000 pages of data can be stored on a single platter.

The extraordinary impact of CD‑ROM is that it provides the distribution of massive databases, including sophisticated retrieval capability, while at the same time requiring none of the prohibitive telecommunications costs normally associated with the distribution of databases to remote sites. All that is required for access to a database, such as the Library of Congress MARC database, is a PC compatible microcomputer with a CD‑ROM reader and control card, and the CD‑ROM platters with LC MARC, which is 3 million records large at this point.

In the past libraries would incur major phone line charges to call the computer upon which resided the database to be searched. Plus the library had to pay toward the cost of storing that database on the computer online. As noted here and earlier, tremendous cost savings are thus realized, and the CD‑ROM is sitting there waiting be used whenever one chose.

Several libraries have chosen the Bibliofile product over membership in OCLC because it is so much cheaper for them to catalog with Bibliofile.

Many U.S. library systems or consortia, rather than face tremendous telecommunications charges, are putting their public catalogs on CD‑ROM, rather than connecting terminals via dedicated phone lines to the circulation system computer.

This application strikes one as particularly valuable when one thinks of regional or distributed networks where resources may be scarce and access to sophisticated telecommunications technology may be either limited or prohibitively expensive. If CD‑ROM would be the primary distribution medium, there would always be a built‑in delay for the time required to create and duplicate the platters from the machine-readable data (four to eight weeks).

For example if all of the four universities in Kenya were to convert their holdings to machine readable form, it is probable that they could all f it on one CD‑ROM disk. Copies of these disks, that is, the holdings of the four university libraries, could be made available to all of the libraries in the nation, assuming these libraries had the PC and CD‑ROM devices on which to read the platter. The distribution of such data in the past was only possible with COM, but the COM medium as noted earlier, and for additional reasons not pursued, is far less satisfactory.

Perhaps one last point should be made about CD‑ROM and its advantages over COM. COM is a much cheaper medium, but one cannot utilize the indexing and searching capability that the computer and CD‑ROM provide in combination. Boolean searches on keywords, subject terms, names, parts of names, etc. all are possible with CD‑ROM based systems, examples of which John Lilech would be glad to demonstrate at some reasonable time. So the extra cost of CD‑ROM versus COM pays for appreciably greater search and retrieval capability, a capacity far exceeding that of the traditional catalog.

Without further belaboring the CD‑ROM technology's virtues, one will now discuss telecommunications, the big giant to which the little CD‑ROM seems to be providing an alternative.

1.5. Telecommunications is an area of great specialization and, as a major technology, has its own discipline to be mastered. No attempt will be made to indicate just how complex this area of technology is, nor even to explain the various telecommunications processes. It will suffice to indicate the kinds of roles telecommunications may play in a distributed inftion network.

The use of telecommunications in all of the contexts discussed here means the use of phone lines as the means by which a user at a remote site communicates with the central site computer or anyone at another remote site.

(Much less frequently, other means for transmitting messages such as satellite or radio frequency transmitters may be used, but will not be considered further.)

If there is a phone line specifically established for such communication, and if it is used solely and exclusively for that purpose, then it is called a dedicated line. If a phone line can be used for a variety of calling purposes, one of which is calling the computer from a remote site, then the remote terminal has dial‑up capability.

For both dedicated and dial‑up lines to be used, special equipment is required on the terminal and on the computer that will permit the data to be converted into a form that can travel between the computer and the terminal, and then converted back so that it can be displayed on the terminal or processed by the computer—depending on the direction the data is headed.

The telecommunications equipment is expensive, but even worse, the telephone charges themselves, at least in the U.S., are extremely dear.

For example the library of which I am the director will have a telecommunications bill of over $10,000 per month. It would have been double that price if we had been unable to take advantage of a special federal tariff. The telecommunications equipment was approximately 1/4 of the price of the total system, that is $350,000 of a $1.4 million dollar purchase price. The system will support 180 terminals at 45 sites.

This is precisely why U.S. libraries are turning to CD‑ROM for distributing data.

The interesting trend that seems to be emerging is that online terminals are used and will continue to be used where the data must be accessible in real time and must be maintained and current to the moment. online circulation systems which keep track of who has which books and which books are in or out, are examples of the justification of online communication with the computer.

Online catalogs are harder to justify. For example, if a library's database consists of 500,000 volumes, if it adds twelve thousand volumes per year, and its CD‑ROM catalog is updated quarterly, then the most someone will be denied access to will be 1% of the library's holdings. And there are alternate ways of making that three to six months of information available so that it is not entirely inaccessible to the library user. In this way the library is spared the substantial telecommunications bill, of online public access catalogs, yet the users will still have access to most of the titles held by the library.

The discussion of technologies would not be complete without briefly touching upon Video and Cable Communications.

1.4. Video and Cable communication are technologies being used by libraries in a variety of ways.

1. 4. 1. Video and cable communication are being used simply as entertainment and educational media within libraries much the same way they are used in U.S.A. households. Libraries have extended this function by videotaping, broadcasting or otherwise making available information of value to the local community not otherwise accessible to that community.

1.4.2. Public libraries have been especially exploiting the home videocassette market by buying and circulating to their clientele entertainment and educational videocassettes.

1.4.3. The potential for this medium as an information dissemination tool is great. It is not yet being used very effectively in this manner.

Video and cable communication has the potential to be an extremely successful application in the DIDC program. It presumes the availability of TVs and video playback devices in all community meeting places. In this way important information can be disseminated in each district, and it would not require people to be literate to gain the benefits from that information. As in, the picture is worth a thousand words.

This concludes this technology overview and alternatives.

Dr. Maurice J. Freedman, Director

Westchester Library System

Nairobi, Kenya

25 February 1988