Presentation at the University of Maryland College Park
CHINESE AMERICAN NETWORKING SYMPOSIUM
JANUARY 14, 1999
Charles B. Lowry, PH.D.
Dean of Libraries
Introduction
It is no longer a point of contention that the information underlying academic library services--scholarly information--will be mediated increasingly by electronic means. What is in doubt, is the speed with which this transition to digital formats will take place. There are many economic, social and technological inhibitors to this process which will have indeterminate effects. The response of all libraries will have to be opportunistic, taking advantage of information technology (IT) innovations as they appear to create new classroom and research synergies. Accordingly, it is essential in the design of library IT systems to insure maximum flexibility so that adaptations to currently unknowable developments are possible.
Almost no one questions that there is a paradigm shift going on and there is no shortage of people with "crystal balls" to predict the future , myself included. In the next twenty minutes I want to paint a fairly broad picture of the challenges presented by the advance of information technology and the opportunities that "digital library" development present for the University of Maryland, leaving sufficient time for Q&A. I want to propose a series of questions to which I will give my own answers.
WHAT ARE THE PARADIGM SHIFT STRATEGIES?
In 1988 I wrote in an article that "Today, the Library is being transformed into a capital-intensive, high-technology light industry Until very recently, libraries have been fundamentally nineteenth-century institutions and may be characterized as labor-intensive craft workshops. As such, their organization has centered around specialized skills and knowledge applied to complex manual filing systems." Libraries are organizations which may be characterized by a significant distribution of the "expert" information needed to make decisions and a consequent increase in the responsibilities and expertise of staff throughout the organization. The main point is that the transformational effect of technology is unavoidable. It is upon us and the tide cannot be turned back, but we may be able to "manage" it.
The information technology landscape is characterized by a set of very complex issues such as networking, distributed architectures, connectivity, remote access, full text and increasingly multi-media--in short a shift in the library marketplace. This shift will place significant new pressures on libraries as the providers of scholarly information, since that information is taking multiple forms. What is needed is a shift in thinking from traditional automation planning which is built around the notion of major computing system implementation and vendor abandonment over fairly long cycles. The landscape is changing so that we must develop an "information strategy" which allows us to respond to more or less constant change in IT. This must obviously be incorporated into budgetary planning.
WHAT ARE THE KEY TECHNOLOGIES?
The new software and hardware information technology, which is the prerequisite for the virtual library, rests on the foundation of client/server distributed computing; large-scale local and wide area networking; open architectures and standards; authentication, authorization, and encryption; and billing and royalty tracking. Moreover, there are fundamental problems for information retrieval (IR) that must be solved to create the technology needed to make IR viable in the distributed full-text future. Call this the virtual library tool kitit will include far less dependence on word indexing and keyword/Boolean retrieval; the development and broad application of natural language processing (NLP); and effective tools for navigation of the network, which are based on good human factors research with human computer interaction (HCI). Finally, academic libraries will quickly broaden their offerings to include what has been loosely termed multi-media. While the printed word will remain critical to the scholarly enterprise, information in its broadest sense has become a significant piece of the puzzle. The research and development that the University of Maryland Libraries have participated in emphasizes research that will significantly focus on the problems of images, digital music, voice digitization and the attendant problems--from metadata (what we used to call cataloging) to retrieval and browsing. It also is illustrative of an important niche research libraries can fill as content providers or publishers in partnership with research faculty and technology companies.
WHAT ABOUT NETWORKED RESOURCES AND THE INTERNET ?
As we begin to exploit the Internet, librarians must keep in mind two salient factsit is really just a large distributed computing system with a decentralized administration, which makes it enormously complex and difficult to make "user friendly"; and it will become a preeminent feature of the virtual library. The use of the Internet within the new library paradigm will result from two sets of effortsthe work of individual librarians as experts who will guide library patrons towards resources that are useful for their particular information needs and the integration of Internet resources within the library information technology environment. Currently, the Internet presents many problems to the common user. Resources available are diverse, difficult to find and their quality is extremely uneven. The question of the volume and quality of information on the Internet is a critical one for publishers as well as libraries.
However, the plain fact is that the technology is also problematic. In spite of the serious and sustained effort to provide better interface tools for the user, we are a long way from a good common user interface. Perhaps that is why metaphors like tunneling, surfing and navigating are typically used to describe the current state of the art for retrieval on the Internet
WHAT ABOUT PUBLISHING IN THE NEW PARADIGM?
There are of course numerous technical problems, but I believe they can be solved with cooperation between libraries, publishers and library automation companies. Too often though, work on digital libraries, not to mention much theoretical discussion, proceeds without a thorough grounding in the fiscal realities. There are certain assumptions which precede this state of affairs, among them the notion that digital libraries somehow will be cheaper than print libraries, perhaps even free. I suspect this arises from the misplaced hope that digital libraries will liberate us from the difficult cost dynamics of print libraries. There is also a presumption that electronic access will mean added value to library patrons, but it begs the question if the access is at a cost patrons or academic libraries are unwilling or unable to pay.
It seems clear that libraries will not have large amounts of new funding with which to purchase electronic materials, although it is not a zero-sum game. It is not only serials costs which are spiraling but also those for books which are far and away out of scale when compared to measures like the US governments Consumer Price Index. In addition, electronic information resources are taking a significant share of the budget. A large part of the problem has nothing whatever to do with technology, but with the structure of modern scholarly publishing, particularly in technical disciplines. This is all very well understood and has to do with what I call the iron syllogism--universities (i.e., our faculty) produce scholarship, scholarship is used for career advancement (i.e., promotion and tenure) by transferring it to publishers; and universities (our libraries) are responsible for acquiring the information to support classroom teaching and research (i.e., buy back the scholarship which faculty created). Publishers are in the enviable position of controlling the "inputs" and "outputs" of scholarship.
It is also the case that publishers may not expect to have large sources of new profits from the sales of electronic products that represent the information published today in books and journals. The economics of print book and journal publishing (that is, how to make a profit from them) are well understood. Electronic publishing is another matter. On the other hand, the magnitude of the IT undertaking before publishers is no less daunting than that facing libraries. I do think that publishers who adapt will survive. There will also for a time be significant new opportunities for alternative publishing enterprise. None of this will be free.
IT offers an opportunity to restructure publishing, publisher/library relationships, and technological applications. Any serious economic analysis draws us to one final telling conclusion. The traditional model of local ownership, which has dominated the vision of library organization and collection development for a century must change. The access model which is emerging will mean that libraries may subscribe or license access to information formerly packaged as a book or a journal, but it is not likely that they will store it on the local campus network. It only makes sense to share information technology resources among libraries and the cost of shared access to databases. However, these are all new relationships. They mean that the nature of ownership must be carefully redefined and this will take time and it may not be very easy to accomplish.
Publishers will want to know that their materials are being used appropriately. They should expect that access is for the campus community, that Interlibrary Loan and reserve reading conform to the "fair use" principle of US copyright law and that authentication and authorization prevent the significant downstreaming of information to those who have not paid for it. Libraries will want to know that a subscription to a title gives them permanent access to the contents over time, that the server on which it is found will be consistently available on the network, that the technology will be robust and stable, and that if the supplier (e.g., consortium, publisher, network) ever withdraws the service, then there is a plan for giving them the data they paid for. If such relationships are properly worked out, we may expect opportunities for new types of subscribed access.
HOW WILL LIBRARIES SURVIVE IN THIS FUTURE
I don't see us moving toward a "National Electronic Library" any more than a National Electronic Computing Center. It is just not in the nature of distributed architectures, which are inherently decentralized, to have such and effect. However, I do not mean that libraries as places won't change and change drastically. During the first 25 years of library automation, the consistent result of the introduction of information technology has been to increase the actual use of library facilities and print resources. It may be that the second wave of automation will validate this experience. However, there is a good possibility that this will be the first instance in which technology applications will change the way in which users relate to traditional library services and collections. One of the University of Maryland Libraries research goals in all full text projects must be to assess the effects of these projects on traditional library operations. We do know that remote access independent of place and time is a near term reality. We have all heard the clichés about the exponential growth of information during our century. This phenomenon is the cause of the emergence of the academic library as an instructional agent which has been developing strongly during the last thirty years. This role is essential because bibliographic systems of access to print are complex and their efficient use by students and faculty has to be supported by effective instruction and reference services. The networked world of the virtual library will make current systems of print access in libraries look like the model of simplicity and rationality. The role of the academic library as the central agency for acquiring and organizing access to scholarly information and the concomitant role of the librarians and library staff as information "mediators" will not disappear in the networked world, but will be integrated more closely into the teaching process as librarians and departmental faculty cooperate to insure that the "information" which is the building block of knowledge and classroom learning has currency. So, libraries as places will continue and will be the same so long as we have significant print collections, but at the same time different because of the access they provide to distributed information.
IS THERE A PLACE FOR DIGITAL LIBRARY RESEARCH IN THE PRODUCTION ORIENTED LIBRARY CONTEXT?
I want to close by saying a few words about the tension in IT work for libraries between research and production. We have learned a hard lesson about the burden that IT development brings to a small computing staff. Maintaining systems in production requires constant attention, and we do not have the resources to develop new applications while we keep old systems running without effective strategies of partnering. Many computing groups on campus are doing work which offers opportunities. Such relationships require time and effort to manage if they are to be successful and flourish. A corollary is that relationships carry an overhead and we must make decisions about which ones are worth the effort. Each new effort, each new partnership carries with it new burdens. Expanding efforts without due caution will, I believe, result in failure. I like to measure new projects by five basic rules: 1) seek partners on campus and in the library vendor community who can bring significant know-how and special resources to the enterprise; 2) emphasize efforts which will leverage advantages to our central purpose of supporting the teaching and research efforts at the University of Maryland with information resources; 3) transfer any technology developments to the larger academic library community; 4) secure new funding for any information technology initiatives we undertake; and 5) start new initiatives that have a strong propensity to build on past successes and current work. We must in short consider the cost/benefit basis for undertakings, and this means asking hard questions all the time. For instance, what previously innovative homegrown systems are becoming available from vendors, and what will we lose if we adopt them? How can we use relationships with vendors to create mutual advantage? So our research proposals are based on viable opportunities and based on local human resources, technical experience and information resources which are unique. They must also be based on partnerships. I want to describe four current proposals in which University of Maryland Libraries are participating that I think live up to the rules that I have just outlined.
NSF Digital Library Initiative -- Creating the Digital Music Library
Total UM Libraries Request: $1,329,606 (over five years)
The University of Maryland Libraries submitted a Digital Library Initiative proposal as a partnership with Indiana University and the University of California at Berkeley for the 1998 competition. The research proposed involves testing and creating a Digital Music Library.
NSF Digital Library Initiative -- Searching Sound Recordings Using Speech Technology
Total Request: $599,813 (over three years)
Expected Decision Date: January 1999
This joint proposal with Professor Oard in the College of Library and Information Service will use our radio broadcasting archives to advance research in voice recognition and conversion to ASCII. Based on this work we can create indexed retrieval to radio archives. However, here to we also see opportunities to strengthen our basic services and add value to the retrieval of audio information particularly by converting voice recordings from our huge archives of radio materials. In addition, this gives us an opportunity to push the envelope on reformatting analog materials such as wire recordings and at the same time to experiment with digital preservation of these materials.
Ameritech/Library of Congress Digital Library Initiative -- Town and Counter, Life in the Plantation Chesapeake 1740-1920: Text, Images, and Archaeological Collections from Historic Annapolis Foundation and the University of Maryland Libraries
Total Project Request: $149,234
In partnership with the UM Department of Anthropology, its associated Geographic Information System Mapping Project, and the Historic Annapolis Foundation, the UM Libraries propose to create a digital archive that will describe and illustrate the daily life and the built environment of the Chesapeake region. Artifacts and manuscripts from all partners will be digitized by an outside contractor and then loaded onto a local system. The technical architecture for the project will then be created by the Information Technology Division of the UM Libraries. The archive will be accessible in GIS format as well as through the database searching tools of the library.
NSF Knowledge and Distributed Intelligence Universal Access for Newsprint Media: Microfilmed Newspaper Scanning Project to Research and Develop Effectiveness of Optical Character Recognition Process
We are currently developing this proposal. The University of Maryland Libraries will submit a proposal in cooperation with CLIS, UMIACS, and the Department of Journalism. The project will involve digitizing from microfilm a portion of the University of Maryland Libraries newspaper collection. Tapas Kanungo from UMIACS will perform the OCR process on the digitized microfilm in an attempt to make this process more "smart" and effective. CLIS will participate in developing the searching mechanism for the database that is created. The UM Libraries will be the lead evaluator on the project with some participation of Journalism Department students.
Let me close now and offer a chance for Q&A.