The Center for Research on Information Access (CRIA):
Multidisciplinary Coordination
In order to ensure coordination of digital library research, testbed
and evaluation components, and to ensure that activities are carried
out in accordance with the vision of the Columbia Digital Library,
Columbia University has established the Center for Research on
Information Access (CRIA). The Center, housed in and associated with
the Libraries, is also closely associated both with Academic
Information Systems (AcIS) and with the Department of Computer
Science. The director of CRIA identifies opportunities for new
projects, initiates and develops projects, coordinates relations among
project partners, oversees financial and project-development
management, and is involved in ensuring that the research vision is
pursued as a result of advice from its advisory committees. CRIA
comprises several advisory committees, including the research advisory
committee which oversees research aspects of projects, carrying out
regular reviews of the research and identifying new directions that
should be explored, and the intellectual property committee which
coordinates legal experts, publishers, and researchers in exploring
new approaches to online distribution of material.
The Columbia Digital Library Infrastructure
Over the past several years, Columbia University has been creating and
assembling an increasingly rich and diverse set of networked
information resources that serves and empowers its community,
including high-speed network connections, electronic classrooms,
campus-wide public terminals, and wiring of all dormitory rooms.
Current investments build upon other technology initiatives that led
to installation of the fiber-optic network, authentication systems,
and network management systems. Enabled by timely investments in the
campus network, enhanced by rapid developments in new information
delivery technologies, and joined together by the ColumbiaNet
infrastructure, these interconnecting services have become the
components of a larger, and increasingly integrated, program of
electronic information delivery. Viewed as an integrated program,
these services form the nucleus of a digital library at Columbia
The Current Digital Library
Columbia's digital library is a collection of information, tools, and
resources made available to the Columbia community over the network in
an organized, well-managed and well-supported manner. The contents of
the digital library include research, instructional, administrative,
and student information in a variety of forms, including full text,
images, indexes, catalogs, databases, multi-media resources,
geographic and numeric datasets, and links to selected services on the
Internet. The digital library organizes and delivers this content
through search and retrieval mechanisms, and it provides a wide array
of tools and applications for users to manipulate and employ the
content they retrieve. The digital library in its broadest sense is a
key component of the digital university, permitting access to material
as wide-ranging as Dante to grades to weather measurements, using tools
tailored to the task with proper intellectual property protection and
security.
Research on Digital Library Technologies
The development of a digital library is a highly complex process and
requires simultaneous advances in discrete domains of technical
inquiry including user interfaces, search and retrieval techniques,
representation and presentation of information, and management of
intellectual property. It requires combining very large-scale
networks with very large-scale file storage and creating digital
collections of sufficient depth and breadth to be of compelling
interest to working user groups. The Digital Library research program
brings together an interdisciplinary team of experts to address these
issues, resulting in a research prototype digital library.
Research and evaluation teams are working toward this goal by addressing several domains of technical inquiry, in a highly interactive manner. In developing a prototype advanced digital library, our focus is on creating a user interface that can meet the needs of a wide range of end users, on developing integrated search and retrieval to efficiently find documents of interest whether textual or visual, and on enhancing representation of information to facilitate search and presentation of information. An important feature of our research is the integrated search of textual and image documents, allowing a user to retrieve textual, image, video, or any combination of these three, in response to a single query. Our research addresses questions such as how to integrate information from both text and image in order to more effectively find a desired document. For example, someone wishing to find out about a news topic such as the World Trade Bombing could get summarized newpaper articles, relevant images, and maps, all presented in a concise organized fashion. Our goal is to present large amounts of available information, using both natural language summaries and graphical visualization, to users in a compact way to alleviate the growing problem of information overload.
Content Development
Content development is of particular importance to the research
program given the fact that relatively few scholarly resources are
currently available in digital form. Creation of a content testbed
provides the means to validate research carried out on digital
libraries, by developing a very large multimedia test data set which
also will be made available to other researchers. Through the strength
of Columbia University Libraries and partnership with a variety of
publishers, the Columbia Digital Library provides access to
collections in a growing range of areas. Collections are continuing
to grow both through the efforts of Columbia University Libraries and
through individual departments. For example, the Libraries and
Academic Information Systems have embarked on a project to put online
all of the texts and art history images required for the core
humanities courses in Columbia College. In addition, Columbia's
Libraries are undertaking a study of the use of on-line books, with
support from the Mellon Foundation, and the participation of Oxford
University Press, Columbia University Press, Garland Publishing, and
Simon & Schuster Higher Education Division.
This study will answer questions such as which books are most useful in digital format (reference, high use, low use), how usage patterns differ between digital and paper formats, and how to incorporate copyright and fair use restrictions into digitally available data. As part of the instructional area, individual departments such as geology, chemistry, and medicine are creating online curricula for instructional use and these also provide access to large collections of relevant data and documents. In addition to drawing on collections developed and in use here at Columbia, the Digital Library also draws on the large amount of material available and growing on the Internet. Both the Columbia Digital Library production system and the Columbia Digital Library advanced research prototype aim at providing access to a broad set of topic areas.
Access to Content
The broadly defined scope of the digital library creates new
challenges for organizing its contents, beyond those for managing and
providing access to traditional information media. The nature of
digital information presents both new opportunities and new
complexities for those devising schemes of access. Although reliable
access to this data is provided for all members of the University in
addition to outside researchers, the richness and diversity of
resources on ColumbiaWeb already push the limits of menuing systems
and key word searching as access mechanisms. New technologies for
retrieval to improve on existing systems are a focus of the research
program for the research prototype, but these need to be effectively
integrated with efforts to structure and catalog digital information.
As components of the digital library prototype become available to the
user community, they will be incorporated into the collection testbed
and used to facilitate access. Members of the Academic Information
Systems (AcIS) are already working on the production of the testbed,
assuring reliability, access, and connectivity across campus.
Collaborations to Increase Impact
Our collaboration with public and university libraries, along with
primary and secondary school settings, ensures that our work remains
at the forefront of participatory design. As part of its membership in
the National Digital Library Federation (which includes the New York
Public Library and the Library of Congress as well as several large
university research libraries), Columbia collaborates with other
institutions to provide a varied audience of library patrons to
evaluate its systems. In addition, we plan to make the Columbia
Digital Library advanced prototype available to various intermediate
school settings with whom we have established connections. In
particular, Columbia is involved in establishing internet connections
to a variety of schools within Harlem and providing expertise to
educators within these settings to adequately avail themselves of the
online connection. Through this infrastructure, which is being funded
at Teachers College, we will be able to easily provide access to the
Digital Library advanced prototype for controlled, educational problem
solving tasks. These collaborations allow us to test our ideas with
actual users with initial prototypes of the final system.
The Potential of the Digital Library
The Digital Library provides the framework in which to design and
implement the continued expansion and enrichment of online information
services for the Columbia community. Developments in digital library
technology--locally, nationally and internationally-- have provided a
focus for realizing the potential of digital information to be used in
service of the University's mission. Columbia University is at the
lead both in providing digital text, image, and sound to the
community, and in the development of forward-thinking technologies for
the advancement of the Digital Library.