Indexing Structures in Database Management system.pdf
Digital librarie
1. Libraries, digital libraries and
digital library research
Lorcan Dempsey
OCLC
Keynote presentation at
European Conference on Digital
Libraries 2004
University of Bath
September 12 – 17 2004
3. Holes
‘There was once a man who aspired to be the author of the
general theory of holes.
When asked “What kind of hole – holes dug by children in the
sand for amusement, holes dug by gardeners to plant lettuce
seedlings, tank traps, holes made by roadmakers?” he would
reply indignantly that he wished for a general theory that
would explain all of these.
This man’s achievement has
passed totally unnoticed except by me.’
4. Digital libraries and holes …
‘Digital library’ has no
precise or agreed referent
Different communities of
practice
Compare ‘archive’
Different incentives • Archival institution
• Serve • Archival materials
• Build • OAI
• Research • A promise of
preservation?
5. Digital
library
Research
Digital
library
Digital
Library
libraries
6. Anthropology/ethnography/ Grid W3C
social science
Computer science
Digital
Library and
Information science
library Economics
Research
Industrial R&D
HCI
Semantic web
Digital Artstor
Entertainment library Jorum
Libraries …
Library
Amazon
E-research Digital
‘Business’ Inst Rep
E-learning libraries
Banks arXiv
Cultural
heritage Internet archive BBC archive
7. Emphasis:
Digital
library Library
Research
Digital
library
Digital
Library
libraries
9. Libraries
‘So why have I written
this? I can’t show it if it’s
going to contradict or
undermine my case.
There are a number of
reasons. First and
foremost, I am a
librarian. I live for
records and documents.’
10. A library as institution
Because the purpose and result of absorbing information
is always finally to produce further information, i.e., to
continue the conversation,
the function of the library must be understood as one
that assists members of the community both in taking
particular positions and in recognizing and assessing the
positions taken by others.
Ross Atkinson. Contingency and contradiction: The place(s) of the
Ross Atkinson. Contingency and contradiction: The place(s) of the
library at the dawn of the new millennium
library at the dawn of the new millennium
Journal of the American Society for Information Science and
Journal of the American Society for Information Science and
Technology, Volume 52, Issue 1, Pages 3-11. Published Online:
Technology, Volume 52, Issue 1, Pages 3-11. Published Online:
2001.
2001.
11. A library as institution
We often hear it said that libraries (and librarians) select,
organize, retrieve, and transmit information or knowledge. That
is true.
But those are the activities, not the mission, of the library.
… the important question is: "To what purpose?" We do not do
those things by and for themselves.
We do them in order to address an important and continuing
need of the society we seek to serve. In short, we do it to
support learning.
Robert Martin. Libraries and Learners in the Twenty-first Century.
Robert Martin. Libraries and Learners in the Twenty-first Century.
http://www.imls.gov/scripts/text.cgi?/whatsnew/current/sp040503.htm
http://www.imls.gov/scripts/text.cgi?/whatsnew/current/sp040503.htm
12. Libraries and digital libraries
Support research and learning.
Discover position of others and form one’s own position.
In order to uphold their mission and values…
… they must renovate their practices.
13. “Search engine mindshare”
John Regazzi
“In a survey for this lecture,
Scientists: librarians and scientists were
• Google asked to name the top scientific
• Yahoo and medical search resources
• PubMed that they use or are aware of.
Librarians: The difference is startling.”
• Science Direct
• ISI Web of Science
• MedLine
Source: John Regazzi,
The Battle for Mindshare: A battle beyond access and retrieval
http://www.nfais.org/publications/mc_lecture_2004.htm
14. Pattern recognition – libraries now
The ‘Amazoogle’ effect
Value User behavior
opaque
Uncertainty about
digital directions
‘The future is
here. It's just
not evenly
distributed yet’
William Gibson
15. The difficulty in creating a digital management strategy stems
in part from the bewildering convergence of technological
developments.
Developing a digital management strategy is further
complicated by the fact that there are no recognized patterns or
models for managing digital assets.
Some managers seek to develop fully distributed institutional
repositories but still must choose between open-source
solutions or commercial providers. Others prefer to place their
material in one of a limited number of dedicated storage
institutions. While best practices may exist for given technical
processes, library managers do not have a single paradigm to
use as the basis for developing operational plans and policies to
capture, store, index, preserve, and redistribute the intellectual
output in digital formats.
Managing Digital Assets, CLIR primer
Managing Digital Assets, CLIR primer
program, 2005
program, 2005
16. Impact of digital library research?
User studies
• How much do we know about changing patterns of research,
learning and engagement?
Federation and metasearch
• FDI, IndexData, Cheshire, iPort, …
• OAI/OpenURL Local
• NISO metasearch – issues still to be addressed successes …
Repositories/digital library systems
• Multiple communities
• Dspace, Fedora, CONTENTdm, DLXS, .. … but we
Metadata have many
• Growing acronymic density
• Collections, rights, policies, services, …
open
• Complex objects, relations questions.
Identifiers/citation
Preservation
17. Collections grid
Stewardship
high low
Books Freely-accessible
Journals web resources
Uniqueness
•Newspapers
low
•Gov. docs
•CD, DVD
•Maps
•Scores
Research and learning
materials
high
Special
•ePrints/tech reports
collections •Learning objects
Archives •Courseware
•Rare books •E-portfolios
•Local history materials •Research data
•Archives & Manuscripts
Untransferred records
•Theses & dissertations
18. Collections grid
high low disclosure
Publishing
Amazoogle
D2D
low
Reformatting
high
E-learning
E-research
Cultural Digital asset
heritage management
19. lab books
PDAs
campus portal
learning management systems
course material exhibitions
text book
personal collections
reading
lists
user environments
resource environment library Virtual
reference
Institutional repository
Aggregations
Digital collections
Licensed
Catalog
E-reserve collections
Cataloging
ILL
21. Scope, scale, diversity
Systemic issues
• No single system is the sole focus of a user’s attention
• How do systems and services work across the four
quadrants of the collections grid
• How do they fit into wider enterprise systems
Structure of costs does not reflect users’ value perception
• Reallocation of resources difficult
• Little substitution – ‘and’ not ‘or’
22. A new world
Co-evolution with research and learning behaviors which
are themselves changing
Unsure about appropriate “economy of presence”
• Place, network hub, channel, …
• Web services, portlets, channels, …
• Ambience, diffusion, ubiquity, recombinance, …
E.g. Trajectory of search
• Search system
• Search system, machine interface, metasearch
• Provide data, externalize search
• Google, OAI
23. Webulation …
Monolithic applications resistant to
• Webulation
• Service oriented architectures
Massive legacy investment in knowledge structure
unconnected to the web
• How to release its value in a network environment
Content does not easily flow into user space for
manipulation, packaging, aggregation
24. Vendor environment
Many libraries have outsourced development effort
Library vendors do not have large R&D budgets
Poor out-of-the-box support for ‘below-the-line’ materials in
digital form
Interesting tension between commodity (standards) and
added value
OSS environment very unsophisticated
Limited support for logistics/supply chain/integration
services
25. Limited application platforms
Consider Library world
• Google • Fragmented systems and
• Amazon development effort
• E-bay • Does not benefit from
• MapQuest
scale
• Unsustainable local
Massively central applications
development agendas
platforms working in loosely
coupled webby world Organizational rearticulation
Software as a service difficult.
• APIs
Application platforms?
• CDL
• GMAIL
• JISC
• Paypal
• DEF
• search
• OCLC/RLG
26. Architecture? Theory?
Do we need a big picture?
Allows the articulation of technical and business discussion?
An unnecessary constraint?
27. Without it we are susceptible to ….
Marchitecture
Techeology
Portal envy
Gratuitous acronym requests in RFPs
Beauty contests
• Dspace, Fedora, ….
28. A history of consumption means that we are
unprepared for contribution
Standards
Open source software
Common services
Limited structures to capture contribution and support.
29. And finally ..
Libraries need to think about libraries not digital libraries
And they need help from wherever they can get it!
Notas del editor
Digital libraries – a wide range of services are digital-library-like. The involve selection, curation and disclosure of digital materials for particular audiences. Depending on your definition some of these are in, some out. Wherever you draw the line there is significant activity. ‘ Business’ – many organizations do digital-library-like activities to support their business needs. For example, think what will happen with historical collections of media materials in the ‘media’ business; collections of business documents (insurance, cheques) and so on in financial services companies; e-learning repositories; developing research collections. Digital library research reaches into many disciplines. Although there is a somewhat diffuse community of ‘digital library researchers’ in computer science, library and information science, and related issues, those who are building digital libraries are potentially interested in a wider research hinterland. This means that ‘digital library’ relates to a very diffuse set of interests.
I will focus on libraries!
Not clear how extensive the survey was or what the population was.
Amazoogle – from a policy and funding point of view libraries are increasingly working in an environment shaped by expectations created by Google and Amazon. The library has to create the value case in such an environment. User behaviors are changing in a network environment. Research and learning behaviors are co-evolving with general network activity. People create and consume information in new ways. There are no patterns for digital directions.
This may slightly overstate the case, but it is clear that we are some way from being able to routinely create viable digital information environments.
It is difficult to measure the impact of digital library research. It is clear that there have been local successes and one can point to certain outcomes which benefited from programmatic research funding. The ROI on user studies seems low. Many are tied to particular systems or services. Some commercial metasearch products have been assisted by being part of the EU technical research and development investment. OCLC Pica’s iPort grew out of the EU project Decomate. Fretwell Downing participated in EU and JISC funded activities which contributed to their current suite. IndexData did nice work in several EU projects also. Cheshire assisted by NSF funding. OpenURL and OAI – Herbert Van de Sompel.
Increasingly the library needs to provide services into the user environment – it needs to be visible in course management systems, in university portals, and so on. Not everybody will come to the library or to the library portal.
“economy of presence” – a phrase of Bill Mitchell’s. Users have heterogeneous requirements. What is an effective network presence.
‘ below-the-line’ – i.e. below the line in the collections grid. These tend to be unique materials – special collections and research and learning materials (e-prints, data sets, courseware, …)
JISC has its Information Architecture and now the E-Learning Framework. These help us have conversations, create shared understanding, help us partition problems, and so on. The library community seems resistant to such shared architectures, which may be a good or a bad thing depending on your point of view.
Marchitecture – an architecture produced by a vendor for marketing purposes. May not be the best guide to the applications space. (do a search on google for more) Techeology – a mixture of technology and ideology. Discussion where ideological beliefs cloud technical discussion. I find this a useful word to describe quite a bit of the conversation one comes across. Portal envy – we must have a portal, everybody else does Beauty contests – discussion starts with which of the commonly known repository frameworks one wants rather than with requirements etc
Continued health of standards and OSS depends on intellectual and other contributions and sustaining frameworks. Mackenzie Smith spoke about difficulties with OSS at this conference. Neil McLean spoke about common services and the need for such infrastructural services. Again we are not sure how to secure and sustain these.