We Have Interesting Problems: Some Applied Grand Challenges from Digital Libraries, Archives and Museums

WE HAVE
INTERESTING
PROBLEMSSOME APPLIED GRAND CHALLENGES FROM
DIGITAL LIBRARIES, ARCHIVES AND
MUSEUMS

TALK ROADMAP
- Context on where I’m coming from
- The New ABCs of Research as Framework
- Examples from IMLS National Digital Platform Projects
- Examples of initiatives from LC Labs
- Some Jumping off Points and Applied Grand Challenges

CONTEXT ON
WHERE I’M
COMING
FROM

https://lj.libraryjournal.com/2017/12/people/qa-with-trevor-
owens-lc-head-of-digital-content-management/#_

The Theory and Craft of Digital Preservation
https://osf.io/preprints/lissa/5cpjt/

THE NEW
ABCS OF
RESEARCH AS
A FRAMEWORK

EXAMPLES FROM
IMLS NATIONAL
DIGITAL
PLATFORM
PROJECTS

NDP@3
REPORT
DETAILS
RESULTS
AND
TRENDS

Extending Intelligent Computational Image Analysis for
Archival Discovery (LG-71-16-0152-16), Board of Regents
of the University of Nebraska, $462,317 The Image
Analysis for Archival Discovery (Aida) research team at the
University of Nebraska-Lincoln will investigate the use of
image analysis as a methodology for content identification,
description, and information retrieval in digital libraries and
other digitized collections. The project will focus on identifying
poetic and advertising content in digitized historic
newspapers. Using a machine learning approach, the project
will result in an intelligent computational system that can
process digital images and identify these specific types of
content. https://www.imls.gov/grants/awarded/LG-71-16-
0152-16

Improving Access to Time-Based Media through
Crowdsourcing and Machine Learning (LG-71-15-0208-
15), WGBH Educational Foundation, $898,474 WGBH, in
partnership with Pop-Up Archive, will address the challenges
faced by many libraries and archives trying to provide better
online access to their media collections. This 30-month
research project will explore and test technological and social
approaches for metadata creation by leveraging scalable
computation and engaging the public to improve access
through crowdsourcing games for time-based media.
https://www.imls.gov/grants/awarded/LG-71-15-0208-15

Systems Interoperability and Collaborative Development
for Web Archiving $353,221 and $98,460 in cost share:
The Internet Archive, with the University of North Texas,
Rutgers University, and Stanford University Library will build
a foundation for collaborative technology development,
improved systems interoperability, and an Application
Programming Interface (API) based model for enhanced
access to, and research use of, web archives. In working with
the Archive-It platform, used by more than 350 partner
institutions, results of this research will be directly applicable
to libraries, archives, and museums around the country and
the world. https://www.imls.gov/grants/awarded/LG-71-15-
0174-15

Transforming Libraries and Archives through
Crowdsourcing (LG-71-16-0028-16), Adler Planetarium,
$1,214,780 This research partnership between Adler
Planetarium’s Library and researchers at Oxford University,
will expand the capacity for libraries and archives across the
country to use crowdsourcing techniques to engage with
audiences and improve access to digital collections. Through
this effort, the team will develop a series of library/archive
Zooniverse projects that explore improvements to full text
and audio transcription and image annotation crowdsourcing
tools and research differences between transcribing in
isolation versus with knowledge of others’ transcription.
Lessons learned from these projects will be incorporated into
the Project Builder. https://www.imls.gov/grants/awarded/LG-
71-16-0028-16

Programmatic Extraction of “Documents” from Web
Archives (LG-71-17-0202-17), University of North Texas,
$318,988 The University of North Texas Libraries and the
Computer Science and Engineering Department will research
the efficacy of using machine-learning algorithms to identify
and extract publications contained in web archives. The
overarching goal of this project is to understand if machine-
learning models can successfully identify content-rich PDF
and Word documents from web archives that align with
library and archives collecting plans.
https://www.imls.gov/grants/awarded/LG-71-17-0202-17

EXAMPLES OF
INITIATIVES
FROM LC LABS

Jer Thorp, Innovator in Residence
• Overview https://labs.loc.gov/experiments/innovator-in-residence-jer-thorp/
• Research materials https://osf.io/b7e6w/
• Code https://github.com/blprnt/loc
• Podcast https://artistinthearchive.podbean.com/
Laura Wrubel’s Library of Congress Colors
• Application https://loc-colors.glitch.me/
• Code https://github.com/lwrubel/loc-colors
• Blog post https://blogs.loc.gov/thesignal/2018/01/from-code-to-colors-working-with-the-
loc-gov-json-api/
Tahir Hemphill, Papamarkou Chair in Education at the John W. Kluge Center
• About Hip Hop Word Count https://www.newyorker.com/magazine/2013/04/01/rap-
sheet-2
• Studio https://www.tahirhemphill.com/
• Past chairs https://www.loc.gov/loc/kluge/fellowships/hpeducation.html
Reports
• Gallinger, M. & Chudnov, D. Recommendations for a Digital Scholarship Lab at the
Library of Congress
• Herron, S. Digital Scholarship Resource Guide.
• Access the reports https://labs.loc.gov/meta/reports/
LC LABS RESOURCES

SOME JUMPING
OFF POINTS AND
APPLIED GRAND
CHALLENGES

SOME ESSENTIAL APPLIED
RESEARCH AREAS
- How can various new technologies be implemented to
scale the ability to acquire, describe, organize and make
available digital collections?
- How can we best integrate various automated methods for
working with digital collections with the work of subject
catalogers/subject matter experts?
- What ways can we best connect and build relationships
with various user communities through crowdsourcing
initiatives?
- What do all of these technologies look like in ongoing
production workflows?

SOME MORE SPECIFIC
EXAMPLES
- Working models for content addressable storage in digital
repository storage architectures
- Reconciling data warehousing approaches with library
approaches to content and metadata management
- Weaving together structured cataloging workflows with
metadata generating mechanisms (crowdsourcing, NLP,
Computer Vision, etc.)
- Virtual machines general purpose policy based restricted
access infrastructure
- Enabling data mining and computational scholarship on
arbitrary restricted access collections

We Have Interesting Problems: Some Applied Grand Challenges from Digital Libraries, Archives and Museums

We Have Interesting Problems: Some Applied Grand Challenges from Digital Libraries, Archives and Museums

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (15)

Similar a We Have Interesting Problems: Some Applied Grand Challenges from Digital Libraries, Archives and Museums

Similar a We Have Interesting Problems: Some Applied Grand Challenges from Digital Libraries, Archives and Museums (20)

Más de Trevor Owens

Más de Trevor Owens (20)

Último

Último (20)

We Have Interesting Problems: Some Applied Grand Challenges from Digital Libraries, Archives and Museums

Notas del editor