Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
#LAWDI Open Context, publishing linked data in archaeology
1. A Publication Approach to
Linked Data in Archaeology
A Publication Approach to
Linked Data in Archaeology
Eric C. Kansa
UC Berkeley / OpenContext.org
Unless otherwise indicated, this work is licensed under a Creative Commons Attribution
3.0 License <http://creativecommons.org/licenses/by/3.0/>
2. • Started in 2007
• Open access / open data
publishing for archaeology
• Archiving by California
Digital Library
• Referenced by NSF and
NEH for grant data
management
• Started in 2007
• Open access / open data
publishing for archaeology
• Archiving by California
Digital Library
• Referenced by NSF and
NEH for grant data
management
5. Data Sharing as Publication
• Several projects studying
editorial + publishing
workflows
• Current Funding: ACLS,
NEH, Sloan, EOL
Data Sharing as Publication
• Several projects studying
editorial + publishing
workflows
• Current Funding: ACLS,
NEH, Sloan, EOL
6.
7.
8. Web of DataWeb of Data
Cross-discipline Connections
Open Context links with
humanities data (CIDOC,
Pleiades, British Museum), and
natural sciences (EOL, UBERON)
11. EOL Computable Data
Challenge
1. 15 different sites
2. 34 zooarchaeologists
3. Publishing: decoding, cleanup,
metadata documentation
4. Linked Data annotation (EOL,
UBERON, biometrics)
5. Collaborative analysis
6. Reuse itself studied by
DIPIR.org (U. Michigan
ISchool)
EOL Computable Data
Challenge
1. 15 different sites
2. 34 zooarchaeologists
3. Publishing: decoding, cleanup,
metadata documentation
4. Linked Data annotation (EOL,
UBERON, biometrics)
5. Collaborative analysis
6. Reuse itself studied by
DIPIR.org (U. Michigan
ISchool)
12. Data Publishing
Google / Open Refine
1. Check consistency
2. Edit functions
3. All changes logged, can be
rolled back
Google / Open Refine
1. Check consistency
2. Edit functions
3. All changes logged, can be
rolled back
13.
14. Bibliography
• Bibliographic references
expressed as Linked Data
(modeled after S. Heath)
• Associates publication
citation with Open Access
variants
Bibliography
• Bibliographic references
expressed as Linked Data
(modeled after S. Heath)
• Associates publication
citation with Open Access
variants
15.
16.
17.
18.
19.
20.
21.
22. Why UBERON?
1. Expresses relevant expert knowledge,
tremendous effort. Why ignore or
duplicate this effort?
2. Anatomic entities related to
embryology, genetic networks. New
research opportunities for zooarch?
3. Zooarchaeology gains stakeholders
(biometric data of wide interest)
Why UBERON?
1. Expresses relevant expert knowledge,
tremendous effort. Why ignore or
duplicate this effort?
2. Anatomic entities related to
embryology, genetic networks. New
research opportunities for zooarch?
3. Zooarchaeology gains stakeholders
(biometric data of wide interest)
35. DIPIR: Data Documentation PracticesDIPIR: Data Documentation Practices
I use an Excel spreadsheet…which I … inherited from my research
advisers. …my dissertation advisor was still recording data for each
specimen on paper when I was in graduate school so that's what I
started …then quickly, I was like, "This is ridiculous.“… I just started
using an Excel spreadsheet that has sort of slowly gotten bigger and
bigger over time with more variables or columns…I've added …color
coding…I also use…a very sort of primitive numerical coding system,
again, that I inherited from my research advisers…So, this little book
that goes with me of codes which is sort of odd, but …we all know
that a 14 is a sheep.” (CCU13)
A long way to go before we
get usable, intelligible data
37. SPARQL endpoint easy to break (too big of a graph
to query).
Needed a work-around, so I also use the normal
(“plain web”) index to query the British Museum.
38. (1) Keyword
search for
relevant term.
(2) Scrape results
(blech!) for item
identifiers
(“objectid”
parameter in
URLs)
(3) Use ObjectIDs
in SPARQL queries
(limits size of
graph queried, so
server doesn’t
die).
40. Why is linked
data important?
Why is linked
data important?
1. Improve data quality, expert
curation of concepts +
vocabularies
2. Develop ties with other
research communities (can
feedback to collect new /
different data)
3. Increasingly sophisticated
open source tools, support
services
4. Part of the Web, not just on
the Web
1. Improve data quality, expert
curation of concepts +
vocabularies
2. Develop ties with other
research communities (can
feedback to collect new /
different data)
3. Increasingly sophisticated
open source tools, support
services
4. Part of the Web, not just on
the Web
41. … but
participating
in Linked Data
requires
effort!
… but
participating
in Linked Data
requires
effort!
Why is linked
data important?
Why is linked
data important?
44. Data are challenging
1. “Raw data” often problematic,
even with documentation (10X
effort needed with decoded data)
2. Tension between modeling needs
and familiarity with tools (Excel)
3. More work needed modeling
research methods (esp. sampling,
see DIPIR.org outcomes)
4. You’re never going to be done!
Data are challenging
1. “Raw data” often problematic,
even with documentation (10X
effort needed with decoded data)
2. Tension between modeling needs
and familiarity with tools (Excel)
3. More work needed modeling
research methods (esp. sampling,
see DIPIR.org outcomes)
4. You’re never going to be done!