This document discusses the potential for digital data in archaeology beyond just preserving existing practices. It outlines some visions, including optimizing current practices versus using data as an opportunity for new research methods and communication. Open Context is presented as taking the latter approach through open data sharing, linked data practices, and collaboration across projects. However, challenges to wider data sharing are also discussed, such as a lack of professional incentives and constraints of current academic evaluation systems. The document argues for treating data as an object of continued intellectual investment and innovation in order to fully realize its potential.
Beyond Preservation: Situating Archaeological Data in Professional Practice
1. Beyond Preservation:
Situating Archaeological Data in
Professional Practice
Eric C. Kansa (@ekansa)
UC Berkeley D-Lab
Eric C. Kansa (@ekansa)
UC Berkeley D-Lab
& Open Context
& Open Context
2014-2015 Harvard Center for
Hellenic Studies & German
2014-2015 Harvard Center for
Hellenic Archaeological Studies Institute
& German
Archaeological Institute Research
Research Fellow
Fellow
Unless otherwise indicated, this work is licensed under a Creative Commons Attribution 3.0
License <http://creativecommons.org/licenses/by/3.0/>
2. Data Sharing as Publication
• Started in 2007
• Open data (mainly CC-By)
• Archiving by California Digital
Library
• Part of a broader reform
movement in scholarly
communications
3. IInnttrroodduuccttiioonn
Visions for Digital
Data in Archaeology
1. “Optimizing the status quo”
2. Opportunity for fundamentally better
ways to conduct and communicate
research
4. IInnttrroodduuccttiioonn
Digital Data in Archaeology
1. Why discuss data?
2. Data in (bad) institutional contexts
3. Open Context's approach
4. Need for more & wider intellectual
investment
5. IInnttrroodduuccttiioonn
Digital Data in Archaeology
1. Why discuss data?
2. Data in (bad) institutional contexts
3. Open Context's approach
4. Need for more & wider intellectual
investment
6. Data source: Arif Jinha (2010). Article 50 million: an estimate of the number of scholarly articles in
Arif Jinha (2010). Article 50 million: an estimate of the number of scholarly articles in
existence Learned Publishing, 23 (3), 258-263 DOI: 10.1087/20100308.
existence Learned Publishing, 23 (3), 258-263 DOI: 10.1087/20100308.
Image Source: http://www.cs.cmu.edu/~comar/open-science/
http://www.cs.cmu.edu/~comar/open-science/
7. Paper and paper like
digital files (PDFs) do
not scale well:
● ● Discovery
● ● Reuse
Image Credit: Wikimedia Commons (Public Domain)
http://commons.wikimedia.org/wiki/File:Archives_entreprises.jpg
9. Lots of investment in
“Big Data”
● ● Corporate
● ● Government
● ● 'STEM' academia
10. Lots of investment in
“Big Data”
● ● Corporate
● ● Government
● ● 'STEM' academia
11.
12. Image Credit: 'gin soak' (CC-BY-NC-ND)
https://www.flickr.com/photos/gin_soak/2215398726
Structured Data – Creativity
1. New forms of communication
2. New forms of collaboration
3. New research opportunities
15. Text-mining literature to identify
references to ancient places
2010 (renewed 2012) Google Digital Humanities Awards: with
Elton Barker, Leif Isaksen, Kate Byrne, Nick Rabinowitz
17. IInnttrroodduuccttiioonn
Digital Data in Archaeology
1. Why discuss data?
2. Data in (bad) institutional contexts
3. Open Context's approach
4. Need for more & wider intellectual
investment
18. Commercial interests and
public policy
Conditions of
academic labor
Neoliberalism:
(Loosely associated ideologies /
assumptions / interests)
19. Source: The Occasional Pamphlet - Harvard University
(http://blogs.law.harvard.edu/pamphlet/2013/01/29/why-open-access-is-better-for-scholarly-societies/)
30. Need more carrots!
1. Citation, credit, intellectually
valued
2. Research outcomes (new
insights from data reuse!)
31. Need more carrots!
1. Citation, credit, intellectually
valued
2. Research outcomes (new
insights from data reuse!)
32. Adapt Academic Taylorism:
● Datacite (metadata, citation
for datasets)
● Alt-metrics (social media,
view counts, download
counts, etc.)
Make data count!
33. Need more carrots!
1. Citation, credit, intellectually
valued
2. Research outcomes (new
insights from data reuse!)
34. IInnttrroodduuccttiioonn
Digital Data in Archaeology
1. Why discuss data?
2. Data in (bad) institutional contexts
3. Open Context's approach
4. Need for more & wider intellectual
investment
35. Data Sharing as Publication
• Started in 2007
• Open data (mainly CC-By)
• Archiving by California Digital
Library
• Part of a broader reform
movement in scholarly
communications
46. Digital Index of North American
Archaeology (DINAA)
1. Rich metadata (cultures,
chronology, site-types)
2. Reduced precision location data
(site security, legal)
3. Data modeling challenges (using
GeoJSON-LD, CIDOC-CRM,
event models)
47.
48. Using site file
data to
examine the
impacts of sea
level rise
In 100 years, 19,676
sites will be covered!
49. Digital Index of North American
Archaeology (DINAA)
1. ~ 500,000 site records curated by
state officials
2. Key (Linked Data!) reference for N.
American archaeology
3. PIs/Co-PIs: David G. Anderson,
Joshua Wells, Eric Kansa, Sarah
Kansa, Stephen Yerka
50. Stable Web URI:
Reference this to disambiguate between
“Alexandria” (Egypt) and other places
called “Alexandria” (many of which are
also ancient)
51. Pelagios:
Heat map of museum collections,
archives, databases referencing places
in Pleiades
(PIs Leif Isaksen, Elton Barker)
52. WWeebb ooff DDaattaa ((22001111))
Need Archaeology on the Map
Contributions should not be isolated
from other communities
54. IInnttrroodduuccttiioonn
Digital Data in Archaeology
1. Why discuss data?
2. Data in (bad) institutional contexts
3. Open Context's approach
4. Need for more & wider intellectual
investment
55. I just started using an Excel spreadsheet that
has sort of slowly gotten bigger and bigger
over time with more variables or columns…I've
added …color coding…I also use…a very sort of
primitive numerical coding system, again, that I
inherited from my research advisers…So, this
little book that goes with me of codes which is
sort of odd, but …we all know that a 14 is a
sheep.” (CCU13)
Need to do more than
“Optimize the Status Quo”
58. Large scale data sharing &
integration for exploring the
origins of farming.
Funded by EOL / NEH
59. 1. 300,000 bone specimens
2. Complex: dozens, up to 110
descriptive fields
3. 34 contributors from 15
archaeological sites
4. More than 4 person years of
effort to create the data !
60. 6500 BC (few pigs, mixing with wild animals?)
7500 BC (sheep + goat dominate, few pigs, few cattle)
7000 BC (many pigs, cattle)
8000 BC (cattle, pigs,
sheep + goats)
• Not a neat model of progress to adopt a more productive economy. Very
different, sometimes piecemeal adoption in different regions.
Arbuckle BS, Kansa SW, Kansa E, Orton D, Çakırlar C, et al. (2014) Data Sharing Reveals Complexity in
the Westward Spread of Domestic Animals across Neolithic Turkey. PLoS ONE 9(6): e99845.
doi:10.1371/journal.pone.0099845
61. Easy to Align
1. Animal taxonomy
2. Skeletal elements
3. Sex determinations
4. Side of the animal
5. Fusion (bone growth, up to a
point)
62. Hard to Align (poor modeling, recording)
1. Tooth wear (age)
2. Fusion data
3. Measurements
Despite common research methods!!
63. “Under the hood” exposure
and reuse attempts critical!
Fundamental method & theory
issues in data modeling!
64. Investing in Data is a Continual Need
1. Data and code co-evolve. New
visualizations, analysis may reveal unseen
problems in data.
2. Data and metadata change routinely
(revised stratigraphy requires ongoing
updates to data in this analysis)
3. Problems, interpretive issues in data (and
annotations) keep cropping up.
4. Is publishing a bad metaphor implying a
static product?
65.
66. Data sharing as publication
Data sharing as open source
release cycles?
67. Data sharing as publication
Data sharing as open source
release cycles?
68. Data sharing as publication
AND
Data sharing as open source
release cycles
69. Go beyond Optimization
of the Status Quo
More to data than 'compliance'
Data require intellectual investment,
methodological and theoretical
innovation.
New professional roles needed, but
who will pay for it?
70. TThhaannkk yyoouu!!
Special Thanks!
Harvard Center for Hellenic
Studies & the German
Archaeological Institute (DAI)