16. In 2009, 116 articles cited ORNL DAAC data.
Finding these articles took 70-80 hours
across at least 12 resources
all chosen from a deep understanding
of this specific research domain
then the full text of all the hits were
manually reviewed
Valerie Enriquez interview with James Kidder
http://openwetware.org/wiki/DataONE:Notebook/Reuse_of_repository_data
17. How
to
iden9fy
Dataset
Reuse
in
the
published
literature
This
cita2on
paCern
(dataset
DOI/ID
in
references
sec2on)
is
used
almost
exclusively
for
dataset
has
an
iden2fier? with
dataset
unique
ID search
in
reference
dataset
reuse.
(DOI,
url,
accession
#) sec2ons
of
all
papers Manual
disambigua2on
not
required:
can
be
automated
IDs
are
difficult
to
DOI/ID
reference
search
possible
in
full-‐text
portals
like
pending
API
support.
unambiguously
iden2fy
in
PubMed
Central
and
HighWire
Press,
however
portal
full
text
unless
they
have
a
coverage
is
limited
and
search
is
not
restricted
to
Does
not
require
access
to
unique
paCern
(DOI)
or
references
sec2on. full-‐text
unusual
prefix
or
suffix. with
dataset
unique
DOI/ID
search
works
in
Google
Scholar,
but
scope
is
This
cita2on
paCern
is
currently
ID
poorly
defined,
results
are
messy. rare
This
cita2on
paCern
is
difficult
DOI/ID
search
not
supported
by
ISI
Web
of
Science
or
to
track
with
exis2ng
tool
Scopus limita2ons
with
(submi-er
surname
AND
repository
name),
publicly
dataset
submission
record
has
and
also This
cita2on
paCern
archived
submiCer
name
or
dataset
(dataset
9tle
AND search
in
full
text
of
all
sort
hits
to
disambiguate
(accession
numbers
in
full
dataset 2tle?
repository
name) papers reuse
from
submission text)
is
very
common
in
some
subdisciplines,
so
Names
and
2tles
are
messy
Disambigua2on
is
2me
probably
finds
most
Requires
ability
to
query
iden2fiers consuming reuses.
full
text
across
all
literature
that
may
Requires
access
to
full
text
of
with
(first
author
surname
contain
reuse search
hits
for
sor2ng
AND
repository
name)
sort
hits
to
disambiguate
dataset
submission
record
men2ons
gather
papers
that
cite
the
data
This
cita2on
paCern
with
data
reuse
from
other
data
collec2on
ar2cle
publica2on? collec2on
paper (cita2on
to
data
crea2on
collec2on
ar2cle’s
cita2on
contexts
paper)
is
very
common
in
journal,
volume,
Disambigua2on
is
2me
some
subdisciplines,
so
page,
etc. Cita2on
history
export
is
2me
probably
finds
most
reuses.
Link
to
data
collec2on
paper
oVen
consuming:
most
cita2ons
are
consuming:
automa2on
not
missing
from
dataset
submission
record,
not
in
the
context
of
reuse
supported.
especially
when
dataset
submission
predates
ar2cle
publica2on.
Only
finds
cita2ons
indexed
by
Requires
access
to
full
text
of
cita2on
databases search
hits
for
sor2ng
This
flow
s2ll
misses
aCribu2ons
embedded
in
supplementary
informa2on,
reuses
aCributed
through
a
query
descrip2on,
etc.
Heather
Piwowar,
v1.0,
CC-‐BY
31. I post my data, code, and statistical scripts:
http://researchremix.org
Share yours too!
-> Open Notebook Science
http://www.flickr.com/photos/myklroventine/892446624/
33. thank you
Todd Vision,
Estephanie Sta Maria
Jonathan Carlson
Dryad and DataONE teams
The open science online community and those who
release their articles, datasets and photos openly