TeamStation AI System Report LATAM IT Salaries 2024
Linking data to publications: Towards the execution of papers
1. Linking
data
to
publica0ons:
Towards
the
execu0on
of
papers
Anita
de
Waard
Elsevier
Labs/UUtrecht
h5p://elsatglabs.com/labs/anita
2. Cycle
of
Scien,fic
Inves,ga,on
make observational assertions make interpretational assertions
gather data aggregate assertions
Observations Interpretations
Experimental Domain-speci c
Design Model Reasoning Model
perform experiments make predictions
design experiments formulate hypotheses
CoSI
model
by
Gully
Burns,
ISI/USC
2
3. Cycle
of
Scien,fic
Inves,ga,on
make observational assertions make interpretational assertions
Processed
Data/Sta0s0cs Conclusions
gather data aggregate assertions
Observations Interpretations
Observed
Results
Experimental Domain-speci c
Design Model Reasoning Model
Background
perform experiments make predictions
Experimental
Objects
design experiments formulate hypotheses
Experimental
Design Hypotheses
CoSI
model
by
Gully
Burns,
ISI/USC
2
4. Cycle
of
Scien,fic
Inves,ga,on
Publica0on
make observational assertions make interpretational assertions
Processed
Data/Sta0s0cs
Figures Conclusions
gather data aggregate assertions
Observations Interpretations
Observed
Results
Results
Experimental Domain-speci c
Design Model Reasoning Model
Background
perform experiments make predictions
Experimental
Objects
design experiments formulate hypotheses
Methods
Experimental
Design Hypotheses
CoSI
model
by
Gully
Burns,
ISI/USC
2
5. Cycle
of
Scien,fic
Inves,ga,on
Publica0on
make observational assertions make interpretational assertions
Processed
Data/Sta0s0cs Background
gather data aggregate assertions
Observations Interpretations
Observed
Results Hypotheses
Experimental Domain-speci c
Design Model Methods Reasoning Model
Results
perform experiments make predictions
Experimental
Objects
design experiments Figures formulate hypotheses
Experimental
Design Conclusions
CoSI
model
by
Gully
Burns,
ISI/USC
2
6. 1.
Current
prac?ce:
store
data
in
repository,
link
from
document,
and
vice
versa
Publica0on
Background Workflow
Repository
Hypotheses Experimental
Design
Methods
Data
Repository
Results Observed
Results
Figures
Sta,s,cs
storage
system
Conclusions
Processed
Data/Sta0s0cs
3
7. Current
Prac,ce:
linking
to
documents
Least
favorite:
raw
research
data
delivered
as
supplementary
data
Much
beGer:
linking
into/from
data
centres,
e.g.
Pangea:
3
8. Current
Prac,ce:
linking
to
documents
Least
favorite:
raw
research
data
delivered
as
supplementary
data
Much
beGer:
linking
into/from
data
centres,
e.g.
Pangea:
3
9. Current
Prac,ce:
linking
to
documents
Least
favorite:
raw
research
data
delivered
as
supplementary
data
Much
beGer:
linking
into/from
data
centres,
e.g.
Pangea:
3
10. Linking
data
and
papers:
‘the
publisher’s’
posi,on:
STM’s
“Brussels
Declara,on”,
June
2006:
“...
believe
that,
as
a
general
principle,
data
sets,
raw
data
outputs
of
research,
and
sets
or
subsets
of
that
data
should
wherever
possible
be
made
freely
accessible
...”
• Publishers
are
(in
general)
not
interested
in
owning
or
charging
for
research
data
repositories
• Publishers
are
very
interested
in
linking
to
and
from
data,
and
want
to
work
with
data
repositories
to
do
this
effec,vely
• Publishers
believe
in
(and
know)
the
concept
of
Digital
Object
Iden,fiers:
– Where
possible:
one
repository
for
iden,fiers
– Persistent
and
unique
(don’t
keep
same
ID
if
content
changes)
– Where
possible,
link
back
to
the
publica,on
11. Linking
data
and
papers:
‘the
publisher’s’
posi,on:
STM’s
“Brussels
Declara,on”,
June
2006:
“...
believe
that,
as
a
general
principle,
data
sets,
raw
data
outputs
of
research,
and
sets
or
subsets
of
that
data
should
wherever
possible
be
made
freely
accessible
...”
• Publishers
are
(in
general)
not
interested
in
owning
or
charging
for
research
data
repositories
• Publishers
are
very
interested
in
linking
to
and
from
data,
and
want
to
work
with
data
repositories
to
do
this
effec,vely
• Publishers
believe
in
(and
know)
the
concept
of
Digital
Object
Iden,fiers:
Complete
agreement
with
MacKenzie
Smith’s
“Requirements
for
Data
Cita,on!”
– Where
possible:
one
repository
for
iden,fiers
– Persistent
and
unique
(don’t
keep
same
ID
if
content
changes)
– Where
possible,
link
back
to
the
publica,on
12. 2. Store
data
in
repository,
link
within
document.
Publica0on
Background Workflow
Repository
Hypotheses Experimental
Design
Methods
Data
Repository
Results Observed
Results
Figures
So]ware
Repository
Conclusions
Code/Sta0s0cs
6
13. Enabler
at
Elsevier
-‐
Linked
Data:
access
any
level
of
granularity
of
content
7
14. Enabler
at
Elsevier
-‐
Linked
Data:
access
any
level
of
granularity
of
content
7
15. Enabler
at
Elsevier
-‐
Linked
Data:
access
any
level
of
granularity
of
content
Dublin Core and SKOS
7
16. Enabler
at
Elsevier
-‐
Linked
Data:
access
any
level
of
granularity
of
content
Dublin Core and SKOS
SWAN’s PAV (Provenance, Authoring and Versioning) ontology
7
17. Enabler
at
Elsevier
-‐
Linked
Data:
access
any
level
of
granularity
of
content
1. Where the document region is completely described by an existing ID, use that ID to
Dublin Core and SKOS
define the region.
Example: http://api.elsevier.com/content/article/DOI:10.1016/S0030-3992(02)00069-
5#p0100 specifies a document region as the element with ID "p0100".
2. Where the document region can be completely described by an element within an ID'd
element, navigate outwards to an ID that encloses the region, and use a relative Xpath.
Example: #xpath-e(id('s0050')/ce:para[4]) specifies a document region as the fourth
SWAN’s PAV (Provenance, Authoring and Versioning) ontology
ce:para element within an element with ID "s0050".
3. Where the document region cannot be completely described by an element within the
content, use the above locators combined with substrings.
Example: #xpath-e(substring(id('p0100'),10,20)) specifies a document region as being
characters 10–20 in the element with ID "p0100".
4. Where the source content does not contain IDs, use absolute Xpaths to navigate to
the appropriate element, and use substrings as required.
Example: #xpath-e(article/body/ce:sections/ce:section[4]/ce:para[4]) points to a particular
ce:para as defined by the given Xpath. An example of an absolute Xpath with substrings is
left as an exercise for the reader.
7
18. Few
(modest)
examples
of
linking
within
document
Authors
manually
iden,fy
(and
tag)
en,,es
for
which
associated
data
is
in
databases,
like
GenBank,
Uniprot,
PDB,
etc
Or:
automa,c
en,ty
iden,fica,on
and
linking
to
relevant
databases.
4
19. Few
(modest)
examples
of
linking
within
document
Authors
manually
iden,fy
(and
tag)
en,,es
for
which
associated
data
is
in
databases,
like
GenBank,
Uniprot,
PDB,
etc
Or:
automa,c
en,ty
iden,fica,on
and
linking
to
relevant
databases.
4
20. 3. The
future
being
made
today:
let’s
execute
the
paper!
Publica0on
Background Workflow
Repository
Hypotheses Experimental
Design
Methods
Data
Repository
Results Observed
Results
Figures
So]ware
Repository
Conclusions
Code/Sta0s0cs
9
21. 3. The
future
being
made
today:
let’s
execute
the
paper!
Workflow
Repository
Experimental
Design
Data
Repository
Observed
Results
So]ware
Repository
Code/Sta0s0cs
9
22. 3. The
future
being
made
today:
let’s
execute
the
paper!
Workflow
Repository
Research
Process
Data
Repository
So]ware
Repository
10
23. 3. The
future
being
made
today:
let’s
execute
the
paper!
Research
Report
Background
Workflow
Repository
Hypotheses
Research
Process
Data
Repository
So]ware
Repository
10
24. 3. The
future
being
made
today:
let’s
execute
the
paper!
Research
Report
Background
Experimental
Design
Workflow
Repository
Hypotheses
Research
Process
Data
Repository
So]ware
Repository
10
25. 3. The
future
being
made
today:
let’s
execute
the
paper!
Research
Report
Background
Workflow
Repository
Hypotheses
Experimental
Design
Experimental
Design
Research
Process
Data
Repository
So]ware
Repository
10
26. 3. The
future
being
made
today:
let’s
execute
the
paper!
Research
Report
Background
Workflow
Repository
Hypotheses Observed
Results Experimental
Design
Experimental
Design
Research
Process
Data
Repository
So]ware
Repository
10
27. 3. The
future
being
made
today:
let’s
execute
the
paper!
Research
Report
Background
Workflow
Repository
Hypotheses
Experimental
Design
Experimental
Design
Observed
Results
Research
Process
Data
Repository
Observed
Results
So]ware
Repository
10
28. 3. The
future
being
made
today:
let’s
execute
the
paper!
Research
Report
Background
Workflow
Repository
Hypotheses
Experimental
Design
Experimental
Design
Code/Sta0s0cs
Observed
Results
Research
Process
Data
Repository
Observed
Results
So]ware
Repository
10
29. 3. The
future
being
made
today:
let’s
execute
the
paper!
Research
Report
Background
Workflow
Repository
Hypotheses
Experimental
Design
Experimental
Design
Observed
Results
Research
Process
Data
Repository
Code/Sta0s0cs Observed
Results
So]ware
Repository
Code/Sta0s0cs
10
30. 3. The
future
being
made
today:
let’s
execute
the
paper!
Research
Report
Background
Workflow
Repository
Hypotheses
Experimental
Design
Experimental
Design
Observed
Results
Research
Process
Data
Repository
Code/Sta0s0cs Observed
Results
Conclusions
So]ware
Repository
Code/Sta0s0cs
10
31. 3. The
future
being
made
today:
let’s
execute
the
paper!
Research
Report
Background
Workflow
Repository
Hypotheses
Experimental
Design
Experimental
Design
Observed
Results
Research
Process
Data
Repository
Code/Sta0s0cs Observed
Results
Maintain
context:
Conclusions -‐ Experimental So]ware
Repository
-‐ Narra0ve Code/Sta0s0cs
-‐ Domain
10
32. 3. Even
be5er:
why
move
anything
anywhere??
Research
Report
Background
Experimental
Design
Workflow
Repository
Hypotheses Observed
Results
Code/Sta0s0cs
Research
Process
Data
Repository
Conclusions
So]ware
Repository
11
33. 3. Even
be5er:
why
move
anything
anywhere??
Research
Report
Background
Experimental
Design
Workflow
Repository
Hypotheses
Observed
Results Experimental
Design
Experimental
Design
Code/Sta0s0cs
Observed
Results
Research
Process
Data
Repository
Code/Sta0s0cs Observed
Results
Conclusions
So]ware
Repository
Code/Sta0s0cs
11
35. 3.Science
in
the
cloud
Proposal
Advantages
to
the
scien4st
Store
research
plan,
results,
thoughts,
Always
keep
track
of
your
own
data!
observa0ons,
etc.
locally/in
the
cloud
in
a
Maintain
copyright
and
access
system
that
adds
metadata.
privileges.
Allow
access
to
the
data,
workflow
etc.
to
Data
is
veXed,
iden0fied,
and
the
data
repository,
who adver0sed.
1.
validates
quality
(content
and
form)
If
scien0st/funding
body
wants:
2.
assigns
a
UID
data
repository
controls
access
rights
3.
adver0ses
its
existence data
repository
maintains
archive
Allow
access
to
the
collected
thoughts,
Content
veXed,
iden0fied,
and
(with
links
to
data)
to
the
publisher,
who adver0sed..
1.
validates
quality
(content
and
form)
If
scien0st/funding
body
wants:
2.
assigns
a
UID
publisher/library
controls
access
rights
3.
adver0ses
its
existence publisher/library
maintains
archive
Others
-‐
perhaps
publishers,
perhaps
data
BeXer
so[ware!
repositories,
perhaps
(egad!)
so[ware
BeXer
links
to
everything
else
we
do.
developers
-‐
build
tools,
to
place
thoughts
and
data
into
context. 12
36. 3.Science
in
the
cloud
Proposal
Advantages
to
the
scien4st
Store
research
plan,
results,
thoughts,
Always
keep
track
of
your
own
data!
observa0ons,
etc.
locally/in
the
cloud
in
a
Maintain
copyright
and
access
system
that
adds
metadata.
privileges.
Allow
access
to
the
data,
workflow
etc.
to
Data
is
veXed,
iden0fied,
and
the
data
repository,
who adver0sed.
1.
validates
quality
(content
and
form)
If
scien0st/funding
body
wants:
2.
assigns
a
UID
data
repository
controls
access
rights
3.
adver0ses
its
existence data
repository
maintains
archive
Allow
access
to
the
collected
thoughts,
Content
veXed,
iden0fied,
and
(with
links
to
data)
to
the
publisher,
who adver0sed..
1.
validates
quality
(content
and
form)
If
scien0st/funding
body
wants:
2.
assigns
a
UID
publisher/library
controls
access
rights
3.
adver0ses
its
existence publisher/library
maintains
archive
Others
-‐
perhaps
publishers,
perhaps
data
BeXer
so[ware!
repositories,
perhaps
(egad!)
so[ware
BeXer
links
to
everything
else
we
do.
developers
-‐
build
tools,
to
place
thoughts
and
data
into
context. 12
37. 3.Science
in
the
cloud
Proposal
Advantages
to
the
scien4st
Store
research
plan,
results,
thoughts,
Always
keep
track
of
your
own
data!
observa0ons,
etc.
locally/in
the
cloud
in
a
Maintain
copyright
and
access
system
that
adds
metadata.
privileges.
Allow
access
to
the
data,
workflow
etc.
to
Data
is
veXed,
iden0fied,
and
the
data
repository,
who adver0sed.
1.
validates
quality
(content
and
form)
If
scien0st/funding
body
wants:
2.
assigns
a
UID
data
repository
controls
access
rights
3.
adver0ses
its
existence data
repository
maintains
archive
Allow
access
to
the
collected
thoughts,
Content
veXed,
iden0fied,
and
(with
links
to
data)
to
the
publisher,
who adver0sed..
1.
validates
quality
(content
and
form)
If
scien0st/funding
body
wants:
2.
assigns
a
UID
publisher/library
controls
access
rights
3.
adver0ses
its
existence publisher/library
maintains
archive
Others
-‐
perhaps
publishers,
perhaps
data
BeXer
so[ware!
repositories,
perhaps
(egad!)
so[ware
BeXer
links
to
everything
else
we
do.
developers
-‐
build
tools,
to
place
thoughts
and
data
into
context. 12
38. 3.Science
in
the
cloud
Proposal
Advantages
to
the
scien4st
Store
research
plan,
results,
thoughts,
Always
keep
track
of
your
own
data!
observa0ons,
etc.
locally/in
the
cloud
in
a
Maintain
copyright
and
access
system
that
adds
metadata.
privileges.
Allow
access
to
the
data,
workflow
etc.
to
Data
is
veXed,
iden0fied,
and
the
data
repository,
who adver0sed.
1.
validates
quality
(content
and
form)
If
scien0st/funding
body
wants:
2.
assigns
a
UID
data
repository
controls
access
rights
3.
adver0ses
its
existence data
repository
maintains
archive
Allow
access
to
the
collected
thoughts,
Content
veXed,
iden0fied,
and
(with
links
to
data)
to
the
publisher,
who adver0sed..
1.
validates
quality
(content
and
form)
If
scien0st/funding
body
wants:
2.
assigns
a
UID
publisher/library
controls
access
rights
3.
adver0ses
its
existence publisher/library
maintains
archive
Others
-‐
perhaps
publishers,
perhaps
data
BeXer
so[ware!
repositories,
perhaps
(egad!)
so[ware
BeXer
links
to
everything
else
we
do.
developers
-‐
build
tools,
to
place
thoughts
and
data
into
context. 12
39. 3.Science
in
the
cloud
Proposal
Advantages
to
the
scien4st
Store
research
plan,
results,
thoughts,
Always
keep
track
of
your
own
data!
observa0ons,
etc.
locally/in
the
cloud
in
a
Maintain
copyright
and
access
system
that
adds
metadata.
privileges.
Allow
access
to
the
data,
workflow
etc.
to
Data
is
veXed,
iden0fied,
and
the
data
repository,
who adver0sed.
1.
validates
quality
(content
and
form)
If
scien0st/funding
body
wants:
2.
assigns
a
UID
data
repository
controls
access
rights
3.
adver0ses
its
existence data
repository
maintains
archive
Allow
access
to
the
collected
thoughts,
Content
veXed,
iden0fied,
and
(with
links
to
data)
to
the
publisher,
who adver0sed..
1.
validates
quality
(content
and
form)
If
scien0st/funding
body
wants:
2.
assigns
a
UID
publisher/library
controls
access
rights
3.
adver0ses
its
existence publisher/library
maintains
archive
Others
-‐
perhaps
publishers,
perhaps
data
BeXer
so[ware!
repositories,
perhaps
(egad!)
so[ware
BeXer
links
to
everything
else
we
do.
developers
-‐
build
tools,
to
place
thoughts
and
data
into
context. 12
50. In
summary:
• Publishers
are
in
general
not
interes0ng
in
owning
or
charging
for
research
data
repositories
(Brussels
declara0on)
• Publishers
are
very
interested
in
linking
to
and
from
data,
and
want
to
work
with
data
repositories
to
do
this
effec0vely
• Publishers
believe
in
Digital
Object
Iden0fiers
• Publishers
embrace
open
standards
and
interoperability,
and
are
adap0ng
their
infrastructure
to
be
future-‐compliant:
– In
par0cular,
we
think
scien0sts
should
keep
(track
of)
their
work
16
51. In
summary:
• Publishers
are
in
general
not
interes0ng
in
owning
or
charging
for
research
data
repositories
(Brussels
declara0on)
• Publishers
are
very
interested
in
linking
to
and
from
data,
and
want
to
work
with
data
repositories
to
do
this
effec0vely
• Publishers
believe
in
Digital
Object
Iden0fiers
• Publishers
embrace
open
standards
and
interoperability,
and
are
adap0ng
their
infrastructure
to
be
future-‐compliant:
– In
par0cular,
we
think
scien0sts
should
keep
(track
of)
their
work
– We
also
think
novel
informa0on
architectures
work
for
science,
including
Linked
Data,
the
concept
of
app
servers,
and
the
cloud
16
52. In
summary:
• Publishers
are
in
general
not
interes0ng
in
owning
or
charging
for
research
data
repositories
(Brussels
declara0on)
• Publishers
are
very
interested
in
linking
to
and
from
data,
and
want
to
work
with
data
repositories
to
do
this
effec0vely
• Publishers
believe
in
Digital
Object
Iden0fiers
• Publishers
embrace
open
standards
and
interoperability,
and
are
adap0ng
their
infrastructure
to
be
future-‐compliant:
– In
par0cular,
we
think
scien0sts
should
keep
(track
of)
their
work
– We
also
think
novel
informa0on
architectures
work
for
science,
including
Linked
Data,
the
concept
of
app
servers,
and
the
cloud
• Publishers
believe
in
a
future
that
stores
and
shares
science
in
a
beXer
and
more
produc0ve
way,
and
inven0ng
it
together:
FoRCE11:
The
Future
of
Research
Communica0ons
and
eScience
16