The end of the scientific paper as we know it (or not...)

The end of
the scientific paper
as we know it
(in 4 easy steps)
Frank van Harmelen
(+ Paul Groth)
VU Amsterdam

Reports on
the death of
the scientific paper
have been greatly
exaggerated
Frank van Harmelen
(+ Paul Groth)
VU Amsterdam
And how the Semantic Web
makes it possible

Semsci 2017 workshop
• 7/10 papers about data
• 3/10 are about papers
and they are about papers written by&for people
Thanks (in order of appearance) to:
• Paul Groth
• Tobias Kuhn
• Jan Velterop
• Barend Mons
• Anita de Waard
• Carole Goble

Scientific publishing hasn’t changed
in 350 years
• Letter from Christian Huygens (1652)
• Writing to his prof in Mathematics
• Citing (and complaining about)
work of Descartes
• One of 3000 letters by Huygens

2017: Only superficial changes
• Different format & style
• Different medium
(Web, PDF)
• Different speed
(PubMed = 2 papers/min)

Section 1: Related work
Section 2: Research question
Section 3: Experimental design
Section 4: Experimental findings
Section 5: Interpretation, conclusions
And our papers still follow
this storyline:
Step 1: Study & interpret literature
Step 2: Formulate hypothesis
Step 3: Design experiment
Step 4: Execute experiment
Step 5: Publish results
This storyline is important,
but only readable by people,
not for machines

How to make our papers more usable?
“We only need information extraction
because we first did information burial” (Barend Mons)
“A journal paper
is a state-funeral
for your results”
(Hans Akkermans)

Step 1: explicit rhetorical structure
Capture the roles of blocks of text &
make these roles explicit
1 paper = 1 Network of blocks
N papers = 1 Network of blocks
Results Results
Interpretati
ons
Interpretati
ons
Conclusio
ns
Problem
Method
Results
Interpretati
ons
Conclusio
ns
Problem
Method
One paper Another paper

Step 2: explicit fine-grained
rhetorical structure
Locate individual knowledge items
and their relationships
Example: Scholonto, ClaiMaker [Buckinham-Shum]
Paper = set of claims
Claim = text – relation – text
Relation = causes, predicts, prevents; addresses, solves
equals, is-similar-to; proofs, supports, challenges
1 paper = 1 fine-grained network of relations
N papers = 1 fine-grained network of relations

Step 3: do away with the paper altogether.
• Any fact is a relation between two things (“triple”)
• Count each fact as a nano-publication
• Together, these nano-publications form a
huge very fine-grained network of relations,
a web of knowledge,
a “semantic web”
• Computers as colleagues,
not (only) tools
Just publish the facts

What is a Nanopublication
“A nanopublication is the smallest unit of
publishable information: an assertion about
anything that can be uniquely identified and
attributed to its author”
http://nanopub.org

Step 4: turning context into a
1st class citizen
• Link to all the stuff that goes on before publication:
– Datasets, workflows
– Open Lab books
– Open peer reviewing
• Link to all the stuff that goes on after publication:
– Websites
– Blogs
– Emails
– Tweets

– Give web-addresses to objects (URIs)
– Use the web to link between the objects
– Provide meaning in a form that computers can handle (RDF)
These principles embodied
in already deployed technology
We can build this using
semantic web technology

So now we have…
No longer a set of
disconnected monolithic PDFs
A network of facts, reviews,
evidence, opinions, data

The story so far…
• Publishing hasn't changed for 300+ years
• The structure and format of our papers
is still based on this
• Deconstruct the scientific paper
– from monolithic block of text
– to a network of computer readable facts & context
• All of this made possible by the semantic web

Pragmatic infeasiblility
Previous experiments in formalising (social) science
turned out to be very hard:
• Hannan and Freeman's theory of organizational inertia
in first-order logic
American Sociological Review 59(4):571-593 · August
1994
• Caroll & Hannan’s resource portioning theory
in first order logic
Computational & Mathematical Organization Theory 7,
87–111, 2001.

Pragmatic (in)feasiblility
Many sciences are quantitave,
but I guess this is still possible in RDF + MathML:

Pragmatic infeasiblility
Science is a social activity, which includes persuasion,
rhetorics, deliberate ambiguity, etc.

CACM, Vol. 22, No. 5, May 1979
“A proof doesn't settle a mathematical argument.
Contrary to what its name suggests,
a proof is only one step in the direction of confidence.
We believe that, in the end,
it is a social process that determines whether
mathematicians feel confident about a theorem.

Thomas, J., The Axiom of Choice, North-Holland, Amsterdam, 1973
(a historical review of independence results in set theory)

Technical infeasibility: Scalability
Scalability
#statements/year =
#statements/nanopub x #nanopubs/paper x #papers/year
= 30 x N x 1.5M = N x 45M/yr
Let’s hope N ≈ O(10)….

Technical infeasibility: expressivity
• RDF hopelessly simple
• Needs at least DL:
“Mosquito’s transmit malaria“
All? no.
Some? yes.
Only? probably.
transmit. Malaria  Mosquitos
Many? Most?
• Beyond DL:
Probabilities, fuzziness, inconsistencies

Technical (in)feasibility:
Argumentation graphs
Escilatopram does not inhibit CYP2D6”
Micropublications, Clark, Ciccarese, Goble, 2013

Technical (in)feasibility:
Argumentation graphs
Argumentation graphs require:
• Defeasible logic
• Modal logic
• Higher-order logic
• ….
at scale of 450M statement/yr 

Should we give up on computers
as scientific colleagues?
• A more modest role for nano-publications?
– Annotations of datasets?
– Very approximate annotations of papers?
• Make them speak our language
instead of us speaking theirs?

The end of the scientific paper as we know it (or not...)

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a The end of the scientific paper as we know it (or not...)

Similar a The end of the scientific paper as we know it (or not...) (20)

Más de Frank van Harmelen

Más de Frank van Harmelen (20)

Último

Último (20)

The end of the scientific paper as we know it (or not...)

Notas del editor