Similar a Machine Support for Interacting with Scientific Publications Improving Information Retrieval, and Assessing Quality of Scientific Output (20)
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Machine Support for Interacting with Scientific Publications Improving Information Retrieval, and Assessing Quality of Scientific Output
1. Introduction Vision Technology Solutions Conclusion
Machine Support for
Interacting w. Scientific Publications,
Improving Information Retrieval, and
Assessing Quality of Scientific Output
4th German-Russian Young Researchers Forum 2014
Christoph Lange1,2
1Enterprise Information Systems, Institute for Applied Computer Science, University of Bonn
2Fraunhofer Institute for Intelligent Analysis and Information Systems (IAIS), Sankt Augustin
http://langec.wordpress.com/about
Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 1
2. Introduction Vision Technology Solutions Conclusion
Machine Support for
Assessing Quality of Scientific Output
4th German-Russian Young Researchers Forum 2014
Christoph Lange1,2
1Enterprise Information Systems, Institute for Applied Computer Science, University of Bonn
2Fraunhofer Institute for Intelligent Analysis and Information Systems (IAIS), Sankt Augustin
http://langec.wordpress.com/about
Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 1
3. Introduction Vision Technology Solutions Conclusion
Hello, World!
2011 PhD at Jacobs Univ. Bremen, Germany: software for
collaborating on mathematical documents [Lan11]
2011/12 Univ. Bremen, Germany: making knowledge of
different complexity manageable for computers
[OntoIOp13]
2012/13 Univ. Birmingham, UK: enabling domain
experts to make mathematical models
machine-verifiable [KLR]
2013– Enterprise Information Systems @ Univ. Bonn,
Germany / Organized Knowledge @ Fraunhofer IAIS:
enterprise information integration [AL14], data
quality assessment, ...
Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 2
4. Introduction Vision Technology Solutions Conclusion
Assess Quality of Scientific Output (I)
Vision: answer the following questions about the quality
of scientific output:
Author “What is a good workshop to discuss my latest
idea?”
Senior Researcher “Should I accept an invitation to the
programme committee of this conference?”
PhD Student “What are the best publications I should
read to get started?”
Reviewer “Is this paper based on high-quality data?”
Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 3
5. Introduction Vision Technology Solutions Conclusion
Assess Quality of Scientific Output (II)
How? – Semantic Web / Linked Open Data technology
weak artificial intelligence – does not aim at
replacing, but at supporting humans
practically applicable, and scalable to the size of
the Web (→ search engine example)
suitable for connecting data from heterogeneous
sources:
scientific publications
(bibliographic metadata, citations and full text)
social networks
(in science? – ResearchGate, Mendeley, etc.)
research data
Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 4
6. Introduction Vision Technology Solutions Conclusion
Linked Open Data: schema.org
initiative of search engines (Google, Yandex, ...)
structuring web page content (creative works,
events, organisations, persons, places, products)
Example (Movie description)
Avatar
Director: James Cameron (born August 16, 1954)
Science fiction
Trailer
Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 5
7. Introduction Vision Technology Solutions Conclusion
Linked Open Data: schema.org
initiative of search engines (Google, Yandex, ...)
structuring web page content (creative works,
events, organisations, persons, places, products)
Example (Movie description)
<div class="movie">
<h1>Avatar</h1>
<div class="director">
Director: James Cameron
(born August 16, 1954)
</div>
<span class="genre">Science fiction</span>
<a href="../movies/avatar-theatrical-trailer.html"
Trailer</a></div>
Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 5
8. Introduction Vision Technology Solutions Conclusion
Linked Open Data: schema.org
initiative of search engines (Google, Yandex, ...)
structuring web page content (creative works,
events, organisations, persons, places, products)
Example (Movie description)
<div itemscope itemtype="http://schema.org/Movie">
<h1 itemprop="name">Avatar</h1>
<div itemprop="director" itemscope
itemtype="http://schema.org/Person">
Director: <span itemprop="name">James Cameron</span>
(born <span itemprop="birthDate">August 16, 1954</span>)</div>
<span itemprop="genre">Science fiction</span>
<a href="../movies/avatar-theatrical-trailer.html"
itemprop="trailer">Trailer</a></div>
Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 5
9. Introduction Vision Technology Solutions Conclusion
Linked Open Data: schema.org
initiative of search engines (Google, Yandex, ...)
structuring web page content (creative works,
events, organisations, persons, places, products)
Example (Movie description)
Movie Avatar Person
James Cameron
August 16, 1954Science fiction../movies/...
type
nam
e
director
genre
trailer
type
name
birthDate
Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 5
10. Introduction Vision Technology Solutions Conclusion
Social Data with schema.org
review or rating of a creative work, organization or
product (written by a person)
social network of a person: “knows”, “works for”, “is
colleague of”, “has parent/sibling/spouse/child/relative”
Example (Reviews of a movie)
Movie type
Avatar
name
reviews
authorreviewRating
reviews
author
reviewRating
6
ratingValue
8.5
ratingValue
Pünktchen
name
Anton
name
Person
type
type
knows
Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 6
11. Introduction Vision Technology Solutions Conclusion
schema.org in a Search Engine
Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 7
12. Introduction Vision Technology Solutions Conclusion
Workshop Quality
Author: “What is a good workshop to discuss my latest
idea?”
Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 8
13. Introduction Vision Technology Solutions Conclusion
Workshop Quality: Examples
Low-quality workshop
1st International Workshop on Applied Networking
(but all non-invited submissions are from authors from
the same institution as the chairs)
High-quality workshop
focused topic, 10 editions so far, balanced continuity and
renewal in organising committee, number of
submissions not decreasing, international participation,
part of a high-profile conference
Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 9
14. Introduction Vision Technology Solutions Conclusion
Workshop Quality: Data
Semantic Publishing Challenge [DL14]
@ Extended Semantic Web Conference 2014
One task focused on extracting Linked Data from
CEUR-WS.org workshop proceedings volumes
1,200 workshops since 1995
open access
most important publisher for computer science
workshops
semi-structured HTML tables of content
unstructured PDF full-text
A team from Saint-Petersburg (ITMO University)
won the award for the best-performing tool [KK14]
Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 10
15. Introduction Vision Technology Solutions Conclusion
Conference Quality
Senior Researcher: “Should I accept an invitation to the
programme committee of this conference?”
Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 11
16. Introduction Vision Technology Solutions Conclusion
Conference Quality in the Past: Ranking
CORE (Computing Research and Education Association
of Australasia) and ERA (Excellence in Research for
Australia) rankings of 2008, 2010 and 2013:
infrequent and intransparent
Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 12
17. Introduction Vision Technology Solutions Conclusion
Paper Quality in the Past: Impact Factor
PhD Student: “What are the best publications I should
read to get started?”
Impact Factor
Average number of
citations of recent articles
journals only
not comparable across
disciplines
can be influenced by
journal editors
Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 13
18. Introduction Vision Technology Solutions Conclusion
Paper Quality in the Future
Multidimensional, context-sensitive analysis:
trend detection, topic analysis, expert search,
community dynamics, research performance at
different levels (e.g. [OM14])
context-sensitive citation analysis
e.g. 2014 Semantic Publishing Challenge task 2 (using
PubMedCentral XML metadata) [DL14]
“good citation”: B’s contribution is based on A’s
methodology
“bad citation”: A cited in a footnote in the “related work”
section
Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 14
19. Introduction Vision Technology Solutions Conclusion
Data Quality
Reviewer: “Is this paper based on high-quality data?”
Quality metrics of an evolving dataset [DLA14]
Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 15
20. Introduction Vision Technology Solutions Conclusion
Data Quality Assessment
Quality := “fitness for use” – categories [Zav+13]:
Relevancy
Conciseness
Timeliness
Rep.-
Conciseness
Interoperability
Consistency
Interpretability
Understandability
Versatility*
Availability
Performance* Interlinking*
Syntactic
Validity
Representation
Contextual
Intrinsic
Accessibility
Trustworthiness
Two dimensions
are related
Licensing*
Semantic
Accuracy
Completeness
Security*
Dim1 Dim2
Enable authors to upload data with their papers!
Give peer reviewers access to data quality metrics
Starting collaboration with GESIS (social science)
Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 16
21. Introduction Vision Technology Solutions Conclusion
Directions: Jailbreaking the PDF
“exploring ways to
access scholarly
data in modern
ways”
free peer-reviewed
scientific knowledge
from being locked
up in PDF
documents
Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 17
22. Introduction Vision Technology Solutions Conclusion
Directions: Pact with the Devil
Openness vs. impact
Springer:
conference linked
data
Elsevier: executable
paper challenge
ResearchGate: open
reviews
Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 18
23. Introduction Vision Technology Solutions Conclusion
Conclusion
Scientists need help with assessing the quality of
scientific output.
Having PDF documents peer-reviewed by human
experts is not sufficient.
We need better quality metrics than the impact
factor.
Not just paper quality matters, but also data quality.
Semantic Web/Linked Data technology helps to
provide complementary machine support...
... and is a gate into openness.
Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 19
24. References
References I
S. Auer and C. Lange. “Interlinking Data and
Knowledge in Enterprises, Research and Society
with Linked Data”. In: Proceedings of the 11th
International Baltic Conference on Databases and
Information Systems (Baltic DB&IS). (Tallinn, Estonia,
June 8–11, 2014). Ed. by H.-M. Haav, A. Kalja, and
T. Robal. Invited paper. Tallinn, Estonia: Tallinn
University of Technology Press, 2014, pp. 3–12.
A. Di Iorio and C. Lange, eds. (Anissaras, Greece,
May 25, 2014). 2014. URL: http://2014.eswc-
conferences.org/program/semwebeval.
Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 20
25. References
References II
J. Debattista, C. Lange, and S. Auer. “Representing
Dataset Quality Metadata using
Multi-Dimensional Views”. 2014. Submitted.
M. Kolchin and F. Kozlov. “Unstable markup: A
template-based information extraction from web
sites with unstable markup”. In: Semantic
Publishing Challenge (Extended Semantic Web
Conference, Semantic Web Evaluation Track).
(Anissaras, Greece, May 25, 2014). Ed. by A. Di Iorio
and C. Lange. 2014. URL: http://2014.eswc-
conferences.org/program/semwebeval.
Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 21
26. References
References III
M. Kerber, C. Lange, and C. Rowat. ForMaRE.
Formal Mathematical Reasoning in Economics. URL:
http://cs.bham.ac.uk/research/
projects/formare/ (visited on 2013-02-10).
C. Lange. “Enabling Collaboration on Semiformal
Mathematical Knowledge by Semantic Web
Integration”. PhD thesis. Jacobs University
Bremen, 2011.
Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 22
27. References
References IV
F. Osborne and E. Motta. “Understanding Research
Dynamics”. In: Semantic Publishing Challenge
(Extended Semantic Web Conference, Semantic Web
Evaluation Track). (Anissaras, Greece, May 25, 2014).
Ed. by A. Di Iorio and C. Lange. 2014. URL:
http://2014.eswc-
conferences.org/program/semwebeval.
OntoIOp (Ontology, Model and Specification
Integration and Interoperability), an OMG Standard
Development Initiative. 2013. URL:
http://ontoiop.org (visited on 2013-10-09).
Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 23
28. References
References V
A. Zaveri, A. Rula, A. Maurino, R. Pietrobon,
J. Lehmann, and S. Auer. “Quality Assessment
Methodologies for Linked Open Data”. In:
Semantic Web Journal (2013). This article is still
under review. URL: http://www.semantic-
web-journal.net/content/quality-
assessment-linked-open-data-survey.
Lange (Bonn) Interacting with Scientific Publications; Assessing Quality of Scientific Output 2014-07-07 24