Ontologies of research areas are important tools for characterising, exploring, and analysing the research landscape. Some fields of research are comprehensively described by large-scale taxonomies, e.g., MeSH in Biology and PhySH in Physics. Conversely, current Computer Science taxonomies are coarse-grained and tend to evolve slowly. For instance, the ACM classification scheme contains only about 2K research topics and the last version dates back to 2012. In this paper, we introduce the Computer Science Ontology (CSO), a large-scale, automatically generated ontology of research areas, which includes about 15K topics and 70K semantic relationships. It was created by applying the Klink-2 algorithm on a very large dataset of 16M scientific articles. CSO presents two main advantages over the alternatives: i) it includes a very large number of topics that do not appear in other classifications, and ii) it can be updated automatically by running Klink-2 on recent corpora of publications. CSO powers several tools adopted by the editorial team at Springer Nature and has been used to enable a variety of solutions, such as classifying research publications, detecting research communities, and predicting research trends. To facilitate the uptake of CSO we have developed the CSO Portal, a web application that enables users to download, explore, and provide granular feedback on CSO at different levels. Users can use the portal to rate topics and relationships, suggest missing relationships, and visualise sections of the ontology. The portal will support the publication of and access to regular new releases of CSO, with the aim of providing a comprehensive resource to the various communities engaged with scholarly data.
The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
1. The Computer Science Ontology:
A Large-Scale Taxonomy of Research Areas
Angelo A. Salatino, Thiviyan Thanapalasingam, Andrea Mannocci, Francesco Osborne,
Enrico Motta
@angelosalatino
Knowledge Media Institute
The Open University
United Kingdom
2. Ontologies of Research Areas
I. making sense of the research dynamics
II. classifying publications
III. identifying research communities
IV. forecasting research trends
3. Ontologies and Taxonomies of Research Areas
Mathematics Subject
Classification – MSC2010
Physics and Astronomy
Classification Scheme
(PACS)
JEL Classification
System
Library of Congress
Classification (LCC)
Computing
Classification System
(CCS)
4. The Computer Science Ontology
• Ontology of research areas, automatically generated using
Klink-2* algorithm, on a dataset of 16 million publications
mainly in Computer Science
• Current version of CSO includes 14K topics and 143K
relationships
• Main roots include Computer Science, Linguistic,
Mathematics, Geometry, Semantics and so on.
*Francesco Osborne, and Enrico Motta. "Klink-2: integrating multiple web sources to generate
semantic topic networks." In ISWC 2015, Bethlehem, PA (USA).
5. Data Model
The CSO data model includes seven semantic relations:
• skos:broaderGeneric, which indicates that a topic is a sub-area of another one (e.g.,
Linked Data, Semantic Web).
• relatedEquivalent, which indicates that two topics can be treated as equivalent for the
purpose of exploring research data (e.g., Ontology Matching, Ontology Alignment).
• contributesTo, which indicates that the research outputs of one topic contributes to
another. For instance, research in Ontology Engineering contributes to the Semantic
Web, but arguably Ontology Engineering is not a sub-area of the Semantic Web – but
arguably Ontology Engineering is not a sub-area of Semantic Web – that is, there is
plenty of research in Ontology Engineering outside the Semantic Web area.
• owl:sameAs, this relation indicates that a research concepts is identical to an external
resource. We used DBpedia Spotlight to connect research concepts to Dbpedia.
• primaryLabel, this relation is used to state the main label for topics belonging to a
cluster of relatedEquivalent. For instance, the topics Ontology Matching and Ontology
Alignment will both have their primaryLabel set to Ontology Matching.
• rdf:type, this relation is used to state that a resource is an instance of a class. For
example, a resource in our ontology is an instance of topic.
• rdfs:label, this relation is used to provide a human-readable version of a resource’s
name.
6. CSO Generation
Klink-2 is an approach for learning
large-scale ontologies of research
topics from corpora of scientific
articles and knowledge sources on the
web.
Given a pair of keywords it infers their
semantic relationship:
• skos:broaderGeneric
• contributesTo
• relatedEquivalent
Francesco Osborne, and Enrico Motta. "Klink-2: integrating multiple web sources to generate semantic
topic networks." In ISWC 2015, Bethlehem, PA (USA).
relatedEquivalent
skos:broaderGeneric
contributesTo
8. ISWC 2018 - Call for Papers
database
internet
reasoning
knowledge base
artificial intelligenceaccess control
social networks data miningontology
machine learning
semantics
privacy
knowledge representation
natural language processing
semantic web
data stream
information retrieval
ontology-based data access
web data mining
cloud environments
information visualization
mobile platform
ontology merging
ontology matching
geo-spatial data
data cleaning
semantic data blockchain
ontology mapping
ontology engineering
question answering
linked data
data mining techniques
knowledge discovery
information extraction
About 50% of these topics are not in ACM Computing Classification Scheme
ontology matching
Not available in ACM CCS
Available in ACM CCS
9. Smart Topic Miner
The Smart Topic Miner (STM) is a semantic application that support the
Springer Nature editorial team in classifying scholarly publications in the
field of Computer Science.
Francesco Osborne, Angelo Salatino, Aliaksandr Birukou, and Enrico Motta. "Automatic
classification of springer nature proceedings with smart topic miner." In ISWC 2016. Kobe, Japan.
http://rexplore.kmi.open.ac.uk/STM_demo
10. Smart Book Recommender
Smart Book Recommender (SBR) is a web application that takes as input a
conference and suggests books, proceedings and journals which address
similar topics. It helps Springer Nature editorial team in marketing books.
Thiviyan Thanapalasingam, Francesco Osborne, Aliaksandr Birukou, and Enrico Motta. "Ontology-
Based Recommendation of Editorial Products." ISWC 2018. Monterey, CA (USA).
http://rexplore.kmi.open.ac.uk/SBR_demo
11. Augur – Early Detection of Research Topics
Augur is a method for detecting the emergence of research areas at an
embryonic stage, i.e., before the topic has been consistently labelled by
researchers and associated with several publications.
Angelo Salatino, Francesco Osborne, and Enrico Motta. "AUGUR: Forecasting the Emergence of New
Research Topics." In JCDL’18. Fort Worth, Texas, USA.
12. CSO through CSO Portal
I. Browse
II. Download
• https://cso.kmi.open.ac.uk/dow
nloads
• or
https://w3id.org/cso/downloads
• It is available in OWL, Turtle and
CSV format.
III. Provide granular feedback
This work is licensed under a Creative Commons Attribution 4.0 International License.
13. CSO Ecosystem – Let’s keep humans in the loop
New
Systems
Use CSO
Feedback
Explore / Download
Computer Science
Ontology
Update
CSO Portal
Community of
researchers
14. CSO Portal Architecture
Visit CSO Portal: https://cso.kmi.open.ac.uk
Registered Users
Editorial Board
Rexplore
Dataset DBpedia
Klink
Computer Science
Ontology
Ontology Feedback
Topic Feedback
Relationship Feedback
Suggest New Relationship
Version x.y
Snapshot of
Feedbacks
Revision and
Analysis of
Feedbacks
Minor Revision
Major Revision
Create version x.(y+1)
Create version (x+1).0
Revision and Update Framework
Annotation
Ontology
Browsing Ontology
Users
Ontology Generation
Download Ontology
Check
Dashboard/Contributions
15. CSO Portal Architecture
Visit CSO Portal: https://cso.kmi.open.ac.uk
Registered Users
Editorial Board
Rexplore
Dataset DBpedia
Klink
Computer Science
Ontology
Ontology Feedback
Topic Feedback
Relationship Feedback
Suggest New Relationship
Version x.y
Snapshot of
Feedbacks
Revision and
Analysis of
Feedbacks
Minor Revision
Major Revision
Create version x.(y+1)
Create version (x+1).0
Revision and Update Framework
Annotation
Ontology
Browsing Ontology
Users
Ontology Generation
Download Ontology
Check
Dashboard/Contributions
16. CSO Portal Architecture
Visit CSO Portal: https://cso.kmi.open.ac.uk
Registered Users
Editorial Board
Rexplore
Dataset DBpedia
Klink
Computer Science
Ontology
Ontology Feedback
Topic Feedback
Relationship Feedback
Suggest New Relationship
Version x.y
Snapshot of
Feedbacks
Revision and
Analysis of
Feedbacks
Minor Revision
Major Revision
Create version x.(y+1)
Create version (x+1).0
Revision and Update Framework
Annotation
Ontology
Browsing Ontology
Users
Ontology Generation
Download Ontology
Check
Dashboard/Contributions
17. CSO Portal Architecture
Visit CSO Portal: https://cso.kmi.open.ac.uk
Registered Users
Editorial Board
Rexplore
Dataset DBpedia
Klink
Computer Science
Ontology
Ontology Feedback
Topic Feedback
Relationship Feedback
Suggest New Relationship
Version x.y
Snapshot of
Feedbacks
Revision and
Analysis of
Feedbacks
Minor Revision
Major Revision
Create version x.(y+1)
Create version (x+1).0
Revision and Update Framework
Annotation
Ontology
Browsing Ontology
Users
Ontology Generation
Download Ontology
Check
Dashboard/Contributions
19. Predicates shown
Shown predicate Ontology predicate Example
parent of skos:broaderGeneric
semantic web patent of linked data
semantic web skos:broaderGeneric linked data
alternative label relatedEquivalent
computer network patent of computer networks
computer network skos:broaderGeneric computer networks
child of inverseOf(skos:broaderGeneric)
semantic web child of world wide web
world wide web skos:broaderGeneric semantic web
same as owl:sameAs
semantic web same as dbpedia:Semantic_Web
semantic web owl:sameAs dbpedia:Semantic_Web
20. Browsing research concepts: content negotiation
Format Header Resource
HTML text/html https://cso.kmi.open.ac.uk/topics/semantic web
RDF/XML application/rdf+xml
https://cso.kmi.open.ac.uk/topics/semantic web.rdf
https://cso.kmi.open.ac.uk/topics/semantic web.xml
Turtle text/turtle https://cso.kmi.open.ac.uk/topics/semantic web.ttl
JSON-LD application/json or application/ld+json
https://cso.kmi.open.ac.uk/topics/semantic web.json
https://cso.kmi.open.ac.uk/topics/semantic web.jsonld
N-Triples application/n-triples https://cso.kmi.open.ac.uk/topics/semantic web.nt
CSO Portal supports the content negotiation to serve different
representations of the same resource (URI)
21. Providing Feedback
Users can offer four
kinds of feedback:
• Topic
• Relationship
• Suggest new
relationship
• Entire ontology
22. Editorial Panel
Some functionalities are
already available:
• Add/Remove topic
• Add/Remove relationship
• Change cluster’s primary
label
• Check Ontology
Consistency
• Check Ontology state
• Check History operations
• Deploy Ontology
23. Release cycle
• Minor revisions
• Correcting specific errors
• Add/Remove relationships
• Add/Remove topic
• Major revisions
• Expanding ontology by re-running Klink-2
• New recent corpus of publications
• Considering user feedback
25. Future Work
• Currently we are working on Klink v3.0
• Extract further information from abstracts
• Can take into account the feedback gathered through the portal
• We plan to release ontologies in other fields of Science
• Engineering
• Medicine
• Producing external links to other resources
• E.g., mapping to other available taxonomies
• Developing new features for the CSO Portal
• Relevant papers and authors associated to each research topic
To enhance the user experience and to make this portal also available of non semantic web savvy, we renamed the main predicated in a more user friendly way.