Statistics notes ,it includes mean to index numbers
Enabling Data Analytics from Knowledge Graphs @ ISWC 2017 Doctoral Consortium
1. Enabling Data Analytics from Knowledge Graphs
Henrique Santos
Universidade de Fortaleza, Fortaleza, CE, Brazil
The 16th
International Semantic Web Conference (ISWC 2017) – Doctoral Consortium
Vienna, Austria – 22 October 2017
2. 22 October 2017Henrique Santos - Enabling Data Analytics from Knowledge Graphs2
The 16th International Semantic Web Conference (ISWC 2017) – Vienna, Austria – Doctoral Consortium
Problem statement
●
Datasets are the most common source of scientifc data for data analysis
●
Lack of metadata, not clean, can’t be directly combined or compared
●
Knowledge Graphs for scientifc data are on the rise
●
Many approaches, multiple uses: but data scientists are still using datasets
●
Consequence: data preparation takes around 80% of the time of the
whole analytical process (PATIL, 2012)
●
How to maintain enough metadata related to scientifc data?
●
How to exploit that knowledge to foster data analytics activities?
●
How integrate data from scientifc KGs with regular data tools like R,
Python or BI softwares?
3. 22 October 2017Henrique Santos - Enabling Data Analytics from Knowledge Graphs3
The 16th International Semantic Web Conference (ISWC 2017) – Vienna, Austria – Doctoral Consortium
Related work
●
W3C’s CSV on the Web
●
Scientifc ontologies
●
SSN – Semantic Sensor Network
●
VSTO – Virtual Solar-Terrestrial Observatory
●
HAScO – Human-Aware Science Ontology
●
Indicators
●
GCI Ontology
●
Scientifc Knowledge graphs
●
Gene Ontology, Bio2RDF, The Graph of Things
4. 22 October 2017Henrique Santos - Enabling Data Analytics from Knowledge Graphs4
The 16th International Semantic Web Conference (ISWC 2017) – Vienna, Austria – Doctoral Consortium
Research questions & Hypothesis
Q1 Can ontologies be used to successfully bridge the knowledge gap between acquired scientifc data and
data users? If so, how?
Q2 Will data users and applications beneft from the use of knowledge behind each scientifc data point?
Q3 How to provide data access for scientifc KGs in a way that can be consumed by routine data tools
while making use of the attached data knowledge to facilitate analytics?
H1 The reuse of scientifc data ontologies with proper extensions and their alignments to domain
ontologies can mitigate the current loss of knowledge during data acquisition
H2 Providing data points together with their knowledge (e.g. provenance, contextual knowledge) to data
users and applications can facilitate data analytics compared to current dataset usage.
H3 A hybrid RDF serialization format that suits the needs of existing data tools but also is able to convey
knowledge can be used to serialize data from KGs together with its associated metadata.
H4 A query API for scientifc KGs can be used to output data together with its associated metadata in a
better way than current tools for querying RDF data for data tools.
5. 22 October 2017Henrique Santos - Enabling Data Analytics from Knowledge Graphs5
The 16th International Semantic Web Conference (ISWC 2017) – Vienna, Austria – Doctoral Consortium
Approach
Data
annotation
KG
building
KG
browsing
KG
serialization
Intelligent
applications
C
HAScO
VSTO-I
HACitO
prov:Activityprov:Activity
hasco: Studyhasco: Study hasco:
DataAcquisition
hasco:
DataAcquisition
vstoi:
Deployment
vstoi:
Deployment
xsd:dateTime
xsd:dateTime
isData
AcquisitionOf hasDeployment
prov: startedAtTime
prov: endedAtTime
vstoi:
Instrument
vstoi:
Instrument
vstoi:
Platform
vstoi:
Platform
vstoi:
Detector
vstoi:
Detector
hasDetectorhasInstrument hasPlatform
C
●
Automatic data
visualization
●
Data cleansing
●
Infer semantic
diference between
data points
●
...
6. 22 October 2017Henrique Santos - Enabling Data Analytics from Knowledge Graphs6
The 16th International Semantic Web Conference (ISWC 2017) – Vienna, Austria – Doctoral Consortium
Preliminary results
SANTOS, H. et al. Contextual Data Collection for Smart Cities.
In: Proceedings of the Sixth Workshop on Semantics for
Smarter Cities. Bethlehem, PA, USA. 2015.
SANTOS, H. et al. From Data to City Indicators: A Knowledge Graph for
Supporting Automatic Generation of Dashboards. In: The Semantic Web -
Proceedings of the 14th Extended Semantic Web Conference (ESWC 2017).
Portorož, Slovenia. 2017.
Data
annotation
KG
building
KG
browsing
KG
serialization
Intelligent
applications
7. 22 October 2017Henrique Santos - Enabling Data Analytics from Knowledge Graphs7
The 16th International Semantic Web Conference (ISWC 2017) – Vienna, Austria – Doctoral Consortium
Evaluation plan
KG evaluation (H1): state-of-the-art KG evaluation approaches discussed
in (PAULHEIM, 2017).
Metadata evaluation (H2): gathering data analytics use cases and
assessing how the associated metadata facilitates the use of the data.
KG querying & serialization (H3, H4): tests with data scientists and
feld specialists acting as users of our proposed KG and processes. Using
their data (preferably from diferent studies and sources), we intend to
build a scientifc KG adding the relevant metadata and then provide them
tools for querying the data and preparing datasets for their routine data
analytics. Then, questionnaires will be applied to measure how much our
approach has eased their tasks in contrast with their regular processes.
8. 22 October 2017Henrique Santos - Enabling Data Analytics from Knowledge Graphs8
The 16th International Semantic Web Conference (ISWC 2017) – Vienna, Austria – Doctoral Consortium
Relevancy
●
We expect this research to bring straight benefts to data
scientists and feld specialists, by providing specifcations and
tools that we claim will ease their data preparation tasks
●
KG serialization technique will promote interoperability
between scientifc data in KGs and existing non-semantic data
tools which we believe will broaden the use of KGs to even
more knowledge areas
9. 22 October 2017Henrique Santos - Enabling Data Analytics from Knowledge Graphs9
The 16th International Semantic Web Conference (ISWC 2017) – Vienna, Austria – Doctoral Consortium
Refections
●
Promoting data analytics from scientifc data in KGs is still in its early
stages
●
Difcult to query the needed data
●
Lack of methods and tools to easily cope data tools with data from KGs
●
Knowledge exploitation to foster data analysis is minimal
●
Our contributions
●
KG specifcation aligned with data analytics requirements
●
Data fle format able to convey both data and metadata
●
Method for data access and retrieval in scientifc KGs based on user queries
●
Our resources
●
Indicator and domain ontologies for developed use-cases
●
Implementations of the proposed method for data access
10. 22 October 2017Henrique Santos - Enabling Data Analytics from Knowledge Graphs10
The 16th International Semantic Web Conference (ISWC 2017) – Vienna, Austria – Doctoral Consortium
hos@edu.unifor.br
@hansidm
http://henriquesantos.org
Enabling Data Analytics from Knowledge Graphs
Henrique Santos
Thank you for your attention
Advisor: Prof. João José Vasco Peixoto Furtado, Docteur