SlideShare una empresa de Scribd logo
1 de 25
Distributed Knowledge Graph
Processing in SANSA and
Ontario
Prof. Dr. Jens Lehmann
Big Data Europe Platform ReleaseMay 3rd 2017
SANSA: Motivation
◎ Abundant machine readable structured information is
available (e.g. in RDF)
o Across BDE Societal Challenges, e.g. for Life Science Data
o General: DBpedia, Google knowledge graph
o Social graphs: Facebook, Twitter
◎ Need for scalable querying, inference and machine
learning
o Link prediction
o Knowledge base completion
o Predictive analytics
o Forward chaining inference
2
SANSA Stack
4
◎ SANSA includes several libraries:
o Read / Write RDF / OWL library
o Querying library
o Inference library
o ML- Machine Learning core library
http://sansa-stack.net/
◎ Framework for distributed RDF dataset processing
aiming at high scalability and fault tolerance
SANSA: Read Write Layer
5
SANSA: Read Write Layer
◎ Ingest RDF and OWL data in different formats using
Jena / OWL API style interfaces
◎ Goal: Represent/distribute data in multiple formats
(e.g. RDD, Data Frames, GraphX, Tensors)
◎ Goal: Allow transformation among these formats
◎ Compute dataset statistics and apply functions to URIs,
literals, subjects, objects → Distributed LODStats
6
SANSA: Query Layer
7
SANSA: Query Layer
◎ Improve performance of generic queries via:
o Intelligent indexing
o Splitting strategies
o Distributed Storage
◎ SPARQL-to-SQL approaches, Virtual Views
◎ Next Step: SPARQL query engine evaluation
◎ SANSA provides a W3C SPARQL compliant endpoint
8
SANSA: Inference Layer
9
SANSA: Inference Layer
◎ W3C Standards for Modelling: RDFS and OWL
◎ Parallel in-memory inference via rule-based forward
chaining
◎ Beyond state of the art: dynamically build a rule
dependency graph for a rule set
◎ → Adjustable performance levels
10
SANSA: ML Layer
11
SANSA: ML Layer
◎ Distributed Machine Learning (ML) algorithms that work
on RDF data and make use of its structure / semantics
◎ Work in Progress:
o Tensor Factorization for e.g. KB completion (testing stage)
o Graph Clustering (testing stage)
o Association rule mining (evaluation stage)
o Semantic Decision trees (idea stage)
o Inference in Knowledge Graph Embeddings (idea stage)
12
SANSA Status and Releases
◎ A generic stack for (big) Linked Data
o Build on top of a state-of-the-art distributed frameworks (Spark, Flink)
o Easy to use: just add the dependencies to your existing Spark or Flink
project
◎ Out-of-the-box framework for (1) querying, (2) inference
and (3) machine learning over RDF datasets
◎ 0.1 Release in December
◎ Subsequent releases every 6 months
13
Semantic Data Lake (Ontario)
◎ Data Lake or Swamp?
o Repository of data in its original formats
o Structured, semi-structured, unstructured
o Without unified schema
◎ Semantic Data Lake
o Add a Semantic Layer on top of the source datasets
❖ The data is semantically lifted using ontology
terms
❖ Provide a uniform view over nonuniform data
14
Semantic Data Lake (Ontario)
◎ A SPARQL query is decomposed into sub-queries
o Keeping links between sub-queries to collect data
returned later from each sub-query
◎ Data Lake is checked for candidate data sources that
can answer each sub-query (consulting metadata)
◎ Each SPARQL sub-query is translated to a query on the
query language of the selected candidate data source
o Examples: SPARQL to SQL, or SPARQL to CQL
◎ Sub-resultes are joined together to form the end result
15
Metadata
property -> data source (type)
Semantic Data Lake (Ontario)
16
Decomposing
User QuerySPARQL query
Database
XML
File
?item gho:Country ?country
.
?item gho:Disease ?disease
.
...
SELECT country, disease,
... FROM Observations
Finding Relevant Data Sources
+ Converting Queries
SQL XPathSQL
MongoDB
JSON
Path
SQL
XML
MongoDB
Semantic Data Lake (Ontario)
17
Database
XML
File
Results Reconciliation
Execution Plan
Links between sub-queries
MongoDB
Final Results
Big Data Europe Integrator Platform Launch
Wednesday 3 May @ 15:00 CEST
Please type your questions at any time. Q&A will follow the presentations
Backup Slides
SANSA: Motivation
20
◎ Over the last years, the size of the Semantic Web has
increased and several large-scale datasets were published.
Source: LOD-Cloud (http://lod-cloud.net/ )
◎ Now days hadoop ecosystem has become a standard for
BigData applications.
◎ We use this infrastructure for Semantic Web as well.
Thank You
Thank You
@Gezim_Sejdiu
http://sda.cs.uni-bonn.de/gezim-sejdiu/
SANSA
Semantic Analytics Stack
https://github.com/SANSA-Stack
● SANSA-RDF
● SANSA-OWL
● SANSA-Query
● SANSA-Inference
● SANSA-ML
● SANSA-Examples
SANSA Planning
◎ Current state:
o SANSA 0.1 released in December
o Can read RDF/OWL files, compute statistics, simple
queries, lightweight inference, graph clustering, rule
mining
◎ Subsequent releases every 6 months
o SANSA 0.2 release planned in June 2017.
23
Conclusions and Next steps
◎ A generic stack for (big) Linked Data
o Build on top of a state-of-the-art distributed frameworks (Spark, Flink)
◎ Out-of-the-box framework for scalable and distributed
semantic data analysis combining semantic web and
distributed machine learning for (1) querying, (2)
inference and (3) analytics of RDF datasets.
◎ Next steps
o Refinement of data structures (RDF/OWL Layer)
o Add support for SPARQL 1.1 and other backend strategies (Query
Layer)
o Define a ML pipelines for Structured ML (ML Layer).
24
BDE & General Integration
◎ SANSA = Scala / Maven Repositories based on Spark /
Flink
◎ Easy to include both in BDE platform and any Spark /
Flink environment
25

Más contenido relacionado

La actualidad más candente

Seige arndt-lightning talk swib13
Seige arndt-lightning talk swib13Seige arndt-lightning talk swib13
Seige arndt-lightning talk swib13
Leander Seige
 

La actualidad más candente (20)

Big Data Europe SC6 WS #3: Big Data Europe Platform: Apps, challenges, goals ...
Big Data Europe SC6 WS #3: Big Data Europe Platform: Apps, challenges, goals ...Big Data Europe SC6 WS #3: Big Data Europe Platform: Apps, challenges, goals ...
Big Data Europe SC6 WS #3: Big Data Europe Platform: Apps, challenges, goals ...
 
Big Data Europe Transport Pilot case, Luigi Selmi
Big Data Europe Transport Pilot case, Luigi SelmiBig Data Europe Transport Pilot case, Luigi Selmi
Big Data Europe Transport Pilot case, Luigi Selmi
 
SC1 Workshop 2 General Introduction to BDE
SC1 Workshop 2 General Introduction to BDESC1 Workshop 2 General Introduction to BDE
SC1 Workshop 2 General Introduction to BDE
 
Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016
Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016
Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016
 
Release webinar end users
Release webinar end usersRelease webinar end users
Release webinar end users
 
BigDataEurope @BDVA Summit2016 2: Societal Pilots
BigDataEurope @BDVA Summit2016 2: Societal PilotsBigDataEurope @BDVA Summit2016 2: Societal Pilots
BigDataEurope @BDVA Summit2016 2: Societal Pilots
 
SC4 Hangout - Luigi Selmi, Transport pilot architecture
SC4 Hangout - Luigi Selmi, Transport pilot architectureSC4 Hangout - Luigi Selmi, Transport pilot architecture
SC4 Hangout - Luigi Selmi, Transport pilot architecture
 
Lightweight Collection and Storage of Software Repository Data with DataRover
Lightweight Collection and Storage of  Software Repository Data with DataRoverLightweight Collection and Storage of  Software Repository Data with DataRover
Lightweight Collection and Storage of Software Repository Data with DataRover
 
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
 
Big Data Europe: Simplifying Development and Deployment of Big Data Applications
Big Data Europe: Simplifying Development and Deployment of Big Data ApplicationsBig Data Europe: Simplifying Development and Deployment of Big Data Applications
Big Data Europe: Simplifying Development and Deployment of Big Data Applications
 
Red hat infrastructure for analytics
Red hat infrastructure for analyticsRed hat infrastructure for analytics
Red hat infrastructure for analytics
 
BDE-BDVA Webinar: BigDataEurope Overview & Synergies with BDVA
BDE-BDVA Webinar: BigDataEurope Overview & Synergies with BDVABDE-BDVA Webinar: BigDataEurope Overview & Synergies with BDVA
BDE-BDVA Webinar: BigDataEurope Overview & Synergies with BDVA
 
Updates from Hungary (Jozsef Kovacs)
Updates from Hungary (Jozsef Kovacs)Updates from Hungary (Jozsef Kovacs)
Updates from Hungary (Jozsef Kovacs)
 
BDE_SC4_WS3_6_Luigi Selmi - Pilot SC4
BDE_SC4_WS3_6_Luigi Selmi - Pilot SC4BDE_SC4_WS3_6_Luigi Selmi - Pilot SC4
BDE_SC4_WS3_6_Luigi Selmi - Pilot SC4
 
Producing Linked Open Data with a Content Management System
Producing Linked Open Data with a Content Management SystemProducing Linked Open Data with a Content Management System
Producing Linked Open Data with a Content Management System
 
BDE SC3.3 Workshop - BDE Platform: Technical overview
 BDE SC3.3 Workshop -  BDE Platform: Technical overview BDE SC3.3 Workshop -  BDE Platform: Technical overview
BDE SC3.3 Workshop - BDE Platform: Technical overview
 
Seige arndt-lightning talk swib13
Seige arndt-lightning talk swib13Seige arndt-lightning talk swib13
Seige arndt-lightning talk swib13
 
Open DMPs: Machine Actionable open data management planning (Presentation at ...
Open DMPs: Machine Actionable open data management planning (Presentation at ...Open DMPs: Machine Actionable open data management planning (Presentation at ...
Open DMPs: Machine Actionable open data management planning (Presentation at ...
 
How to use R in different professions: R In Finance (Speaker: Gabriel Foix, M...
How to use R in different professions: R In Finance (Speaker: Gabriel Foix, M...How to use R in different professions: R In Finance (Speaker: Gabriel Foix, M...
How to use R in different professions: R In Finance (Speaker: Gabriel Foix, M...
 
Alexander Pavlenko, Java Software Engineer, DataArt.
Alexander Pavlenko, Java Software Engineer, DataArt.Alexander Pavlenko, Java Software Engineer, DataArt.
Alexander Pavlenko, Java Software Engineer, DataArt.
 

Similar a Release webinar: Sansa and Ontario

Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
Hajira Jabeen
 
DistLODStats: Distributed Computation of RDF Dataset Statistics - ISWC 2018 talk
DistLODStats: Distributed Computation of RDF Dataset Statistics - ISWC 2018 talkDistLODStats: Distributed Computation of RDF Dataset Statistics - ISWC 2018 talk
DistLODStats: Distributed Computation of RDF Dataset Statistics - ISWC 2018 talk
Gezim Sejdiu
 
Wed roman tut_open_datapub
Wed roman tut_open_datapubWed roman tut_open_datapub
Wed roman tut_open_datapub
eswcsummerschool
 

Similar a Release webinar: Sansa and Ontario (20)

Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
 
The Best of Both Worlds: Unlocking the Power of (big) Knowledge Graphs with S...
The Best of Both Worlds: Unlocking the Power of (big) Knowledge Graphs with S...The Best of Both Worlds: Unlocking the Power of (big) Knowledge Graphs with S...
The Best of Both Worlds: Unlocking the Power of (big) Knowledge Graphs with S...
 
DistLODStats: Distributed Computation of RDF Dataset Statistics - ISWC 2018 talk
DistLODStats: Distributed Computation of RDF Dataset Statistics - ISWC 2018 talkDistLODStats: Distributed Computation of RDF Dataset Statistics - ISWC 2018 talk
DistLODStats: Distributed Computation of RDF Dataset Statistics - ISWC 2018 talk
 
Wed roman tut_open_datapub
Wed roman tut_open_datapubWed roman tut_open_datapub
Wed roman tut_open_datapub
 
BigDataEurope @BDVA Summit2016 1: The BDE Platform
BigDataEurope @BDVA Summit2016 1: The BDE PlatformBigDataEurope @BDVA Summit2016 1: The BDE Platform
BigDataEurope @BDVA Summit2016 1: The BDE Platform
 
Apache Spark 101 - Demi Ben-Ari
Apache Spark 101 - Demi Ben-AriApache Spark 101 - Demi Ben-Ari
Apache Spark 101 - Demi Ben-Ari
 
Graph databases & data integration - the case of RDF
Graph databases & data integration - the case of RDFGraph databases & data integration - the case of RDF
Graph databases & data integration - the case of RDF
 
State of the Semantic Web
State of the Semantic WebState of the Semantic Web
State of the Semantic Web
 
Java One 2017: Open Source Big Data in the Cloud: Hadoop, M/R, Hive, Spark an...
Java One 2017: Open Source Big Data in the Cloud: Hadoop, M/R, Hive, Spark an...Java One 2017: Open Source Big Data in the Cloud: Hadoop, M/R, Hive, Spark an...
Java One 2017: Open Source Big Data in the Cloud: Hadoop, M/R, Hive, Spark an...
 
Modèles de données et langages de description ouverts 6 - 2021-2022
Modèles de données et langages de description ouverts   6 - 2021-2022Modèles de données et langages de description ouverts   6 - 2021-2022
Modèles de données et langages de description ouverts 6 - 2021-2022
 
Apache spark - History and market overview
Apache spark - History and market overviewApache spark - History and market overview
Apache spark - History and market overview
 
Scaling the (evolving) web data –at low cost-
Scaling the (evolving) web data –at low cost-Scaling the (evolving) web data –at low cost-
Scaling the (evolving) web data –at low cost-
 
20110728 datalift-rpi-troy
20110728 datalift-rpi-troy20110728 datalift-rpi-troy
20110728 datalift-rpi-troy
 
ACM TechTalks : Apache Arrow and the Future of Data Frames
ACM TechTalks : Apache Arrow and the Future of Data FramesACM TechTalks : Apache Arrow and the Future of Data Frames
ACM TechTalks : Apache Arrow and the Future of Data Frames
 
معرفی کاربردهای یادگیری عمیق و چالش های آن در کلان داده
معرفی کاربردهای یادگیری عمیق و چالش های آن در کلان دادهمعرفی کاربردهای یادگیری عمیق و چالش های آن در کلان داده
معرفی کاربردهای یادگیری عمیق و چالش های آن در کلان داده
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival data
 
Present and future of unified, portable and efficient data processing with Ap...
Present and future of unified, portable and efficient data processing with Ap...Present and future of unified, portable and efficient data processing with Ap...
Present and future of unified, portable and efficient data processing with Ap...
 
Apache Spark 101 - Demi Ben-Ari - Panorays
Apache Spark 101 - Demi Ben-Ari - PanoraysApache Spark 101 - Demi Ben-Ari - Panorays
Apache Spark 101 - Demi Ben-Ari - Panorays
 
ICWE2017 BigDataEurope
ICWE2017 BigDataEuropeICWE2017 BigDataEurope
ICWE2017 BigDataEurope
 
Data Analytics and Machine Learning: From Node to Cluster on ARM64
Data Analytics and Machine Learning: From Node to Cluster on ARM64Data Analytics and Machine Learning: From Node to Cluster on ARM64
Data Analytics and Machine Learning: From Node to Cluster on ARM64
 

Más de BigData_Europe

Más de BigData_Europe (20)

Luigi Selmi - The Big Data Integrator Platform
Luigi Selmi - The Big Data Integrator PlatformLuigi Selmi - The Big Data Integrator Platform
Luigi Selmi - The Big Data Integrator Platform
 
Josep Maria Salanova - Introduction to BDE+SC4
Josep Maria Salanova - Introduction to BDE+SC4Josep Maria Salanova - Introduction to BDE+SC4
Josep Maria Salanova - Introduction to BDE+SC4
 
Rajendra Akerkar - LeMO Project
Rajendra Akerkar - LeMO ProjectRajendra Akerkar - LeMO Project
Rajendra Akerkar - LeMO Project
 
Big Data Europe SC6 WS #3: PILOT SC6: CITIZEN BUDGET ON MUNICIPAL LEVEL, Mart...
Big Data Europe SC6 WS #3: PILOT SC6: CITIZEN BUDGET ON MUNICIPAL LEVEL, Mart...Big Data Europe SC6 WS #3: PILOT SC6: CITIZEN BUDGET ON MUNICIPAL LEVEL, Mart...
Big Data Europe SC6 WS #3: PILOT SC6: CITIZEN BUDGET ON MUNICIPAL LEVEL, Mart...
 
Big Data Europe SC6 WS 3: Where we are and are going for Big Data in OpenScie...
Big Data Europe SC6 WS 3: Where we are and are going for Big Data in OpenScie...Big Data Europe SC6 WS 3: Where we are and are going for Big Data in OpenScie...
Big Data Europe SC6 WS 3: Where we are and are going for Big Data in OpenScie...
 
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
 
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
 
BDE SC3.3 Workshop - BDE review: Scope and Opportunities
 BDE SC3.3 Workshop -  BDE review: Scope and Opportunities BDE SC3.3 Workshop -  BDE review: Scope and Opportunities
BDE SC3.3 Workshop - BDE review: Scope and Opportunities
 
BDE SC3.3 Workshop - Agenda
 BDE SC3.3 Workshop - Agenda BDE SC3.3 Workshop - Agenda
BDE SC3.3 Workshop - Agenda
 
BDE SC3.3 Workshop - BDE Pilot case for Wind Turbine condition monitoring re...
 BDE SC3.3 Workshop - BDE Pilot case for Wind Turbine condition monitoring re... BDE SC3.3 Workshop - BDE Pilot case for Wind Turbine condition monitoring re...
BDE SC3.3 Workshop - BDE Pilot case for Wind Turbine condition monitoring re...
 
BDE SC3.3 Workshop - Data management in WT testing and monitoring
 BDE SC3.3 Workshop - Data management in WT testing and monitoring  BDE SC3.3 Workshop - Data management in WT testing and monitoring
BDE SC3.3 Workshop - Data management in WT testing and monitoring
 
BDE SC3.3 Workshop - Big Data in Wind Turbine Condition Monitoring
 BDE SC3.3 Workshop -  Big Data in Wind Turbine Condition Monitoring BDE SC3.3 Workshop -  Big Data in Wind Turbine Condition Monitoring
BDE SC3.3 Workshop - Big Data in Wind Turbine Condition Monitoring
 
BDE SC3.3 Workshop - Options for Wind Farm performance assessment and Power f...
BDE SC3.3 Workshop - Options for Wind Farm performance assessment and Power f...BDE SC3.3 Workshop - Options for Wind Farm performance assessment and Power f...
BDE SC3.3 Workshop - Options for Wind Farm performance assessment and Power f...
 
BDE SC3.3 Workshop - Wind Farm Monitoring and advanced analytics
 BDE SC3.3 Workshop - Wind Farm Monitoring and advanced analytics  BDE SC3.3 Workshop - Wind Farm Monitoring and advanced analytics
BDE SC3.3 Workshop - Wind Farm Monitoring and advanced analytics
 
Big Data Europe: Workshop 3 SC6 Social Science: THE IMPORTANCE OF METADATA & ...
Big Data Europe: Workshop 3 SC6 Social Science: THE IMPORTANCE OF METADATA & ...Big Data Europe: Workshop 3 SC6 Social Science: THE IMPORTANCE OF METADATA & ...
Big Data Europe: Workshop 3 SC6 Social Science: THE IMPORTANCE OF METADATA & ...
 
BDE SC1 Workshop 3 - BigMedilytics Overview (Supriyo Chatterjea)
BDE SC1 Workshop 3 - BigMedilytics Overview (Supriyo Chatterjea)BDE SC1 Workshop 3 - BigMedilytics Overview (Supriyo Chatterjea)
BDE SC1 Workshop 3 - BigMedilytics Overview (Supriyo Chatterjea)
 
BDE SC1 Workshop 3 - iASiS (Guillermo Palma)
BDE SC1 Workshop 3 - iASiS (Guillermo Palma)BDE SC1 Workshop 3 - iASiS (Guillermo Palma)
BDE SC1 Workshop 3 - iASiS (Guillermo Palma)
 
BDE SC1 Workshop 3 - MIDAS (Michaela Black)
BDE SC1 Workshop 3 - MIDAS (Michaela Black)BDE SC1 Workshop 3 - MIDAS (Michaela Black)
BDE SC1 Workshop 3 - MIDAS (Michaela Black)
 
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)
 
BDE SC1 Workshop 3 - Big Data Europe (Simon Scerri)
BDE SC1 Workshop 3 - Big Data Europe (Simon Scerri)BDE SC1 Workshop 3 - Big Data Europe (Simon Scerri)
BDE SC1 Workshop 3 - Big Data Europe (Simon Scerri)
 

Último

怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
vexqp
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
gajnagarg
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
gajnagarg
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
wsppdmt
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Bertram Ludäscher
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
chadhar227
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
gajnagarg
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
nirzagarg
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 

Último (20)

怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
Kings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about themKings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about them
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 

Release webinar: Sansa and Ontario

  • 1. Distributed Knowledge Graph Processing in SANSA and Ontario Prof. Dr. Jens Lehmann Big Data Europe Platform ReleaseMay 3rd 2017
  • 2. SANSA: Motivation ◎ Abundant machine readable structured information is available (e.g. in RDF) o Across BDE Societal Challenges, e.g. for Life Science Data o General: DBpedia, Google knowledge graph o Social graphs: Facebook, Twitter ◎ Need for scalable querying, inference and machine learning o Link prediction o Knowledge base completion o Predictive analytics o Forward chaining inference 2
  • 3.
  • 4. SANSA Stack 4 ◎ SANSA includes several libraries: o Read / Write RDF / OWL library o Querying library o Inference library o ML- Machine Learning core library http://sansa-stack.net/ ◎ Framework for distributed RDF dataset processing aiming at high scalability and fault tolerance
  • 6. SANSA: Read Write Layer ◎ Ingest RDF and OWL data in different formats using Jena / OWL API style interfaces ◎ Goal: Represent/distribute data in multiple formats (e.g. RDD, Data Frames, GraphX, Tensors) ◎ Goal: Allow transformation among these formats ◎ Compute dataset statistics and apply functions to URIs, literals, subjects, objects → Distributed LODStats 6
  • 8. SANSA: Query Layer ◎ Improve performance of generic queries via: o Intelligent indexing o Splitting strategies o Distributed Storage ◎ SPARQL-to-SQL approaches, Virtual Views ◎ Next Step: SPARQL query engine evaluation ◎ SANSA provides a W3C SPARQL compliant endpoint 8
  • 10. SANSA: Inference Layer ◎ W3C Standards for Modelling: RDFS and OWL ◎ Parallel in-memory inference via rule-based forward chaining ◎ Beyond state of the art: dynamically build a rule dependency graph for a rule set ◎ → Adjustable performance levels 10
  • 12. SANSA: ML Layer ◎ Distributed Machine Learning (ML) algorithms that work on RDF data and make use of its structure / semantics ◎ Work in Progress: o Tensor Factorization for e.g. KB completion (testing stage) o Graph Clustering (testing stage) o Association rule mining (evaluation stage) o Semantic Decision trees (idea stage) o Inference in Knowledge Graph Embeddings (idea stage) 12
  • 13. SANSA Status and Releases ◎ A generic stack for (big) Linked Data o Build on top of a state-of-the-art distributed frameworks (Spark, Flink) o Easy to use: just add the dependencies to your existing Spark or Flink project ◎ Out-of-the-box framework for (1) querying, (2) inference and (3) machine learning over RDF datasets ◎ 0.1 Release in December ◎ Subsequent releases every 6 months 13
  • 14. Semantic Data Lake (Ontario) ◎ Data Lake or Swamp? o Repository of data in its original formats o Structured, semi-structured, unstructured o Without unified schema ◎ Semantic Data Lake o Add a Semantic Layer on top of the source datasets ❖ The data is semantically lifted using ontology terms ❖ Provide a uniform view over nonuniform data 14
  • 15. Semantic Data Lake (Ontario) ◎ A SPARQL query is decomposed into sub-queries o Keeping links between sub-queries to collect data returned later from each sub-query ◎ Data Lake is checked for candidate data sources that can answer each sub-query (consulting metadata) ◎ Each SPARQL sub-query is translated to a query on the query language of the selected candidate data source o Examples: SPARQL to SQL, or SPARQL to CQL ◎ Sub-resultes are joined together to form the end result 15
  • 16. Metadata property -> data source (type) Semantic Data Lake (Ontario) 16 Decomposing User QuerySPARQL query Database XML File ?item gho:Country ?country . ?item gho:Disease ?disease . ... SELECT country, disease, ... FROM Observations Finding Relevant Data Sources + Converting Queries SQL XPathSQL MongoDB JSON Path SQL XML MongoDB
  • 17. Semantic Data Lake (Ontario) 17 Database XML File Results Reconciliation Execution Plan Links between sub-queries MongoDB Final Results
  • 18. Big Data Europe Integrator Platform Launch Wednesday 3 May @ 15:00 CEST Please type your questions at any time. Q&A will follow the presentations
  • 20. SANSA: Motivation 20 ◎ Over the last years, the size of the Semantic Web has increased and several large-scale datasets were published. Source: LOD-Cloud (http://lod-cloud.net/ ) ◎ Now days hadoop ecosystem has become a standard for BigData applications. ◎ We use this infrastructure for Semantic Web as well.
  • 22. Thank You @Gezim_Sejdiu http://sda.cs.uni-bonn.de/gezim-sejdiu/ SANSA Semantic Analytics Stack https://github.com/SANSA-Stack ● SANSA-RDF ● SANSA-OWL ● SANSA-Query ● SANSA-Inference ● SANSA-ML ● SANSA-Examples
  • 23. SANSA Planning ◎ Current state: o SANSA 0.1 released in December o Can read RDF/OWL files, compute statistics, simple queries, lightweight inference, graph clustering, rule mining ◎ Subsequent releases every 6 months o SANSA 0.2 release planned in June 2017. 23
  • 24. Conclusions and Next steps ◎ A generic stack for (big) Linked Data o Build on top of a state-of-the-art distributed frameworks (Spark, Flink) ◎ Out-of-the-box framework for scalable and distributed semantic data analysis combining semantic web and distributed machine learning for (1) querying, (2) inference and (3) analytics of RDF datasets. ◎ Next steps o Refinement of data structures (RDF/OWL Layer) o Add support for SPARQL 1.1 and other backend strategies (Query Layer) o Define a ML pipelines for Structured ML (ML Layer). 24
  • 25. BDE & General Integration ◎ SANSA = Scala / Maven Repositories based on Spark / Flink ◎ Easy to include both in BDE platform and any Spark / Flink environment 25

Notas del editor

  1. The traditional Machine learning methods operate on simple feature vector based input and lack expressive and understandable outcomes. The methods like statistical relational learning and Inductive logic are expressive but lack scalability
  2. Goal: Build a semantic analytics stack which allows you to perform distributed: Querying Inference Analytics on RDF datasets Existing Open Source libraries either: Lack Community Lack Documentation and Examples Lack Scalability Or are research-oriented Working: A Library built on top of Spark and Flink Consists of several APIs Read / Write: RDF / OWL API for RDF/OWL operations, Querying API: supports a query language on top of distributed RDF library, Inference API : Implementation of a rule based reasoner using the querying API ML- Machine Learning Core Library