SlideShare una empresa de Scribd logo
1 de 23
Creating and Utilizing Linked Open Statistical 
Data for the Development of Advanced 
Analytics Services 
E. Kalampokis, A. Karamanou, A. Nikolov, P. Haase, R. Cyganiak, B. Roberts, P. 
Hermans, E. Tambouris, K. Tarabanis
 A major part of Open Data concerns statistics that can be formulated as 
data cubes. 
 The objective of this paper is to present the OpenCube approach for 
working with linked data cubes. 
 The ultimate goal of OpenCube is to facilitate 
 Publishing of high-quality linked statistical data 
 Reusing linked statistical datasets in visualizations and analytics 
2 
Objective 
19 October 2014, Riva del Garda, Italy ISWC 2014 – SemStats 2014
Linked Statistical Data Lifecycle 
 OpenCube develops 
components to support the 
whole lifecycle of linked 
statistical data. 
 The lifecycle describes steps that 
raw data cubes should go 
through in order to create value. 
19 October 2014, Riva del Garda, Italy ISWC 2014 – SemStats 2014 
3
 Different steps of the lifecycle are realized by separate components. 
 Two different implementation approaches are considered based on 
the underlying platform. 
 fluidOps’ Information Workbench 
 Swirrl’s PublishMyData 
 Extensions for the commercial platforms and an Open-Source toolkit. 
4 
Implementation 
19 October 2014, Riva del Garda, Italy ISWC 2014 – SemStats 2014
 Publishing components 
 TARQL extension 
 D2RQ /R2RML-QB extension 
 JSON-stat 
 Grafter 
 Consuming components 
 Data catalogue 
 OpenCube Browser 
 OpenCube MapView 
 R Analysis Chart 
 Aggregation component 
5 
Components 
19 October 2014, Riva del Garda, Italy ISWC 2014 – SemStats 2014
 TARQL is a command-line tool for 
converting CSV files to RDF using 
SPARQL 1.1 syntax 
 https://github.com/cygri/tarql 
 TARQL is a SPARQL based data 
mapping language. 
 The OpenCube TARQL extension 
enables RDF data cubes 
construction from CSV files. 
 Redesigned TARQL API 
 Added streaming evaluation mode 
 It will be integrated to the IWB 
platform very soon. 
6 
TARQL OpenCube Extension 
Components description
 The D2RQ OpenCube component 
enables the generation of RDF data 
cubes from relational tables. 
 It builds upon the D2RQ open source 
platform and it leverages R2RML 
language. 
 The component will be integrated 
into the IWB platform and it will 
provide an easy to use interface to 
adjust output mapping. 
7 
D2RQ/R2RML-QB Extension 
Components description
 The JSON-stat format is a simple lightweight JSON format for 
multidimensional data. 
 http://json-stat.org/format/ 
 A JSON-stat file can contain one or more datasets. 
 Multiple datasets responses allow a provider to disseminate 
information with few common dimensions in a single response. 
8 
JSON-stat 
Components description
 Open source software framework for transforming tabular data (CSV 
or XLS) to RDF 
 http://grafter.org 
 Automatable/works with API 
 Designed to support a graphical user interface (work in progress) 
 Performs well with large datasets 
9 
Grafter 
19 October 2014, Riva del Garda, Italy ISWC 2014 – SemStats 2014
WORLD BANK 
Managing metadata over data cubes 
• Data collection 
• Integration of major open data catalogs 
• UI for search and exploration of data sets 
• Rich metadata based on open standards 
• Both descriptive and structural metadata 
• Self-service UI 
• Custom queries and visualizations 
• Widgets, dashboarding, etc.
Data catalogue management 
 Managing catalogues of datasets 
 Search & discovery of relevant data 
 Goal: on-demand provisioning 
19-20 May 2014 OpenCube plenary meeting 11
 It enables the exploration of an RDF data cube by presenting a two-dimensional 
slice of the cube as a table. 
 The slice is created by setting a fixed values for each dimension that 
is not presented in the table. 
 The browser is integrated in both IWB and PublishMyData platform. 
12 
OpenCube browser 
19 October 2014, Riva del Garda, Italy ISWC 2014 – SemStats 2014
13 
OpenCube browser (IWB extension) 
Summarize 
observations across a 
dimension 
(dimension 
reduction) 
Change the 
axes of the 
table 
18-19 November 2013 OpenCube kick-off meeting 
Change the 
language 
Change the 
fixed values
Data cube grid view (PublishMyData extension) 
 See http://opendatacommunities.org for live examples
Data cube grid view 
 Shows two dimensional slice of data 
 Controls to set values of other dimensions 
 Download chosen slice as CSV 
 Performs well with large datasets by loading data asynchronously as 
users scrolls through 
 See http://opendatacommunities.org for live examples
 It enables the visualization of RDF data cubes on a map based on their 
geospatial dimension. 
 It supports: 
 Markers 
 Bubble 
 Choropleth maps (need for polygons) 
 It is integrated in both 
 IWB and 
 PublishMyData 
16 
OpenCube MapView 
19 October 2014, Riva del Garda, Italy ISWC 2014 – SemStats 2014
Choropleth map in PublishMyData
Support for advanced analytic tasks 
 Reuse of existing established 
tools to support advanced 
analytic tasks 
 Loose coupling integration with 
R 
 R is accessed as a web service 
 Rich analysis capabilities (all 
packages developed by the R 
community) 
18-19 November 2013 OpenCube kick-off meeting 18
Integration with R 
 Visualisation of analysis results (charts & tables) 
 Reuse of analysis results: preserving R output as linked data 
 Managing a catalogue of the analytics experiments („recipes“) 
19-20 May 2014 OpenCube plenary meeting 19
 Widget-based visualization of data 
 Pre-existing: Configuration using 
explicit SPARQL queries 
 More appropriate for engineers building 
custom solutions than for end users 
 Goal: Intuitive configuration of 
visualization views exploiting the 
Data Cube structure 
Data Cube Visualization 
Visualization and Exploration 
Analytics and Reporting 
18-19 November 2013 OpenCube kick-off meeting 20
Stock chart visualization 
 Adaptation of the stock chart view to the RDF data cube datasets 
 Improved configuration UI 
 specifying dimension restrictions instead of the complete SPARQL query 
 Additional features (e.g., comparison between slices) 
19-20 May 2014 OpenCube plenary meeting 21
We currently perform evaluations of the components in four pilots 
 Department for Communities and Local Government (UK) 
 Central Statistics Office (Ireland) 
 Flemish Government (Belgium) 
 Swiss Banks 
 Some interesting findings 
 Why to use linked data 
 Performance issues with large data sets 
 Noisy data 
22 
Initial Evaluation Results 
19 October 2014, Riva del Garda, Italy ISWC 2014 – SemStats 2014
 For more information 
 http://opencube-project.eu 
 http://opencube-toolkit.eu 
23 
OpenCube toolkit 
19 October 2014, Riva del Garda, Italy ISWC 2014 – SemStats 2014

Más contenido relacionado

La actualidad más candente

Spatial Data Integrator - Software Presentation and Use Cases
Spatial Data Integrator - Software Presentation and Use CasesSpatial Data Integrator - Software Presentation and Use Cases
Spatial Data Integrator - Software Presentation and Use Cases
mathieuraj
 
Towards maintainable constraint validation and repair for taxonomies: The Poo...
Towards maintainable constraint validation and repair for taxonomies: The Poo...Towards maintainable constraint validation and repair for taxonomies: The Poo...
Towards maintainable constraint validation and repair for taxonomies: The Poo...
Monika Solanki
 

La actualidad más candente (14)

Spatial Data Integrator - Software Presentation and Use Cases
Spatial Data Integrator - Software Presentation and Use CasesSpatial Data Integrator - Software Presentation and Use Cases
Spatial Data Integrator - Software Presentation and Use Cases
 
Yet another population cartogram: Creating gridded cartograms using ArcGIS an...
Yet another population cartogram: Creating gridded cartograms using ArcGIS an...Yet another population cartogram: Creating gridded cartograms using ArcGIS an...
Yet another population cartogram: Creating gridded cartograms using ArcGIS an...
 
Dr Richard Fry - Using R as a GIS
Dr Richard Fry - Using R as a GISDr Richard Fry - Using R as a GIS
Dr Richard Fry - Using R as a GIS
 
Team 5: Open Land Use Metadata Harvesting on NextGEOSS
Team 5: Open Land Use Metadata Harvesting on NextGEOSSTeam 5: Open Land Use Metadata Harvesting on NextGEOSS
Team 5: Open Land Use Metadata Harvesting on NextGEOSS
 
Maps with leafletR
Maps with leafletRMaps with leafletR
Maps with leafletR
 
MOCHA 2018 Challenge @ ESWC2018
MOCHA 2018 Challenge @ ESWC2018MOCHA 2018 Challenge @ ESWC2018
MOCHA 2018 Challenge @ ESWC2018
 
What's new in Spark 2.0?
What's new in Spark 2.0?What's new in Spark 2.0?
What's new in Spark 2.0?
 
European Data Portal - ePSI platform webinar 8 February 2016
European Data Portal - ePSI platform webinar 8 February 2016European Data Portal - ePSI platform webinar 8 February 2016
European Data Portal - ePSI platform webinar 8 February 2016
 
From Simple Features to Moving Features and Beyond? at OGC Member Meeting, Se...
From Simple Features to Moving Features and Beyond? at OGC Member Meeting, Se...From Simple Features to Moving Features and Beyond? at OGC Member Meeting, Se...
From Simple Features to Moving Features and Beyond? at OGC Member Meeting, Se...
 
Graph database
Graph databaseGraph database
Graph database
 
Supervised Papers Classification on Large-Scale High-Dimensional Data with Ap...
Supervised Papers Classification on Large-Scale High-Dimensional Data with Ap...Supervised Papers Classification on Large-Scale High-Dimensional Data with Ap...
Supervised Papers Classification on Large-Scale High-Dimensional Data with Ap...
 
General Presentation European Data Portal
General Presentation European Data PortalGeneral Presentation European Data Portal
General Presentation European Data Portal
 
Flink Forward Berlin 2018: Piotr Wawrzyniak & Jarosław Legierski - "Using Apa...
Flink Forward Berlin 2018: Piotr Wawrzyniak & Jarosław Legierski - "Using Apa...Flink Forward Berlin 2018: Piotr Wawrzyniak & Jarosław Legierski - "Using Apa...
Flink Forward Berlin 2018: Piotr Wawrzyniak & Jarosław Legierski - "Using Apa...
 
Towards maintainable constraint validation and repair for taxonomies: The Poo...
Towards maintainable constraint validation and repair for taxonomies: The Poo...Towards maintainable constraint validation and repair for taxonomies: The Poo...
Towards maintainable constraint validation and repair for taxonomies: The Poo...
 

Similar a Creating and Utilizing Linked Open Statistical Data for the Development of Advanced Analytics Services

Database Integrated Analytics using R InitialExperiences wi
Database Integrated Analytics using R InitialExperiences wiDatabase Integrated Analytics using R InitialExperiences wi
Database Integrated Analytics using R InitialExperiences wi
OllieShoresna
 
Dublinked tech workshop_15_dec2011
Dublinked tech workshop_15_dec2011Dublinked tech workshop_15_dec2011
Dublinked tech workshop_15_dec2011
Dublinked .
 
The habitats approach to build the inspire infrastructure
The habitats approach to build the inspire infrastructureThe habitats approach to build the inspire infrastructure
The habitats approach to build the inspire infrastructure
Karel Charvat
 
Tag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh PlatformTag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh Platform
Sanjay Padhi, Ph.D
 
Accelerating R analytics with Spark and Microsoft R Server for Hadoop
Accelerating R analytics with Spark and  Microsoft R Server  for HadoopAccelerating R analytics with Spark and  Microsoft R Server  for Hadoop
Accelerating R analytics with Spark and Microsoft R Server for Hadoop
Willy Marroquin (WillyDevNET)
 

Similar a Creating and Utilizing Linked Open Statistical Data for the Development of Advanced Analytics Services (20)

Database Integrated Analytics using R InitialExperiences wi
Database Integrated Analytics using R InitialExperiences wiDatabase Integrated Analytics using R InitialExperiences wi
Database Integrated Analytics using R InitialExperiences wi
 
Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...
Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...
Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...
 
Ss eb29
Ss eb29Ss eb29
Ss eb29
 
SC1 Workshop 2 General Introduction to BDE
SC1 Workshop 2 General Introduction to BDESC1 Workshop 2 General Introduction to BDE
SC1 Workshop 2 General Introduction to BDE
 
BDE SC3.3 Workshop - BDE Platform: Technical overview
 BDE SC3.3 Workshop -  BDE Platform: Technical overview BDE SC3.3 Workshop -  BDE Platform: Technical overview
BDE SC3.3 Workshop - BDE Platform: Technical overview
 
UnifiedViews: Towards ETL Tool for Simple yet Powerful RDF Data Management.
UnifiedViews: Towards ETL Tool for Simple yet Powerful RDF Data Management.UnifiedViews: Towards ETL Tool for Simple yet Powerful RDF Data Management.
UnifiedViews: Towards ETL Tool for Simple yet Powerful RDF Data Management.
 
BDE SC6-ws-05/12/2016 technology part - SWC
BDE SC6-ws-05/12/2016 technology part - SWCBDE SC6-ws-05/12/2016 technology part - SWC
BDE SC6-ws-05/12/2016 technology part - SWC
 
BDE SC6-hang out - technology part-SWC - Martin
BDE SC6-hang out - technology part-SWC - MartinBDE SC6-hang out - technology part-SWC - Martin
BDE SC6-hang out - technology part-SWC - Martin
 
Dublinked tech workshop_15_dec2011
Dublinked tech workshop_15_dec2011Dublinked tech workshop_15_dec2011
Dublinked tech workshop_15_dec2011
 
BDE SC6 workshop - introduction 2016
BDE SC6 workshop - introduction 2016BDE SC6 workshop - introduction 2016
BDE SC6 workshop - introduction 2016
 
The habitats approach to build the inspire infrastructure
The habitats approach to build the inspire infrastructureThe habitats approach to build the inspire infrastructure
The habitats approach to build the inspire infrastructure
 
RDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival dataRDF-Gen: Generating RDF from streaming and archival data
RDF-Gen: Generating RDF from streaming and archival data
 
Benchmarking of distributed linked data streaming systems
Benchmarking of distributed linked data streaming systemsBenchmarking of distributed linked data streaming systems
Benchmarking of distributed linked data streaming systems
 
Geohosting
GeohostingGeohosting
Geohosting
 
Planetdata simpda
Planetdata simpdaPlanetdata simpda
Planetdata simpda
 
PlanetData: Consuming Structured Data at Web Scale
PlanetData: Consuming Structured Data at Web ScalePlanetData: Consuming Structured Data at Web Scale
PlanetData: Consuming Structured Data at Web Scale
 
Tag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh PlatformTag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh Platform
 
Accelerating R analytics with Spark and Microsoft R Server for Hadoop
Accelerating R analytics with Spark and  Microsoft R Server  for HadoopAccelerating R analytics with Spark and  Microsoft R Server  for Hadoop
Accelerating R analytics with Spark and Microsoft R Server for Hadoop
 
The Best of Both Worlds: Unlocking the Power of (big) Knowledge Graphs with S...
The Best of Both Worlds: Unlocking the Power of (big) Knowledge Graphs with S...The Best of Both Worlds: Unlocking the Power of (big) Knowledge Graphs with S...
The Best of Both Worlds: Unlocking the Power of (big) Knowledge Graphs with S...
 
Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016
Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016
Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016
 

Último

怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
vexqp
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
Health
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
vexqp
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
ahmedjiabur940
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
PLE-statistics document for primary schs
PLE-statistics document for primary schsPLE-statistics document for primary schs
PLE-statistics document for primary schs
cnajjemba
 
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
vexqp
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
chadhar227
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
ptikerjasaptiker
 

Último (20)

SR-101-01012024-EN.docx Federal Constitution of the Swiss Confederation
SR-101-01012024-EN.docx  Federal Constitution  of the Swiss ConfederationSR-101-01012024-EN.docx  Federal Constitution  of the Swiss Confederation
SR-101-01012024-EN.docx Federal Constitution of the Swiss Confederation
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
Data Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdfData Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdf
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
PLE-statistics document for primary schs
PLE-statistics document for primary schsPLE-statistics document for primary schs
PLE-statistics document for primary schs
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
 
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
 

Creating and Utilizing Linked Open Statistical Data for the Development of Advanced Analytics Services

  • 1. Creating and Utilizing Linked Open Statistical Data for the Development of Advanced Analytics Services E. Kalampokis, A. Karamanou, A. Nikolov, P. Haase, R. Cyganiak, B. Roberts, P. Hermans, E. Tambouris, K. Tarabanis
  • 2.  A major part of Open Data concerns statistics that can be formulated as data cubes.  The objective of this paper is to present the OpenCube approach for working with linked data cubes.  The ultimate goal of OpenCube is to facilitate  Publishing of high-quality linked statistical data  Reusing linked statistical datasets in visualizations and analytics 2 Objective 19 October 2014, Riva del Garda, Italy ISWC 2014 – SemStats 2014
  • 3. Linked Statistical Data Lifecycle  OpenCube develops components to support the whole lifecycle of linked statistical data.  The lifecycle describes steps that raw data cubes should go through in order to create value. 19 October 2014, Riva del Garda, Italy ISWC 2014 – SemStats 2014 3
  • 4.  Different steps of the lifecycle are realized by separate components.  Two different implementation approaches are considered based on the underlying platform.  fluidOps’ Information Workbench  Swirrl’s PublishMyData  Extensions for the commercial platforms and an Open-Source toolkit. 4 Implementation 19 October 2014, Riva del Garda, Italy ISWC 2014 – SemStats 2014
  • 5.  Publishing components  TARQL extension  D2RQ /R2RML-QB extension  JSON-stat  Grafter  Consuming components  Data catalogue  OpenCube Browser  OpenCube MapView  R Analysis Chart  Aggregation component 5 Components 19 October 2014, Riva del Garda, Italy ISWC 2014 – SemStats 2014
  • 6.  TARQL is a command-line tool for converting CSV files to RDF using SPARQL 1.1 syntax  https://github.com/cygri/tarql  TARQL is a SPARQL based data mapping language.  The OpenCube TARQL extension enables RDF data cubes construction from CSV files.  Redesigned TARQL API  Added streaming evaluation mode  It will be integrated to the IWB platform very soon. 6 TARQL OpenCube Extension Components description
  • 7.  The D2RQ OpenCube component enables the generation of RDF data cubes from relational tables.  It builds upon the D2RQ open source platform and it leverages R2RML language.  The component will be integrated into the IWB platform and it will provide an easy to use interface to adjust output mapping. 7 D2RQ/R2RML-QB Extension Components description
  • 8.  The JSON-stat format is a simple lightweight JSON format for multidimensional data.  http://json-stat.org/format/  A JSON-stat file can contain one or more datasets.  Multiple datasets responses allow a provider to disseminate information with few common dimensions in a single response. 8 JSON-stat Components description
  • 9.  Open source software framework for transforming tabular data (CSV or XLS) to RDF  http://grafter.org  Automatable/works with API  Designed to support a graphical user interface (work in progress)  Performs well with large datasets 9 Grafter 19 October 2014, Riva del Garda, Italy ISWC 2014 – SemStats 2014
  • 10. WORLD BANK Managing metadata over data cubes • Data collection • Integration of major open data catalogs • UI for search and exploration of data sets • Rich metadata based on open standards • Both descriptive and structural metadata • Self-service UI • Custom queries and visualizations • Widgets, dashboarding, etc.
  • 11. Data catalogue management  Managing catalogues of datasets  Search & discovery of relevant data  Goal: on-demand provisioning 19-20 May 2014 OpenCube plenary meeting 11
  • 12.  It enables the exploration of an RDF data cube by presenting a two-dimensional slice of the cube as a table.  The slice is created by setting a fixed values for each dimension that is not presented in the table.  The browser is integrated in both IWB and PublishMyData platform. 12 OpenCube browser 19 October 2014, Riva del Garda, Italy ISWC 2014 – SemStats 2014
  • 13. 13 OpenCube browser (IWB extension) Summarize observations across a dimension (dimension reduction) Change the axes of the table 18-19 November 2013 OpenCube kick-off meeting Change the language Change the fixed values
  • 14. Data cube grid view (PublishMyData extension)  See http://opendatacommunities.org for live examples
  • 15. Data cube grid view  Shows two dimensional slice of data  Controls to set values of other dimensions  Download chosen slice as CSV  Performs well with large datasets by loading data asynchronously as users scrolls through  See http://opendatacommunities.org for live examples
  • 16.  It enables the visualization of RDF data cubes on a map based on their geospatial dimension.  It supports:  Markers  Bubble  Choropleth maps (need for polygons)  It is integrated in both  IWB and  PublishMyData 16 OpenCube MapView 19 October 2014, Riva del Garda, Italy ISWC 2014 – SemStats 2014
  • 17. Choropleth map in PublishMyData
  • 18. Support for advanced analytic tasks  Reuse of existing established tools to support advanced analytic tasks  Loose coupling integration with R  R is accessed as a web service  Rich analysis capabilities (all packages developed by the R community) 18-19 November 2013 OpenCube kick-off meeting 18
  • 19. Integration with R  Visualisation of analysis results (charts & tables)  Reuse of analysis results: preserving R output as linked data  Managing a catalogue of the analytics experiments („recipes“) 19-20 May 2014 OpenCube plenary meeting 19
  • 20.  Widget-based visualization of data  Pre-existing: Configuration using explicit SPARQL queries  More appropriate for engineers building custom solutions than for end users  Goal: Intuitive configuration of visualization views exploiting the Data Cube structure Data Cube Visualization Visualization and Exploration Analytics and Reporting 18-19 November 2013 OpenCube kick-off meeting 20
  • 21. Stock chart visualization  Adaptation of the stock chart view to the RDF data cube datasets  Improved configuration UI  specifying dimension restrictions instead of the complete SPARQL query  Additional features (e.g., comparison between slices) 19-20 May 2014 OpenCube plenary meeting 21
  • 22. We currently perform evaluations of the components in four pilots  Department for Communities and Local Government (UK)  Central Statistics Office (Ireland)  Flemish Government (Belgium)  Swiss Banks  Some interesting findings  Why to use linked data  Performance issues with large data sets  Noisy data 22 Initial Evaluation Results 19 October 2014, Riva del Garda, Italy ISWC 2014 – SemStats 2014
  • 23.  For more information  http://opencube-project.eu  http://opencube-toolkit.eu 23 OpenCube toolkit 19 October 2014, Riva del Garda, Italy ISWC 2014 – SemStats 2014