SlideShare una empresa de Scribd logo
1 de 34
Descargar para leer sin conexión
From Data Platforms to Dataspaces:
Enabling Data Ecosystems for Intelligent Systems
Edward Curry
Insight @ NUI Galway
edward.curry@nuigalway.ie
Open Access Book
Contents
Part I: Fundamentals and Concepts
Part II: Data Support Services
Part III: Stream and Event Processing Services
Part IV: Intelligent Systems and Applications
Part V: Future Directions
Team
http://dataspaces.info
Web:dataspaces.info
Part I: Fundamentals and Concepts
3
http://dataspaces.info
Data Driven Innovations
Digital Twins: A digital replica of physical
assets (car), processes (value-chain), systems,
or physical environments (building). The
digital representation (i.e. simulation
modelling or data-driven models) provided by
the digital twin can be analysed to optimise
the operation of the “physical twin”.
Physical-Cyber-Social (PCS): A computing
paradigm that supports a richer human
experience with a holistic data-rich view of
the smart environment that integrates,
correlates, interprets, and provides
contextually relevant abstractions to humans.
Mass Personalisation: More human-centric
thinking in the design of systems where users
have growing expectations for highly
personalised digital services for the “Market
of One”.
Data Network Effects: As more systems/users
join and contribute data to the smart
environment, a “network effect” can take
place, resulting in the overall data available
becoming more valuable.
http://dataspaces.info
Real World Digital World
Sensors Orient
DecideActuators Act
Observe
Physical Twin
(Asset-centric)
Digital Twin
(System-centric)
Digital
Twins
http://dataspaces.info 5
Connected Intelligent Systems
6
http://dataspaces.info
http://dataspaces.info 7
Value Chains in Data Ecosystems
Data Management Challenges
• Pay-as-you-go Data Integration, Accessibility, and Sharing
– Standard data syntax, semantics, and linkage: Facilitate integration and sharing, ideally with open standards
and non-proprietary approaches.
– Single-point data discoverability and accessibility: Allow the organisation and access to datasets and
metadata through a single location.
– Incremental data management: Enable a low barrier to entry and a pay-as-you-go paradigm to minimise
costs.
• Secure Access Control: Support data access rights to preserve the security of data and privacy of
users in the smart environment.
• Real-time Data Processing and Historical Querying
– Real-time data processing: Including ingestion, aggregation, and pattern detection within event streams
originating from sensors and things in the smart environment.
– Unified querying of real-time data and historical data: Provide applications and end-users with a holistic
queryable state of the smart environment at a latency suitable for user interaction.
• Entity-centric Data Views
– Entity management: The storage, linkage, curation, and retrieval of entity data, such as users, zones, and
locations.
– Event enrichment: Enhancement of sensor/things streams with contextual data (e.g. entities) to make the
stream data more encapsulated and useful in downstream processing.
http://dataspaces.info
The “gold mining” metaphor applied to data processing
http://dataspaces.info
Traditional Approaches to Data Integration
Low
High
High
Frequency
of use
Cost of administration &
semantic integration using
traditional approaches
Popularity/Use
Number of data sources, entities, attributes
http://dataspaces.info
Data is Key to AI…Data Platforms will Fuel AI Decisions
Data Generation
and Analysis
(including IoT)
Data Platforms
(Access and Portability)
AI and Decision Platformshttp://dataspaces.info
IoT-Enablement
Layer 1 - Communication and Sensing
IPv6, Wi-Fi, RFID, CoAP, AVB, etc.
Layer 3 - Data
Schema, Entities, Catalog, Sharing, Access/Control, etc.
Layer 4 – Intelligent Apps, Analytics, and Users
Datasets
Things / Sensors
Contextual Data Sources
(including legacy systems)
Predictive
Analytics
Situation
Awareness
Decision
Support
Digital
Twin
Machine
Learning
Users
Layer 2 - Middleware
Peer-to-Peer, Events, Pub/Sub, SOA, SDN, etc.
A Data Sharing Layer is needed….
Adapted from: L. Atzori, A. Iera, and G. Morabito, “The
Internet of Things: A survey,” Comput. Networks, vol. 54,
no. 15, pp. 2787–2805, Oct. 2010.http://dataspaces.info
Cost of Data Management Solutions
http://dataspaces.info
Administrative Proximity:
– With close control many assumptions
can hold concerning guarantees such
as data quality and consistency.,
– Far control refers to a loosely coupled
environment and a lack of
coordination on the data sources.
Semantic Integration
– Degree to which data schemas are
matched up (types, attributes, and
names).
– All data conform to an agreed-upon
schema vs. no schema information.
This dimension is relevant to how
much semantically rich querying can
be done. 13
Halevy, A., Franklin, M. and Maier, D. 2006. Principles of dataspace
systems. 25th ACM SIGMOD-SIGACT-SIGART symposium on Principles of
database systems - PODS ’06 (New York, New York, USA, 2006), 1–9.
(Real-time Linked) Dataspace
Principles: (adapted from by Halevy et al.)
• Must deal with many different formats of
streams and events.
• Does not subsume the stream and event
processing engines; they still provide
individual access via their native interfaces.
• Queries in are provided on a best-effort
and approximate basis.
• Must provide pathways to improve the
integration among the data sources,
including streams and events, in a pay-as-
you-go fashion.
14http://dataspaces.info
Dataspace
“Dataspaces are not a data integration
approach; rather, they are more of a data co-
existence approach. The goal of dataspace
support is to provide base functionality over
all data sources, regardless of how integrated
they are.” (Halevy, A., Franklin, M. and Maier, D. 2006.)
Real-time Linked Dataspace (RLD)
Enabling platform for data management for
intelligent systems within smart environments
that combines the pay-as-you-go paradigm of
dataspaces, linked data, and knowledge
graphs with entity-centric real-time query
capabilities.
Approximate and Best Effort Approaches
Low
High
High
Frequency
of use Approximate &
best-effort
approaches
Cost of administration &
semantic integration using
traditional approaches
Popularity/Use
Number of data sources, entities, attributes
http://dataspaces.info
Architecture of Real-time Linked Dataspace
• Support Platform: Responsible for providing
the functionalities and services essential for
managing the dataspace.
• Things / Sensors: Produce real-time data
streams that need to be processed & managed.
• Data Sources: Available in a wide variety of
formats and accessible through different
systems interfaces.
• Managed Entities: Actively managed entities
including their relationship to participating
things, data sources, and other entities.
• Intelligent Applications, Analytics, & Users:
Leverage RLDs data and services to provide
data analytics, decision support tools, user
interfaces, and data visualisations. 16http://dataspaces.info
Pay-as-you-Go Tiered Data Model
http://dataspaces.info 17
• Provides flexibility by reducing
the initial cost and barriers to
joining the dataspace.
• Specialisation of the 5 star
scheme defined by
Tim Berners-Lee.
• Over time the level of integration
with the support services can be
improved in an incremental
manner on an as-needed basis.
• The more investment made to
integrate with the support
services; the better integration is
achievable in the dataspace.
http://dataspaces.info
Service Tiers for Support Services
Part II: Data Support Services
http://dataspaces.info
Part III: Stream and Event Processing Services
http://dataspaces.info
Data Self-Management
http://dataspaces.info 21
Techniques for:
• Self-Configuration
• Self-Healing
• Self-Optimizing
Automatic Source
Selection
• Source Selection
• Source Replacement
• Model Selection
• Model Training
• Parameterization
Entity Data Management and Humans in the Loop
http://dataspaces.info
Enables Users in the Smart
Environment to participate in
data management tasks
• Collection & Enrichment
• Mapping & Matching
• Operator Evaluation
• Feedback & Refinement
• Citizen Actuation
Key HIL Challenges
• Task Specification (simplicity)
• Interaction Mechanism
• Task Assignment (Geospatial,
expertise) 22
Semantic Approximation Matching of Streams
http://dataspaces.info
Challenges
• Heterogeneity in Event
Semantics (000s schema)
• Heterogeneity in processing
Rules (000s of rule tied to
schema)
Approx. Semantic Event Matcher
• Sub-symbolic Distributional
Event Semantics
• Enables pay-as-you-go event
matching for data streams
• Replaced 48,000 exact rules with
100 approximate rules with
around 85% accuracy
23
Part IV: Intelligent Systems and Applications
http://dataspaces.info
LOCATION
Airport Office Home Mixed Use School
LINATE AIRPORT,
MILAN, ITALY
INSIGHT,
GALWAY, IRELAND
HOUSES,
THERMI, GREECE
ENGINEERING,
NUI GALWAY
COLÁISTE NA
COIRIBE, IRELAND
TARGETUSERS
• Corporate users
• ~9.5 million
passengers
• Utilities
management
• Maintenance
staff
• Environmental
managers
• 130 staff
• Office consumers
• Operations
managers
• Utility providers
• Building
managers
• Domestic
consumers
(adults, young
adults and
children)
• Utility providers
• Mixed/Public
consumers
• Building
managers
• 100 staff
• 1000 students
(ages 18 to 24)
• Mixed/Public
consumers
• School
management
• Maintenance
staff
• 500 students
(ages 12 to 18)
• 40 teachers
INFRASTRUCTURE
• Safety critical
• 10 km water
network
• Multiple
buildings
• Water meters
• Energy meters
• Legacy systems
• 2190 m2 space
• 22 offices + 160
open plan spaces
• Conference room
• 4 meeting rooms
• 3 kitchens
• Data centre
• 30 person café
• Energy meters
• 10 households
• Typical variety of
domestic settings
including kitchen,
showers, baths,
living room,
bedrooms, and
garden
• Water meters
• Water meters
• Energy meters
• Rainwater
harvesting
• Café
• Weather station
• Wet labs
• Showers
• Water meters
• Energy meters
• Rainwater
harvesting
India (OK)India (OK)India (OK)
Smart Water
and Energy
Management
Pilots
Smart School
CnaC School in
Galway, Ireland
Mixed Use
Galway, Ireland
Building
Manager
University Students
Smart Airport
Milan Linate,
Italy
Corporate
Staff
Passengers
Smart Homes
Municipality of
Thermi, Greece
Smart Office
Galway, Ireland
Families
Operational
Staff
Researchers
Application
Developers
Teaching Staff School Students
Data
Scientist
Need to target different Target Users
http://dataspaces.info
IoT-enabled
Digital Twins
and
Intelligent
Applications
Real-time Linked Dataspace
DatasetsThings / Sensors
Entity Management Service
Catalog &
Access Control
Service
Personal DashboardPublic Dashboards
Decision Analytics and
Machine Learning
Notifications Apps
Alerts
Orient Decide
Act
Search & Query
Service
Entity-Centric
Real-Time Query
Service
Complex Event
Processing Service
Digital Twin
CEP
D
Human Task Service
Human Task
Service
Observe
http://dataspaces.info
“OODA” Loop
Interactive Public Displays
Alerts and NotificationsPersonalised Dashboards
Example
Applications
Experiences and Lessons Learnt from Dataspaces
http://dataspaces.info
• Developer education need for stream processing and approximate results
• Incremental data management can support agile software development
• Build the business case for data-driven innovation
• Integration with legacy data is a significant cost in smart environments
• The 5 star pay-as-you-go model simplified communication with non-technical
users
• A secure canonical source for entity data simplifies application development
• Data quality with things and sensors is challenging in an operational
environment
• Working with three pipelines add overhead (LAMBDA + Entity Layer)
28
Part V: Future Directions
http://dataspaces.info 29
Large-scale Decentralised Support Services
• Enhanced Supported Services
• Scaling Entity Management
• Maintenance and Operation Cost
Multimedia/Knowledge-Intensive Event
Processing
• Support Services for Multimedia Data
• Placement of Multimedia Data and
Workloads
• Adaptive Training of Classifiers
• Complex Multimedia Event Processing
Trusted Data Sharing
• Trusted Platforms
• Usage Control
• Personal/ Industrial Dataspaces
Ecosystem Governance and Economic
Models
• Decentralised Data Governance
• Economic Models
Incremental Intelligent Systems
Engineering Cognitive Adaptability
• Pay-as-you-go Systems
• Cognitive Adaptability
Towards Human-centric Systems
• Explainable Artificial Intelligence
and Data Provenance
• Human-in-the-loop
Some final thoughts on
Impacts, Influence, and Future Funding
http://dataspaces.info
Data Sharing Spaces – Position Paper
Key Recommendations
Create the conditions for the
development of a trusted European
data sharing framework
Incorporate data sharing at the core
of the data lifecycle to enable greater
access to data.
Provide supportive measures for
European businesses to safely
embrace new technologies, practices
and policies.
Assemble a European-wide digital
skills strategy to equip the workforce
for the new data economy.
A European Strategy for Data
BDVA Meeting
26 February 2020
Yvo Volman
Head of Unit G1 - Data Policy and Innovation
DG CNECT, European Commission
European Strategy for Data
Data can flow within the
EU and across sectors
European rules and values
are fully respected
Rules for access and use of data are
fair, practical and clear & clear data
governance mechanisms are in place
A common European data space, a single market for data
Availability of high quality data
to create and innovate
Rich pool of data
(varying degree of
accessibility)
Free flow of data
across sectors and
countries
Full respect of GDPR
Health
Industrial &
Manufacturing Agriculture Finance Mobility Green Deal Energy
−Technical tools for data pooling and sharing
−Standards & interoperability (technical,
semantic)
− Sectoral Data Governance (contracts,
licenses, access rights, usage rights)
− IT capacity, including cloud storage,
processing and services
Horizontal
framework for data
governance and data
access
Common European data spaces
Public
Administration Skills

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Introduction to Statistical Machine Learning
Introduction to Statistical Machine LearningIntroduction to Statistical Machine Learning
Introduction to Statistical Machine Learning
 
Presentation on gis and future trends
Presentation on gis and future trendsPresentation on gis and future trends
Presentation on gis and future trends
 
Particle Swarm Optimization: The Algorithm and Its Applications
Particle Swarm Optimization: The Algorithm and Its ApplicationsParticle Swarm Optimization: The Algorithm and Its Applications
Particle Swarm Optimization: The Algorithm and Its Applications
 
DIGITAL IMAGE PROCESSING - Day 4 Image Transform
DIGITAL IMAGE PROCESSING - Day 4 Image TransformDIGITAL IMAGE PROCESSING - Day 4 Image Transform
DIGITAL IMAGE PROCESSING - Day 4 Image Transform
 
Noise Models
Noise ModelsNoise Models
Noise Models
 
Student information chatbot final report
Student information chatbot  final report Student information chatbot  final report
Student information chatbot final report
 
Lecture 2&3 Computer vision image formation ,filters&edge detection
Lecture 2&3 Computer vision image formation ,filters&edge detectionLecture 2&3 Computer vision image formation ,filters&edge detection
Lecture 2&3 Computer vision image formation ,filters&edge detection
 
Real life application
Real life applicationReal life application
Real life application
 
History of neural networks
History of neural networks History of neural networks
History of neural networks
 
Spatial Database
Spatial DatabaseSpatial Database
Spatial Database
 
Fuzzy Logic
Fuzzy LogicFuzzy Logic
Fuzzy Logic
 
03 Machine Learning Linear Algebra
03 Machine Learning Linear Algebra03 Machine Learning Linear Algebra
03 Machine Learning Linear Algebra
 
3D reconstruction
3D reconstruction3D reconstruction
3D reconstruction
 
Internship report on AI , ML & IIOT and project responses full docs
Internship report on AI , ML & IIOT and project responses full docsInternship report on AI , ML & IIOT and project responses full docs
Internship report on AI , ML & IIOT and project responses full docs
 
Satellite Image Classification with Deep Learning Survey
Satellite Image Classification with Deep Learning SurveySatellite Image Classification with Deep Learning Survey
Satellite Image Classification with Deep Learning Survey
 
Business models for business processes on IoT
Business models for business processes on IoTBusiness models for business processes on IoT
Business models for business processes on IoT
 
05 Scalar Visualization
05 Scalar Visualization05 Scalar Visualization
05 Scalar Visualization
 
Image Registration (Digital Image Processing)
Image Registration (Digital Image Processing)Image Registration (Digital Image Processing)
Image Registration (Digital Image Processing)
 
Texture in image processing
Texture in image processing Texture in image processing
Texture in image processing
 
Image Filtering in the Frequency Domain
Image Filtering in the Frequency DomainImage Filtering in the Frequency Domain
Image Filtering in the Frequency Domain
 

Similar a From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent Systems

Unit i introduction to grid computing
Unit i   introduction to grid computingUnit i   introduction to grid computing
Unit i introduction to grid computing
sudha kar
 
Mining on Relationships in Big Data era using Improve Apriori Algorithm with ...
Mining on Relationships in Big Data era using Improve Apriori Algorithm with ...Mining on Relationships in Big Data era using Improve Apriori Algorithm with ...
Mining on Relationships in Big Data era using Improve Apriori Algorithm with ...
KamleshKumar394
 
Square Pegs In Round Holes: Rethinking Data Availability in the Age of Automa...
Square Pegs In Round Holes: Rethinking Data Availability in the Age of Automa...Square Pegs In Round Holes: Rethinking Data Availability in the Age of Automa...
Square Pegs In Round Holes: Rethinking Data Availability in the Age of Automa...
Denodo
 
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Geoffrey Fox
 

Similar a From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent Systems (20)

Unit i introduction to grid computing
Unit i   introduction to grid computingUnit i   introduction to grid computing
Unit i introduction to grid computing
 
From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent S...
From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent S...From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent S...
From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent S...
 
Big data and oracle
Big data and oracleBig data and oracle
Big data and oracle
 
Grid computing
Grid computingGrid computing
Grid computing
 
The Internet of Things: What's next?
The Internet of Things: What's next? The Internet of Things: What's next?
The Internet of Things: What's next?
 
Data Domain-Driven Design
Data Domain-Driven DesignData Domain-Driven Design
Data Domain-Driven Design
 
Mining on Relationships in Big Data era using Improve Apriori Algorithm with ...
Mining on Relationships in Big Data era using Improve Apriori Algorithm with ...Mining on Relationships in Big Data era using Improve Apriori Algorithm with ...
Mining on Relationships in Big Data era using Improve Apriori Algorithm with ...
 
GridComputing-an introduction.ppt
GridComputing-an introduction.pptGridComputing-an introduction.ppt
GridComputing-an introduction.ppt
 
Intelligent Data Processing for the Internet of Things
Intelligent Data Processing for the Internet of Things Intelligent Data Processing for the Internet of Things
Intelligent Data Processing for the Internet of Things
 
Mobile Data Analytics
Mobile Data AnalyticsMobile Data Analytics
Mobile Data Analytics
 
Distributed Trust Architecture: The New Reality of ML-based Systems
Distributed Trust Architecture: The New Reality of ML-based SystemsDistributed Trust Architecture: The New Reality of ML-based Systems
Distributed Trust Architecture: The New Reality of ML-based Systems
 
Square Pegs In Round Holes: Rethinking Data Availability in the Age of Automa...
Square Pegs In Round Holes: Rethinking Data Availability in the Age of Automa...Square Pegs In Round Holes: Rethinking Data Availability in the Age of Automa...
Square Pegs In Round Holes: Rethinking Data Availability in the Age of Automa...
 
FR.pptx
FR.pptxFR.pptx
FR.pptx
 
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
 
Dynamic Data Analytics for the Internet of Things: Challenges and Opportunities
Dynamic Data Analytics for the Internet of Things: Challenges and OpportunitiesDynamic Data Analytics for the Internet of Things: Challenges and Opportunities
Dynamic Data Analytics for the Internet of Things: Challenges and Opportunities
 
Big Data Session 1.pptx
Big Data Session 1.pptxBig Data Session 1.pptx
Big Data Session 1.pptx
 
Project Haystack - 10 Years of Bringing Data Semantics to the Built Environment!
Project Haystack - 10 Years of Bringing Data Semantics to the Built Environment!Project Haystack - 10 Years of Bringing Data Semantics to the Built Environment!
Project Haystack - 10 Years of Bringing Data Semantics to the Built Environment!
 
Dm unit i r16
Dm unit i   r16Dm unit i   r16
Dm unit i r16
 
A Review on Resource Discovery Strategies in Grid Computing
A Review on Resource Discovery Strategies in Grid ComputingA Review on Resource Discovery Strategies in Grid Computing
A Review on Resource Discovery Strategies in Grid Computing
 
B017240812
B017240812B017240812
B017240812
 

Último

Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
amitlee9823
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 

Último (20)

Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 

From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent Systems

  • 1. From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent Systems Edward Curry Insight @ NUI Galway edward.curry@nuigalway.ie
  • 2. Open Access Book Contents Part I: Fundamentals and Concepts Part II: Data Support Services Part III: Stream and Event Processing Services Part IV: Intelligent Systems and Applications Part V: Future Directions Team http://dataspaces.info Web:dataspaces.info
  • 3. Part I: Fundamentals and Concepts 3 http://dataspaces.info
  • 4. Data Driven Innovations Digital Twins: A digital replica of physical assets (car), processes (value-chain), systems, or physical environments (building). The digital representation (i.e. simulation modelling or data-driven models) provided by the digital twin can be analysed to optimise the operation of the “physical twin”. Physical-Cyber-Social (PCS): A computing paradigm that supports a richer human experience with a holistic data-rich view of the smart environment that integrates, correlates, interprets, and provides contextually relevant abstractions to humans. Mass Personalisation: More human-centric thinking in the design of systems where users have growing expectations for highly personalised digital services for the “Market of One”. Data Network Effects: As more systems/users join and contribute data to the smart environment, a “network effect” can take place, resulting in the overall data available becoming more valuable. http://dataspaces.info
  • 5. Real World Digital World Sensors Orient DecideActuators Act Observe Physical Twin (Asset-centric) Digital Twin (System-centric) Digital Twins http://dataspaces.info 5
  • 8. Data Management Challenges • Pay-as-you-go Data Integration, Accessibility, and Sharing – Standard data syntax, semantics, and linkage: Facilitate integration and sharing, ideally with open standards and non-proprietary approaches. – Single-point data discoverability and accessibility: Allow the organisation and access to datasets and metadata through a single location. – Incremental data management: Enable a low barrier to entry and a pay-as-you-go paradigm to minimise costs. • Secure Access Control: Support data access rights to preserve the security of data and privacy of users in the smart environment. • Real-time Data Processing and Historical Querying – Real-time data processing: Including ingestion, aggregation, and pattern detection within event streams originating from sensors and things in the smart environment. – Unified querying of real-time data and historical data: Provide applications and end-users with a holistic queryable state of the smart environment at a latency suitable for user interaction. • Entity-centric Data Views – Entity management: The storage, linkage, curation, and retrieval of entity data, such as users, zones, and locations. – Event enrichment: Enhancement of sensor/things streams with contextual data (e.g. entities) to make the stream data more encapsulated and useful in downstream processing. http://dataspaces.info
  • 9. The “gold mining” metaphor applied to data processing http://dataspaces.info
  • 10. Traditional Approaches to Data Integration Low High High Frequency of use Cost of administration & semantic integration using traditional approaches Popularity/Use Number of data sources, entities, attributes http://dataspaces.info
  • 11. Data is Key to AI…Data Platforms will Fuel AI Decisions Data Generation and Analysis (including IoT) Data Platforms (Access and Portability) AI and Decision Platformshttp://dataspaces.info
  • 12. IoT-Enablement Layer 1 - Communication and Sensing IPv6, Wi-Fi, RFID, CoAP, AVB, etc. Layer 3 - Data Schema, Entities, Catalog, Sharing, Access/Control, etc. Layer 4 – Intelligent Apps, Analytics, and Users Datasets Things / Sensors Contextual Data Sources (including legacy systems) Predictive Analytics Situation Awareness Decision Support Digital Twin Machine Learning Users Layer 2 - Middleware Peer-to-Peer, Events, Pub/Sub, SOA, SDN, etc. A Data Sharing Layer is needed…. Adapted from: L. Atzori, A. Iera, and G. Morabito, “The Internet of Things: A survey,” Comput. Networks, vol. 54, no. 15, pp. 2787–2805, Oct. 2010.http://dataspaces.info
  • 13. Cost of Data Management Solutions http://dataspaces.info Administrative Proximity: – With close control many assumptions can hold concerning guarantees such as data quality and consistency., – Far control refers to a loosely coupled environment and a lack of coordination on the data sources. Semantic Integration – Degree to which data schemas are matched up (types, attributes, and names). – All data conform to an agreed-upon schema vs. no schema information. This dimension is relevant to how much semantically rich querying can be done. 13 Halevy, A., Franklin, M. and Maier, D. 2006. Principles of dataspace systems. 25th ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems - PODS ’06 (New York, New York, USA, 2006), 1–9.
  • 14. (Real-time Linked) Dataspace Principles: (adapted from by Halevy et al.) • Must deal with many different formats of streams and events. • Does not subsume the stream and event processing engines; they still provide individual access via their native interfaces. • Queries in are provided on a best-effort and approximate basis. • Must provide pathways to improve the integration among the data sources, including streams and events, in a pay-as- you-go fashion. 14http://dataspaces.info Dataspace “Dataspaces are not a data integration approach; rather, they are more of a data co- existence approach. The goal of dataspace support is to provide base functionality over all data sources, regardless of how integrated they are.” (Halevy, A., Franklin, M. and Maier, D. 2006.) Real-time Linked Dataspace (RLD) Enabling platform for data management for intelligent systems within smart environments that combines the pay-as-you-go paradigm of dataspaces, linked data, and knowledge graphs with entity-centric real-time query capabilities.
  • 15. Approximate and Best Effort Approaches Low High High Frequency of use Approximate & best-effort approaches Cost of administration & semantic integration using traditional approaches Popularity/Use Number of data sources, entities, attributes http://dataspaces.info
  • 16. Architecture of Real-time Linked Dataspace • Support Platform: Responsible for providing the functionalities and services essential for managing the dataspace. • Things / Sensors: Produce real-time data streams that need to be processed & managed. • Data Sources: Available in a wide variety of formats and accessible through different systems interfaces. • Managed Entities: Actively managed entities including their relationship to participating things, data sources, and other entities. • Intelligent Applications, Analytics, & Users: Leverage RLDs data and services to provide data analytics, decision support tools, user interfaces, and data visualisations. 16http://dataspaces.info
  • 17. Pay-as-you-Go Tiered Data Model http://dataspaces.info 17 • Provides flexibility by reducing the initial cost and barriers to joining the dataspace. • Specialisation of the 5 star scheme defined by Tim Berners-Lee. • Over time the level of integration with the support services can be improved in an incremental manner on an as-needed basis. • The more investment made to integrate with the support services; the better integration is achievable in the dataspace.
  • 19. Part II: Data Support Services http://dataspaces.info
  • 20. Part III: Stream and Event Processing Services http://dataspaces.info
  • 21. Data Self-Management http://dataspaces.info 21 Techniques for: • Self-Configuration • Self-Healing • Self-Optimizing Automatic Source Selection • Source Selection • Source Replacement • Model Selection • Model Training • Parameterization
  • 22. Entity Data Management and Humans in the Loop http://dataspaces.info Enables Users in the Smart Environment to participate in data management tasks • Collection & Enrichment • Mapping & Matching • Operator Evaluation • Feedback & Refinement • Citizen Actuation Key HIL Challenges • Task Specification (simplicity) • Interaction Mechanism • Task Assignment (Geospatial, expertise) 22
  • 23. Semantic Approximation Matching of Streams http://dataspaces.info Challenges • Heterogeneity in Event Semantics (000s schema) • Heterogeneity in processing Rules (000s of rule tied to schema) Approx. Semantic Event Matcher • Sub-symbolic Distributional Event Semantics • Enables pay-as-you-go event matching for data streams • Replaced 48,000 exact rules with 100 approximate rules with around 85% accuracy 23
  • 24. Part IV: Intelligent Systems and Applications http://dataspaces.info LOCATION Airport Office Home Mixed Use School LINATE AIRPORT, MILAN, ITALY INSIGHT, GALWAY, IRELAND HOUSES, THERMI, GREECE ENGINEERING, NUI GALWAY COLÁISTE NA COIRIBE, IRELAND TARGETUSERS • Corporate users • ~9.5 million passengers • Utilities management • Maintenance staff • Environmental managers • 130 staff • Office consumers • Operations managers • Utility providers • Building managers • Domestic consumers (adults, young adults and children) • Utility providers • Mixed/Public consumers • Building managers • 100 staff • 1000 students (ages 18 to 24) • Mixed/Public consumers • School management • Maintenance staff • 500 students (ages 12 to 18) • 40 teachers INFRASTRUCTURE • Safety critical • 10 km water network • Multiple buildings • Water meters • Energy meters • Legacy systems • 2190 m2 space • 22 offices + 160 open plan spaces • Conference room • 4 meeting rooms • 3 kitchens • Data centre • 30 person café • Energy meters • 10 households • Typical variety of domestic settings including kitchen, showers, baths, living room, bedrooms, and garden • Water meters • Water meters • Energy meters • Rainwater harvesting • Café • Weather station • Wet labs • Showers • Water meters • Energy meters • Rainwater harvesting India (OK)India (OK)India (OK) Smart Water and Energy Management Pilots
  • 25. Smart School CnaC School in Galway, Ireland Mixed Use Galway, Ireland Building Manager University Students Smart Airport Milan Linate, Italy Corporate Staff Passengers Smart Homes Municipality of Thermi, Greece Smart Office Galway, Ireland Families Operational Staff Researchers Application Developers Teaching Staff School Students Data Scientist Need to target different Target Users http://dataspaces.info
  • 26. IoT-enabled Digital Twins and Intelligent Applications Real-time Linked Dataspace DatasetsThings / Sensors Entity Management Service Catalog & Access Control Service Personal DashboardPublic Dashboards Decision Analytics and Machine Learning Notifications Apps Alerts Orient Decide Act Search & Query Service Entity-Centric Real-Time Query Service Complex Event Processing Service Digital Twin CEP D Human Task Service Human Task Service Observe http://dataspaces.info “OODA” Loop
  • 27. Interactive Public Displays Alerts and NotificationsPersonalised Dashboards Example Applications
  • 28. Experiences and Lessons Learnt from Dataspaces http://dataspaces.info • Developer education need for stream processing and approximate results • Incremental data management can support agile software development • Build the business case for data-driven innovation • Integration with legacy data is a significant cost in smart environments • The 5 star pay-as-you-go model simplified communication with non-technical users • A secure canonical source for entity data simplifies application development • Data quality with things and sensors is challenging in an operational environment • Working with three pipelines add overhead (LAMBDA + Entity Layer) 28
  • 29. Part V: Future Directions http://dataspaces.info 29 Large-scale Decentralised Support Services • Enhanced Supported Services • Scaling Entity Management • Maintenance and Operation Cost Multimedia/Knowledge-Intensive Event Processing • Support Services for Multimedia Data • Placement of Multimedia Data and Workloads • Adaptive Training of Classifiers • Complex Multimedia Event Processing Trusted Data Sharing • Trusted Platforms • Usage Control • Personal/ Industrial Dataspaces Ecosystem Governance and Economic Models • Decentralised Data Governance • Economic Models Incremental Intelligent Systems Engineering Cognitive Adaptability • Pay-as-you-go Systems • Cognitive Adaptability Towards Human-centric Systems • Explainable Artificial Intelligence and Data Provenance • Human-in-the-loop
  • 30. Some final thoughts on Impacts, Influence, and Future Funding http://dataspaces.info
  • 31. Data Sharing Spaces – Position Paper Key Recommendations Create the conditions for the development of a trusted European data sharing framework Incorporate data sharing at the core of the data lifecycle to enable greater access to data. Provide supportive measures for European businesses to safely embrace new technologies, practices and policies. Assemble a European-wide digital skills strategy to equip the workforce for the new data economy.
  • 32. A European Strategy for Data BDVA Meeting 26 February 2020 Yvo Volman Head of Unit G1 - Data Policy and Innovation DG CNECT, European Commission
  • 33. European Strategy for Data Data can flow within the EU and across sectors European rules and values are fully respected Rules for access and use of data are fair, practical and clear & clear data governance mechanisms are in place A common European data space, a single market for data Availability of high quality data to create and innovate
  • 34. Rich pool of data (varying degree of accessibility) Free flow of data across sectors and countries Full respect of GDPR Health Industrial & Manufacturing Agriculture Finance Mobility Green Deal Energy −Technical tools for data pooling and sharing −Standards & interoperability (technical, semantic) − Sectoral Data Governance (contracts, licenses, access rights, usage rights) − IT capacity, including cloud storage, processing and services Horizontal framework for data governance and data access Common European data spaces Public Administration Skills