SlideShare una empresa de Scribd logo
1 de 18
Descargar para leer sin conexión
3/31/22 1
Demetris Trihinas
trihinas.d@unic.ac.cy
1
Workshop: Processing Data in the Fog – Aristotle University, GR – Apr. 4, 2022
Department of
Computer Science
StreamSight
A Query-Driven Framework Extending
Streaming IoT Analytics to the Fog Continuum
Dr. Demetris Trihinas
Department of Computer Science
ailab @ University of Nicosia
trihinas.d@unic.ac.cy
3/31/22 2
Demetris Trihinas
trihinas.d@unic.ac.cy
2
Workshop: Processing Data in the Fog – Aristotle University, GR – Apr. 4, 2022
Department of
Computer Science
“Designing and developing scalable and self-adaptive tools for data
management, exploration and visualization”
trihinas.d@unic.ac.cy http:///dtrihinas.info dtrihinas
Dr. Demetris Trihinas
Lecturer at University of Nicosia
Artificial Intelligence Laboratory (AILab)
Open and trusted fog computing
platform that facilitates the
deployment of scalable and
heterogeneous IoT services
Enabling power-efficient Machine
Learning and its applications to
drone technology for handling
time-critical missions
Bridging the early diagnosis and
treatment gap of brain diseases
via smart, connected, proactive
and evidence-based technology
3/31/22 3
Demetris Trihinas
trihinas.d@unic.ac.cy
3
Workshop: Processing Data in the Fog – Aristotle University, GR – Apr. 4, 2022
Department of
Computer Science
Distributed Data Processing Engines
• Big data processing engines are contributing to the democratization of analytics
by hiding the complexity for:
• M2M communication and syncing.
• Resource management.
• Task scheduling and supervision for analytic jobs.
• Fault tolerance for both the infrastructure and execution state.
• Monitoring and logging.
• ...
3/31/22 4
Demetris Trihinas
trihinas.d@unic.ac.cy
4
Workshop: Processing Data in the Fog – Aristotle University, GR – Apr. 4, 2022
Department of
Computer Science
3/31/22 5
Demetris Trihinas
trihinas.d@unic.ac.cy
5
Workshop: Processing Data in the Fog – Aristotle University, GR – Apr. 4, 2022
Department of
Computer Science
• Spark-SQL and Structure-Steaming leveling the game… but..
Challenge: Steep Learning Curve
...
...
Compute the mean of a
metric using a 60s
sliding window
• Unlike SQL, there is a difficulty for IoT operators/Data Scientists to issue ad-hoc
queries -> requires knowledge of underlying engine programming model.
3/31/22 6
Demetris Trihinas
trihinas.d@unic.ac.cy
6
Workshop: Processing Data in the Fog – Aristotle University, GR – Apr. 4, 2022
Department of
Computer Science
Challenge: Analytics Governance Lock-in
• Analytics landscape still fairly open and non-dominant
• Switching big data framework requires massive re-coding
• Apache Beam (former Google DataFlow) and
Summingbird towards right direction…
• But…
https://sigmodrecord.org/2020/02/12/the-seattle-report-on-database-research/
3/31/22 8
Demetris Trihinas
trihinas.d@unic.ac.cy
8
Workshop: Processing Data in the Fog – Aristotle University, GR – Apr. 4, 2022
Department of
Computer Science
Challenge: Fog-Aware Queries?
The “Edge”
SpO2
HR
...
motion…
temp
pollutants
…
Less powerful nodes
Network bandwidth far from uniform… many uncertainties
Reduce unnecessary computations and data movement
air quality
The “Fog”
physical and/or network distance
Less load on
centralized services
LAN/WAN
(one hop away)
Internet
3/31/22 9
Demetris Trihinas
trihinas.d@unic.ac.cy
9
Workshop: Processing Data in the Fog – Aristotle University, GR – Apr. 4, 2022
Department of
Computer Science
The StreamSight Analytics Framework for IoT
SQL-like query
model for streaming
analytics with fog
optimization hints
Big data engine
agnostic query plan
Compilers for
multiple big data
engines
StreamSight: A Query-Driven Framework for Streaming Analytics in Edge Computing. Z. Georgiou, M. Symeonides, D. Trihinas, G. Pallis and M. Dikaiakos, IEEE/ACM UCC, 2018.
Query-Driven Descriptive Analytics for IoT and Edge Computing. M. Symeonides, D. Trihinas, Z. Georgiou, G. Pallis and M. Dikaiakos, IEEE IC2E, 2019.
3/31/22 10
Demetris Trihinas
trihinas.d@unic.ac.cy
10
Workshop: Processing Data in the Fog – Aristotle University, GR – Apr. 4, 2022
Department of
Computer Science
StreamSight Query Model
Queries are
applied on metric
streams with the
intent to derive
insights
Insights can be reused-
transformed-composed
with other metric
streams to create new
insights
Query Model
• Descriptive
statistics
• Filtering
• Transformations
• Windowing
• Grouping
• Sampling
• Query Prioritization
• Outlier Detection
• Operator Placement
• Job Scheduling Hints
• …
3/31/22 11
Demetris Trihinas
trihinas.d@unic.ac.cy
11
Workshop: Processing Data in the Fog – Aristotle University, GR – Apr. 4, 2022
Department of
Computer Science
COMPUTE bus_delay
WHEN > ( RUNNING_MEAN(bus_delay) + 3 * RUNNING_SDEV(bus_delay) )
BY city_segment EVERY 5 SECONDS;
COMPUTE
ARITHMETIC_MEAN(bus_delay, 10 MINUTES)
BY city_segment EVERY 5 SECONDS
Examples Queries
• Window Operations: several aggregations (sum, count, sdev, median, percentile, etc)
• Filter Composition:
Metric of interest Window length
Aggregate
Updating
Interval
Group by key
for multivariate
data
Apache Spark
15 Ops
Apache Spark
41 Ops
Filter predicate
3/31/22 12
Demetris Trihinas
trihinas.d@unic.ac.cy
12
Workshop: Processing Data in the Fog – Aristotle University, GR – Apr. 4, 2022
Department of
Computer Science
Query Parser and Validation
• Query syntax mapped to an Abstract Syntax Tree (AST).
• Syntactic correctness validation.
• Independent of underlying engine.
3/31/22 13
Demetris Trihinas
trihinas.d@unic.ac.cy
13
Workshop: Processing Data in the Fog – Aristotle University, GR – Apr. 4, 2022
Department of
Computer Science
AST Optimization
• Naive AST... extremely inefficient, ignore geo-distributed nature of IoT.
• Unnecessary intermediate re-computations
• Increased data movement
Cache and broadcast
across worker nodes
expressions, composites
and results to reduce
unnecessary re-
computations
Intermediate results
can be shared among
queries
3/31/22 14
Demetris Trihinas
trihinas.d@unic.ac.cy
14
Workshop: Processing Data in the Fog – Aristotle University, GR – Apr. 4, 2022
Department of
Computer Science
Other (User-Annotated) Optimizations…
COMPUTE MAX(taxis_fare_amount, 60 MINUTES)
BY city_segment EVERY 1 MINUTES
WITH SALIENCE 1
Query Prioritization
On high-load influx
critical queries are
not delayed
COMPUTE
ARITHMETIC_MEAN(taxi_passengers, 10 MINUTES)
EVERY 30 SECONDS
WITH MAX_ERROR 0.05 AND CONFIDENCE 0.95
Error upper bound Confidence Interval
Query execution
with bounded error
guarantees for
sampling
Sampling
Low-Cost Adaptive Monitoring Techniques for the Internet of Things. D. Trihinas, G. Pallis and M. Dikaiakos, IEEE Trans. On Services Computing, 2018.
3/31/22 15
Demetris Trihinas
trihinas.d@unic.ac.cy
15
Workshop: Processing Data in the Fog – Aristotle University, GR – Apr. 4, 2022
Department of
Computer Science
Other Optimizations…
• Dedicated execution on specific nodes
• Job optimization strategies
3/31/22 16
Demetris Trihinas
trihinas.d@unic.ac.cy
16
Workshop: Processing Data in the Fog – Aristotle University, GR – Apr. 4, 2022
Department of
Computer Science
1 AST to Rule them ALL…
3/31/22 17
Demetris Trihinas
trihinas.d@unic.ac.cy
17
Workshop: Processing Data in the Fog – Aristotle University, GR – Apr. 4, 2022
Department of
Computer Science
Performance Evaluation
• Dublin Smart City Bus Network
• 968 Buses (Jan 2014), 16 metrics/record incl. bus_id, bus_delay, city_segment
• Used 7 insights introduced from the examples
16 Edge servers
● 1 vCPU, 1GB MEM, 2↑ 16↓ Mbps
Evaluation Metric
● Batch Processing Time
Unstable
System
Stable
System
x1.4 speedup over baseline Spark StreamSight+Samling x4.3 over baseline
3/31/22 18
Demetris Trihinas
trihinas.d@unic.ac.cy
18
Workshop: Processing Data in the Fog – Aristotle University, GR – Apr. 4, 2022
Department of
Computer Science
Performance Evaluation (reusing results)
• Dublin Bus Workload
• Average Processing Time ( Fixed Input rate 700 req/s )
StreamSight
DOES NOT incur a
performance overhead
Baseline failed
3/31/22 19
Demetris Trihinas
trihinas.d@unic.ac.cy
19
Workshop: Processing Data in the Fog – Aristotle University, GR – Apr. 4, 2022
Department of
Computer Science
StreamSight
A Query-Driven Framework Extending
Streaming IoT Analytics to the Fog Continuum
Thank You!

Más contenido relacionado

Similar a StreamSight: A Query-Driven Framework Extending Streaming IoT Analytics to the Fog Continuum

Data-centric AI and the convergence of data and model engineering: opportunit...
Data-centric AI and the convergence of data and model engineering:opportunit...Data-centric AI and the convergence of data and model engineering:opportunit...
Data-centric AI and the convergence of data and model engineering: opportunit...Paolo Missier
 
The Data Science Process: From Mining Raw Data to Story Visualization
The Data Science Process: From Mining Raw Data to Story VisualizationThe Data Science Process: From Mining Raw Data to Story Visualization
The Data Science Process: From Mining Raw Data to Story VisualizationDemetris Trihinas
 
Research data spring - Jisc Digital Festival 2015
Research data spring - Jisc Digital Festival 2015Research data spring - Jisc Digital Festival 2015
Research data spring - Jisc Digital Festival 2015Jisc
 
Data Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data ScienceData Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data SciencePouria Amirian
 
Data Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data ScienceData Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data SciencePouria Amirian
 
Towards a Community-driven Data Science Body of Knowledge – Data Management S...
Towards a Community-driven Data Science Body of Knowledge – Data Management S...Towards a Community-driven Data Science Body of Knowledge – Data Management S...
Towards a Community-driven Data Science Body of Knowledge – Data Management S...Research Data Alliance
 
Nimble@itcecnogrid novel toolkit for computing weather
Nimble@itcecnogrid novel toolkit for computing weatherNimble@itcecnogrid novel toolkit for computing weather
Nimble@itcecnogrid novel toolkit for computing weatheriaemedu
 
4th International Conference on Machine Learning Techniques and Data Science ...
4th International Conference on Machine Learning Techniques and Data Science ...4th International Conference on Machine Learning Techniques and Data Science ...
4th International Conference on Machine Learning Techniques and Data Science ...IJNSA Journal
 
4th International Conference on Machine Learning Techniques and Data Science ...
4th International Conference on Machine Learning Techniques and Data Science ...4th International Conference on Machine Learning Techniques and Data Science ...
4th International Conference on Machine Learning Techniques and Data Science ...ijistjournal
 
Applying K-Means Clustering Algorithm to Discover Knowledge from Insurance Da...
Applying K-Means Clustering Algorithm to Discover Knowledge from Insurance Da...Applying K-Means Clustering Algorithm to Discover Knowledge from Insurance Da...
Applying K-Means Clustering Algorithm to Discover Knowledge from Insurance Da...theijes
 
Big Data and Computer Science Education
Big Data and Computer Science EducationBig Data and Computer Science Education
Big Data and Computer Science EducationJames Hendler
 
NIST Big Data Public Working Group NBD-PWG
NIST Big Data Public Working Group NBD-PWGNIST Big Data Public Working Group NBD-PWG
NIST Big Data Public Working Group NBD-PWGGeoffrey Fox
 
Elaboration and enhanced usage of data analysis tool DAMIS+
Elaboration and enhanced usage of data analysis tool DAMIS+Elaboration and enhanced usage of data analysis tool DAMIS+
Elaboration and enhanced usage of data analysis tool DAMIS+Saulius Maskeliunas
 
A Deep Dissertion Of Data Science Related Issues And Its Applications
A Deep Dissertion Of Data Science  Related Issues And Its ApplicationsA Deep Dissertion Of Data Science  Related Issues And Its Applications
A Deep Dissertion Of Data Science Related Issues And Its ApplicationsTracy Hill
 
Linked Open Data about Springer Nature conferences. The story so far
Linked Open Data about Springer Nature conferences. The story so farLinked Open Data about Springer Nature conferences. The story so far
Linked Open Data about Springer Nature conferences. The story so farAliaksandr Birukou
 
4th International Conference on Machine Learning Techniques and Data Science ...
4th International Conference on Machine Learning Techniques and Data Science ...4th International Conference on Machine Learning Techniques and Data Science ...
4th International Conference on Machine Learning Techniques and Data Science ...gerogepatton
 
Software Engineering Challenges in building AI-based complex systems
Software Engineering Challenges in building AI-based complex systemsSoftware Engineering Challenges in building AI-based complex systems
Software Engineering Challenges in building AI-based complex systemsIvica Crnkovic
 
Lecture_IIITD.pptx
Lecture_IIITD.pptxLecture_IIITD.pptx
Lecture_IIITD.pptxachakracu
 

Similar a StreamSight: A Query-Driven Framework Extending Streaming IoT Analytics to the Fog Continuum (20)

Data-centric AI and the convergence of data and model engineering: opportunit...
Data-centric AI and the convergence of data and model engineering:opportunit...Data-centric AI and the convergence of data and model engineering:opportunit...
Data-centric AI and the convergence of data and model engineering: opportunit...
 
The Data Science Process: From Mining Raw Data to Story Visualization
The Data Science Process: From Mining Raw Data to Story VisualizationThe Data Science Process: From Mining Raw Data to Story Visualization
The Data Science Process: From Mining Raw Data to Story Visualization
 
Research data spring - Jisc Digital Festival 2015
Research data spring - Jisc Digital Festival 2015Research data spring - Jisc Digital Festival 2015
Research data spring - Jisc Digital Festival 2015
 
Data Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data ScienceData Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data Science
 
Data Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data ScienceData Science as a Service: Intersection of Cloud Computing and Data Science
Data Science as a Service: Intersection of Cloud Computing and Data Science
 
40 41
40 4140 41
40 41
 
Towards a Community-driven Data Science Body of Knowledge – Data Management S...
Towards a Community-driven Data Science Body of Knowledge – Data Management S...Towards a Community-driven Data Science Body of Knowledge – Data Management S...
Towards a Community-driven Data Science Body of Knowledge – Data Management S...
 
Nimble@itcecnogrid novel toolkit for computing weather
Nimble@itcecnogrid novel toolkit for computing weatherNimble@itcecnogrid novel toolkit for computing weather
Nimble@itcecnogrid novel toolkit for computing weather
 
4th International Conference on Machine Learning Techniques and Data Science ...
4th International Conference on Machine Learning Techniques and Data Science ...4th International Conference on Machine Learning Techniques and Data Science ...
4th International Conference on Machine Learning Techniques and Data Science ...
 
4th International Conference on Machine Learning Techniques and Data Science ...
4th International Conference on Machine Learning Techniques and Data Science ...4th International Conference on Machine Learning Techniques and Data Science ...
4th International Conference on Machine Learning Techniques and Data Science ...
 
Applying K-Means Clustering Algorithm to Discover Knowledge from Insurance Da...
Applying K-Means Clustering Algorithm to Discover Knowledge from Insurance Da...Applying K-Means Clustering Algorithm to Discover Knowledge from Insurance Da...
Applying K-Means Clustering Algorithm to Discover Knowledge from Insurance Da...
 
Big Data and Computer Science Education
Big Data and Computer Science EducationBig Data and Computer Science Education
Big Data and Computer Science Education
 
NIST Big Data Public Working Group NBD-PWG
NIST Big Data Public Working Group NBD-PWGNIST Big Data Public Working Group NBD-PWG
NIST Big Data Public Working Group NBD-PWG
 
Elaboration and enhanced usage of data analysis tool DAMIS+
Elaboration and enhanced usage of data analysis tool DAMIS+Elaboration and enhanced usage of data analysis tool DAMIS+
Elaboration and enhanced usage of data analysis tool DAMIS+
 
Data Intensive Using Mobile Relay Nodes in Wireless Sensor Networks for Energ...
Data Intensive Using Mobile Relay Nodes in Wireless Sensor Networks for Energ...Data Intensive Using Mobile Relay Nodes in Wireless Sensor Networks for Energ...
Data Intensive Using Mobile Relay Nodes in Wireless Sensor Networks for Energ...
 
A Deep Dissertion Of Data Science Related Issues And Its Applications
A Deep Dissertion Of Data Science  Related Issues And Its ApplicationsA Deep Dissertion Of Data Science  Related Issues And Its Applications
A Deep Dissertion Of Data Science Related Issues And Its Applications
 
Linked Open Data about Springer Nature conferences. The story so far
Linked Open Data about Springer Nature conferences. The story so farLinked Open Data about Springer Nature conferences. The story so far
Linked Open Data about Springer Nature conferences. The story so far
 
4th International Conference on Machine Learning Techniques and Data Science ...
4th International Conference on Machine Learning Techniques and Data Science ...4th International Conference on Machine Learning Techniques and Data Science ...
4th International Conference on Machine Learning Techniques and Data Science ...
 
Software Engineering Challenges in building AI-based complex systems
Software Engineering Challenges in building AI-based complex systemsSoftware Engineering Challenges in building AI-based complex systems
Software Engineering Challenges in building AI-based complex systems
 
Lecture_IIITD.pptx
Lecture_IIITD.pptxLecture_IIITD.pptx
Lecture_IIITD.pptx
 

Más de Demetris Trihinas

Composable Energy Modeling for ML-Driven Drone Applications
Composable Energy Modeling for ML-Driven Drone ApplicationsComposable Energy Modeling for ML-Driven Drone Applications
Composable Energy Modeling for ML-Driven Drone ApplicationsDemetris Trihinas
 
Telling a Story – or Even Propaganda – Through Data Visualization
Telling a Story – or Even Propaganda – Through Data VisualizationTelling a Story – or Even Propaganda – Through Data Visualization
Telling a Story – or Even Propaganda – Through Data VisualizationDemetris Trihinas
 
Machine Learning Introduction
Machine Learning IntroductionMachine Learning Introduction
Machine Learning IntroductionDemetris Trihinas
 
Απεικόνιση και Αλληλεπίδραση Δεδομένων Μεγάλου Όγκου με Διαδραστικούς Χάρτες
Απεικόνιση και Αλληλεπίδραση Δεδομένων Μεγάλου Όγκου με Διαδραστικούς ΧάρτεςΑπεικόνιση και Αλληλεπίδραση Δεδομένων Μεγάλου Όγκου με Διαδραστικούς Χάρτες
Απεικόνιση και Αλληλεπίδραση Δεδομένων Μεγάλου Όγκου με Διαδραστικούς ΧάρτεςDemetris Trihinas
 
From Mining Raw Data to Story Visualization
From Mining Raw Data to Story VisualizationFrom Mining Raw Data to Story Visualization
From Mining Raw Data to Story VisualizationDemetris Trihinas
 
Designing Scalable and Secure Microservices by Embracing DevOps-as-a-Service ...
Designing Scalable and Secure Microservices by Embracing DevOps-as-a-Service ...Designing Scalable and Secure Microservices by Embracing DevOps-as-a-Service ...
Designing Scalable and Secure Microservices by Embracing DevOps-as-a-Service ...Demetris Trihinas
 
Adam - Adaptive Monitoring in 5min
Adam - Adaptive Monitoring in 5minAdam - Adaptive Monitoring in 5min
Adam - Adaptive Monitoring in 5minDemetris Trihinas
 
Low-Cost Adaptive Monitoring Techniques for the Internet of Things
Low-Cost Adaptive Monitoring Techniques for the Internet of ThingsLow-Cost Adaptive Monitoring Techniques for the Internet of Things
Low-Cost Adaptive Monitoring Techniques for the Internet of ThingsDemetris Trihinas
 
AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices
AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT DevicesAdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices
AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT DevicesDemetris Trihinas
 
Cloud Elasticity and the CELAR Project
Cloud Elasticity and the CELAR ProjectCloud Elasticity and the CELAR Project
Cloud Elasticity and the CELAR ProjectDemetris Trihinas
 
[ccgrid2014] JCatascopia: Monitoring Elastically Adaptive Applications in the...
[ccgrid2014] JCatascopia: Monitoring Elastically Adaptive Applications in the...[ccgrid2014] JCatascopia: Monitoring Elastically Adaptive Applications in the...
[ccgrid2014] JCatascopia: Monitoring Elastically Adaptive Applications in the...Demetris Trihinas
 
[SummerSoc 2014] Monitoring Elastic Cloud Services
[SummerSoc 2014] Monitoring Elastic Cloud Services[SummerSoc 2014] Monitoring Elastic Cloud Services
[SummerSoc 2014] Monitoring Elastic Cloud ServicesDemetris Trihinas
 

Más de Demetris Trihinas (13)

Composable Energy Modeling for ML-Driven Drone Applications
Composable Energy Modeling for ML-Driven Drone ApplicationsComposable Energy Modeling for ML-Driven Drone Applications
Composable Energy Modeling for ML-Driven Drone Applications
 
Telling a Story – or Even Propaganda – Through Data Visualization
Telling a Story – or Even Propaganda – Through Data VisualizationTelling a Story – or Even Propaganda – Through Data Visualization
Telling a Story – or Even Propaganda – Through Data Visualization
 
Machine Learning Introduction
Machine Learning IntroductionMachine Learning Introduction
Machine Learning Introduction
 
Απεικόνιση και Αλληλεπίδραση Δεδομένων Μεγάλου Όγκου με Διαδραστικούς Χάρτες
Απεικόνιση και Αλληλεπίδραση Δεδομένων Μεγάλου Όγκου με Διαδραστικούς ΧάρτεςΑπεικόνιση και Αλληλεπίδραση Δεδομένων Μεγάλου Όγκου με Διαδραστικούς Χάρτες
Απεικόνιση και Αλληλεπίδραση Δεδομένων Μεγάλου Όγκου με Διαδραστικούς Χάρτες
 
From Mining Raw Data to Story Visualization
From Mining Raw Data to Story VisualizationFrom Mining Raw Data to Story Visualization
From Mining Raw Data to Story Visualization
 
Designing Scalable and Secure Microservices by Embracing DevOps-as-a-Service ...
Designing Scalable and Secure Microservices by Embracing DevOps-as-a-Service ...Designing Scalable and Secure Microservices by Embracing DevOps-as-a-Service ...
Designing Scalable and Secure Microservices by Embracing DevOps-as-a-Service ...
 
Adam - Adaptive Monitoring in 5min
Adam - Adaptive Monitoring in 5minAdam - Adaptive Monitoring in 5min
Adam - Adaptive Monitoring in 5min
 
Low-Cost Adaptive Monitoring Techniques for the Internet of Things
Low-Cost Adaptive Monitoring Techniques for the Internet of ThingsLow-Cost Adaptive Monitoring Techniques for the Internet of Things
Low-Cost Adaptive Monitoring Techniques for the Internet of Things
 
AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices
AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT DevicesAdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices
AdaM: an Adaptive Monitoring Framework for Sampling and Filtering on IoT Devices
 
Find A Project
Find A ProjectFind A Project
Find A Project
 
Cloud Elasticity and the CELAR Project
Cloud Elasticity and the CELAR ProjectCloud Elasticity and the CELAR Project
Cloud Elasticity and the CELAR Project
 
[ccgrid2014] JCatascopia: Monitoring Elastically Adaptive Applications in the...
[ccgrid2014] JCatascopia: Monitoring Elastically Adaptive Applications in the...[ccgrid2014] JCatascopia: Monitoring Elastically Adaptive Applications in the...
[ccgrid2014] JCatascopia: Monitoring Elastically Adaptive Applications in the...
 
[SummerSoc 2014] Monitoring Elastic Cloud Services
[SummerSoc 2014] Monitoring Elastic Cloud Services[SummerSoc 2014] Monitoring Elastic Cloud Services
[SummerSoc 2014] Monitoring Elastic Cloud Services
 

Último

DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 

Último (20)

DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 

StreamSight: A Query-Driven Framework Extending Streaming IoT Analytics to the Fog Continuum

  • 1. 3/31/22 1 Demetris Trihinas trihinas.d@unic.ac.cy 1 Workshop: Processing Data in the Fog – Aristotle University, GR – Apr. 4, 2022 Department of Computer Science StreamSight A Query-Driven Framework Extending Streaming IoT Analytics to the Fog Continuum Dr. Demetris Trihinas Department of Computer Science ailab @ University of Nicosia trihinas.d@unic.ac.cy
  • 2. 3/31/22 2 Demetris Trihinas trihinas.d@unic.ac.cy 2 Workshop: Processing Data in the Fog – Aristotle University, GR – Apr. 4, 2022 Department of Computer Science “Designing and developing scalable and self-adaptive tools for data management, exploration and visualization” trihinas.d@unic.ac.cy http:///dtrihinas.info dtrihinas Dr. Demetris Trihinas Lecturer at University of Nicosia Artificial Intelligence Laboratory (AILab) Open and trusted fog computing platform that facilitates the deployment of scalable and heterogeneous IoT services Enabling power-efficient Machine Learning and its applications to drone technology for handling time-critical missions Bridging the early diagnosis and treatment gap of brain diseases via smart, connected, proactive and evidence-based technology
  • 3. 3/31/22 3 Demetris Trihinas trihinas.d@unic.ac.cy 3 Workshop: Processing Data in the Fog – Aristotle University, GR – Apr. 4, 2022 Department of Computer Science Distributed Data Processing Engines • Big data processing engines are contributing to the democratization of analytics by hiding the complexity for: • M2M communication and syncing. • Resource management. • Task scheduling and supervision for analytic jobs. • Fault tolerance for both the infrastructure and execution state. • Monitoring and logging. • ...
  • 4. 3/31/22 4 Demetris Trihinas trihinas.d@unic.ac.cy 4 Workshop: Processing Data in the Fog – Aristotle University, GR – Apr. 4, 2022 Department of Computer Science
  • 5. 3/31/22 5 Demetris Trihinas trihinas.d@unic.ac.cy 5 Workshop: Processing Data in the Fog – Aristotle University, GR – Apr. 4, 2022 Department of Computer Science • Spark-SQL and Structure-Steaming leveling the game… but.. Challenge: Steep Learning Curve ... ... Compute the mean of a metric using a 60s sliding window • Unlike SQL, there is a difficulty for IoT operators/Data Scientists to issue ad-hoc queries -> requires knowledge of underlying engine programming model.
  • 6. 3/31/22 6 Demetris Trihinas trihinas.d@unic.ac.cy 6 Workshop: Processing Data in the Fog – Aristotle University, GR – Apr. 4, 2022 Department of Computer Science Challenge: Analytics Governance Lock-in • Analytics landscape still fairly open and non-dominant • Switching big data framework requires massive re-coding • Apache Beam (former Google DataFlow) and Summingbird towards right direction… • But… https://sigmodrecord.org/2020/02/12/the-seattle-report-on-database-research/
  • 7. 3/31/22 8 Demetris Trihinas trihinas.d@unic.ac.cy 8 Workshop: Processing Data in the Fog – Aristotle University, GR – Apr. 4, 2022 Department of Computer Science Challenge: Fog-Aware Queries? The “Edge” SpO2 HR ... motion… temp pollutants … Less powerful nodes Network bandwidth far from uniform… many uncertainties Reduce unnecessary computations and data movement air quality The “Fog” physical and/or network distance Less load on centralized services LAN/WAN (one hop away) Internet
  • 8. 3/31/22 9 Demetris Trihinas trihinas.d@unic.ac.cy 9 Workshop: Processing Data in the Fog – Aristotle University, GR – Apr. 4, 2022 Department of Computer Science The StreamSight Analytics Framework for IoT SQL-like query model for streaming analytics with fog optimization hints Big data engine agnostic query plan Compilers for multiple big data engines StreamSight: A Query-Driven Framework for Streaming Analytics in Edge Computing. Z. Georgiou, M. Symeonides, D. Trihinas, G. Pallis and M. Dikaiakos, IEEE/ACM UCC, 2018. Query-Driven Descriptive Analytics for IoT and Edge Computing. M. Symeonides, D. Trihinas, Z. Georgiou, G. Pallis and M. Dikaiakos, IEEE IC2E, 2019.
  • 9. 3/31/22 10 Demetris Trihinas trihinas.d@unic.ac.cy 10 Workshop: Processing Data in the Fog – Aristotle University, GR – Apr. 4, 2022 Department of Computer Science StreamSight Query Model Queries are applied on metric streams with the intent to derive insights Insights can be reused- transformed-composed with other metric streams to create new insights Query Model • Descriptive statistics • Filtering • Transformations • Windowing • Grouping • Sampling • Query Prioritization • Outlier Detection • Operator Placement • Job Scheduling Hints • …
  • 10. 3/31/22 11 Demetris Trihinas trihinas.d@unic.ac.cy 11 Workshop: Processing Data in the Fog – Aristotle University, GR – Apr. 4, 2022 Department of Computer Science COMPUTE bus_delay WHEN > ( RUNNING_MEAN(bus_delay) + 3 * RUNNING_SDEV(bus_delay) ) BY city_segment EVERY 5 SECONDS; COMPUTE ARITHMETIC_MEAN(bus_delay, 10 MINUTES) BY city_segment EVERY 5 SECONDS Examples Queries • Window Operations: several aggregations (sum, count, sdev, median, percentile, etc) • Filter Composition: Metric of interest Window length Aggregate Updating Interval Group by key for multivariate data Apache Spark 15 Ops Apache Spark 41 Ops Filter predicate
  • 11. 3/31/22 12 Demetris Trihinas trihinas.d@unic.ac.cy 12 Workshop: Processing Data in the Fog – Aristotle University, GR – Apr. 4, 2022 Department of Computer Science Query Parser and Validation • Query syntax mapped to an Abstract Syntax Tree (AST). • Syntactic correctness validation. • Independent of underlying engine.
  • 12. 3/31/22 13 Demetris Trihinas trihinas.d@unic.ac.cy 13 Workshop: Processing Data in the Fog – Aristotle University, GR – Apr. 4, 2022 Department of Computer Science AST Optimization • Naive AST... extremely inefficient, ignore geo-distributed nature of IoT. • Unnecessary intermediate re-computations • Increased data movement Cache and broadcast across worker nodes expressions, composites and results to reduce unnecessary re- computations Intermediate results can be shared among queries
  • 13. 3/31/22 14 Demetris Trihinas trihinas.d@unic.ac.cy 14 Workshop: Processing Data in the Fog – Aristotle University, GR – Apr. 4, 2022 Department of Computer Science Other (User-Annotated) Optimizations… COMPUTE MAX(taxis_fare_amount, 60 MINUTES) BY city_segment EVERY 1 MINUTES WITH SALIENCE 1 Query Prioritization On high-load influx critical queries are not delayed COMPUTE ARITHMETIC_MEAN(taxi_passengers, 10 MINUTES) EVERY 30 SECONDS WITH MAX_ERROR 0.05 AND CONFIDENCE 0.95 Error upper bound Confidence Interval Query execution with bounded error guarantees for sampling Sampling Low-Cost Adaptive Monitoring Techniques for the Internet of Things. D. Trihinas, G. Pallis and M. Dikaiakos, IEEE Trans. On Services Computing, 2018.
  • 14. 3/31/22 15 Demetris Trihinas trihinas.d@unic.ac.cy 15 Workshop: Processing Data in the Fog – Aristotle University, GR – Apr. 4, 2022 Department of Computer Science Other Optimizations… • Dedicated execution on specific nodes • Job optimization strategies
  • 15. 3/31/22 16 Demetris Trihinas trihinas.d@unic.ac.cy 16 Workshop: Processing Data in the Fog – Aristotle University, GR – Apr. 4, 2022 Department of Computer Science 1 AST to Rule them ALL…
  • 16. 3/31/22 17 Demetris Trihinas trihinas.d@unic.ac.cy 17 Workshop: Processing Data in the Fog – Aristotle University, GR – Apr. 4, 2022 Department of Computer Science Performance Evaluation • Dublin Smart City Bus Network • 968 Buses (Jan 2014), 16 metrics/record incl. bus_id, bus_delay, city_segment • Used 7 insights introduced from the examples 16 Edge servers ● 1 vCPU, 1GB MEM, 2↑ 16↓ Mbps Evaluation Metric ● Batch Processing Time Unstable System Stable System x1.4 speedup over baseline Spark StreamSight+Samling x4.3 over baseline
  • 17. 3/31/22 18 Demetris Trihinas trihinas.d@unic.ac.cy 18 Workshop: Processing Data in the Fog – Aristotle University, GR – Apr. 4, 2022 Department of Computer Science Performance Evaluation (reusing results) • Dublin Bus Workload • Average Processing Time ( Fixed Input rate 700 req/s ) StreamSight DOES NOT incur a performance overhead Baseline failed
  • 18. 3/31/22 19 Demetris Trihinas trihinas.d@unic.ac.cy 19 Workshop: Processing Data in the Fog – Aristotle University, GR – Apr. 4, 2022 Department of Computer Science StreamSight A Query-Driven Framework Extending Streaming IoT Analytics to the Fog Continuum Thank You!