Big data offers tremendous opportunities for transport process innovation and will have a profound economic and societal impact on mobility and logistics. As an example, with annual growth rates of 3.2% of passenger transport and 4.5% of freight transport in the EU, transforming the current mobility and logistics processes to become significantly more efficient, will have major impact. Improvements in operational efficiency empowered by big data are expected to save as much as EUR 440 billion globally in terms of fuel and time within the mobility and logistics sector, as well as reducing 380 megatons of CO2 emissions. The mobility and logistics sector is ideally placed to benefit from big data technologies, as it already manages massive flows of goods and people whilst generating vast amounts of data. This talk reports about the main technical findings and lessons learned regarding the application of big data in the transport domain.
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Big Data Technology Insights
1. Big Data Technology Insights
Andreas Metzger
(paluno, TT Technical Coordinator)
2. TT Methodology
Rationale
• “No free lunch”
– Each data set, domain, use case is different
– Using a single data analytics solution will
most probably not work
For each of the 13 Pilots
– Data analytics solutions best suited for requirement and datasets
– Dedicated infrastructures best linked to data sources
Reuse of best practices, common requirements,
lessons learned, …
– Within pilots, across pilots, beyond project
TT, A. Metzger, Riga 2019 2
3. TT Methodology
3-Stage validation and scale-up
Stage Embedding Scale of Data
Technology
Validation
Problem understanding and
validation of key solution ideas
(Historic) data pinpointing
problems and opportunities
Large-scale
Experiments
Controlled environment (not
productive environment)
Large historic and real-time data,
possibly anonymized / simulated
In-situ (on site)
trials
Trials in the field, involving actual
end-users
Real-time, live production data
complementing historic data
TT, A. Metzger, Riga 2019 3
4. TT Technical Results
TT, A. Metzger, Riga 2019 4
5 main technical priorities
with individual sub-topics
Coverage of BDVA SRIA
[S. Zillner, E. Curry, A. Metzger, R. Seidl (Eds.), “European big
data value strategic research and innovation agenda (SRIA),”
Version 4.0, October, 2017]
5. TT, A. Metzger, Riga 2019 5
D10.4
Data ManagementSemantic Annotation of unstructured and
semi- structured data 2 2 3 3 3 3 3 3 3 3 4 4 3
Semantic interoperability 3 3 3 4 3 3 3 3 3 3 4 4 3
Data quality 3 3 4 4 2 2 4 4 4 4 4 4 4Data lifecycle management and data
governance 4 4 4 4 4 4 4 4 3 3 3 3 3Integration of data and business
processes 3 2 3 4 4 4 4 4 4 4 4 4 4
Data-as-a service 4 4 4 4 4 4 4 4 4 4 4 4 3Distributed trust infrastructures for data
management 4 4 4 4 4 4 4 4 4 4 4 4 4
Other (specify)
Data Processing Architectures
Heterogeneity 4 4 4 4 4 4 4 4 4 4 3 3 4
Scalability 3 3 3 3 3 3 3 3 3 3 3 3 3Processing of data-in-motion and data-at-
rest 4 4 4 4 4 4 4 4 4 4 4 4 4
Decentralizatrion 4 4 4 4 4 4 4 4 4 4 4 4 4
Performance 4 4 4 4 4 4 4 4 4 4 4 4 4Novel architectures for enabling new
types of big data workloads 3 3 4 4 4 4 4 4 4 4 4 4 4
Introduction of new hardware capabilities 4 4 4 3 4 4 4 4 4 4 4 4 3
Other (specify)
Data Analytics
Semantic and knowledge-based analysis 3 2 3 3 2 2 3 3 2 2 2 2 2
Content validation 4 4 4 4 3 3 4 4 3 3 4 4 4
Analytics frameworks & processing 2 3 3 3 3 3 3 3 3 3 3 3 3Advanced business analytics and
intelligence 3 2 2 1 1 1 2 2 3 3 2 2 2
Predictive and prescriptive analytics 1 1 1 2 1 1 1 1 2 2 1 1 1High Performance Data Analytics
(HPDA) 2 2 2 2 1 1 2 2 2 2 3 3 2
Data analytics and Artificial Intelligence 4 4 4 3 4 4 4 4 4 4 4 4 3
Other (specify)
Data ProtectionGeneric and easy to use data protection
approaches 4 4 4 4 4 4 4 4 4 4 4 4 4Robust Data privacy (incl. multi-party
computation) 4 4 4 4 4 4 4 4 4 4 4 4 4
Risk based approaches 4 4 4 4 4 4 4 4 4 4 4 4 4
Other (specify)
Data Visualisation and User Interaction
Visual data discovery 3 3 3 2 2 2 3 3 3 3 3 3 3Interactive visual analytics of multiple
scale data 2 2 3 2 2 2 2 2 3 3 2 2 2Collaborative, intuitive and interactive
visual interfaces 2 2 2 2 2 2 2 2 3 3 2 2 2Interactive visual data exploration and
querying in a multi-device context 2 2 2 2 2 2 2 2 3 3 2 2 2
Other (specify)
D9.4D5.4 D6.4 D7.4D4.4 D8.4
1 = Main focus
2 = Topic addressed
(but not main focus)
3 = Topic marginally
addressed
4 = Topic not addressed
Risk based approaches 4 4 4 4 4 4 4 4 4 4 4 4 4
Other (specify)
Data Visualisation and User Interaction
Visual data discovery 3 3 3 2 2 2 3 3 3 3 3 3 3Interactive visual analytics of multiple
scale data 2 2 3 2 2 2 2 2 3 3 2 2 2Collaborative, intuitive and interactive
visual interfaces 2 2 2 2 2 2 2 2 3 3 2 2 2Interactive visual data exploration and
querying in a multi-device context 2 2 2 2 2 2 2 2 3 3 2 2 2
Other (specify)
Heterogeneity 4 4 4 4 4 4 4 4 4 4 3 3 4
Scalability 3 3 3 3 3 3 3 3 3 3 3 3 3Processing of data-in-motion and data-at-
rest 4 4 4 4 4 4 4 4 4 4 4 4 4
Decentralizatrion 4 4 4 4 4 4 4 4 4 4 4 4 4
Performance 4 4 4 4 4 4 4 4 4 4 4 4 4Novel architectures for enabling new
types of big data workloads 3 3 4 4 4 4 4 4 4 4 4 4 4
Introduction of new hardware capabilities 4 4 4 3 4 4 4 4 4 4 4 4 3
Other (specify)
Data Analytics
Semantic and knowledge-based analysis 3 2 3 3 2 2 3 3 2 2 2 2 2
Content validation 4 4 4 4 3 3 4 4 3 3 4 4 4
Analytics frameworks & processing 2 3 3 3 3 3 3 3 3 3 3 3 3Advanced business analytics and
intelligence 3 2 2 1 1 1 2 2 3 3 2 2 2
Predictive and prescriptive analytics 1 1 1 2 1 1 1 1 2 2 1 1 1High Performance Data Analytics
(HPDA) 2 2 2 2 1 1 2 2 2 2 3 3 2
Data analytics and Artificial Intelligence 4 4 4 3 4 4 4 4 4 4 4 4 3
Other (specify)
Data ProtectionGeneric and easy to use data protection
approaches 4 4 4 4 4 4 4 4 4 4 4 4 4Robust Data privacy (incl. multi-party
computation) 4 4 4 4 4 4 4 4 4 4 4 4 4
Risk based approaches 4 4 4 4 4 4 4 4 4 4 4 4 4
Other (specify)
Data Visualisation and User Interaction
Visual data discovery 3 3 3 2 2 2 3 3 3 3 3 3 3Interactive visual analytics of multiple
scale data 2 2 3 2 2 2 2 2 3 3 2 2 2Collaborative, intuitive and interactive
visual interfaces 2 2 2 2 2 2 2 2 3 3 2 2 2Interactive visual data exploration and
querying in a multi-device context 2 2 2 2 2 2 2 2 3 3 2 2 2
Other (specify)
D10.4
Data ManagementSemantic Annotation of unstructured and
semi- structured data 2 2 3 3 3 3 3 3 3 3 4 4 3
Semantic interoperability 3 3 3 4 3 3 3 3 3 3 4 4 3
Data quality 3 3 4 4 2 2 4 4 4 4 4 4 4Data lifecycle management and data
governance 4 4 4 4 4 4 4 4 3 3 3 3 3Integration of data and business
processes 3 2 3 4 4 4 4 4 4 4 4 4 4
Data-as-a service 4 4 4 4 4 4 4 4 4 4 4 4 3Distributed trust infrastructures for data
management 4 4 4 4 4 4 4 4 4 4 4 4 4
Other (specify)
Data Processing Architectures
Heterogeneity 4 4 4 4 4 4 4 4 4 4 3 3 4
Scalability 3 3 3 3 3 3 3 3 3 3 3 3 3Processing of data-in-motion and data-at-
rest 4 4 4 4 4 4 4 4 4 4 4 4 4
Decentralizatrion 4 4 4 4 4 4 4 4 4 4 4 4 4
Performance 4 4 4 4 4 4 4 4 4 4 4 4 4Novel architectures for enabling new
types of big data workloads 3 3 4 4 4 4 4 4 4 4 4 4 4
Introduction of new hardware capabilities 4 4 4 3 4 4 4 4 4 4 4 4 3
Other (specify)
Data Analytics
Semantic and knowledge-based analysis 3 2 3 3 2 2 3 3 2 2 2 2 2
Content validation 4 4 4 4 3 3 4 4 3 3 4 4 4
Analytics frameworks & processing 2 3 3 3 3 3 3 3 3 3 3 3 3Advanced business analytics and
intelligence 3 2 2 1 1 1 2 2 3 3 2 2 2
Predictive and prescriptive analytics 1 1 1 2 1 1 1 1 2 2 1 1 1High Performance Data Analytics
(HPDA) 2 2 2 2 1 1 2 2 2 2 3 3 2
Data analytics and Artificial Intelligence 4 4 4 3 4 4 4 4 4 4 4 4 3
Other (specify)
Data ProtectionGeneric and easy to use data protection
approaches 4 4 4 4 4 4 4 4 4 4 4 4 4Robust Data privacy (incl. multi-party
computation) 4 4 4 4 4 4 4 4 4 4 4 4 4
Risk based approaches 4 4 4 4 4 4 4 4 4 4 4 4 4
Other (specify)
Data Visualisation and User Interaction
Visual data discovery 3 3 3 2 2 2 3 3 3 3 3 3 3Interactive visual analytics of multiple
scale data 2 2 3 2 2 2 2 2 3 3 2 2 2Collaborative, intuitive and interactive
D9.4D5.4 D6.4 D7.4D4.4 D8.4
semi- structured data 2 2 3 3 3 3 3 3 3 3 4 4 3
Semantic interoperability 3 3 3 4 3 3 3 3 3 3 4 4 3
Data quality 3 3 4 4 2 2 4 4 4 4 4 4 4Data lifecycle management and data
governance 4 4 4 4 4 4 4 4 3 3 3 3 3Integration of data and business
processes 3 2 3 4 4 4 4 4 4 4 4 4 4
Data-as-a service 4 4 4 4 4 4 4 4 4 4 4 4 3Distributed trust infrastructures for data
management 4 4 4 4 4 4 4 4 4 4 4 4 4
Other (specify)
Data Processing Architectures
Heterogeneity 4 4 4 4 4 4 4 4 4 4 3 3 4
Scalability 3 3 3 3 3 3 3 3 3 3 3 3 3Processing of data-in-motion and data-at-
rest 4 4 4 4 4 4 4 4 4 4 4 4 4
Decentralizatrion 4 4 4 4 4 4 4 4 4 4 4 4 4
Performance 4 4 4 4 4 4 4 4 4 4 4 4 4Novel architectures for enabling new
types of big data workloads 3 3 4 4 4 4 4 4 4 4 4 4 4
Introduction of new hardware capabilities 4 4 4 3 4 4 4 4 4 4 4 4 3
Other (specify)
Data Analytics
Semantic and knowledge-based analysis 3 2 3 3 2 2 3 3 2 2 2 2 2
Content validation 4 4 4 4 3 3 4 4 3 3 4 4 4
Analytics frameworks & processing 2 3 3 3 3 3 3 3 3 3 3 3 3Advanced business analytics and
intelligence 3 2 2 1 1 1 2 2 3 3 2 2 2
Predictive and prescriptive analytics 1 1 1 2 1 1 1 1 2 2 1 1 1High Performance Data Analytics
(HPDA) 2 2 2 2 1 1 2 2 2 2 3 3 2
Data analytics and Artificial Intelligence 4 4 4 3 4 4 4 4 4 4 4 4 3
Other (specify)
Data ProtectionGeneric and easy to use data protection
approaches 4 4 4 4 4 4 4 4 4 4 4 4 4Robust Data privacy (incl. multi-party
computation) 4 4 4 4 4 4 4 4 4 4 4 4 4
Risk based approaches 4 4 4 4 4 4 4 4 4 4 4 4 4
Other (specify)
Data Visualisation and User Interaction
Visual data discovery 3 3 3 2 2 2 3 3 3 3 3 3 3Interactive visual analytics of multiple
scale data 2 2 3 2 2 2 2 2 3 3 2 2 2Collaborative, intuitive and interactive
visual interfaces 2 2 2 2 2 2 2 2 3 3 2 2 2Interactive visual data exploration and
querying in a multi-device context 2 2 2 2 2 2 2 2 3 3 2 2 2
Other (specify)
D10.4
Data ManagementSemantic Annotation of unstructured and
semi- structured data 2 2 3 3 3 3 3 3 3 3 4 4 3
Semantic interoperability 3 3 3 4 3 3 3 3 3 3 4 4 3
Data quality 3 3 4 4 2 2 4 4 4 4 4 4 4Data lifecycle management and data
governance 4 4 4 4 4 4 4 4 3 3 3 3 3Integration of data and business
processes 3 2 3 4 4 4 4 4 4 4 4 4 4
Data-as-a service 4 4 4 4 4 4 4 4 4 4 4 4 3Distributed trust infrastructures for data
management 4 4 4 4 4 4 4 4 4 4 4 4 4
Other (specify)
Data Processing Architectures
Heterogeneity 4 4 4 4 4 4 4 4 4 4 3 3 4
D9.4D5.4 D6.4 D7.4D4.4 D8.4
2
1
3
6. Technical Lessons Learned
Analytics
“Garbage-in – garbage-out”
• Check and cope with missing data,
data accuracy, data timeliness,
different time-zones (clocks), …
Deep Learning works very well
“out of the box”
• Use Deep learning to make more
efficient development and engineering
of big data applications (no need for
extensive hyper-parametrization)
TT, A. Metzger, Riga 2019 6
[A. Metzger & A. Neubauer, “Considering non-sequential
control flows for process prediction with recurrent
neural networks,” in SEAA 2018, Prague, Czech Republic,
IEEE Computer Society]
0,00000
0,10000
0,20000
0,30000
0,40000
0,50000
0,60000
0,70000
1 2 3 4 5 6 7 8 9 10
Diagrammtitel
Datenreihen1 Datenreihen2
Datenreihen3 Datenreihen4
MLP
RNN
Checkpoint
Accuracy [MCC]
7. Technical Lessons Learned
Analytics
Operators benefit from
knowing data accuracy
• Augment data (actual or
predicted) with
confidence intervals,
error ranges, reliability
estimates, …
TT, A. Metzger, Riga 2019 7
[A. Metzger et al., “Proactive process adaptation using deep
learning ensembles,” in CAiSE 2019, Rome, Italy, Springer;
Open Access: https://doi.org/10.1007/978-3-030-21290-2_34]
Alarm
Reliability Estimate
Terminal Productivity Cockpit
8. Technical Lessons Learned
Visualization
Do not show too much information
• Show information hierarchically
(top-down: summary details)
• Use quickly to grasp and intuitive widgets
• Only show critical and validated events
Static UIs may be limiting
• Easy and ad-hoc customization of
visualization
TT, A. Metzger, Riga 2019 8
9. Technical Lessons Learned
Data Management
Data availability does not mean fit for purpose
• Define data analytics / visualization goal and then
determine which data and how to access
(or combination: bottom-up + top-down)
Data quality and integration takes around 80% of
effort/time
• Plan sufficient time at project start for data refinement
and fine-tuning of data collection
TT, A. Metzger, Riga 2019 9
10. Finally…
What’s next?
AI in Transport
TT, A. Metzger, Riga 2019
Autonomic
Enactment
• Reinforcement
learning for solving
complex planning and
decision problems
• Actuation driven by AI
decisions
• Safety and
trustworthiness as key
requirements for
adoption
Improved
Decision Making
• Deep learning for high
accuracy descriptive and
predictive analytics
10
[S. Zillner, J.A. Gomez, A. Garcia, E. Curry (Eds.), “Data for
Artificial Intelligence for European economic competitiveness
and societal progress – BDVA position statement,” 2018]
11. Thanks!
TT, A. Metzger, Riga 2019 11
Research leading to these results has received
funding from the EU’s Horizon 2020 research and
innovation programme under grant agreements no.
731932 – http://www.transformingtransport.eu
732630 – http://www.big-data-value.eu