SlideShare una empresa de Scribd logo
1 de 42
Miguel Pérez Colino // @mmmmmmpc
CLOUD OPERATIONS WITH STREAMING
ANALYTICS USING BIG DATA TOOLS
DataWorks Summit Sydney 2017
Miguel Pérez Colino
Senior Design Product Manager, ISBU - Red Hat
miguel@redhat.com / @mmmmmmpc
Suneel Marthi
Senior Principal Software Engineer - Red Hat
smarthi@redhat.com / @suneelmarthi
Miguel Pérez Colino // @mmmmmmpc
THE PROBLEM
Miguel Pérez Colino // @mmmmmmpc
Cloud Deployments
Act as one single thing …
… and need to be managed and operated as one
Source: https://commons.wikimedia.org/wiki/File:Auklet_flock_Shumagins_1986.jpg
Miguel Pérez Colino // @mmmmmmpc
Cloud Deployments
They do really scale ...
https://www.cncf.io/blog/2016/08/23/deploying-1000-nodes-of-openshift-on-the-cncf-cluster-part-1/
● Higher scalability
● More workloads per physical
machine (multi-tenant)
● Network and Storage also
Software Defined
● Containers and
Microservices providing
more granularity
Miguel Pérez Colino // @mmmmmmpc
THE CHALLENGE
Miguel Pérez Colino // @mmmmmmpc
Questions to solve
● Who is the user?
● What is there problem?
● How do other people solve this problem?
● How can we better solve the problem?
● What would the end result look/feel like?
Miguel Pérez Colino // @mmmmmmpc
[DESIGN THINKING]
THE BEST WAY TO HAVE A GOOD
IDEA IS TO HAVE LOTS OF IDEAS.
Miguel Pérez Colino // @mmmmmmpc
Who is the user? (Personas)
● Cloud Ops
● Developer
● Security Ops
● Monitoring
● Service Designer
● Marketing
● IT Manager
● Infrastructure Architect?
Customer’s issues are mostly
“Day 2” → Operations
● Operate OpenStack
● Operate OpenShift
○ Platform Ops
○ Developer logs
Logs → root cause analysis + forensic
Miguel Pérez Colino // @mmmmmmpc
Logs
Config
Telemetry
App debug info
Events
Monitoring
Provides Events,
Consumes Logs
Cloud Ops
Root Cause Analysis
Developer
App Analysis & Debug
Security Engineer
Sec Analysis, Audits
Marketing
Access to stats
Service
DesignerIT Manager
Access to aggregated
data, i.e. SLA, usage
Personae
Miguel Pérez Colino // @mmmmmmpc
What are there problems?
● Data aggregation
○ Ingestion
○ Transport
● Data Model → Common Data Model
● Correlation
○ With external sources (Events / Metrics / Config …)
○ Add more Information types to the solution
● Coherency (Data format and Enrichment)
Miguel Pérez Colino // @mmmmmmpc
Data (What)
Data + Information flow in Log Aggregation
ProcessIngest StoreCollect Query ViewGenerate
Derived from: http://www.dataintensive.info/
Miguel Pérez Colino // @mmmmmmpc
Personae (Who)
That can use Log Aggregation
Log Aggregation
Monitoring
Provides Events,
Consumes Logs
Cloud Ops
Root Cause
Analysis
Developer
App Analysis &
Debug
Security Engineer
Sec Analysis, Audits
User /
Marketing
Access to stats
Service
DesignerIT Manager
Access to
aggregated data,
i.e. SLA, usage
Miguel Pérez Colino // @mmmmmmpc
Personae (Motivation)
That need Log Aggregation
Cloud Ops (Apps)
“I want to proactively know
about active or potential
degradation of service”
Cloud Ops (OpenStack)
“User reports that their VM
request failed and returned
error”
Developer (OpenShift)
“My recent commit resulted in
Jenkins test failure”
“Application (multi-tiered)
launched from CloudForms
returns error”
Cloud Suite User
Miguel Pérez Colino // @mmmmmmpc
Situational Awareness (Why)
Or the need of it!
Source: https://en.wikipedia.org/wiki/Situation_awareness
Miguel Pérez Colino // @mmmmmmpc
THE SOLUTION
Miguel Pérez Colino // @mmmmmmpc
Focus on One Persona and Use Case
“Oscar the OpenStack Operator”
Log Aggregation
Monitoring
Provides Events,
Consumes Logs
Cloud Ops
Root Cause
Analysis
Developer
App Analysis &
Debug
Security Engineer
Sec Analysis, Audits
User /
Marketing
Access to stats
Service
DesignerIT Manager
Access to
aggregated data,
i.e. SLA, usage
Miguel Pérez Colino // @mmmmmmpc
Prototyped User Experience
Creating User Interface Mockups
Miguel Pérez Colino // @mmmmmmpc
Implementation
Red Hat’s containerized solution with EFK stack
ElasticFluent Kibana
ProcessIngest StoreCollect Query ViewCreate
Miguel Pérez Colino // @mmmmmmpc
Implementation
KEEDIO’s containerized solution with a Big Data toolset
SOLR /
Cassandra
Kafka PatternFly
ProcessIngest StoreCollect Query ViewCreate
Flume / NiFi
HDFS
(tier 2)
Spark / FlinkRsyslog
Miguel Pérez Colino // @mmmmmmpc
Implementation: Generation
Rsyslog
What?
● Open-source software used for
forwarding log messages in a network.
● Implements the syslog protocol
Why?
● Fast system for log processing.
● High performance, Low footprint,
included in the OS
● Inputs from wide variety of sources
Miguel Pérez Colino // @mmmmmmpc
Implementation: Ingestion
Apache Nifi
What?
● Reliable system to process and
distribute data
● Language: Java
Why?
● Graphical management
● Clusterizable
● Data Provenance
● Many sources and destinations
Miguel Pérez Colino // @mmmmmmpc
Use Case: Ingestion
Apache Nifi
Easily customize “tagging” and processing
rules via Graphical User Interface
Review steps with data provenance
“Like having an IDE and a Debugger for
data processing rules.”
Miguel Pérez Colino // @mmmmmmpc
Implementation: Collect
Apache Kafka
What?
● Open-source distributed messaging
system
● Languages: Java & Scala
Why?
● High throughput and low-latency
● Clusterable, load balancing and async
send.
● Allows handling real-time data feeds
● Customizable data retention on disk
● Enables multiple consumers on the
same data
● “Rewind and Replay”
Miguel Pérez Colino // @mmmmmmpc
Implementation: Process
Apache Flink
What?
● Open-source stream processing
framework for distributed, high-
performing, always-available, and
accurate data streaming apps.
● Language: Java, Scala
Why?
● Streaming-first, continuous processing
● Fault-tolerant, stateful computations
● Scalable & performance. High
throughput, low latency
● Advanced filtering capabilities (CEP)
Miguel Pérez Colino // @mmmmmmpc
Use Case: Collect + Process
Apache Kafka + Flink
● Long retention periods in queue
enable new post processing targets
to previous events
● Only the right info sent to the right
target
● Detect anomalies and trigger alerts
Miguel Pérez Colino // @mmmmmmpc
Use Case: Collect + Process
Apache Kafka + Flink
● Different storage targets with filtered post
processed output
Miguel Pérez Colino // @mmmmmmpc
Use Case: Collect + Process
Apache Kafka + Flink
● Alerts sent to Kafka. A listener can enable
all kind of alerts
Alert ListenerTelegramE-Mail
Miguel Pérez Colino // @mmmmmmpc
Implementation: Store + Query
Apache Cassandra
What?
● Open source NoSQL database, <key,
value> based
● Language: Java
Why?
● Fault tolerant
● Decentralized & scalable
● Fully proven & high performant
● Flexible data model
Miguel Pérez Colino // @mmmmmmpc
Implementation: View
Patternfly
What?
● Open Source responsive framework for
frontends
● Language: Javascript, Bootstrap,
AngularJS 1
Why?
● Easy to implement new interfaces
● Includes capabilities for graphs
● (d3 JS + c3 JS)
● Natively responsive (mobile / tablet)
● Well supported and extended (Used in
most Red Hat products)
Miguel Pérez Colino // @mmmmmmpc
Implementation
Infrastructure
Miguel Pérez Colino // @mmmmmmpc
Deployment
Miguel Pérez Colino // @mmmmmmpc
Deployment: View
Patternfly
Miguel Pérez Colino // @mmmmmmpc
Deployment: View
Patternfly
Miguel Pérez Colino // @mmmmmmpc
Deployment: View
Patternfly
Miguel Pérez Colino // @mmmmmmpc
USE CASE EXAMPLE (CEP)
Miguel Pérez Colino // @mmmmmmpc
Use Case: OpenStack Timeouts
Network Timeout by default 30 secs
1. Request of VM
2. Request of vPort (Virtual NIC)
3. vPort generated in more than 30 secs → Timeout!
4. Error generating VM
5. No error generating vPort
Need correlation to detect
Miguel Pérez Colino // @mmmmmmpc
Use Case: OpenStack Timeouts
What we see ...
Error in Nova
2016-12-05 10:28:14.292 10253 ERROR nova.compute.manager [req-190de497-d90f-48e0-91ea-
f1f1c0877704688ae4039aad471fbab98da1b1e1fcb6 e21be8c7ab34490386508bbd0c58f511 - - -] Instance failed
network setup after 1 attempt(s)
2016-12-05 10:28:14.292 10253 ERROR nova.compute.manager ConnectTimeout: Request to
https://[::1]:9696/v2.0/ports.json timed out
Info in Neutron
2016-12-05 10:28:16.878 13187 INFO neutron.wsgi
[req-827495e1-2ae2-41c1-b51b-2eda57f4ba1d688ae4039aad471fbab98da1b1e1fcb6
e21be8c7ab34490386508bbd0c58f511 - - -] ::1 - - [05/Dec/2016 10:28:16] "POST /v2.0/ports.json HTTP/1.1" 201
900 32.589028
Miguel Pérez Colino // @mmmmmmpc
Use Case: OpenStack Timeouts
Both lines detected correlated and alert generated. → Alert sent to Kafka
ErrorAlert:
Nova-3-2017-04-28 12:48:20.321
Neutron-6-2017-04-28 12:48:23.123
{"severity":"3","body":"[ Generating synthetic log
CEP_ID=67c8c1cc3d48c3987aee13dce5cf35a1]","spriority":"191","hostname":"overcloud-compute-
1","protocol":"TCP","port":"7790","sender":"/192.168.1.16","service":"Nova","id":"c1318482-11a1-41cd-949e-
5195c54767e5","facility":"23","timestamp":"2017-04-28 12:48:20.321"}
{"severity":"6","body":"[ Generating synthetic log
CEP_ID=67c8c1cc3d48c3987aee13dce5cf35a1]","spriority":"191","hostname":"overcloud-controller-
1","protocol":"TCP","port":"7793","sender":"/192.168.1.13","service":"Neutron","id":"e617d049-7e40-4114-8727-
c6c41140567e","facility":"23","timestamp":"2017-04-28 12:48:23.123"}
Miguel Pérez Colino // @mmmmmmpc
Use Case: OpenStack Timeouts
Both lines detected correlated and alert generated. → Alert routed to Telegram
Miguel Pérez Colino // @mmmmmmpc
THANK YOU
plus.google.com/+RedHat
linkedin.com/company/red-hat
youtube.com/user/RedHatVideos
facebook.com/redhatinc
twitter.com/RedHatNews
Miguel Pérez Colino // @mmmmmmpc
BACKUP SLIDES
Miguel Pérez Colino // @mmmmmmpc
Deployment

Más contenido relacionado

La actualidad más candente

Running Zeppelin in Enterprise
Running Zeppelin in EnterpriseRunning Zeppelin in Enterprise
Running Zeppelin in EnterpriseDataWorks Summit
 
HAWQ Meets Hive - Querying Unmanaged Data
HAWQ Meets Hive - Querying Unmanaged DataHAWQ Meets Hive - Querying Unmanaged Data
HAWQ Meets Hive - Querying Unmanaged DataDataWorks Summit
 
Enabling Modern Application Architecture using Data.gov open government data
Enabling Modern Application Architecture using Data.gov open government dataEnabling Modern Application Architecture using Data.gov open government data
Enabling Modern Application Architecture using Data.gov open government dataDataWorks Summit
 
Cloudy with a chance of Hadoop - real world considerations
Cloudy with a chance of Hadoop - real world considerationsCloudy with a chance of Hadoop - real world considerations
Cloudy with a chance of Hadoop - real world considerationsDataWorks Summit
 
Real-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiReal-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiManish Gupta
 
Overview of Apache Flink: the 4G of Big Data Analytics Frameworks
Overview of Apache Flink: the 4G of Big Data Analytics FrameworksOverview of Apache Flink: the 4G of Big Data Analytics Frameworks
Overview of Apache Flink: the 4G of Big Data Analytics FrameworksDataWorks Summit/Hadoop Summit
 
Seattle spark-meetup-032317
Seattle spark-meetup-032317Seattle spark-meetup-032317
Seattle spark-meetup-032317Nan Zhu
 
SparkR Best Practices for R Data Scientists
SparkR Best Practices for R Data ScientistsSparkR Best Practices for R Data Scientists
SparkR Best Practices for R Data ScientistsDataWorks Summit
 
Deep learning on HDP 2018 Prague
Deep learning on HDP 2018 PragueDeep learning on HDP 2018 Prague
Deep learning on HDP 2018 PragueTimothy Spann
 
IoT Edge Processing with Apache NiFi and MiniFi and Apache MXNet for IoT NY 2018
IoT Edge Processing with Apache NiFi and MiniFi and Apache MXNet for IoT NY 2018IoT Edge Processing with Apache NiFi and MiniFi and Apache MXNet for IoT NY 2018
IoT Edge Processing with Apache NiFi and MiniFi and Apache MXNet for IoT NY 2018Timothy Spann
 
Unifying Stream, SWL and CEP for Declarative Stream Processing with Apache Flink
Unifying Stream, SWL and CEP for Declarative Stream Processing with Apache FlinkUnifying Stream, SWL and CEP for Declarative Stream Processing with Apache Flink
Unifying Stream, SWL and CEP for Declarative Stream Processing with Apache FlinkDataWorks Summit/Hadoop Summit
 

La actualidad más candente (20)

Streaming in the Wild with Apache Flink
Streaming in the Wild with Apache FlinkStreaming in the Wild with Apache Flink
Streaming in the Wild with Apache Flink
 
Running Zeppelin in Enterprise
Running Zeppelin in EnterpriseRunning Zeppelin in Enterprise
Running Zeppelin in Enterprise
 
HAWQ Meets Hive - Querying Unmanaged Data
HAWQ Meets Hive - Querying Unmanaged DataHAWQ Meets Hive - Querying Unmanaged Data
HAWQ Meets Hive - Querying Unmanaged Data
 
Enabling Modern Application Architecture using Data.gov open government data
Enabling Modern Application Architecture using Data.gov open government dataEnabling Modern Application Architecture using Data.gov open government data
Enabling Modern Application Architecture using Data.gov open government data
 
Cloudy with a chance of Hadoop - real world considerations
Cloudy with a chance of Hadoop - real world considerationsCloudy with a chance of Hadoop - real world considerations
Cloudy with a chance of Hadoop - real world considerations
 
Apache Deep Learning 201
Apache Deep Learning 201Apache Deep Learning 201
Apache Deep Learning 201
 
Running Spark in Production
Running Spark in ProductionRunning Spark in Production
Running Spark in Production
 
Debunking Common Myths in Stream Processing
Debunking Common Myths in Stream ProcessingDebunking Common Myths in Stream Processing
Debunking Common Myths in Stream Processing
 
Real-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiReal-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFi
 
Migrating pipelines into Docker
Migrating pipelines into DockerMigrating pipelines into Docker
Migrating pipelines into Docker
 
Overview of Apache Flink: the 4G of Big Data Analytics Frameworks
Overview of Apache Flink: the 4G of Big Data Analytics FrameworksOverview of Apache Flink: the 4G of Big Data Analytics Frameworks
Overview of Apache Flink: the 4G of Big Data Analytics Frameworks
 
LLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in HiveLLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in Hive
 
Seattle spark-meetup-032317
Seattle spark-meetup-032317Seattle spark-meetup-032317
Seattle spark-meetup-032317
 
SparkR Best Practices for R Data Scientists
SparkR Best Practices for R Data ScientistsSparkR Best Practices for R Data Scientists
SparkR Best Practices for R Data Scientists
 
LLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in HiveLLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in Hive
 
Deep learning on HDP 2018 Prague
Deep learning on HDP 2018 PragueDeep learning on HDP 2018 Prague
Deep learning on HDP 2018 Prague
 
IoT Edge Processing with Apache NiFi and MiniFi and Apache MXNet for IoT NY 2018
IoT Edge Processing with Apache NiFi and MiniFi and Apache MXNet for IoT NY 2018IoT Edge Processing with Apache NiFi and MiniFi and Apache MXNet for IoT NY 2018
IoT Edge Processing with Apache NiFi and MiniFi and Apache MXNet for IoT NY 2018
 
LLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in HiveLLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in Hive
 
LinkedIn
LinkedInLinkedIn
LinkedIn
 
Unifying Stream, SWL and CEP for Declarative Stream Processing with Apache Flink
Unifying Stream, SWL and CEP for Declarative Stream Processing with Apache FlinkUnifying Stream, SWL and CEP for Declarative Stream Processing with Apache Flink
Unifying Stream, SWL and CEP for Declarative Stream Processing with Apache Flink
 

Similar a Cloud Operations with Streaming Analytics using Apache NiFi and Apache Flink

Cloud operations with streaming analytics using big data tools
Cloud operations with streaming analytics using big data toolsCloud operations with streaming analytics using big data tools
Cloud operations with streaming analytics using big data toolsMiguel Pérez Colino
 
FluentD for end to end monitoring
FluentD for end to end monitoringFluentD for end to end monitoring
FluentD for end to end monitoringPhil Wilkins
 
TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform Seldon
 
JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogene...
JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogene...JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogene...
JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogene...Andrey Sadovykh
 
Deep learning beyond the learning - Jörg Schad - Codemotion Amsterdam 2018
Deep learning beyond the learning - Jörg Schad - Codemotion Amsterdam 2018Deep learning beyond the learning - Jörg Schad - Codemotion Amsterdam 2018
Deep learning beyond the learning - Jörg Schad - Codemotion Amsterdam 2018Codemotion
 
Machine Learning Infrastructure
Machine Learning InfrastructureMachine Learning Infrastructure
Machine Learning InfrastructureSigOpt
 
Solving enterprise challenges through scale out storage &amp; big compute final
Solving enterprise challenges through scale out storage &amp; big compute finalSolving enterprise challenges through scale out storage &amp; big compute final
Solving enterprise challenges through scale out storage &amp; big compute finalAvere Systems
 
GE Capital Legacy Modernization and Mainframe Conversion
GE Capital Legacy Modernization and Mainframe ConversionGE Capital Legacy Modernization and Mainframe Conversion
GE Capital Legacy Modernization and Mainframe Conversionguatham
 
Challenges In Modern Application
Challenges In Modern ApplicationChallenges In Modern Application
Challenges In Modern ApplicationRahul Kumar Gupta
 
TEC118 – How Do You Manage the Configuration of Your Environments from Metal ...
TEC118 –How Do You Manage the Configuration of Your Environments from Metal ...TEC118 –How Do You Manage the Configuration of Your Environments from Metal ...
TEC118 – How Do You Manage the Configuration of Your Environments from Metal ...Chris Kernaghan
 
Mulesoft Meetup Milano #9 - Batch Processing and CI/CD
Mulesoft Meetup Milano #9 - Batch Processing and CI/CDMulesoft Meetup Milano #9 - Batch Processing and CI/CD
Mulesoft Meetup Milano #9 - Batch Processing and CI/CDGonzalo Marcos Ansoain
 
Nexxworks bootcamp ML6 (27/09/2017)
Nexxworks bootcamp ML6 (27/09/2017)Nexxworks bootcamp ML6 (27/09/2017)
Nexxworks bootcamp ML6 (27/09/2017)Karel Dumon
 
Meetup 2020 - Back to the Basics part 101 : IaC
Meetup 2020 - Back to the Basics part 101 : IaCMeetup 2020 - Back to the Basics part 101 : IaC
Meetup 2020 - Back to the Basics part 101 : IaCDamienCarpy
 
Cloud-Native .Net des applications containerisées .Net sur Linux, Windows e...
 Cloud-Native .Net des applications containerisées .Net sur Linux, Windows e... Cloud-Native .Net des applications containerisées .Net sur Linux, Windows e...
Cloud-Native .Net des applications containerisées .Net sur Linux, Windows e...VMware Tanzu
 
Get Your Head in the Cloud - Lessons in GPU Computing with Schlumberger
Get Your Head in the Cloud - Lessons in GPU Computing with SchlumbergerGet Your Head in the Cloud - Lessons in GPU Computing with Schlumberger
Get Your Head in the Cloud - Lessons in GPU Computing with Schlumbergerinside-BigData.com
 
Network Automation Journey, A systems engineer NetOps perspective
Network Automation Journey, A systems engineer NetOps perspectiveNetwork Automation Journey, A systems engineer NetOps perspective
Network Automation Journey, A systems engineer NetOps perspectiveWalid Shaari
 
DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...
DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...
DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...Haggai Philip Zagury
 
Anypoint Tools and MuleSoft Automation (DRAFT).pptx
Anypoint Tools and MuleSoft Automation (DRAFT).pptxAnypoint Tools and MuleSoft Automation (DRAFT).pptx
Anypoint Tools and MuleSoft Automation (DRAFT).pptxAkshata Sawant
 
MuleSoft Meetup #9 - Anypoint Tools and MuleSoft Automation (FINAL).pptx
MuleSoft Meetup #9 - Anypoint Tools and MuleSoft Automation (FINAL).pptxMuleSoft Meetup #9 - Anypoint Tools and MuleSoft Automation (FINAL).pptx
MuleSoft Meetup #9 - Anypoint Tools and MuleSoft Automation (FINAL).pptxSteve Clarke
 

Similar a Cloud Operations with Streaming Analytics using Apache NiFi and Apache Flink (20)

Cloud operations with streaming analytics using big data tools
Cloud operations with streaming analytics using big data toolsCloud operations with streaming analytics using big data tools
Cloud operations with streaming analytics using big data tools
 
Path to continuous delivery
Path to continuous deliveryPath to continuous delivery
Path to continuous delivery
 
FluentD for end to end monitoring
FluentD for end to end monitoringFluentD for end to end monitoring
FluentD for end to end monitoring
 
TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform
 
JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogene...
JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogene...JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogene...
JUNIPER: Towards Modeling Approach Enabling Efficient Platform for Heterogene...
 
Deep learning beyond the learning - Jörg Schad - Codemotion Amsterdam 2018
Deep learning beyond the learning - Jörg Schad - Codemotion Amsterdam 2018Deep learning beyond the learning - Jörg Schad - Codemotion Amsterdam 2018
Deep learning beyond the learning - Jörg Schad - Codemotion Amsterdam 2018
 
Machine Learning Infrastructure
Machine Learning InfrastructureMachine Learning Infrastructure
Machine Learning Infrastructure
 
Solving enterprise challenges through scale out storage &amp; big compute final
Solving enterprise challenges through scale out storage &amp; big compute finalSolving enterprise challenges through scale out storage &amp; big compute final
Solving enterprise challenges through scale out storage &amp; big compute final
 
GE Capital Legacy Modernization and Mainframe Conversion
GE Capital Legacy Modernization and Mainframe ConversionGE Capital Legacy Modernization and Mainframe Conversion
GE Capital Legacy Modernization and Mainframe Conversion
 
Challenges In Modern Application
Challenges In Modern ApplicationChallenges In Modern Application
Challenges In Modern Application
 
TEC118 – How Do You Manage the Configuration of Your Environments from Metal ...
TEC118 –How Do You Manage the Configuration of Your Environments from Metal ...TEC118 –How Do You Manage the Configuration of Your Environments from Metal ...
TEC118 – How Do You Manage the Configuration of Your Environments from Metal ...
 
Mulesoft Meetup Milano #9 - Batch Processing and CI/CD
Mulesoft Meetup Milano #9 - Batch Processing and CI/CDMulesoft Meetup Milano #9 - Batch Processing and CI/CD
Mulesoft Meetup Milano #9 - Batch Processing and CI/CD
 
Nexxworks bootcamp ML6 (27/09/2017)
Nexxworks bootcamp ML6 (27/09/2017)Nexxworks bootcamp ML6 (27/09/2017)
Nexxworks bootcamp ML6 (27/09/2017)
 
Meetup 2020 - Back to the Basics part 101 : IaC
Meetup 2020 - Back to the Basics part 101 : IaCMeetup 2020 - Back to the Basics part 101 : IaC
Meetup 2020 - Back to the Basics part 101 : IaC
 
Cloud-Native .Net des applications containerisées .Net sur Linux, Windows e...
 Cloud-Native .Net des applications containerisées .Net sur Linux, Windows e... Cloud-Native .Net des applications containerisées .Net sur Linux, Windows e...
Cloud-Native .Net des applications containerisées .Net sur Linux, Windows e...
 
Get Your Head in the Cloud - Lessons in GPU Computing with Schlumberger
Get Your Head in the Cloud - Lessons in GPU Computing with SchlumbergerGet Your Head in the Cloud - Lessons in GPU Computing with Schlumberger
Get Your Head in the Cloud - Lessons in GPU Computing with Schlumberger
 
Network Automation Journey, A systems engineer NetOps perspective
Network Automation Journey, A systems engineer NetOps perspectiveNetwork Automation Journey, A systems engineer NetOps perspective
Network Automation Journey, A systems engineer NetOps perspective
 
DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...
DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...
DevOpsDays Tel Aviv DEC 2022 | Building A Cloud-Native Platform Brick by Bric...
 
Anypoint Tools and MuleSoft Automation (DRAFT).pptx
Anypoint Tools and MuleSoft Automation (DRAFT).pptxAnypoint Tools and MuleSoft Automation (DRAFT).pptx
Anypoint Tools and MuleSoft Automation (DRAFT).pptx
 
MuleSoft Meetup #9 - Anypoint Tools and MuleSoft Automation (FINAL).pptx
MuleSoft Meetup #9 - Anypoint Tools and MuleSoft Automation (FINAL).pptxMuleSoft Meetup #9 - Anypoint Tools and MuleSoft Automation (FINAL).pptx
MuleSoft Meetup #9 - Anypoint Tools and MuleSoft Automation (FINAL).pptx
 

Más de DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

Más de DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Último

Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 

Último (20)

Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 

Cloud Operations with Streaming Analytics using Apache NiFi and Apache Flink

  • 1. Miguel Pérez Colino // @mmmmmmpc CLOUD OPERATIONS WITH STREAMING ANALYTICS USING BIG DATA TOOLS DataWorks Summit Sydney 2017 Miguel Pérez Colino Senior Design Product Manager, ISBU - Red Hat miguel@redhat.com / @mmmmmmpc Suneel Marthi Senior Principal Software Engineer - Red Hat smarthi@redhat.com / @suneelmarthi
  • 2. Miguel Pérez Colino // @mmmmmmpc THE PROBLEM
  • 3. Miguel Pérez Colino // @mmmmmmpc Cloud Deployments Act as one single thing … … and need to be managed and operated as one Source: https://commons.wikimedia.org/wiki/File:Auklet_flock_Shumagins_1986.jpg
  • 4. Miguel Pérez Colino // @mmmmmmpc Cloud Deployments They do really scale ... https://www.cncf.io/blog/2016/08/23/deploying-1000-nodes-of-openshift-on-the-cncf-cluster-part-1/ ● Higher scalability ● More workloads per physical machine (multi-tenant) ● Network and Storage also Software Defined ● Containers and Microservices providing more granularity
  • 5. Miguel Pérez Colino // @mmmmmmpc THE CHALLENGE
  • 6. Miguel Pérez Colino // @mmmmmmpc Questions to solve ● Who is the user? ● What is there problem? ● How do other people solve this problem? ● How can we better solve the problem? ● What would the end result look/feel like?
  • 7. Miguel Pérez Colino // @mmmmmmpc [DESIGN THINKING] THE BEST WAY TO HAVE A GOOD IDEA IS TO HAVE LOTS OF IDEAS.
  • 8. Miguel Pérez Colino // @mmmmmmpc Who is the user? (Personas) ● Cloud Ops ● Developer ● Security Ops ● Monitoring ● Service Designer ● Marketing ● IT Manager ● Infrastructure Architect? Customer’s issues are mostly “Day 2” → Operations ● Operate OpenStack ● Operate OpenShift ○ Platform Ops ○ Developer logs Logs → root cause analysis + forensic
  • 9. Miguel Pérez Colino // @mmmmmmpc Logs Config Telemetry App debug info Events Monitoring Provides Events, Consumes Logs Cloud Ops Root Cause Analysis Developer App Analysis & Debug Security Engineer Sec Analysis, Audits Marketing Access to stats Service DesignerIT Manager Access to aggregated data, i.e. SLA, usage Personae
  • 10. Miguel Pérez Colino // @mmmmmmpc What are there problems? ● Data aggregation ○ Ingestion ○ Transport ● Data Model → Common Data Model ● Correlation ○ With external sources (Events / Metrics / Config …) ○ Add more Information types to the solution ● Coherency (Data format and Enrichment)
  • 11. Miguel Pérez Colino // @mmmmmmpc Data (What) Data + Information flow in Log Aggregation ProcessIngest StoreCollect Query ViewGenerate Derived from: http://www.dataintensive.info/
  • 12. Miguel Pérez Colino // @mmmmmmpc Personae (Who) That can use Log Aggregation Log Aggregation Monitoring Provides Events, Consumes Logs Cloud Ops Root Cause Analysis Developer App Analysis & Debug Security Engineer Sec Analysis, Audits User / Marketing Access to stats Service DesignerIT Manager Access to aggregated data, i.e. SLA, usage
  • 13. Miguel Pérez Colino // @mmmmmmpc Personae (Motivation) That need Log Aggregation Cloud Ops (Apps) “I want to proactively know about active or potential degradation of service” Cloud Ops (OpenStack) “User reports that their VM request failed and returned error” Developer (OpenShift) “My recent commit resulted in Jenkins test failure” “Application (multi-tiered) launched from CloudForms returns error” Cloud Suite User
  • 14. Miguel Pérez Colino // @mmmmmmpc Situational Awareness (Why) Or the need of it! Source: https://en.wikipedia.org/wiki/Situation_awareness
  • 15. Miguel Pérez Colino // @mmmmmmpc THE SOLUTION
  • 16. Miguel Pérez Colino // @mmmmmmpc Focus on One Persona and Use Case “Oscar the OpenStack Operator” Log Aggregation Monitoring Provides Events, Consumes Logs Cloud Ops Root Cause Analysis Developer App Analysis & Debug Security Engineer Sec Analysis, Audits User / Marketing Access to stats Service DesignerIT Manager Access to aggregated data, i.e. SLA, usage
  • 17. Miguel Pérez Colino // @mmmmmmpc Prototyped User Experience Creating User Interface Mockups
  • 18. Miguel Pérez Colino // @mmmmmmpc Implementation Red Hat’s containerized solution with EFK stack ElasticFluent Kibana ProcessIngest StoreCollect Query ViewCreate
  • 19. Miguel Pérez Colino // @mmmmmmpc Implementation KEEDIO’s containerized solution with a Big Data toolset SOLR / Cassandra Kafka PatternFly ProcessIngest StoreCollect Query ViewCreate Flume / NiFi HDFS (tier 2) Spark / FlinkRsyslog
  • 20. Miguel Pérez Colino // @mmmmmmpc Implementation: Generation Rsyslog What? ● Open-source software used for forwarding log messages in a network. ● Implements the syslog protocol Why? ● Fast system for log processing. ● High performance, Low footprint, included in the OS ● Inputs from wide variety of sources
  • 21. Miguel Pérez Colino // @mmmmmmpc Implementation: Ingestion Apache Nifi What? ● Reliable system to process and distribute data ● Language: Java Why? ● Graphical management ● Clusterizable ● Data Provenance ● Many sources and destinations
  • 22. Miguel Pérez Colino // @mmmmmmpc Use Case: Ingestion Apache Nifi Easily customize “tagging” and processing rules via Graphical User Interface Review steps with data provenance “Like having an IDE and a Debugger for data processing rules.”
  • 23. Miguel Pérez Colino // @mmmmmmpc Implementation: Collect Apache Kafka What? ● Open-source distributed messaging system ● Languages: Java & Scala Why? ● High throughput and low-latency ● Clusterable, load balancing and async send. ● Allows handling real-time data feeds ● Customizable data retention on disk ● Enables multiple consumers on the same data ● “Rewind and Replay”
  • 24. Miguel Pérez Colino // @mmmmmmpc Implementation: Process Apache Flink What? ● Open-source stream processing framework for distributed, high- performing, always-available, and accurate data streaming apps. ● Language: Java, Scala Why? ● Streaming-first, continuous processing ● Fault-tolerant, stateful computations ● Scalable & performance. High throughput, low latency ● Advanced filtering capabilities (CEP)
  • 25. Miguel Pérez Colino // @mmmmmmpc Use Case: Collect + Process Apache Kafka + Flink ● Long retention periods in queue enable new post processing targets to previous events ● Only the right info sent to the right target ● Detect anomalies and trigger alerts
  • 26. Miguel Pérez Colino // @mmmmmmpc Use Case: Collect + Process Apache Kafka + Flink ● Different storage targets with filtered post processed output
  • 27. Miguel Pérez Colino // @mmmmmmpc Use Case: Collect + Process Apache Kafka + Flink ● Alerts sent to Kafka. A listener can enable all kind of alerts Alert ListenerTelegramE-Mail
  • 28. Miguel Pérez Colino // @mmmmmmpc Implementation: Store + Query Apache Cassandra What? ● Open source NoSQL database, <key, value> based ● Language: Java Why? ● Fault tolerant ● Decentralized & scalable ● Fully proven & high performant ● Flexible data model
  • 29. Miguel Pérez Colino // @mmmmmmpc Implementation: View Patternfly What? ● Open Source responsive framework for frontends ● Language: Javascript, Bootstrap, AngularJS 1 Why? ● Easy to implement new interfaces ● Includes capabilities for graphs ● (d3 JS + c3 JS) ● Natively responsive (mobile / tablet) ● Well supported and extended (Used in most Red Hat products)
  • 30. Miguel Pérez Colino // @mmmmmmpc Implementation Infrastructure
  • 31. Miguel Pérez Colino // @mmmmmmpc Deployment
  • 32. Miguel Pérez Colino // @mmmmmmpc Deployment: View Patternfly
  • 33. Miguel Pérez Colino // @mmmmmmpc Deployment: View Patternfly
  • 34. Miguel Pérez Colino // @mmmmmmpc Deployment: View Patternfly
  • 35. Miguel Pérez Colino // @mmmmmmpc USE CASE EXAMPLE (CEP)
  • 36. Miguel Pérez Colino // @mmmmmmpc Use Case: OpenStack Timeouts Network Timeout by default 30 secs 1. Request of VM 2. Request of vPort (Virtual NIC) 3. vPort generated in more than 30 secs → Timeout! 4. Error generating VM 5. No error generating vPort Need correlation to detect
  • 37. Miguel Pérez Colino // @mmmmmmpc Use Case: OpenStack Timeouts What we see ... Error in Nova 2016-12-05 10:28:14.292 10253 ERROR nova.compute.manager [req-190de497-d90f-48e0-91ea- f1f1c0877704688ae4039aad471fbab98da1b1e1fcb6 e21be8c7ab34490386508bbd0c58f511 - - -] Instance failed network setup after 1 attempt(s) 2016-12-05 10:28:14.292 10253 ERROR nova.compute.manager ConnectTimeout: Request to https://[::1]:9696/v2.0/ports.json timed out Info in Neutron 2016-12-05 10:28:16.878 13187 INFO neutron.wsgi [req-827495e1-2ae2-41c1-b51b-2eda57f4ba1d688ae4039aad471fbab98da1b1e1fcb6 e21be8c7ab34490386508bbd0c58f511 - - -] ::1 - - [05/Dec/2016 10:28:16] "POST /v2.0/ports.json HTTP/1.1" 201 900 32.589028
  • 38. Miguel Pérez Colino // @mmmmmmpc Use Case: OpenStack Timeouts Both lines detected correlated and alert generated. → Alert sent to Kafka ErrorAlert: Nova-3-2017-04-28 12:48:20.321 Neutron-6-2017-04-28 12:48:23.123 {"severity":"3","body":"[ Generating synthetic log CEP_ID=67c8c1cc3d48c3987aee13dce5cf35a1]","spriority":"191","hostname":"overcloud-compute- 1","protocol":"TCP","port":"7790","sender":"/192.168.1.16","service":"Nova","id":"c1318482-11a1-41cd-949e- 5195c54767e5","facility":"23","timestamp":"2017-04-28 12:48:20.321"} {"severity":"6","body":"[ Generating synthetic log CEP_ID=67c8c1cc3d48c3987aee13dce5cf35a1]","spriority":"191","hostname":"overcloud-controller- 1","protocol":"TCP","port":"7793","sender":"/192.168.1.13","service":"Neutron","id":"e617d049-7e40-4114-8727- c6c41140567e","facility":"23","timestamp":"2017-04-28 12:48:23.123"}
  • 39. Miguel Pérez Colino // @mmmmmmpc Use Case: OpenStack Timeouts Both lines detected correlated and alert generated. → Alert routed to Telegram
  • 40. Miguel Pérez Colino // @mmmmmmpc THANK YOU plus.google.com/+RedHat linkedin.com/company/red-hat youtube.com/user/RedHatVideos facebook.com/redhatinc twitter.com/RedHatNews
  • 41. Miguel Pérez Colino // @mmmmmmpc BACKUP SLIDES
  • 42. Miguel Pérez Colino // @mmmmmmpc Deployment

Notas del editor

  1. Flink Session windows → Enable CEP Checkpointing → Takes a snapshot if the system goes down Exactly once semantics → Same thing is not processed twice Sub second latency (Spark doesn’t provide)