SlideShare una empresa de Scribd logo
1 de 22
Descargar para leer sin conexión
Apache NiFi
Better Analytics Demand Better Dataflow
Presented by: Joe Witt
Apache NiFi PPMC Member
2	
  
#nifidemo
History
3	
  
•  Developed at NSA for over eight years
•  Donated to the Apache Software Foundation Nov 2014
•  Undergoing incubation
•  Three ASF releases to date
•  0.1.0 out last night!
	
  
The problem space: Enterprise Dataflow
4	
  
Automate the flow of
data from any source
…to systems which
extract meaning and
insight
…and to those that
store and make it
available for users
The challenges we faced
5	
  
•  Transport / Messaging was not enough
•  Needed to understand the big picture
•  Needed the ability to make *immediate* changes
•  Must maintain chain of custody for data
•  Rigorous security and compliance requirements
	
  
Why transport and messaging was not enough?
6	
  
•  Data access exceeded resources to transport
•  Decoupling systems is about more than the connectivity
•  Message sizes ranged from B to GB
•  Not all data is created equal
•  Needed precise security controls
•  SSL and topic level authorization insufficient
 
The basic building blocks
Real-time Command and Control
The Power of Provenance
7	
  
Apache NiFi Foundational Concepts
2
3
1
HEADER	
  
-­‐  UUID	
  
-­‐  Name	
  
-­‐  Size	
  
-­‐  Entry	
  Time	
  
	
  	
  	
  	
  	
  	
  A3ributes	
  Map	
  
	
  	
  	
  	
  	
  	
  	
  [[Key	
  |	
  Value]]	
  
CONTENT	
  
Flow File
8	
  
•  Types
•  Events
•  Objects
•  Files
•  Messages
•  Media
•  Formats
•  JSON
•  Avro
•  Text
•  Mp4
•  Proprietary
•  Sizes
•  Bytes to GBs
Flow File Processor
9	
  
•  Routing
•  Context
•  Content
•  Transformation
•  Enrich
•  Obfuscate
•  Filter
•  Convert
•  Analyze
•  Split
•  Aggregate
•  Mediation
•  Push / Pull
•  …
Connections
10	
  
•  Queuing
•  Back Pressure
•  Expiration
•  Prioritize
•  Swapping
Flow Controller
11	
  
NiFi Architecture
12	
  
NiFi Clustering Model
13	
  
Tighten the feedback loop
•  Changes have consequences (good or bad)
•  And you see them as they occur
Continuous Improvement
•  Compare real-time vs. historical statistics
•  View data provenance
•  View Content at any stage
Intuitive user experience
•  Visual programming
•  Logical flow graph
14	
  
Real-time command and control2
Latency Optimization
•  Intra process
•  Inter process
•  End-to-end
Compliance
•  Prove handling
•  Assess impact
Understanding
•  Step through time
•  View content
•  View Context
15	
  
The Power of Provenance – Chain of custody for data3
16	
  
Demo
Flow File Repo – Write Ahead Log
Content Repo
Add more partitions
Input/Output Streams
Copy on Write
Pass by Reference
Allow tradeoffs of latency vs throughput
17	
  
How fast is it and why?
- User to System and System to System
-  Authentication (2-Way SSL)
-  Authorization (pluggable)
-  Authorize a specific piece of data to a specific system
-  Data provenance
-  Prove you have done the right thing
-  Recover when you have not
18	
  
How does it deal with security?
Web UI
Push API
Reporting Tasks (ganglia, graphite, etc…)
Pull API
REST API
19	
  
How can I monitor this at runtime?
Flow File Processors
Advanced UI
Flow File Prioritizer
Reporting Tasks
Controller Services
Build Clients against our REST API
20	
  
What are the points of extension?
Status and direction for NiFi
21	
  
Efficient use of each node
-  100s of MB/s per node
-  100Ks transactions/s per node
Simple / Effective scaling model
Runtime Command and Control
Data Provenance
	
  
Distributed durability of data
- Maybe Kafka backed queues
High Availability Cluster Manager
Live / Rolling Upgrades
Provenance Query Language /
Reporting
A complete user experience enabled by
provenance
Existing Strengths Roadmap Highlights
Apache NiFi (incubating) site
http://nifi.incubator.apache.org
Subscribe to and collaborate at
dev@nifi.incubator.apache.org
users@nifi.incubator.apache.org
Submit Ideas or Issues
https://issues.apache.org/jira/browse/NIFI
@apachenifi
	
  
22	
  
Learn more about Apache NiFi

Más contenido relacionado

La actualidad más candente

Real-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiReal-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiManish Gupta
 
Insight into Hyperconverged Infrastructure
Insight into Hyperconverged Infrastructure Insight into Hyperconverged Infrastructure
Insight into Hyperconverged Infrastructure HTS Hosting
 
Database ingest with Apache NiFi and MiNiFi
Database ingest with Apache NiFi and MiNiFiDatabase ingest with Apache NiFi and MiNiFi
Database ingest with Apache NiFi and MiNiFiLucian Neghina
 
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...Hortonworks
 
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San JoseDataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San JoseAldrin Piri
 
Apache NiFi: Ingesting Enterprise Data At Scale
Apache NiFi:   Ingesting Enterprise Data At Scale Apache NiFi:   Ingesting Enterprise Data At Scale
Apache NiFi: Ingesting Enterprise Data At Scale Timothy Spann
 
Apache NiFi User Guide
Apache NiFi User GuideApache NiFi User Guide
Apache NiFi User GuideDeon Huang
 
Beyond Messaging Enterprise Dataflow powered by Apache NiFi
Beyond Messaging Enterprise Dataflow powered by Apache NiFiBeyond Messaging Enterprise Dataflow powered by Apache NiFi
Beyond Messaging Enterprise Dataflow powered by Apache NiFiIsheeta Sanghi
 
Apache Nifi - Custom Processor
Apache Nifi - Custom Processor Apache Nifi - Custom Processor
Apache Nifi - Custom Processor thotasrinath
 
Apache NiFi Meetup - Princeton NJ 2016
Apache NiFi Meetup - Princeton NJ 2016Apache NiFi Meetup - Princeton NJ 2016
Apache NiFi Meetup - Princeton NJ 2016Timothy Spann
 
MiNiFi 0.0.1 MeetUp talk
MiNiFi 0.0.1 MeetUp talkMiNiFi 0.0.1 MeetUp talk
MiNiFi 0.0.1 MeetUp talkJoe Percivall
 
Difference between apache spark and apache nifi
Difference between apache spark and apache nifiDifference between apache spark and apache nifi
Difference between apache spark and apache nifiGaneshJoshi47
 
Intelligently Collecting Data at the Edge – Intro to Apache MiNiFi
Intelligently Collecting Data at the Edge – Intro to Apache MiNiFiIntelligently Collecting Data at the Edge – Intro to Apache MiNiFi
Intelligently Collecting Data at the Edge – Intro to Apache MiNiFiDataWorks Summit
 
The First Mile – Edge and IoT Data Collection with Apache NiFi and MiNiFi
The First Mile – Edge and IoT Data Collection with Apache NiFi and MiNiFiThe First Mile – Edge and IoT Data Collection with Apache NiFi and MiNiFi
The First Mile – Edge and IoT Data Collection with Apache NiFi and MiNiFiDataWorks Summit
 
Integrating NiFi and Flink
Integrating NiFi and FlinkIntegrating NiFi and Flink
Integrating NiFi and FlinkBryan Bende
 
Intelligently Collecting Data at the Edge - Intro to Apache MiNiFi
Intelligently Collecting Data at the Edge - Intro to Apache MiNiFiIntelligently Collecting Data at the Edge - Intro to Apache MiNiFi
Intelligently Collecting Data at the Edge - Intro to Apache MiNiFiDataWorks Summit
 

La actualidad más candente (20)

Nifi workshop
Nifi workshopNifi workshop
Nifi workshop
 
Real-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiReal-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFi
 
Insight into Hyperconverged Infrastructure
Insight into Hyperconverged Infrastructure Insight into Hyperconverged Infrastructure
Insight into Hyperconverged Infrastructure
 
The Avant-garde of Apache NiFi
The Avant-garde of Apache NiFiThe Avant-garde of Apache NiFi
The Avant-garde of Apache NiFi
 
Database ingest with Apache NiFi and MiNiFi
Database ingest with Apache NiFi and MiNiFiDatabase ingest with Apache NiFi and MiNiFi
Database ingest with Apache NiFi and MiNiFi
 
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
 
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San JoseDataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
 
Apache NiFi: Ingesting Enterprise Data At Scale
Apache NiFi:   Ingesting Enterprise Data At Scale Apache NiFi:   Ingesting Enterprise Data At Scale
Apache NiFi: Ingesting Enterprise Data At Scale
 
Apache NiFi User Guide
Apache NiFi User GuideApache NiFi User Guide
Apache NiFi User Guide
 
Beyond Messaging Enterprise Dataflow powered by Apache NiFi
Beyond Messaging Enterprise Dataflow powered by Apache NiFiBeyond Messaging Enterprise Dataflow powered by Apache NiFi
Beyond Messaging Enterprise Dataflow powered by Apache NiFi
 
Apache Nifi - Custom Processor
Apache Nifi - Custom Processor Apache Nifi - Custom Processor
Apache Nifi - Custom Processor
 
Apache NiFi Meetup - Princeton NJ 2016
Apache NiFi Meetup - Princeton NJ 2016Apache NiFi Meetup - Princeton NJ 2016
Apache NiFi Meetup - Princeton NJ 2016
 
Apache Nifi Crash Course
Apache Nifi Crash CourseApache Nifi Crash Course
Apache Nifi Crash Course
 
MiNiFi 0.0.1 MeetUp talk
MiNiFi 0.0.1 MeetUp talkMiNiFi 0.0.1 MeetUp talk
MiNiFi 0.0.1 MeetUp talk
 
Difference between apache spark and apache nifi
Difference between apache spark and apache nifiDifference between apache spark and apache nifi
Difference between apache spark and apache nifi
 
Intelligently Collecting Data at the Edge – Intro to Apache MiNiFi
Intelligently Collecting Data at the Edge – Intro to Apache MiNiFiIntelligently Collecting Data at the Edge – Intro to Apache MiNiFi
Intelligently Collecting Data at the Edge – Intro to Apache MiNiFi
 
The First Mile – Edge and IoT Data Collection with Apache NiFi and MiNiFi
The First Mile – Edge and IoT Data Collection with Apache NiFi and MiNiFiThe First Mile – Edge and IoT Data Collection with Apache NiFi and MiNiFi
The First Mile – Edge and IoT Data Collection with Apache NiFi and MiNiFi
 
Integrating Apache Spark and NiFi for Data Lakes
Integrating Apache Spark and NiFi for Data LakesIntegrating Apache Spark and NiFi for Data Lakes
Integrating Apache Spark and NiFi for Data Lakes
 
Integrating NiFi and Flink
Integrating NiFi and FlinkIntegrating NiFi and Flink
Integrating NiFi and Flink
 
Intelligently Collecting Data at the Edge - Intro to Apache MiNiFi
Intelligently Collecting Data at the Edge - Intro to Apache MiNiFiIntelligently Collecting Data at the Edge - Intro to Apache MiNiFi
Intelligently Collecting Data at the Edge - Intro to Apache MiNiFi
 

Similar a Joe witt may2015_kafka_nyc_apachenifi-overview

Integração de Dados com Apache NIFI - Marco Garcia Cetax
Integração de Dados com Apache NIFI - Marco Garcia CetaxIntegração de Dados com Apache NIFI - Marco Garcia Cetax
Integração de Dados com Apache NIFI - Marco Garcia CetaxMarco Garcia
 
Lightning Fast Analytics with Hive LLAP and Druid
Lightning Fast Analytics with Hive LLAP and DruidLightning Fast Analytics with Hive LLAP and Druid
Lightning Fast Analytics with Hive LLAP and DruidDataWorks Summit
 
Apache NiFi - Flow Based Programming Meetup
Apache NiFi - Flow Based Programming MeetupApache NiFi - Flow Based Programming Meetup
Apache NiFi - Flow Based Programming MeetupJoseph Witt
 
Splunk MINT for Mobile Intelligence and Splunk App for Stream for Enhanced Op...
Splunk MINT for Mobile Intelligence and Splunk App for Stream for Enhanced Op...Splunk MINT for Mobile Intelligence and Splunk App for Stream for Enhanced Op...
Splunk MINT for Mobile Intelligence and Splunk App for Stream for Enhanced Op...Splunk
 
Splunk App for Stream for Enhanced Operational Intelligence from Wire Data
Splunk App for Stream for Enhanced Operational Intelligence from Wire DataSplunk App for Stream for Enhanced Operational Intelligence from Wire Data
Splunk App for Stream for Enhanced Operational Intelligence from Wire DataSplunk
 
Splunk MINT and Stream Breakout
Splunk MINT and Stream BreakoutSplunk MINT and Stream Breakout
Splunk MINT and Stream BreakoutSplunk
 
What’s New: Splunk App for Stream and Splunk MINT
What’s New: Splunk App for Stream and Splunk MINTWhat’s New: Splunk App for Stream and Splunk MINT
What’s New: Splunk App for Stream and Splunk MINTSplunk
 
BigData Techcon - Beyond Messaging with Apache NiFi
BigData Techcon - Beyond Messaging with Apache NiFiBigData Techcon - Beyond Messaging with Apache NiFi
BigData Techcon - Beyond Messaging with Apache NiFiAldrin Piri
 
Ruslan Belkin And Sean Dawson on LinkedIn's Network Updates Uncovered
Ruslan Belkin And Sean Dawson on LinkedIn's Network Updates UncoveredRuslan Belkin And Sean Dawson on LinkedIn's Network Updates Uncovered
Ruslan Belkin And Sean Dawson on LinkedIn's Network Updates UncoveredLinkedIn
 
David Henthorn [Rose-Hulman Institute of Technology] | Illuminating the Dark ...
David Henthorn [Rose-Hulman Institute of Technology] | Illuminating the Dark ...David Henthorn [Rose-Hulman Institute of Technology] | Illuminating the Dark ...
David Henthorn [Rose-Hulman Institute of Technology] | Illuminating the Dark ...InfluxData
 
Using Perforce Data in Development at Tableau
Using Perforce Data in Development at TableauUsing Perforce Data in Development at Tableau
Using Perforce Data in Development at TableauPerforce
 
Security Delivery Platform: Best practices
Security Delivery Platform: Best practicesSecurity Delivery Platform: Best practices
Security Delivery Platform: Best practicesMihajlo Prerad
 
Apache NiFi Toronto Meetup
Apache NiFi Toronto MeetupApache NiFi Toronto Meetup
Apache NiFi Toronto MeetupHortonworks
 
Cloud-Scale BGP and NetFlow Analysis
Cloud-Scale BGP and NetFlow AnalysisCloud-Scale BGP and NetFlow Analysis
Cloud-Scale BGP and NetFlow AnalysisAlex Henthorn-Iwane
 
HDF: Hortonworks DataFlow: Technical Workshop
HDF: Hortonworks DataFlow: Technical WorkshopHDF: Hortonworks DataFlow: Technical Workshop
HDF: Hortonworks DataFlow: Technical WorkshopHortonworks
 
Splunk for Security: Background & Customer Case Study
Splunk for Security: Background & Customer Case StudySplunk for Security: Background & Customer Case Study
Splunk for Security: Background & Customer Case StudyAndrew Gerber
 
GWAVACon 2015: Micro Focus - Novell File Reporter
GWAVACon 2015: Micro Focus - Novell File ReporterGWAVACon 2015: Micro Focus - Novell File Reporter
GWAVACon 2015: Micro Focus - Novell File ReporterGWAVA
 

Similar a Joe witt may2015_kafka_nyc_apachenifi-overview (20)

Integração de Dados com Apache NIFI - Marco Garcia Cetax
Integração de Dados com Apache NIFI - Marco Garcia CetaxIntegração de Dados com Apache NIFI - Marco Garcia Cetax
Integração de Dados com Apache NIFI - Marco Garcia Cetax
 
Lightning Fast Analytics with Hive LLAP and Druid
Lightning Fast Analytics with Hive LLAP and DruidLightning Fast Analytics with Hive LLAP and Druid
Lightning Fast Analytics with Hive LLAP and Druid
 
Apache NiFi - Flow Based Programming Meetup
Apache NiFi - Flow Based Programming MeetupApache NiFi - Flow Based Programming Meetup
Apache NiFi - Flow Based Programming Meetup
 
Splunk MINT for Mobile Intelligence and Splunk App for Stream for Enhanced Op...
Splunk MINT for Mobile Intelligence and Splunk App for Stream for Enhanced Op...Splunk MINT for Mobile Intelligence and Splunk App for Stream for Enhanced Op...
Splunk MINT for Mobile Intelligence and Splunk App for Stream for Enhanced Op...
 
Splunk App for Stream for Enhanced Operational Intelligence from Wire Data
Splunk App for Stream for Enhanced Operational Intelligence from Wire DataSplunk App for Stream for Enhanced Operational Intelligence from Wire Data
Splunk App for Stream for Enhanced Operational Intelligence from Wire Data
 
Splunk MINT and Stream Breakout
Splunk MINT and Stream BreakoutSplunk MINT and Stream Breakout
Splunk MINT and Stream Breakout
 
What’s New: Splunk App for Stream and Splunk MINT
What’s New: Splunk App for Stream and Splunk MINTWhat’s New: Splunk App for Stream and Splunk MINT
What’s New: Splunk App for Stream and Splunk MINT
 
BigData Techcon - Beyond Messaging with Apache NiFi
BigData Techcon - Beyond Messaging with Apache NiFiBigData Techcon - Beyond Messaging with Apache NiFi
BigData Techcon - Beyond Messaging with Apache NiFi
 
Streaming Analytics
Streaming AnalyticsStreaming Analytics
Streaming Analytics
 
Ruslan Belkin And Sean Dawson on LinkedIn's Network Updates Uncovered
Ruslan Belkin And Sean Dawson on LinkedIn's Network Updates UncoveredRuslan Belkin And Sean Dawson on LinkedIn's Network Updates Uncovered
Ruslan Belkin And Sean Dawson on LinkedIn's Network Updates Uncovered
 
David Henthorn [Rose-Hulman Institute of Technology] | Illuminating the Dark ...
David Henthorn [Rose-Hulman Institute of Technology] | Illuminating the Dark ...David Henthorn [Rose-Hulman Institute of Technology] | Illuminating the Dark ...
David Henthorn [Rose-Hulman Institute of Technology] | Illuminating the Dark ...
 
Using Perforce Data in Development at Tableau
Using Perforce Data in Development at TableauUsing Perforce Data in Development at Tableau
Using Perforce Data in Development at Tableau
 
Becoming a better pen tester overview
Becoming a better pen tester overviewBecoming a better pen tester overview
Becoming a better pen tester overview
 
Security Delivery Platform: Best practices
Security Delivery Platform: Best practicesSecurity Delivery Platform: Best practices
Security Delivery Platform: Best practices
 
Apache NiFi Toronto Meetup
Apache NiFi Toronto MeetupApache NiFi Toronto Meetup
Apache NiFi Toronto Meetup
 
Cloud-Scale BGP and NetFlow Analysis
Cloud-Scale BGP and NetFlow AnalysisCloud-Scale BGP and NetFlow Analysis
Cloud-Scale BGP and NetFlow Analysis
 
HDF: Hortonworks DataFlow: Technical Workshop
HDF: Hortonworks DataFlow: Technical WorkshopHDF: Hortonworks DataFlow: Technical Workshop
HDF: Hortonworks DataFlow: Technical Workshop
 
Splunk for Security: Background & Customer Case Study
Splunk for Security: Background & Customer Case StudySplunk for Security: Background & Customer Case Study
Splunk for Security: Background & Customer Case Study
 
IBM Aspera overview
IBM Aspera overview IBM Aspera overview
IBM Aspera overview
 
GWAVACon 2015: Micro Focus - Novell File Reporter
GWAVACon 2015: Micro Focus - Novell File ReporterGWAVACon 2015: Micro Focus - Novell File Reporter
GWAVACon 2015: Micro Focus - Novell File Reporter
 

Último

MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 

Último (20)

MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 

Joe witt may2015_kafka_nyc_apachenifi-overview

  • 1. Apache NiFi Better Analytics Demand Better Dataflow Presented by: Joe Witt Apache NiFi PPMC Member
  • 3. History 3   •  Developed at NSA for over eight years •  Donated to the Apache Software Foundation Nov 2014 •  Undergoing incubation •  Three ASF releases to date •  0.1.0 out last night!  
  • 4. The problem space: Enterprise Dataflow 4   Automate the flow of data from any source …to systems which extract meaning and insight …and to those that store and make it available for users
  • 5. The challenges we faced 5   •  Transport / Messaging was not enough •  Needed to understand the big picture •  Needed the ability to make *immediate* changes •  Must maintain chain of custody for data •  Rigorous security and compliance requirements  
  • 6. Why transport and messaging was not enough? 6   •  Data access exceeded resources to transport •  Decoupling systems is about more than the connectivity •  Message sizes ranged from B to GB •  Not all data is created equal •  Needed precise security controls •  SSL and topic level authorization insufficient
  • 7.   The basic building blocks Real-time Command and Control The Power of Provenance 7   Apache NiFi Foundational Concepts 2 3 1
  • 8. HEADER   -­‐  UUID   -­‐  Name   -­‐  Size   -­‐  Entry  Time              A3ributes  Map                [[Key  |  Value]]   CONTENT   Flow File 8   •  Types •  Events •  Objects •  Files •  Messages •  Media •  Formats •  JSON •  Avro •  Text •  Mp4 •  Proprietary •  Sizes •  Bytes to GBs
  • 9. Flow File Processor 9   •  Routing •  Context •  Content •  Transformation •  Enrich •  Obfuscate •  Filter •  Convert •  Analyze •  Split •  Aggregate •  Mediation •  Push / Pull •  …
  • 10. Connections 10   •  Queuing •  Back Pressure •  Expiration •  Prioritize •  Swapping
  • 14. Tighten the feedback loop •  Changes have consequences (good or bad) •  And you see them as they occur Continuous Improvement •  Compare real-time vs. historical statistics •  View data provenance •  View Content at any stage Intuitive user experience •  Visual programming •  Logical flow graph 14   Real-time command and control2
  • 15. Latency Optimization •  Intra process •  Inter process •  End-to-end Compliance •  Prove handling •  Assess impact Understanding •  Step through time •  View content •  View Context 15   The Power of Provenance – Chain of custody for data3
  • 17. Flow File Repo – Write Ahead Log Content Repo Add more partitions Input/Output Streams Copy on Write Pass by Reference Allow tradeoffs of latency vs throughput 17   How fast is it and why?
  • 18. - User to System and System to System -  Authentication (2-Way SSL) -  Authorization (pluggable) -  Authorize a specific piece of data to a specific system -  Data provenance -  Prove you have done the right thing -  Recover when you have not 18   How does it deal with security?
  • 19. Web UI Push API Reporting Tasks (ganglia, graphite, etc…) Pull API REST API 19   How can I monitor this at runtime?
  • 20. Flow File Processors Advanced UI Flow File Prioritizer Reporting Tasks Controller Services Build Clients against our REST API 20   What are the points of extension?
  • 21. Status and direction for NiFi 21   Efficient use of each node -  100s of MB/s per node -  100Ks transactions/s per node Simple / Effective scaling model Runtime Command and Control Data Provenance   Distributed durability of data - Maybe Kafka backed queues High Availability Cluster Manager Live / Rolling Upgrades Provenance Query Language / Reporting A complete user experience enabled by provenance Existing Strengths Roadmap Highlights
  • 22. Apache NiFi (incubating) site http://nifi.incubator.apache.org Subscribe to and collaborate at dev@nifi.incubator.apache.org users@nifi.incubator.apache.org Submit Ideas or Issues https://issues.apache.org/jira/browse/NIFI @apachenifi   22   Learn more about Apache NiFi