SlideShare una empresa de Scribd logo
1 de 18
1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Mission to NARs with
Apache NiFi
Aldrin Piri - @aldrinpiri
ApacheCon Big Data 2016
12 May 2016
2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Tutorial Resources
https://github.com/apiri/nifi-mission-to-nars-workshop
3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Agenda
• Start with a dataflow… but we can do better!
• Do better with the NiFi Framework and custom processor
• Extension Points: Processors, Controller Services, Reporting Tasks
• Process Session & Process Context
• How the API ties to the NiFi repositories
• Testing isn’t that bad!
• Share with templates!
4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Adding new functionality and development approach
 Extending the platform is about leveraging expansive Java ecosystem and existing code
– Make use of open source projects and provided libraries for targeted systems and services
– Reuse existing, proprietary or closed source libraries and wrap their functionality in the framework
 Test framework provides powerful means of testing extensions in isolation as they
would work in a live instance
 Deployment is as simple as copying the created NAR to your instance(s) lib directory
5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Minimal Dependencies Needed
 Java Development Kit, version 1.7 or later
 Maven, version 3.1.0+
6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Boilerplate Code is provided via Maven Archetype
 Support for creating bundles of major extension points of Processors and Controller
Services
– Processor Bundle
– Controller Service Bundle
7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
What is a NAR?
– Bundles the developed code to provide
extensions and their dependencies
– Allows extension classloader isolation,
aiding in versioning issues that can be
pervasive in interacting with a wide variety
of systems, services, and formats
NAR == NiFi ARchive
Consider it to be an OSGi-lite package
NAR Bundle Structure
8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
How long does it take to create an extension?
 Incorporating functionality from an existing library
– Create a bundle
– Include a dependency to the library
– Design User Experience
• Properties – How can this extension be configured? What are valid values for user input?
• Relationships – How will data move to the next stage of its processing?
– Wrap the core classes of the library in the framework and implement onTrigger
• ProcessSession abstracts interactions with backing repositories and handles unit-of-work sessions
• ProcessContext allows accessing defined properties which the framework has validated
– Test
– Deploy
For the majority of cases, development time is measured in hours*
9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
How long does it really take to create an extension?
 Increased development effort may be needed for handling specific protocols
– Driven through manual management of sessions, when there are resources with their own
lifecycles beyond the sole onTrigger method
– Common for protocol “Listeners”
For the majority of cases, development time is still measured in hours
10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Behind the Scenes
11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Architecture
12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
NiFi Architecture – Repositories - Pass by reference
FlowFile Content Provenance
F1 C1 C1 P1 F1
BEFORE
AFTER
F2 C1 C1 P3 F2 – Clone (F1)
F1 C1 P2 F1 – Route
P1 F1 – Create
13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
NiFi Architecture – Repositories – Copy on Write
FlowFile Content Provenance
F1 C1 C1 P1 F1 - CREATE
BEFORE
AFTER
F1 C1
F1.1 C2 C2 (encrypted)
C1 (plaintext)
P2 F1.1 - MODIFY
P1 F1 - CREATE
14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Quick (and dirty?) Prototyping
15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Prototype Dataflows Using Existing Binaries/Applications
 ExecuteProcess – Acts as a source
processor, creating FlowFiles containing
data written to STDOUT by the target
application
 ExecuteStreamCommand – Provides
content of FlowFiles to an external
application via STDIN and creates
FlowFiles containing data written STDOUT
Processors allow making external calls to applications and programs outside of the JVM
16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Increased Flexibility of Prototyping via Scripting Languages
 ExecuteScript– Acts as a source processor,
creating FlowFiles containing data from a
referenced Script
 InvokeScriptedProcessor – Provides access
to the core framework API for interacting
with NiFi like a native Java processor
Processors allow using JVM friendly interpreted languages
17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Resources
Developer Guide
– http://nifi.apache.org/developer-guide.html
Apache NiFi Maven Archetypes
– https://cwiki.apache.org/confluence/display/NIFI/Maven+Proj
ects+for+Extensions
Mission to NARs with Apache NiFi sample bundle
– https://github.com/apiri/nifi-mission-to-nars-workshop
18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Thanks for hanging out!

Más contenido relacionado

Destacado

Rethinking Streaming Analytics For Scale
Rethinking Streaming Analytics For ScaleRethinking Streaming Analytics For Scale
Rethinking Streaming Analytics For ScaleHelena Edelson
 
Building Data Pipelines for Solr with Apache NiFi
Building Data Pipelines for Solr with Apache NiFiBuilding Data Pipelines for Solr with Apache NiFi
Building Data Pipelines for Solr with Apache NiFiBryan Bende
 
Akka in Production - ScalaDays 2015
Akka in Production - ScalaDays 2015Akka in Production - ScalaDays 2015
Akka in Production - ScalaDays 2015Evan Chan
 
Spark Kernel Talk - Apache Spark Meetup San Francisco (July 2015)
Spark Kernel Talk - Apache Spark Meetup San Francisco (July 2015)Spark Kernel Talk - Apache Spark Meetup San Francisco (July 2015)
Spark Kernel Talk - Apache Spark Meetup San Francisco (July 2015)Robert "Chip" Senkbeil
 
Reactive app using actor model & apache spark
Reactive app using actor model & apache sparkReactive app using actor model & apache spark
Reactive app using actor model & apache sparkRahul Kumar
 
Real-Time Anomaly Detection with Spark MLlib, Akka and Cassandra
Real-Time Anomaly Detection  with Spark MLlib, Akka and  CassandraReal-Time Anomaly Detection  with Spark MLlib, Akka and  Cassandra
Real-Time Anomaly Detection with Spark MLlib, Akka and CassandraNatalino Busa
 
Reactive dashboard’s using apache spark
Reactive dashboard’s using apache sparkReactive dashboard’s using apache spark
Reactive dashboard’s using apache sparkRahul Kumar
 
Four Things to Know About Reliable Spark Streaming with Typesafe and Databricks
Four Things to Know About Reliable Spark Streaming with Typesafe and DatabricksFour Things to Know About Reliable Spark Streaming with Typesafe and Databricks
Four Things to Know About Reliable Spark Streaming with Typesafe and DatabricksLegacy Typesafe (now Lightbend)
 
Data processing platforms architectures with Spark, Mesos, Akka, Cassandra an...
Data processing platforms architectures with Spark, Mesos, Akka, Cassandra an...Data processing platforms architectures with Spark, Mesos, Akka, Cassandra an...
Data processing platforms architectures with Spark, Mesos, Akka, Cassandra an...Anton Kirillov
 
Data Science lifecycle with Apache Zeppelin and Spark by Moonsoo Lee
Data Science lifecycle with Apache Zeppelin and Spark by Moonsoo LeeData Science lifecycle with Apache Zeppelin and Spark by Moonsoo Lee
Data Science lifecycle with Apache Zeppelin and Spark by Moonsoo LeeSpark Summit
 
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)Helena Edelson
 
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...Helena Edelson
 
Lambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, Scala
Lambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, ScalaLambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, Scala
Lambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, ScalaHelena Edelson
 
Using Spark, Kafka, Cassandra and Akka on Mesos for Real-Time Personalization
Using Spark, Kafka, Cassandra and Akka on Mesos for Real-Time PersonalizationUsing Spark, Kafka, Cassandra and Akka on Mesos for Real-Time Personalization
Using Spark, Kafka, Cassandra and Akka on Mesos for Real-Time PersonalizationPatrick Di Loreto
 
Streaming Analytics with Spark, Kafka, Cassandra and Akka
Streaming Analytics with Spark, Kafka, Cassandra and AkkaStreaming Analytics with Spark, Kafka, Cassandra and Akka
Streaming Analytics with Spark, Kafka, Cassandra and AkkaHelena Edelson
 
Business Rule Engine - Jare
Business Rule Engine - JareBusiness Rule Engine - Jare
Business Rule Engine - Jareuwe geercken
 
Qonnections2015 From raw data to analysis
Qonnections2015 From raw data to analysisQonnections2015 From raw data to analysis
Qonnections2015 From raw data to analysisJohn Park
 

Destacado (19)

Rethinking Streaming Analytics For Scale
Rethinking Streaming Analytics For ScaleRethinking Streaming Analytics For Scale
Rethinking Streaming Analytics For Scale
 
Building Data Pipelines for Solr with Apache NiFi
Building Data Pipelines for Solr with Apache NiFiBuilding Data Pipelines for Solr with Apache NiFi
Building Data Pipelines for Solr with Apache NiFi
 
How to deploy Apache Spark 
to Mesos/DCOS
How to deploy Apache Spark 
to Mesos/DCOSHow to deploy Apache Spark 
to Mesos/DCOS
How to deploy Apache Spark 
to Mesos/DCOS
 
Akka in Production - ScalaDays 2015
Akka in Production - ScalaDays 2015Akka in Production - ScalaDays 2015
Akka in Production - ScalaDays 2015
 
Spark Kernel Talk - Apache Spark Meetup San Francisco (July 2015)
Spark Kernel Talk - Apache Spark Meetup San Francisco (July 2015)Spark Kernel Talk - Apache Spark Meetup San Francisco (July 2015)
Spark Kernel Talk - Apache Spark Meetup San Francisco (July 2015)
 
Integrating Apache Spark and NiFi for Data Lakes
Integrating Apache Spark and NiFi for Data LakesIntegrating Apache Spark and NiFi for Data Lakes
Integrating Apache Spark and NiFi for Data Lakes
 
Reactive app using actor model & apache spark
Reactive app using actor model & apache sparkReactive app using actor model & apache spark
Reactive app using actor model & apache spark
 
Real-Time Anomaly Detection with Spark MLlib, Akka and Cassandra
Real-Time Anomaly Detection  with Spark MLlib, Akka and  CassandraReal-Time Anomaly Detection  with Spark MLlib, Akka and  Cassandra
Real-Time Anomaly Detection with Spark MLlib, Akka and Cassandra
 
Reactive dashboard’s using apache spark
Reactive dashboard’s using apache sparkReactive dashboard’s using apache spark
Reactive dashboard’s using apache spark
 
Four Things to Know About Reliable Spark Streaming with Typesafe and Databricks
Four Things to Know About Reliable Spark Streaming with Typesafe and DatabricksFour Things to Know About Reliable Spark Streaming with Typesafe and Databricks
Four Things to Know About Reliable Spark Streaming with Typesafe and Databricks
 
Data processing platforms architectures with Spark, Mesos, Akka, Cassandra an...
Data processing platforms architectures with Spark, Mesos, Akka, Cassandra an...Data processing platforms architectures with Spark, Mesos, Akka, Cassandra an...
Data processing platforms architectures with Spark, Mesos, Akka, Cassandra an...
 
Data Science lifecycle with Apache Zeppelin and Spark by Moonsoo Lee
Data Science lifecycle with Apache Zeppelin and Spark by Moonsoo LeeData Science lifecycle with Apache Zeppelin and Spark by Moonsoo Lee
Data Science lifecycle with Apache Zeppelin and Spark by Moonsoo Lee
 
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
 
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
 
Lambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, Scala
Lambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, ScalaLambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, Scala
Lambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, Scala
 
Using Spark, Kafka, Cassandra and Akka on Mesos for Real-Time Personalization
Using Spark, Kafka, Cassandra and Akka on Mesos for Real-Time PersonalizationUsing Spark, Kafka, Cassandra and Akka on Mesos for Real-Time Personalization
Using Spark, Kafka, Cassandra and Akka on Mesos for Real-Time Personalization
 
Streaming Analytics with Spark, Kafka, Cassandra and Akka
Streaming Analytics with Spark, Kafka, Cassandra and AkkaStreaming Analytics with Spark, Kafka, Cassandra and Akka
Streaming Analytics with Spark, Kafka, Cassandra and Akka
 
Business Rule Engine - Jare
Business Rule Engine - JareBusiness Rule Engine - Jare
Business Rule Engine - Jare
 
Qonnections2015 From raw data to analysis
Qonnections2015 From raw data to analysisQonnections2015 From raw data to analysis
Qonnections2015 From raw data to analysis
 

Más de Aldrin Piri

Future of Data New Jersey - HDF 3.0 Deep Dive
Future of Data New Jersey - HDF 3.0 Deep DiveFuture of Data New Jersey - HDF 3.0 Deep Dive
Future of Data New Jersey - HDF 3.0 Deep DiveAldrin Piri
 
Data at Scales and the Values of Starting Small with Apache NiFi & MiNiFi
Data at Scales and the Values of Starting Small with Apache NiFi & MiNiFiData at Scales and the Values of Starting Small with Apache NiFi & MiNiFi
Data at Scales and the Values of Starting Small with Apache NiFi & MiNiFiAldrin Piri
 
Apache NiFi Crash Course - San Jose Hadoop Summit
Apache NiFi Crash Course - San Jose Hadoop SummitApache NiFi Crash Course - San Jose Hadoop Summit
Apache NiFi Crash Course - San Jose Hadoop SummitAldrin Piri
 
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San JoseDataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San JoseAldrin Piri
 
BigData Techcon - Beyond Messaging with Apache NiFi
BigData Techcon - Beyond Messaging with Apache NiFiBigData Techcon - Beyond Messaging with Apache NiFi
BigData Techcon - Beyond Messaging with Apache NiFiAldrin Piri
 
Upping your NiFi Game with Docker
Upping your NiFi Game with DockerUpping your NiFi Game with Docker
Upping your NiFi Game with DockerAldrin Piri
 

Más de Aldrin Piri (6)

Future of Data New Jersey - HDF 3.0 Deep Dive
Future of Data New Jersey - HDF 3.0 Deep DiveFuture of Data New Jersey - HDF 3.0 Deep Dive
Future of Data New Jersey - HDF 3.0 Deep Dive
 
Data at Scales and the Values of Starting Small with Apache NiFi & MiNiFi
Data at Scales and the Values of Starting Small with Apache NiFi & MiNiFiData at Scales and the Values of Starting Small with Apache NiFi & MiNiFi
Data at Scales and the Values of Starting Small with Apache NiFi & MiNiFi
 
Apache NiFi Crash Course - San Jose Hadoop Summit
Apache NiFi Crash Course - San Jose Hadoop SummitApache NiFi Crash Course - San Jose Hadoop Summit
Apache NiFi Crash Course - San Jose Hadoop Summit
 
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San JoseDataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
Dataflow with Apache NiFi - Apache NiFi Meetup - 2016 Hadoop Summit - San Jose
 
BigData Techcon - Beyond Messaging with Apache NiFi
BigData Techcon - Beyond Messaging with Apache NiFiBigData Techcon - Beyond Messaging with Apache NiFi
BigData Techcon - Beyond Messaging with Apache NiFi
 
Upping your NiFi Game with Docker
Upping your NiFi Game with DockerUpping your NiFi Game with Docker
Upping your NiFi Game with Docker
 

Último

Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
Introduction to Decentralized Applications (dApps)
Introduction to Decentralized Applications (dApps)Introduction to Decentralized Applications (dApps)
Introduction to Decentralized Applications (dApps)Intelisync
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 

Último (20)

Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Introduction to Decentralized Applications (dApps)
Introduction to Decentralized Applications (dApps)Introduction to Decentralized Applications (dApps)
Introduction to Decentralized Applications (dApps)
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 

ApacheCon Big Data 2016: Mission to NARs with Apache NiFi

  • 1. 1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Mission to NARs with Apache NiFi Aldrin Piri - @aldrinpiri ApacheCon Big Data 2016 12 May 2016
  • 2. 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Tutorial Resources https://github.com/apiri/nifi-mission-to-nars-workshop
  • 3. 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Agenda • Start with a dataflow… but we can do better! • Do better with the NiFi Framework and custom processor • Extension Points: Processors, Controller Services, Reporting Tasks • Process Session & Process Context • How the API ties to the NiFi repositories • Testing isn’t that bad! • Share with templates!
  • 4. 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Adding new functionality and development approach  Extending the platform is about leveraging expansive Java ecosystem and existing code – Make use of open source projects and provided libraries for targeted systems and services – Reuse existing, proprietary or closed source libraries and wrap their functionality in the framework  Test framework provides powerful means of testing extensions in isolation as they would work in a live instance  Deployment is as simple as copying the created NAR to your instance(s) lib directory
  • 5. 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Minimal Dependencies Needed  Java Development Kit, version 1.7 or later  Maven, version 3.1.0+
  • 6. 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Boilerplate Code is provided via Maven Archetype  Support for creating bundles of major extension points of Processors and Controller Services – Processor Bundle – Controller Service Bundle
  • 7. 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved What is a NAR? – Bundles the developed code to provide extensions and their dependencies – Allows extension classloader isolation, aiding in versioning issues that can be pervasive in interacting with a wide variety of systems, services, and formats NAR == NiFi ARchive Consider it to be an OSGi-lite package NAR Bundle Structure
  • 8. 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved How long does it take to create an extension?  Incorporating functionality from an existing library – Create a bundle – Include a dependency to the library – Design User Experience • Properties – How can this extension be configured? What are valid values for user input? • Relationships – How will data move to the next stage of its processing? – Wrap the core classes of the library in the framework and implement onTrigger • ProcessSession abstracts interactions with backing repositories and handles unit-of-work sessions • ProcessContext allows accessing defined properties which the framework has validated – Test – Deploy For the majority of cases, development time is measured in hours*
  • 9. 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved How long does it really take to create an extension?  Increased development effort may be needed for handling specific protocols – Driven through manual management of sessions, when there are resources with their own lifecycles beyond the sole onTrigger method – Common for protocol “Listeners” For the majority of cases, development time is still measured in hours
  • 10. 10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Behind the Scenes
  • 11. 11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Architecture
  • 12. 12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved NiFi Architecture – Repositories - Pass by reference FlowFile Content Provenance F1 C1 C1 P1 F1 BEFORE AFTER F2 C1 C1 P3 F2 – Clone (F1) F1 C1 P2 F1 – Route P1 F1 – Create
  • 13. 13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved NiFi Architecture – Repositories – Copy on Write FlowFile Content Provenance F1 C1 C1 P1 F1 - CREATE BEFORE AFTER F1 C1 F1.1 C2 C2 (encrypted) C1 (plaintext) P2 F1.1 - MODIFY P1 F1 - CREATE
  • 14. 14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Quick (and dirty?) Prototyping
  • 15. 15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Prototype Dataflows Using Existing Binaries/Applications  ExecuteProcess – Acts as a source processor, creating FlowFiles containing data written to STDOUT by the target application  ExecuteStreamCommand – Provides content of FlowFiles to an external application via STDIN and creates FlowFiles containing data written STDOUT Processors allow making external calls to applications and programs outside of the JVM
  • 16. 16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Increased Flexibility of Prototyping via Scripting Languages  ExecuteScript– Acts as a source processor, creating FlowFiles containing data from a referenced Script  InvokeScriptedProcessor – Provides access to the core framework API for interacting with NiFi like a native Java processor Processors allow using JVM friendly interpreted languages
  • 17. 17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Resources Developer Guide – http://nifi.apache.org/developer-guide.html Apache NiFi Maven Archetypes – https://cwiki.apache.org/confluence/display/NIFI/Maven+Proj ects+for+Extensions Mission to NARs with Apache NiFi sample bundle – https://github.com/apiri/nifi-mission-to-nars-workshop
  • 18. 18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Thanks for hanging out!