Enviar búsqueda
Cargar
JAZOON'13 - Kai Waehner - Hadoop Integration
•
0 recomendaciones
•
550 vistas
J
jazoon13
Seguir
Tecnología
Denunciar
Compartir
Denunciar
Compartir
1 de 78
Descargar ahora
Descargar para leer sin conexión
Recomendados
Best Practices for Streaming IoT Data with MQTT and Apache Kafka
Best Practices for Streaming IoT Data with MQTT and Apache Kafka
Kai Wähner
WJAX 2013 Slides online: Big Data beyond Apache Hadoop - How to integrate ALL...
WJAX 2013 Slides online: Big Data beyond Apache Hadoop - How to integrate ALL...
Kai Wähner
Apache Kafka for Automotive Industry, Mobility Services & Smart City
Apache Kafka for Automotive Industry, Mobility Services & Smart City
Kai Wähner
Apache Kafka for Real-time Supply Chainin the Food and Retail Industry
Apache Kafka for Real-time Supply Chainin the Food and Retail Industry
Kai Wähner
IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita...
IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita...
Kai Wähner
Simplified Machine Learning Architecture with an Event Streaming Platform (Ap...
Simplified Machine Learning Architecture with an Event Streaming Platform (Ap...
Kai Wähner
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...
Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...
Kai Wähner
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Kai Wähner
Más contenido relacionado
La actualidad más candente
The Rise Of Event Streaming – Why Apache Kafka Changes Everything
The Rise Of Event Streaming – Why Apache Kafka Changes Everything
Kai Wähner
DataOps on Streaming Data: From Kafka to InfluxDB via Kubernetes Native Flows...
DataOps on Streaming Data: From Kafka to InfluxDB via Kubernetes Native Flows...
InfluxData
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
Kai Wähner
Kappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology Comparison
Kai Wähner
Apache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare Industry
Kai Wähner
Kafka and Machine Learning in Banking and Insurance Industry
Kafka and Machine Learning in Banking and Insurance Industry
Kai Wähner
Industry-ready NLP Service Framework Based on Kafka (Bernhard Waltl and Georg...
Industry-ready NLP Service Framework Based on Kafka (Bernhard Waltl and Georg...
confluent
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...
TheInevitableCloud
The Streaming Assessment – An Introduction
The Streaming Assessment – An Introduction
confluent
IIoT / Industry 4.0 with Apache Kafka, Connect, KSQL, Apache PLC4X
IIoT / Industry 4.0 with Apache Kafka, Connect, KSQL, Apache PLC4X
Kai Wähner
Machine Learning with Apache Kafka in Pharma and Life Sciences
Machine Learning with Apache Kafka in Pharma and Life Sciences
Kai Wähner
Kafka Summit SF 2017 - Real time Streaming Platform
Kafka Summit SF 2017 - Real time Streaming Platform
confluent
[INFOGRAPHIC] Event-driven Business: How to Handle the Flow of Event Data
[INFOGRAPHIC] Event-driven Business: How to Handle the Flow of Event Data
confluent
Hybrid & Global Kafka Architecture
Hybrid & Global Kafka Architecture
confluent
Apache Kafka and MQTT - Overview, Comparison, Use Cases, Architectures
Apache Kafka and MQTT - Overview, Comparison, Use Cases, Architectures
Kai Wähner
Apache Kafka in Financial Services - Use Cases and Architectures
Apache Kafka in Financial Services - Use Cases and Architectures
Kai Wähner
Event-Streaming verstehen in unter 10 Min
Event-Streaming verstehen in unter 10 Min
confluent
Event Mesh Presentation at Gartner AADI Mumbai
Event Mesh Presentation at Gartner AADI Mumbai
Solace
Financial Event Sourcing at Enterprise Scale
Financial Event Sourcing at Enterprise Scale
confluent
Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...
Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...
confluent
La actualidad más candente
(20)
The Rise Of Event Streaming – Why Apache Kafka Changes Everything
The Rise Of Event Streaming – Why Apache Kafka Changes Everything
DataOps on Streaming Data: From Kafka to InfluxDB via Kubernetes Native Flows...
DataOps on Streaming Data: From Kafka to InfluxDB via Kubernetes Native Flows...
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
Kappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology Comparison
Apache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare Industry
Kafka and Machine Learning in Banking and Insurance Industry
Kafka and Machine Learning in Banking and Insurance Industry
Industry-ready NLP Service Framework Based on Kafka (Bernhard Waltl and Georg...
Industry-ready NLP Service Framework Based on Kafka (Bernhard Waltl and Georg...
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...
The Streaming Assessment – An Introduction
The Streaming Assessment – An Introduction
IIoT / Industry 4.0 with Apache Kafka, Connect, KSQL, Apache PLC4X
IIoT / Industry 4.0 with Apache Kafka, Connect, KSQL, Apache PLC4X
Machine Learning with Apache Kafka in Pharma and Life Sciences
Machine Learning with Apache Kafka in Pharma and Life Sciences
Kafka Summit SF 2017 - Real time Streaming Platform
Kafka Summit SF 2017 - Real time Streaming Platform
[INFOGRAPHIC] Event-driven Business: How to Handle the Flow of Event Data
[INFOGRAPHIC] Event-driven Business: How to Handle the Flow of Event Data
Hybrid & Global Kafka Architecture
Hybrid & Global Kafka Architecture
Apache Kafka and MQTT - Overview, Comparison, Use Cases, Architectures
Apache Kafka and MQTT - Overview, Comparison, Use Cases, Architectures
Apache Kafka in Financial Services - Use Cases and Architectures
Apache Kafka in Financial Services - Use Cases and Architectures
Event-Streaming verstehen in unter 10 Min
Event-Streaming verstehen in unter 10 Min
Event Mesh Presentation at Gartner AADI Mumbai
Event Mesh Presentation at Gartner AADI Mumbai
Financial Event Sourcing at Enterprise Scale
Financial Event Sourcing at Enterprise Scale
Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...
Driving Business Transformation with Real-Time Analytics Using Apache Kafka a...
Destacado
TALEND ETL Introducción
TALEND ETL Introducción
Software
Talend
Talend
templedf
Simplifying Big Data ETL with Talend
Simplifying Big Data ETL with Talend
Edureka!
Présentation Talend Open Studio
Présentation Talend Open Studio
horacio lassey
Open Source ETL using Talend Open Studio
Open Source ETL using Talend Open Studio
santosluis87
Talend, Leading Open Source DataIntegration plateform. Cedric Carbone
Talend, Leading Open Source DataIntegration plateform. Cedric Carbone
Cedric CARBONE
Structured Approach to Solution Architecture
Structured Approach to Solution Architecture
Alan McSweeney
Destacado
(7)
TALEND ETL Introducción
TALEND ETL Introducción
Talend
Talend
Simplifying Big Data ETL with Talend
Simplifying Big Data ETL with Talend
Présentation Talend Open Studio
Présentation Talend Open Studio
Open Source ETL using Talend Open Studio
Open Source ETL using Talend Open Studio
Talend, Leading Open Source DataIntegration plateform. Cedric Carbone
Talend, Leading Open Source DataIntegration plateform. Cedric Carbone
Structured Approach to Solution Architecture
Structured Approach to Solution Architecture
Similar a JAZOON'13 - Kai Waehner - Hadoop Integration
"Big Data beyond Apache Hadoop - How to Integrate ALL your Data" - JavaOne 2013
"Big Data beyond Apache Hadoop - How to Integrate ALL your Data" - JavaOne 2013
Kai Wähner
Big Data beyond Apache Hadoop - How to integrate ALL your Data
Big Data beyond Apache Hadoop - How to integrate ALL your Data
Kai Wähner
Big Data and Enterprise Data - Oracle -1663869
Big Data and Enterprise Data - Oracle -1663869
Edgar Alejandro Villegas
Next-Generation BPM - How to create intelligent Business Processes thanks to ...
Next-Generation BPM - How to create intelligent Business Processes thanks to ...
Kai Wähner
Manipulating data with Talend. Learn how?
Manipulating data with Talend. Learn how?
Edureka!
Manipulating Data with Talend.
Manipulating Data with Talend.
Edureka!
Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which
DataWorks Summit
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
Hortonworks
Lecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.ppt
almaraniabwmalk
Big data introduction, Hadoop in details
Big data introduction, Hadoop in details
Mahmoud Yassin
John Glendenning - Real time data driven services in the Cloud
John Glendenning - Real time data driven services in the Cloud
WeAreEsynergy
Big data4businessusers
Big data4businessusers
Bob Hardaway
Cloud as a Data Platform
Cloud as a Data Platform
Andrei Savu
Big Data
Big Data
Mehmet Burak Akgün
Exploring the Wider World of Big Data
Exploring the Wider World of Big Data
NetApp
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
Hortonworks
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
Jane Roberts
BIG Data & Hadoop Applications in Social Media
BIG Data & Hadoop Applications in Social Media
Skillspeed
How to create intelligent Business Processes thanks to Big Data (BPM, Apache ...
How to create intelligent Business Processes thanks to Big Data (BPM, Apache ...
Kai Wähner
Big data an elephant business opportunities
Big data an elephant business opportunities
Bigdata Meetup Kochi
Similar a JAZOON'13 - Kai Waehner - Hadoop Integration
(20)
"Big Data beyond Apache Hadoop - How to Integrate ALL your Data" - JavaOne 2013
"Big Data beyond Apache Hadoop - How to Integrate ALL your Data" - JavaOne 2013
Big Data beyond Apache Hadoop - How to integrate ALL your Data
Big Data beyond Apache Hadoop - How to integrate ALL your Data
Big Data and Enterprise Data - Oracle -1663869
Big Data and Enterprise Data - Oracle -1663869
Next-Generation BPM - How to create intelligent Business Processes thanks to ...
Next-Generation BPM - How to create intelligent Business Processes thanks to ...
Manipulating data with Talend. Learn how?
Manipulating data with Talend. Learn how?
Manipulating Data with Talend.
Manipulating Data with Talend.
Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
Lecture 5 - Big Data and Hadoop Intro.ppt
Lecture 5 - Big Data and Hadoop Intro.ppt
Big data introduction, Hadoop in details
Big data introduction, Hadoop in details
John Glendenning - Real time data driven services in the Cloud
John Glendenning - Real time data driven services in the Cloud
Big data4businessusers
Big data4businessusers
Cloud as a Data Platform
Cloud as a Data Platform
Big Data
Big Data
Exploring the Wider World of Big Data
Exploring the Wider World of Big Data
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
BIG Data & Hadoop Applications in Social Media
BIG Data & Hadoop Applications in Social Media
How to create intelligent Business Processes thanks to Big Data (BPM, Apache ...
How to create intelligent Business Processes thanks to Big Data (BPM, Apache ...
Big data an elephant business opportunities
Big data an elephant business opportunities
Más de jazoon13
JAZOON'13 - Joe Justice - Test First Saves The World
JAZOON'13 - Joe Justice - Test First Saves The World
jazoon13
JAZOON'13 - Ulrika Park- User Story telling The Lost Art of User Stories
JAZOON'13 - Ulrika Park- User Story telling The Lost Art of User Stories
jazoon13
JAZOON'13 - Bartosz Majsak - Git Workshop - Kung Fu
JAZOON'13 - Bartosz Majsak - Git Workshop - Kung Fu
jazoon13
JAZOON'13 - Thomas Hug & Bartosz Majsak - Git Workshop -Essentials
JAZOON'13 - Thomas Hug & Bartosz Majsak - Git Workshop -Essentials
jazoon13
JAZOON'13 - Oliver Zeigermann - Typed JavaScript with TypeScript
JAZOON'13 - Oliver Zeigermann - Typed JavaScript with TypeScript
jazoon13
JAZOON'13 - Andres Almiray - Spock: boldly go where no test has gone before
JAZOON'13 - Andres Almiray - Spock: boldly go where no test has gone before
jazoon13
JAZOON'13 - Andres Almiray - Rocket Propelled Java
JAZOON'13 - Andres Almiray - Rocket Propelled Java
jazoon13
JAZOON'13 - Sven Peters - How to do Kick-Ass Software Development
JAZOON'13 - Sven Peters - How to do Kick-Ass Software Development
jazoon13
JAZOON'13 - Nikita Salnikov-Tarnovski - Multiplatform Java application develo...
JAZOON'13 - Nikita Salnikov-Tarnovski - Multiplatform Java application develo...
jazoon13
JAZOON'13 - Pawel Wrzeszcz - Visibility Shift In Distributed Teams
JAZOON'13 - Pawel Wrzeszcz - Visibility Shift In Distributed Teams
jazoon13
JAZOON'13 - Sam Brannen - Spring Framework 4.0 - The Next Generation
JAZOON'13 - Sam Brannen - Spring Framework 4.0 - The Next Generation
jazoon13
JAZOON'13 - Guide Schmutz - Kafka and Strom Event Processing In Realtime
JAZOON'13 - Guide Schmutz - Kafka and Strom Event Processing In Realtime
jazoon13
JAZOON'13 - Andrej Vckovski - Go synchronized
JAZOON'13 - Andrej Vckovski - Go synchronized
jazoon13
JAZOON'13 - Paul Brauner - A backend developer meets the web: my Dart experience
JAZOON'13 - Paul Brauner - A backend developer meets the web: my Dart experience
jazoon13
JAZOON'13 - Anatole Tresch - Go for the money (JSR 354) !
JAZOON'13 - Anatole Tresch - Go for the money (JSR 354) !
jazoon13
JAZOON'13 - Abdelmonaim Remani - The Economies of Scaling Software
JAZOON'13 - Abdelmonaim Remani - The Economies of Scaling Software
jazoon13
JAZOON'13 - Stefan Saasen - True Git: The Great Migration
JAZOON'13 - Stefan Saasen - True Git: The Great Migration
jazoon13
JAZOON'13 - Stefan Saasen - Real World Git Workflows
JAZOON'13 - Stefan Saasen - Real World Git Workflows
jazoon13
Más de jazoon13
(18)
JAZOON'13 - Joe Justice - Test First Saves The World
JAZOON'13 - Joe Justice - Test First Saves The World
JAZOON'13 - Ulrika Park- User Story telling The Lost Art of User Stories
JAZOON'13 - Ulrika Park- User Story telling The Lost Art of User Stories
JAZOON'13 - Bartosz Majsak - Git Workshop - Kung Fu
JAZOON'13 - Bartosz Majsak - Git Workshop - Kung Fu
JAZOON'13 - Thomas Hug & Bartosz Majsak - Git Workshop -Essentials
JAZOON'13 - Thomas Hug & Bartosz Majsak - Git Workshop -Essentials
JAZOON'13 - Oliver Zeigermann - Typed JavaScript with TypeScript
JAZOON'13 - Oliver Zeigermann - Typed JavaScript with TypeScript
JAZOON'13 - Andres Almiray - Spock: boldly go where no test has gone before
JAZOON'13 - Andres Almiray - Spock: boldly go where no test has gone before
JAZOON'13 - Andres Almiray - Rocket Propelled Java
JAZOON'13 - Andres Almiray - Rocket Propelled Java
JAZOON'13 - Sven Peters - How to do Kick-Ass Software Development
JAZOON'13 - Sven Peters - How to do Kick-Ass Software Development
JAZOON'13 - Nikita Salnikov-Tarnovski - Multiplatform Java application develo...
JAZOON'13 - Nikita Salnikov-Tarnovski - Multiplatform Java application develo...
JAZOON'13 - Pawel Wrzeszcz - Visibility Shift In Distributed Teams
JAZOON'13 - Pawel Wrzeszcz - Visibility Shift In Distributed Teams
JAZOON'13 - Sam Brannen - Spring Framework 4.0 - The Next Generation
JAZOON'13 - Sam Brannen - Spring Framework 4.0 - The Next Generation
JAZOON'13 - Guide Schmutz - Kafka and Strom Event Processing In Realtime
JAZOON'13 - Guide Schmutz - Kafka and Strom Event Processing In Realtime
JAZOON'13 - Andrej Vckovski - Go synchronized
JAZOON'13 - Andrej Vckovski - Go synchronized
JAZOON'13 - Paul Brauner - A backend developer meets the web: my Dart experience
JAZOON'13 - Paul Brauner - A backend developer meets the web: my Dart experience
JAZOON'13 - Anatole Tresch - Go for the money (JSR 354) !
JAZOON'13 - Anatole Tresch - Go for the money (JSR 354) !
JAZOON'13 - Abdelmonaim Remani - The Economies of Scaling Software
JAZOON'13 - Abdelmonaim Remani - The Economies of Scaling Software
JAZOON'13 - Stefan Saasen - True Git: The Great Migration
JAZOON'13 - Stefan Saasen - True Git: The Great Migration
JAZOON'13 - Stefan Saasen - Real World Git Workflows
JAZOON'13 - Stefan Saasen - Real World Git Workflows
Último
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
D Cloud Solutions
201610817 - edge part1
201610817 - edge part1
Jamie (Taka) Wang
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
Jamie (Taka) Wang
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UbiTrack UK
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
DianaGray10
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
DianaGray10
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
DianaGray10
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptx
Udaiappa Ramachandran
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IES VE
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
David Newbury
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
Matt Ray
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Will Schroeder
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Aijun Zhang
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation Developers
UiPathCommunity
Designing A Time bound resource download URL
Designing A Time bound resource download URL
Runcy Oommen
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
Aggregage
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
DianaGray10
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
Asko Soukka
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
Liveplex
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
Adtran
Último
(20)
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
201610817 - edge part1
201610817 - edge part1
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptx
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation Developers
Designing A Time bound resource download URL
Designing A Time bound resource download URL
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
JAZOON'13 - Kai Waehner - Hadoop Integration
1.
Big Data beyond
Hadoop How to integrate ALL your Data Kai Wähner Principal Consultant at Talend @KaiWaehner www.kai-waehner.de
2.
Kai Wähner Main Tasks Requirements
Engineering Enterprise Architecture Management Business Process Management Architecture and Development of Applications Service-oriented Architecture Integration of Legacy Applications Cloud Computing Big Data Consulting Developing Coaching Speaking Writing © Talend 2013 Contact Email: kontakt@kai-waehner.de Blog: www.kai-waehner.de/blog Twitter: @KaiWaehner Social Networks: Xing, LinkedIn “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
3.
Key messages You have
to care about big data to be competitive in the future! You have to integrate different sources to get most value out of it! Big data integration is no (longer) rocket science! © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
4.
Agenda • Big data
paradigm shift • Challenges for integrating big data • Choosing Hadoop for big data integration • Integration with an open source framework • Integration with an open source suite • Custom big data connectors © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
5.
Agenda • Big data
paradigm shift • Challenges for integrating big data • Choosing Hadoop for big data integration • Integration with an open source framework • Integration with an open source suite • Custom big data connectors © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
6.
Why should you
care about big data? “If you can't measure it, you can't manage it.” William Edwards Deming (1900 –1993) American statistician, professor, author, lecturer and consultant © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
7.
Why should you
care about big data? „Silence the HiPPOs“ (highest-paid person‘s opinion) Being able to interpret unimaginable large data stream, the gut feeling is no longer justified! © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
8.
What is big
data? The Vs of big data Volume (terabytes, petabytes) Velocity (realtime or nearrealtime) Variety (social networks, blog posts, logs, sensors, etc.) © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner Value
9.
Big data tasks
to solve - before analysis Big Data Integration – Land data in a Big Data cluster – Implement or generate parallel processes Big Data Manipulation – Simplify manipulation, such as sort and filter – Computational expensive functions Big Data Quality & Governance – Identify linkages and duplicates, validate big data – Match connector, execute basic quality features Big Data Project Management – Place frameworks around big data projects – Common Repository, scheduling, monitoring © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
10.
Use case: Clickstream
Analysis “The advantage of their new system is that they can now look at their data [from their log processing system] in anyway they want: ➜ Nightly MapReduce jobs collect statistics about their mail system such as spam counts by domain, bytes transferred and number of logins. ➜ When they wanted to find out which part of the world their customers logged in from, a quick [ad hoc] MapReduce job was created and they had the answer within a few hours. Not really possible in your typical ETL system.” http://highscalability.com/how-rackspace-now-uses-mapreduce-and-hadoop-query-terabytes-data © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
11.
Use case: Risk
management Deduce Customer Defections http://hkotadia.com/archives/5021 © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
12.
Use case: Flexible
pricing ➜ ➜ ➜ ➜ With revenue of almost USD 30 billion and a network of 800 locations, Macy's is considered the largest store operator in the USA Daily price check analysis of its 10,000 articles in less than two hours Whenever a neighboring competitor anywhere between New York and Los Angeles goes for aggressive price reductions, Macy's follows its example If there is no market competitor, the prices remain unchanged http://www.t-systems.com/about-t-systems/examples-of-successes-companies-analyze-big-data-in-record-time-l-t-systems/1029702 © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
13.
Storage: Compliance Global Parcel
Service A lot of data must be stored „forever“ ➜ Numbers increase exponentially ➜ Goal: As cheap as possible ➜ Problem: (Fast) queries must still be possible ➜ Solution: Commodity servers and „Hadoop querying“ ➜ http://archive.org/stream/BigDataImPraxiseinsatz-SzenarienBeispieleEffekte/Big_Data_BITKOM-Leitfaden_Sept.2012#page/n0/mode/2up © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
14.
Agenda • Big data
paradigm shift • Challenges for integrating big data • Choosing Hadoop for big data integration • Integration with an open source framework • Integration with an open source suite • Custom big data connectors © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
15.
Limited big data
experts This is your company Big Data Geek © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
16.
Data quality Big Data
+ Poor Data Quality = Big Problems © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
17.
Big data tool
selection (business perspective) Wanna buy a big data solution for your industry? ➜ Maybe a competitor has a big data solution which adds business value? ➜ The competitor will never publish it (rat-race)! ➜ © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
18.
Big data tool
selection (technical perspective) Looking for ‚your‘ required big data product? Support your data from scratch? Good luck! © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
19.
How to solve
these big data challenges? © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
20.
What is your
Big Data process? Step 1 Step 2 Step 3 1) Do not begin with the data, think about business opportunities 2) Choose the right data (combine different data sources) 3) Use easy tooling http://hbr.org/2012/10/making-advanced-analytics-work-for-you © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
21.
Agenda • Big data
paradigm shift • Challenges for integrating big data • Choosing Hadoop for big data integration • Integration with an open source framework • Integration with an open source suite • Custom big data connectors © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
22.
Technology perspective How to
process big data? © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
23.
How to process
big data? The critical flaw in parallel ETL tools is the fact that the data is almost never local to the processing nodes. This means that every time a large job is run, the data has to first be read from the source, split N ways and then delivered to the individual nodes. Worse, if the partition key of the source doesn’t match the partition key of the target, data has to be constantly exchanged among the nodes. In essence, parallel ETL treats the network as if it were a physical I/O subsystem. The network, which is always the slowest part of the process, becomes the weakest link in the performance chain. http://blog.syncsort.com/2012/08/parallel-etl-tools-are-dead © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
24.
How to process
big data? Slides: http://www.slideshare.net/pavlobaron/100-big-data-0-hadoop-0-java Video: http://www.infoq.com/presentations/Big-Data-Hadoop-Java © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
25.
How to process
big data? The defacto standard for big data processing © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
26.
How to process
big data? “A big part of [the company’s strategy] includes wiring SQL Server 2012 (formerly known by the codename “Denali”) to the Hadoop distributed computing platform, and bringing Hadoop to Windows Server and Azure” Even Microsoft (the .NET house) relies on Hadoop since 2011 © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
27.
What is Hadoop? Apache
Hadoop, an open-source software library, is a framework that allows for the distributed processing of large data sets across clusters of commodity hardware using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
28.
Map (Shuffle) Reduce Simple
example • Input: (very large) text files with lists of strings, such as: „318, 0043012650999991949032412004...0500001N9+01111+99999999999...“ • We are interested just in some content: year and temperate (marked in red) • The Map Reduce function has to compute the maximum temperature for every year Example from the book “Hadoop: The Definitive Guide, 3rd Edition” © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
29.
How to process
big data? © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
30.
Agenda • Big data
paradigm shift • Challenges for integrating big data • Choosing Hadoop for big data integration • Integration with an open source framework • Integration with an open source suite • Custom big data connectors © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
31.
Alternatives for systems
integration Integration Suite Enterprise Service Bus Integration Framework Low Connectivity Routing Transformation © Talend 2013 High + INTEGRATION Tooling Monitoring Support + BUSINESS PROCESS MGT. BIG DATA / MDM REGISTRY / REPOSITORY RULES ENGINE „YOU NAME IT“ “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner Complexity of Integration
32.
Alternatives for systems
integration Integration Framework Enterprise Service Bus Low © Talend 2013 Integration Suite High “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner Complexity of Integration
33.
More details about
integration frameworks... http://www.kai-waehner.de/blog/2012/12/20/showdown-integration-frameworkspring-integration-apache-camel-vs-enterprise-service-bus-esb/ © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
34.
Enterprise Integration Patterns
(EIP) Apache Camel Implements the EIPs © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
35.
Enterprise Integration Patterns
(EIP) © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
36.
Enterprise Integration Patterns
(EIP) © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
37.
Architecture http://java.dzone.com/articles/apache-camel-integration © Talend 2013 “Big
Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
38.
Choose your required
connectors TCP SQL SMTP Netty RMI FTP Lucene JMX MQ Atom LDAP Quartz XSLT Akka Many many more © Talend 2013 IRC Log HTTP File EJB AMQP RSS AWS-S3 Jetty JDBC Bean-Validation JMS CXF “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner Custom Components
39.
Choose your favorite
DSL XML (not production-ready yet) © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
40.
Deploy it wherever
you need Standalone Application Server Web Container Spring Container OSGi Cloud © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
41.
Enterprise-ready • Open Source •
Scalability • Error Handling • Transaction • Monitoring • Tooling • Commercial Support © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
42.
Example: Camel integration
route © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
43.
Hadoop Integration with
Apache Camel © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
44.
camel-hdfs connector // Producer from("ftp://user@myServer?password=secret") .to(“hdfs:///myDirectory/myFile.txt?append=true"); //
Consumer from(“hdfs:///myDirectory/myBigDataAnalysis.csv") .to(“file:target/reports/report.csv"); © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
45.
camel-hbase connector © Talend
2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
46.
Real World Use
Case: Storage: Compliance Global Parcel Service A lot of data must be stored „forever“ ➜ Numbers increase exponentially ➜ Goal: As cheap as possible ➜ Problem: (Fast) queries must still be possible ➜ Solution: Commodity servers and „Hadoop querying“ ➜ http://archive.org/stream/BigDataImPraxiseinsatz-SzenarienBeispieleEffekte/Big_Data_BITKOM-Leitfaden_Sept.2012#page/n0/mode/2up © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
47.
Real world use
case: Storage: Compliance Orders (Server 1) Payments (Server 2) ETL Log Files (Server 3) Log Files (Server 100) © Talend 2013 Storage “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner Query
48.
Live demo Apache Camel
in action... © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
49.
camel-pig? camel-hive? camel-hcatalog? Not
available yet (current Camel version: 2.12) Workarounds: • • • Use Pig / Hive-Query scripts (via camel-exec connector or any scripting language) Build your own connector (more details later ...) Use Hive-Hbase-Integration and store data in HBase ( „ugly“) © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
50.
camel-avro connector Camel is
not just about connectors ... Camel supports a pluggable DataFormat to allow messages to be marshalled to and unmarshalled from binary or text formats, e.g. CSV, JSON, SOAP, EDI, ZIP, or ... ... Avro, a data serialization system used in Apache Hadoop. camel-avro connector provides a dataformat for Avro, which allows serialization and deserialization of messages using Apache Avro's binary data format. Moreover, it provides support for Apache Avro's RPC, by providing producers and consumers endpoint for using Avro over Netty or HTTP. © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
51.
Agenda • Big data
paradigm shift • Challenges for integrating big data • Choosing Hadoop for big data integration • Integration with an open source framework • Integration with an open source suite • Custom big data connectors © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
52.
Alternatives for systems
integration Integration Suite Enterprise Service Bus Integration Framework Low Connectivity Routing Transformation © Talend 2013 High + INTEGRATION Tooling Monitoring Support + BUSINESS PROCESS MGT. BIG DATA / MDM REGISTRY / REPOSITORY RULES ENGINE „YOU NAME IT“ “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner Complexity of Integration
53.
Alternatives for systems
integration Integration Framework Enterprise Service Bus Low © Talend 2013 Integration Suite High “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner Complexity of Integration
54.
More details about
ESBs and suites... http://www.kai-waehner.de/blog/2013/01/23/spoilt-for-choicehow-to-choose-the-right-enterprise-service-bus-esb/ © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
55.
Hadoop Integration with
Talend Open Studio © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
56.
Vision: Democratize big
data Talend Open Studio for Big Data • • …an open source ecosystem © Talend 2013 Generates Hadoop code and run transforms inside Hadoop • Native support for HDFS, Pig, Hbase, Hcatalog, Sqoop and Hive • Pig Improves efficiency of big data job design with graphic interface 100% open source under an Apache License • Standards based “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
57.
Vision: Democratize big
data Talend Platform for Big Data • Builds on Talend Open Studio for Big Data • Adds data quality, advanced scalability and management functions • • © Talend 2013 Data cleansing • • Data quality and profiling • …an open source ecosystem Shared Repository and remote deployment • Pig MapReduce massively parallel data processing Reporting and dashboards Commercial support, warranty/IP indemnity under a subscription license “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
58.
Talend Open Studio
for Big Data © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
59.
Real world Use
case: Clickstream Analysis “The advantage of their new system is that they can now look at their data [from their log processing system] in anyway they want: ➜ Nightly MapReduce jobs collect statistics about their mail system such as spam counts by domain, bytes transferred and number of logins. ➜ When they wanted to find out which part of the world their customers logged in from, a quick [ad hoc] MapReduce job was created and they had the answer within a few hours. Not really possible in your typical ETL system.” http://highscalability.com/how-rackspace-now-uses-mapreduce-and-hadoop-query-terabytes-data © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
60.
Real world use
case: Clickstream Analysis Log Files (Server 1) ETL Log Files (Server 2) Log Files (Server 100) © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
61.
Potential Uses of
Clickstream Data One of the original uses of Hadoop at Yahoo was to store and process their massive volume of clickstream data. Now enterprises of all types can use Hadoop to refine and analyze clickstream data. They can then answer business questions such as: • • • What is the most efficient path for a site visitor to research a product, and then buy it? What products do visitors tend to buy together, and what are they most likely to buy in the future? Where should I spend resources on fixing or enhancing the user experience on my website? Goal: Data visualization can help you optimize your website and convert more visits into sales and revenue. Source: for Clickstream Example: „Hortonworks Hadoop Tutorials - Real Life Use Cases” http://hortonworks.com/blog/hadoop-tutorials-real-life-use-cases-in-the-sandbox © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
62.
Example: A semi-structured
log file © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
63.
Example: ETL Job „...
using Talend’s HDFS and Hive Components” © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
64.
Example: ETL Job „...
using Talend’s Map Reduce Components*” * Not available in open source version of Talend Studio © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
65.
Live demo „Talend Open
Studio for Big Data“ in action... © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
66.
Example: Analysis with
Microsoft Excel We can see that the largest number of page hits in Florida were for clothing, followed by shoes. Source: for Clickstream Example: „Hortonworks Hadoop Tutorials - Real Life Use Cases” http://hortonworks.com/blog/hadoop-tutorials-real-life-use-cases-in-the-sandbox © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
67.
Example: Analysis with
Microsoft Excel The chart shows that the majority of men shopping for clothing on our website are between the ages of 22 and 30. With this information, we can optimize our content for this market segment. Source: for Clickstream Example: „Hortonworks Hadoop Tutorials - Real Life Use Cases” http://hortonworks.com/blog/hadoop-tutorials-real-life-use-cases-in-the-sandbox © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
68.
Example: Analysis with
Tableau Spoilt for Choice Use your preferred BI or Analysis tool! © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
69.
Agenda • Big data
paradigm shift • Challenges for integrating big data • Choosing Hadoop for big data integration • Integration with an open source framework • Integration with an open source suite • Custom big data connectors © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
70.
Custom connectors Easy to
realize for all integration alternatives * • Integration Framework • Enterprise Service Bus • Integration Suite * At least for open source solutions © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
71.
Custom connectors You might
need a ... • ... Hive connector for Camel • ... Impala connector for Talend • ... custom connector for your internal data format © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
72.
Live demo (Example:
Apache Camel) Custom connectors in action... © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
73.
Alternative for custom
connectors • SOAP • REST © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
74.
Code example: REST
API for Salesforce object store // Salesforce Query (SOQL) via REST API from("direct:salesforceViaHttpLIST") .setHeader("X-PrettyPrint", 1) .setHeader("Authorization", accessToken) .setHeader(Exchange.CONTENT_TYPE, "application/json") .to("https://na14.salesforce.com/services/data/v20.0/query?q=SELECT+name+from +Article__c") // Salesforce CREATE via REST API from("direct:salesforceViaHttpCREATE") .setHeader("X-PrettyPrint", 1) .setHeader("Authorization", accessToken) .setHeader(Exchange.CONTENT_TYPE, "application/json“) .to("https://na14.salesforce.com/services/data/v20.0/sobjects/Article__c") © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
75.
Did you get
the key message? © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
76.
Key messages You have
to care about big data to be competitive in the future! You have to integrate different sources to get most value out of it! Big data integration is no (longer) rocket science! © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
77.
Did you get
the key message? © Talend 2013 “Big Data beyond Hadoop – How to integrate ALL your Data” by Kai Wähner
78.
Thanks for the
attention! Follow @KaiWaehner Contact Kai Wähner at Xing / LinkedIn Download Talend‘s open source software Mail kwaehner@talend.com Visit www.kai-waehner.de
Descargar ahora