Enviar búsqueda
Cargar
2013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.0
•
8 recomendaciones
•
3,109 vistas
Adam Muise
Seguir
Our Hadoop 2.2.0 Overview for the Toronto Hadoop User Group. Go THUG life.
Leer menos
Leer más
Tecnología
Denunciar
Compartir
Denunciar
Compartir
1 de 59
Descargar ahora
Descargar para leer sin conexión
Recomendados
Architecting a Fraud Detection Application with Hadoop
Architecting a Fraud Detection Application with Hadoop
DataWorks Summit
2014 sept 26_thug_lambda_part1
2014 sept 26_thug_lambda_part1
Adam Muise
Data Wrangling and Oracle Connectors for Hadoop
Data Wrangling and Oracle Connectors for Hadoop
Gwen (Chen) Shapira
Hadoop from Hive with Stinger to Tez
Hadoop from Hive with Stinger to Tez
Jan Pieter Posthuma
Application architectures with hadoop – big data techcon 2014
Application architectures with hadoop – big data techcon 2014
Jonathan Seidman
Evolving HDFS to a Generalized Storage Subsystem
Evolving HDFS to a Generalized Storage Subsystem
DataWorks Summit/Hadoop Summit
Improving Hadoop Resiliency and Operational Efficiency with EMC Isilon
Improving Hadoop Resiliency and Operational Efficiency with EMC Isilon
DataWorks Summit/Hadoop Summit
Hadoop meets Agile! - An Agile Big Data Model
Hadoop meets Agile! - An Agile Big Data Model
Uwe Printz
Recomendados
Architecting a Fraud Detection Application with Hadoop
Architecting a Fraud Detection Application with Hadoop
DataWorks Summit
2014 sept 26_thug_lambda_part1
2014 sept 26_thug_lambda_part1
Adam Muise
Data Wrangling and Oracle Connectors for Hadoop
Data Wrangling and Oracle Connectors for Hadoop
Gwen (Chen) Shapira
Hadoop from Hive with Stinger to Tez
Hadoop from Hive with Stinger to Tez
Jan Pieter Posthuma
Application architectures with hadoop – big data techcon 2014
Application architectures with hadoop – big data techcon 2014
Jonathan Seidman
Evolving HDFS to a Generalized Storage Subsystem
Evolving HDFS to a Generalized Storage Subsystem
DataWorks Summit/Hadoop Summit
Improving Hadoop Resiliency and Operational Efficiency with EMC Isilon
Improving Hadoop Resiliency and Operational Efficiency with EMC Isilon
DataWorks Summit/Hadoop Summit
Hadoop meets Agile! - An Agile Big Data Model
Hadoop meets Agile! - An Agile Big Data Model
Uwe Printz
Data warehousing with Hadoop
Data warehousing with Hadoop
hadooparchbook
Managing Hadoop, HBase and Storm Clusters at Yahoo Scale
Managing Hadoop, HBase and Storm Clusters at Yahoo Scale
DataWorks Summit/Hadoop Summit
Deep Learning using Spark and DL4J for fun and profit
Deep Learning using Spark and DL4J for fun and profit
DataWorks Summit/Hadoop Summit
Welcome to Hadoop2Land!
Welcome to Hadoop2Land!
Uwe Printz
Hadoop Infrastructure @Uber Past, Present and Future
Hadoop Infrastructure @Uber Past, Present and Future
DataWorks Summit
A New "Sparkitecture" for modernizing your data warehouse
A New "Sparkitecture" for modernizing your data warehouse
DataWorks Summit/Hadoop Summit
2013 July 23 Toronto Hadoop User Group Hive Tuning
2013 July 23 Toronto Hadoop User Group Hive Tuning
Adam Muise
Architectural considerations for Hadoop Applications
Architectural considerations for Hadoop Applications
hadooparchbook
Introduction to the Hadoop EcoSystem
Introduction to the Hadoop EcoSystem
Shivaji Dutta
Cisco connect toronto 2015 big data sean mc keown
Cisco connect toronto 2015 big data sean mc keown
Cisco Canada
DeathStar: Easy, Dynamic, Multi-Tenant HBase via YARN
DeathStar: Easy, Dynamic, Multi-Tenant HBase via YARN
DataWorks Summit
Overview of stinger interactive query for hive
Overview of stinger interactive query for hive
David Kaiser
The Time Has Come for Big-Data-as-a-Service
The Time Has Come for Big-Data-as-a-Service
BlueData, Inc.
AWS Summit Sydney 2014 | Secure Hadoop as a Service - Session Sponsored by Intel
AWS Summit Sydney 2014 | Secure Hadoop as a Service - Session Sponsored by Intel
Amazon Web Services
Hadoop Operations - Best practices from the field
Hadoop Operations - Best practices from the field
Uwe Printz
Hadoop in the Cloud - The what, why and how from the experts
Hadoop in the Cloud - The what, why and how from the experts
DataWorks Summit/Hadoop Summit
Apache Hadoop 3
Apache Hadoop 3
Cloudera, Inc.
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
DataWorks Summit
Taming the Elephant: Efficient and Effective Apache Hadoop Management
Taming the Elephant: Efficient and Effective Apache Hadoop Management
DataWorks Summit/Hadoop Summit
Introduction to Data Analyst Training
Introduction to Data Analyst Training
Cloudera, Inc.
MHUG - YARN
MHUG - YARN
Joseph Niemiec
Apache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with Hadoop
Hortonworks
Más contenido relacionado
La actualidad más candente
Data warehousing with Hadoop
Data warehousing with Hadoop
hadooparchbook
Managing Hadoop, HBase and Storm Clusters at Yahoo Scale
Managing Hadoop, HBase and Storm Clusters at Yahoo Scale
DataWorks Summit/Hadoop Summit
Deep Learning using Spark and DL4J for fun and profit
Deep Learning using Spark and DL4J for fun and profit
DataWorks Summit/Hadoop Summit
Welcome to Hadoop2Land!
Welcome to Hadoop2Land!
Uwe Printz
Hadoop Infrastructure @Uber Past, Present and Future
Hadoop Infrastructure @Uber Past, Present and Future
DataWorks Summit
A New "Sparkitecture" for modernizing your data warehouse
A New "Sparkitecture" for modernizing your data warehouse
DataWorks Summit/Hadoop Summit
2013 July 23 Toronto Hadoop User Group Hive Tuning
2013 July 23 Toronto Hadoop User Group Hive Tuning
Adam Muise
Architectural considerations for Hadoop Applications
Architectural considerations for Hadoop Applications
hadooparchbook
Introduction to the Hadoop EcoSystem
Introduction to the Hadoop EcoSystem
Shivaji Dutta
Cisco connect toronto 2015 big data sean mc keown
Cisco connect toronto 2015 big data sean mc keown
Cisco Canada
DeathStar: Easy, Dynamic, Multi-Tenant HBase via YARN
DeathStar: Easy, Dynamic, Multi-Tenant HBase via YARN
DataWorks Summit
Overview of stinger interactive query for hive
Overview of stinger interactive query for hive
David Kaiser
The Time Has Come for Big-Data-as-a-Service
The Time Has Come for Big-Data-as-a-Service
BlueData, Inc.
AWS Summit Sydney 2014 | Secure Hadoop as a Service - Session Sponsored by Intel
AWS Summit Sydney 2014 | Secure Hadoop as a Service - Session Sponsored by Intel
Amazon Web Services
Hadoop Operations - Best practices from the field
Hadoop Operations - Best practices from the field
Uwe Printz
Hadoop in the Cloud - The what, why and how from the experts
Hadoop in the Cloud - The what, why and how from the experts
DataWorks Summit/Hadoop Summit
Apache Hadoop 3
Apache Hadoop 3
Cloudera, Inc.
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
DataWorks Summit
Taming the Elephant: Efficient and Effective Apache Hadoop Management
Taming the Elephant: Efficient and Effective Apache Hadoop Management
DataWorks Summit/Hadoop Summit
Introduction to Data Analyst Training
Introduction to Data Analyst Training
Cloudera, Inc.
La actualidad más candente
(20)
Data warehousing with Hadoop
Data warehousing with Hadoop
Managing Hadoop, HBase and Storm Clusters at Yahoo Scale
Managing Hadoop, HBase and Storm Clusters at Yahoo Scale
Deep Learning using Spark and DL4J for fun and profit
Deep Learning using Spark and DL4J for fun and profit
Welcome to Hadoop2Land!
Welcome to Hadoop2Land!
Hadoop Infrastructure @Uber Past, Present and Future
Hadoop Infrastructure @Uber Past, Present and Future
A New "Sparkitecture" for modernizing your data warehouse
A New "Sparkitecture" for modernizing your data warehouse
2013 July 23 Toronto Hadoop User Group Hive Tuning
2013 July 23 Toronto Hadoop User Group Hive Tuning
Architectural considerations for Hadoop Applications
Architectural considerations for Hadoop Applications
Introduction to the Hadoop EcoSystem
Introduction to the Hadoop EcoSystem
Cisco connect toronto 2015 big data sean mc keown
Cisco connect toronto 2015 big data sean mc keown
DeathStar: Easy, Dynamic, Multi-Tenant HBase via YARN
DeathStar: Easy, Dynamic, Multi-Tenant HBase via YARN
Overview of stinger interactive query for hive
Overview of stinger interactive query for hive
The Time Has Come for Big-Data-as-a-Service
The Time Has Come for Big-Data-as-a-Service
AWS Summit Sydney 2014 | Secure Hadoop as a Service - Session Sponsored by Intel
AWS Summit Sydney 2014 | Secure Hadoop as a Service - Session Sponsored by Intel
Hadoop Operations - Best practices from the field
Hadoop Operations - Best practices from the field
Hadoop in the Cloud - The what, why and how from the experts
Hadoop in the Cloud - The what, why and how from the experts
Apache Hadoop 3
Apache Hadoop 3
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Taming the Elephant: Efficient and Effective Apache Hadoop Management
Taming the Elephant: Efficient and Effective Apache Hadoop Management
Introduction to Data Analyst Training
Introduction to Data Analyst Training
Similar a 2013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.0
MHUG - YARN
MHUG - YARN
Joseph Niemiec
Apache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with Hadoop
Hortonworks
Tez: Accelerating Data Pipelines - fifthel
Tez: Accelerating Data Pipelines - fifthel
t3rmin4t0r
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
hdhappy001
YARN - Hadoop Next Generation Compute Platform
YARN - Hadoop Next Generation Compute Platform
Bikas Saha
Apache Hadoop YARN: Understanding the Data Operating System of Hadoop
Apache Hadoop YARN: Understanding the Data Operating System of Hadoop
Hortonworks
Huhadoop - v1.1
Huhadoop - v1.1
Big Data Joe™ Rossi
Apache Tez : Accelerating Hadoop Query Processing
Apache Tez : Accelerating Hadoop Query Processing
Teddy Choi
Hadoop - Past, Present and Future - v1.2
Hadoop - Past, Present and Future - v1.2
Big Data Joe™ Rossi
Apache Tez -- A modern processing engine
Apache Tez -- A modern processing engine
bigdatagurus_meetup
Hadoop - Past, Present and Future - v1.1
Hadoop - Past, Present and Future - v1.1
Big Data Joe™ Rossi
Hackathon bonn
Hackathon bonn
Emil Andreas Siemes
Tez big datacamp-la-bikas_saha
Tez big datacamp-la-bikas_saha
Data Con LA
Tez Data Processing over Yarn
Tez Data Processing over Yarn
InMobi Technology
How YARN Enables Multiple Data Processing Engines in Hadoop
How YARN Enables Multiple Data Processing Engines in Hadoop
POSSCON
Introduction sur Tez par Olivier RENAULT de HortonWorks Meetup du 25/11/2014
Introduction sur Tez par Olivier RENAULT de HortonWorks Meetup du 25/11/2014
Modern Data Stack France
Apache Tez : Accelerating Hadoop Query Processing
Apache Tez : Accelerating Hadoop Query Processing
Bikas Saha
Overview of slider project
Overview of slider project
Steve Loughran
Hadoop past, present and future
Hadoop past, present and future
Codemotion
Get Started Building YARN Applications
Get Started Building YARN Applications
Hortonworks
Similar a 2013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.0
(20)
MHUG - YARN
MHUG - YARN
Apache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with Hadoop
Tez: Accelerating Data Pipelines - fifthel
Tez: Accelerating Data Pipelines - fifthel
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
Bikas saha:the next generation of hadoop– hadoop 2 and yarn
YARN - Hadoop Next Generation Compute Platform
YARN - Hadoop Next Generation Compute Platform
Apache Hadoop YARN: Understanding the Data Operating System of Hadoop
Apache Hadoop YARN: Understanding the Data Operating System of Hadoop
Huhadoop - v1.1
Huhadoop - v1.1
Apache Tez : Accelerating Hadoop Query Processing
Apache Tez : Accelerating Hadoop Query Processing
Hadoop - Past, Present and Future - v1.2
Hadoop - Past, Present and Future - v1.2
Apache Tez -- A modern processing engine
Apache Tez -- A modern processing engine
Hadoop - Past, Present and Future - v1.1
Hadoop - Past, Present and Future - v1.1
Hackathon bonn
Hackathon bonn
Tez big datacamp-la-bikas_saha
Tez big datacamp-la-bikas_saha
Tez Data Processing over Yarn
Tez Data Processing over Yarn
How YARN Enables Multiple Data Processing Engines in Hadoop
How YARN Enables Multiple Data Processing Engines in Hadoop
Introduction sur Tez par Olivier RENAULT de HortonWorks Meetup du 25/11/2014
Introduction sur Tez par Olivier RENAULT de HortonWorks Meetup du 25/11/2014
Apache Tez : Accelerating Hadoop Query Processing
Apache Tez : Accelerating Hadoop Query Processing
Overview of slider project
Overview of slider project
Hadoop past, present and future
Hadoop past, present and future
Get Started Building YARN Applications
Get Started Building YARN Applications
Más de Adam Muise
2015 nov 27_thug_paytm_rt_ingest_brief_final
2015 nov 27_thug_paytm_rt_ingest_brief_final
Adam Muise
Moving to a data-centric architecture: Toronto Data Unconference 2015
Moving to a data-centric architecture: Toronto Data Unconference 2015
Adam Muise
Paytm labs soyouwanttodatascience
Paytm labs soyouwanttodatascience
Adam Muise
2015 feb 24_paytm_labs_intro_ashwin_armandoadam
2015 feb 24_paytm_labs_intro_ashwin_armandoadam
Adam Muise
Next Generation Hadoop Introduction
Next Generation Hadoop Introduction
Adam Muise
Hadoop at the Center: The Next Generation of Hadoop
Hadoop at the Center: The Next Generation of Hadoop
Adam Muise
2014 sept 4_hadoop_security
2014 sept 4_hadoop_security
Adam Muise
2014 july 24_what_ishadoop
2014 july 24_what_ishadoop
Adam Muise
May 29, 2014 Toronto Hadoop User Group - Micro ETL
May 29, 2014 Toronto Hadoop User Group - Micro ETL
Adam Muise
2014 feb 24_big_datacongress_hadoopsession1_hadoop101
2014 feb 24_big_datacongress_hadoopsession1_hadoop101
Adam Muise
2014 feb 24_big_datacongress_hadoopsession2_moderndataarchitecture
2014 feb 24_big_datacongress_hadoopsession2_moderndataarchitecture
Adam Muise
2014 feb 5_what_ishadoop_mda
2014 feb 5_what_ishadoop_mda
Adam Muise
2013 Dec 9 Data Marketing 2013 - Hadoop
2013 Dec 9 Data Marketing 2013 - Hadoop
Adam Muise
What is Hadoop? Nov 20 2013 - IRMAC
What is Hadoop? Nov 20 2013 - IRMAC
Adam Muise
What is Hadoop? Oct 17 2013
What is Hadoop? Oct 17 2013
Adam Muise
Sept 17 2013 - THUG - HBase a Technical Introduction
Sept 17 2013 - THUG - HBase a Technical Introduction
Adam Muise
2013 march 26_thug_etl_cdc_talking_points
2013 march 26_thug_etl_cdc_talking_points
Adam Muise
2013 feb 20_thug_h_catalog
2013 feb 20_thug_h_catalog
Adam Muise
KnittingBoar Toronto Hadoop User Group Nov 27 2012
KnittingBoar Toronto Hadoop User Group Nov 27 2012
Adam Muise
2012 sept 18_thug_biotech
2012 sept 18_thug_biotech
Adam Muise
Más de Adam Muise
(20)
2015 nov 27_thug_paytm_rt_ingest_brief_final
2015 nov 27_thug_paytm_rt_ingest_brief_final
Moving to a data-centric architecture: Toronto Data Unconference 2015
Moving to a data-centric architecture: Toronto Data Unconference 2015
Paytm labs soyouwanttodatascience
Paytm labs soyouwanttodatascience
2015 feb 24_paytm_labs_intro_ashwin_armandoadam
2015 feb 24_paytm_labs_intro_ashwin_armandoadam
Next Generation Hadoop Introduction
Next Generation Hadoop Introduction
Hadoop at the Center: The Next Generation of Hadoop
Hadoop at the Center: The Next Generation of Hadoop
2014 sept 4_hadoop_security
2014 sept 4_hadoop_security
2014 july 24_what_ishadoop
2014 july 24_what_ishadoop
May 29, 2014 Toronto Hadoop User Group - Micro ETL
May 29, 2014 Toronto Hadoop User Group - Micro ETL
2014 feb 24_big_datacongress_hadoopsession1_hadoop101
2014 feb 24_big_datacongress_hadoopsession1_hadoop101
2014 feb 24_big_datacongress_hadoopsession2_moderndataarchitecture
2014 feb 24_big_datacongress_hadoopsession2_moderndataarchitecture
2014 feb 5_what_ishadoop_mda
2014 feb 5_what_ishadoop_mda
2013 Dec 9 Data Marketing 2013 - Hadoop
2013 Dec 9 Data Marketing 2013 - Hadoop
What is Hadoop? Nov 20 2013 - IRMAC
What is Hadoop? Nov 20 2013 - IRMAC
What is Hadoop? Oct 17 2013
What is Hadoop? Oct 17 2013
Sept 17 2013 - THUG - HBase a Technical Introduction
Sept 17 2013 - THUG - HBase a Technical Introduction
2013 march 26_thug_etl_cdc_talking_points
2013 march 26_thug_etl_cdc_talking_points
2013 feb 20_thug_h_catalog
2013 feb 20_thug_h_catalog
KnittingBoar Toronto Hadoop User Group Nov 27 2012
KnittingBoar Toronto Hadoop User Group Nov 27 2012
2012 sept 18_thug_biotech
2012 sept 18_thug_biotech
Último
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
RankYa
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
Fwdays
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
Slibray Presentation
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
UiPathCommunity
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
Dubai Multi Commodity Centre
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
Sergiu Bodiu
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
Fwdays
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
Enterprise Knowledge
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
Commit University
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
BookNet Canada
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
Scott Keck-Warren
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
Mattias Andersson
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Safe Software
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
Fwdays
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
Fwdays
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Patryk Bandurski
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
Rizwan Syed
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
Alfredo García Lavilla
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
Mark Billinghurst
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
2toLead Limited
Último
(20)
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
2013 Nov 20 Toronto Hadoop User Group (THUG) - Hadoop 2.2.0
1.
Hadoop 2.2.0 Hadoop grows
up Adam Muise © Hortonworks Inc. 2013. Confidential and Proprietary. Page 1
2.
Rob Ford says… …turn
off your #*@!#%!!! Mobile Phones! © Hortonworks Inc. 2013. Confidential and Proprietary. Page 2
3.
YARN Yet Another Resource
Negotiator © Hortonworks Inc. 2013. Confidential and Proprietary.
4.
A new abstraction
layer Single Use System Multi Purpose Platform Batch Apps Batch, Interactive, Online, Streaming, … HADOOP 1.0 HADOOP 2.0 MapReduce Others (data processing) MapReduce (data processing) YARN (cluster resource management & data processing) (cluster resource management) HDFS HDFS2 (redundant, reliable storage) © Hortonworks Inc. 2013. Confidential and Proprietary. (redundant, reliable storage) Page 4
5.
Concepts • Application – Application is a
job submitted to the framework – Example – Map Reduce Job • Container – Basic unit of allocation – Fine-grained resource allocation across multiple resource types (memory, cpu, disk, network, gpu etc.) – container_0 = 2GB, 1CPU – container_1 = 1GB, 6 CPU – Replaces the fixed map/reduce slots © Hortonworks Inc. 2013. Confidential and Proprietary. 5
6.
YARN Architecture • Resource Manager – Global
resource scheduler – Hierarchical queues • Node Manager – Per-machine agent – Manages the life-cycle of container – Container resource monitoring • Application Master – Per-application – Manages application scheduling and task execution – E.g. MapReduce Application Master © Hortonworks Inc. 2013. Confidential and Proprietary. 6
7.
YARN Architecture -
Walkthrough ResourceManager Client2 Scheduler NodeManager NodeManager NodeManager NodeManager Container 1.1 Container 2.2 Container 2.4 NodeManager NodeManager AM 1 NodeManager Container 1.2 NodeManager Container 1.3 © Hortonworks Inc. 2012 NodeManager AM2 NodeManager NodeManager Container 2.1 NodeManager Container 2.3
8.
YARN as OS
for Data Lake ResourceManager Scheduler NodeManager NodeManager map 1.1 NodeManager nimbus0 NodeManager vertex1.1.1 vertex1.2.2 NodeManager NodeManager NodeManager NodeManager map1.2 Batch InteracFve SQL vertex1.1.2 nimbus2 NodeManager NodeManager nimbus1 reduce1.1 © Hortonworks Inc. 2012 NodeManager Real-‐Time NodeManager vertex1.2.1
9.
Multi-Tenant YARN ResourceManager Scheduler
root Mrkting 30% Dev 20% Adhoc 10% Prod 80% DW 60% Dev Reserved Prod 10% 20% 70% P0 70% © Hortonworks Inc. 2012 P1 30%
10.
Multi-Tenancy with New
Capacity Scheduler • Queues • Economics as queue-capacity – Heirarchical Queues • SLAs – Preemption ResourceManager • Resource Isolation – Linux: cgroups – MS Windows: Job Control – Roadmap: Virtualization (Xen, KVM) • Administration – Queue ACLs – Run-time re-configuration for queues – Charge-back Scheduler root Hierarchical Queues Mrkting 20% Dev 20% Adhoc 10% Prod 80% DW 70% Dev Reserved Prod 10% 20% 70% P0 70% P1 30% Capacity Scheduler © Hortonworks Inc. 2013. Confidential and Proprietary. Page 10
11.
MapReduce v2 Changes to
MapReduce on YARN © Hortonworks Inc. 2013. Confidential and Proprietary.
12.
MapReduce V2 is
a library now… • MapReduce runs on YARN like all other Hadoop 2.x applications – Gone are the map and reduce slots, that’s up to containers in YARN now – Gone is the JobTracker, replaced by the YARN AppMaster library • Multiple versions of MapReduce – The older mapred APIs work without modification or recompilation – The newer mapreduce APIs may need to be recompiled • Still has one master server component: the Job History Server – The Job History Server stores the execution of jobs – Used to audit prior execution of jobs – Will also be used by YARN framework to store charge backs at that level © Hortonworks Inc. 2013. Confidential and Proprietary. Page 12
13.
Shuffle in MapReduce
v2 • Faster Shuffle – Better embedded server: Netty • Encrypted Shuffle – Secure the shuffle phase as data moves across the cluster – Requires 2 way HTTPS, certificates on both sides – Incurs significant CPU overhead, reserve 1 core for this work – Certs stored on each node (provision with the cluster), refreshed every 10secs • Pluggable Shuffle Sort – Shuffle is the first phase in MapReduce that is guaranteed to not be data-local – Pluggable Shuffle/Sort allows for intrepid application developers or hardware developers to intercept the network-heavy workload and optimize it – Typical implementations have hardware components like fast networks and software components like sorting algorithms – API will change with future versions of Hadoop © Hortonworks Inc. 2013. Confidential and Proprietary. Page 13
14.
Efficiency Gains of
MRv2 • Key Optimizations – No hard segmentation of resource into map and reduce slots – Yarn scheduler is more efficient – MRv2 framework has become more efficient than MRv1; shuffle phase in MRv2 is more performant with the usage of netty. • Yahoo has over 30000 nodes running YARN across over 365PB of data. • They calculate running about 400,000 jobs per day for about 10 million hours of compute time. • They also have estimated a 60% – 150% improvement on node usage per day. • Yahoo got rid of a whole colo (10,000 node datacenter) because of their increased utilization. © Hortonworks Inc. 2013. Confidential and Proprietary.
15.
HDFS v2 In a
NutShell © Hortonworks Inc. 2013. Confidential and Proprietary.
16.
HA © Hortonworks Inc.
2013. Confidential and Proprietary. Page 16
17.
HDFS Snapshots: Feature
Overview • Admin can create point in time snapshots of HDFS – Of the entire file system (/root) – Of a specific data-set (sub-tree directory of file system) • Restore state of entire file system or data-set to a snapshot (like Apple Time Machine) – Protect against user errors • Snapshot diffs identify changes made to data set – Keep track of how raw or derived/analytical data changes over time © Hortonworks Inc. 2013. Confidential and Proprietary. Page 17
18.
NFS Gateway: Feature
Overview • NFS v3 standard • Supports all HDFS commands – List files – Copy, move files – Create and delete directories • Ingest for large scale analytical workloads – Load immutable files as source for analytical processing – No random writes • Stream files into HDFS – Log ingest by applications writing directly to HDFS client mount © Hortonworks Inc. 2013. Confidential and Proprietary.
19.
Federation © Hortonworks Inc.
2013. Confidential and Proprietary. Page 19
20.
Managing Namespaces © Hortonworks
Inc. 2013. Confidential and Proprietary. Page 20
21.
Performance © Hortonworks Inc.
2013. Confidential and Proprietary. Page 21
22.
Other Features © Hortonworks
Inc. 2013. Confidential and Proprietary. Page 22
23.
Apache Tez A New
Hadoop Data Processing Framework © Hortonworks Inc. 2013. Confidential and Proprietary. Page 23
24.
Moving Hadoop Beyond
MapReduce • Low level data-processing execution engine • Built on YARN • Enables pipelining of jobs • Removes task and job launch times • Does not write intermediate output to HDFS – Much lighter disk and network usage • New base of MapReduce, Hive, Pig, Cascading etc. • Hive and Pig jobs no longer need to move to the end of the queue between steps in the pipeline © Hortonworks Inc. 2013. Confidential and Proprietary.
25.
Apache Tez as
the new Primitive MapReduce as Base Apache Tez as Base HADOOP 1.0 HADOOP 2.0 Batch Pig (data flow) Hive Others (sql) (cascading) MapReduce MapReduce Data Flow Pig SQL Hive Others (cascading) Tez Storm (execu:on engine) YARN (cluster resource management & data processing) (cluster resource management) HDFS HDFS2 (redundant, reliable storage) © Hortonworks Inc. 2013. Confidential and Proprietary. Online Real Time Data Stream Processing Processing HBase, (redundant, reliable storage) Accumulo
26.
Hive-on-MR vs. Hive-on-Tez Tez
avoids unneeded writes to HDFS SELECT a.x, AVERAGE(b.y) AS avg FROM a JOIN b ON (a.id = b.id) GROUP BY a UNION SELECT x, AVERAGE(y) AS AVG FROM c GROUP BY x ORDER BY AVG; Hive – MR M M Hive – Tez M SELECT a.state SELECT b.id R R M SELECT a.state, c.itemId M M M R M SELECT b.id R M HDFS JOIN (a, c) SELECT c.price M R M R HDFS R JOIN (a, c) R HDFS JOIN(a, b) GROUP BY a.state COUNT(*) AVERAGE(c.price) M M R © Hortonworks Inc. 2013. Confidential and Proprietary. M JOIN(a, b) GROUP BY a.state COUNT(*) AVERAGE(c.price) R
27.
Apache Tez (“Speed”) •
Replaces MapReduce as primitive for Pig, Hive, Cascading etc. – Smaller latency for interactive queries – Higher throughput for batch queries – 22 contributors: Hortonworks (13), Facebook, Twitter, Yahoo, Microsoft Task with pluggable Input, Processor and Output Input Processor Output Task Tez Task - <Input, Processor, Output> YARN ApplicationMaster to run DAG of Tez Tasks © Hortonworks Inc. 2013. Confidential and Proprietary.
28.
Tez: Building blocks
for scalable data processing Classical ‘Map’ HDFS Input Map Processor Classical ‘Reduce’ Sorted Output Shuffle Input Shuffle Input Reduce Processor Sorted Output Intermediate ‘Reduce’ for Map-Reduce-Reduce © Hortonworks Inc. 2013. Confidential and Proprietary. Reduce Processor HDFS Output
29.
Hive © Hortonworks Inc.
2013. Confidential and Proprietary. 29
30.
SQL: Enhancing SQL
Semantics Hive SQL Datatypes Hive SQL SemanFcs SQL Compliance INT SELECT, INSERT TINYINT/SMALLINT/BIGINT GROUP BY, ORDER BY, SORT BY BOOLEAN JOIN on explicit join key FLOAT Inner, outer, cross and semi joins DOUBLE Sub-‐queries in FROM clause Hive 12 provides a wide array of SQL datatypes and semantics so your existing tools integrate more seamlessly with Hadoop STRING ROLLUP and CUBE TIMESTAMP UNION BINARY Windowing Func:ons (OVER, RANK, etc) DECIMAL Custom Java UDFs ARRAY, MAP, STRUCT, UNION Standard Aggrega:on (SUM, AVG, etc.) DATE Advanced UDFs (ngram, Xpath, URL) VARCHAR Sub-‐queries in WHERE, HAVING CHAR Expanded JOIN Syntax SQL Compliant Security (GRANT, etc.) INSERT/UPDATE/DELETE (ACID) © Hortonworks Inc. 2013. Confidential and Proprietary. Available Hive 0.12 Roadmap
31.
SPEED: Increasing Hive
Performance Interactive Query Times across ALL use cases • Simple and advanced queries in seconds • Integrates seamlessly with existing tools • Currently a >100x improvement in just nine months Performance Improvements included in Hive 12 – Base & advanced query optimization – Startup time improvement – Join optimizations © Hortonworks Inc. 2013. Confidential and Proprietary.
32.
Apache Tez as
the new Primitive MapReduce as Base Apache Tez as Base HADOOP 1.0 HADOOP 2.0 Batch Pig (data flow) Hive Others (sql) (cascading) MapReduce MapReduce Data Flow Pig SQL Hive Others (cascading) Tez Storm (execu:on engine) YARN (cluster resource management & data processing) (cluster resource management) HDFS HDFS2 (redundant, reliable storage) © Hortonworks Inc. 2013. Confidential and Proprietary. Online Real Time Data Stream Processing Processing HBase, (redundant, reliable storage) Accumulo
33.
Hive-on-MR vs. Hive-on-Tez Tez
avoids unneeded writes to HDFS SELECT a.x, AVERAGE(b.y) AS avg FROM a JOIN b ON (a.id = b.id) GROUP BY a UNION SELECT x, AVERAGE(y) AS AVG FROM c GROUP BY x ORDER BY AVG; Hive – MR M M Hive – Tez M SELECT a.state SELECT b.id R R M SELECT a.state, c.itemId M M M R M SELECT b.id R M HDFS JOIN (a, c) SELECT c.price M R M R HDFS R JOIN (a, c) R HDFS JOIN(a, b) GROUP BY a.state COUNT(*) AVERAGE(c.price) M M R © Hortonworks Inc. 2013. Confidential and Proprietary. M JOIN(a, b) GROUP BY a.state COUNT(*) AVERAGE(c.price) R
34.
Tez on YARN ResourceManager
Scheduler NodeManager NodeManager vertex1.2.2 NodeManager map 1.1 NodeManager map1.2 Batch nimbus2 NodeManager NodeManager nimbus1 reduce1.1 © Hortonworks Inc. 2012 NodeManager nimbus0 NodeManager Hive/Tez (SQL) NodeManager Real-‐Time NodeManager vertex1.1.1 NodeManager vertex1.1.2 NodeManager vertex1.2.1
35.
Apache Falcon Data Lifecycle
Management for Hadoop © Hortonworks Inc. 2013. Confidential and Proprietary.
36.
Data Lifecycle on
Hadoop is Challenging Data Management Needs Tools Data Processing Oozie Replication Sqoop Retention Distcp Scheduling Flume Reprocessing Map / Reduce Multi Cluster Management Hive and Pig Jobs Problem: Patchwork of tools complicate data lifecycle management. Result: Long development cycles and quality challenges. © Hortonworks Inc. 2013. Confidential and Proprietary.
37.
Falcon: One-stop Shop
for Data Lifecycle Apache Falcon Provides Orchestrates Data Management Needs Tools Data Processing Oozie Replication Sqoop Retention Distcp Scheduling Flume Reprocessing Map / Reduce Multi Cluster Management Hive and Pig Jobs Falcon provides a single interface to orchestrate data lifecycle. Sophisticated DLM easily added to Hadoop applications. © Hortonworks Inc. 2013. Confidential and Proprietary.
38.
Falcon Core Capabilities •
Core Functionality – Pipeline processing – Replication – Retention – Late data handling • Automates – Scheduling and retry – Recording audit, lineage and metrics • Operations and Management – Monitoring, management, metering – Alerts and notifications – Multi Cluster Federation • CLI and REST API © Hortonworks Inc. 2013. Confidential and Proprietary.
39.
Falcon At A
Glance Data Processing Applications Falcon Data Management Framework Data Import and Replication Scheduling and Coordination Data Lifecycle Policies Multi-Cluster Management SLA Management > Falcon offers a high-level abstraction of key services for Hadoop data management needs. > Complex data processing logic is handled by Falcon instead of hard-coded in data processing apps. > Falcon enables faster development of ETL, reporting and other data processing apps on Hadoop. © Hortonworks Inc. 2013. Confidential and Proprietary.
40.
Falcon Example: Replication Cleansed Data Conformed Data Access Data Replication Replication Staged
Data Staged Data Processed Data > Falcon manages workflow and replication. > Enables business continuity without requiring full data representation. > Failover clusters can be smaller than primary clusters. © Hortonworks Inc. 2013. Confidential and Proprietary.
41.
Falcon Example: Retention Staged
Data Cleansed Data Conformed Data Access Data Retain 20 Years Retain 3 Years Retain 3 Years Retain Last Copy Only > Sophisticated retention policies expressed in one place. > Simplify data retention for audit, compliance, or for data re-processing. © Hortonworks Inc. 2013. Confidential and Proprietary.
42.
Falcon Example: Late
Data Handling Online Transaction Data (via Sqoop) Wait up to 4 hours for FTP data to arrive Staged Data Combined Dataset Web Log Data (via FTP) > Processing waits until all required input data is available. > Checks for late data arrivals, issues retrigger processing as necessary. > Eliminates writing complex data handling rules within applications. © Hortonworks Inc. 2013. Confidential and Proprietary.
43.
Examples © Hortonworks Inc.
2013. Confidential and Proprietary. Page 43
44.
Example: Cluster Specification <?xml
version="1.0"?>! readonly! <!--! My Local Cluster specification! -->! write! <cluster colo=”my-local-cluster" description="" name="cluster-alpha"> ! <interfaces>! <interface type="readonly" endpoint="hftp://nn:50070" version="2.2.0" />! <interface type="write" endpoint="hdfs://nn:8020" version="2.2.0" />! <interface type="execute" endpoint=”rm:8050" version="2.2.0" />! <interface type="workflow" endpoint="http://os:11000/oozie/" version="4.0.0" />! <interface type="messaging" endpoint="tcp://mq:61616?daemon=true" version="5.1.6" />! </interfaces>! <locations>! execute! <location name="staging" path="/apps/falcon/cluster-alpha/staging" />! <location name="temp" path="/tmp" />! <location name="working" path="/apps/falcon/cluster-alpha/working" />! </locations>! </cluster>! workflow! © Hortonworks Inc. 2013. Confidential and Proprietary. NameNode Resource Manager Oozie Server Page 44
45.
Example: Weblogs Replication and
Retention © Hortonworks Inc. 2013. Confidential and Proprietary. Page 45
46.
Example 1: Weblogs •
Weblogs land hourly in my primary cluster • HDFS location is /weblogs/{date} • I want to: – Evict weblogs from primary cluster after 1 day © Hortonworks Inc. 2013. Confidential and Proprietary. Page 46
47.
Feed Specification 1:
Weblogs <feed description="" name="feed-weblogs1" xmlns="uri:falcon:feed:0.1” >! <frequency>hours(1)</frequency>! ! <clusters>! !<cluster name="cluster-primary" type="source”>! ! <validity start="2013-10-24T00:00Z" end="2014-12-31T00:00Z"/>! ! <retention limit="days(1)" action="delete"/>! !</cluster>! </clusters>! ! <locations>! !<location type="data" path="/weblogs/${YEAR}-${MONTH}-${DAY}-${HOUR}" />! </locations>! ! <ACL owner="hdfs" group="users" permission="0755" />! <schema location="/none" provider="none"/>! </feed>! Cluster where data is located Retention policy 1 day Location of the data © Hortonworks Inc. 2013. Confidential and Proprietary. Page 47
48.
Example 2: Weblogs •
Weblogs land hourly in my primary cluster • HDFS location is /weblogs/{date} • I want to: – Replicate weblogs to my secondary cluster – Evict weblogs from primary cluster after 2 days – Evict weblogs from secondary cluster after 1 week © Hortonworks Inc. 2013. Confidential and Proprietary. Page 48
49.
Feed Specification 2:
Weblogs <feed description=“" name=”feed-weblogs2” xmlns="uri:falcon:feed:0.1">! <frequency>hours(1)</frequency>! ! <clusters>! <cluster name=”cluster-primary" type="source">! <validity start="2012-01-01T00:00Z" end="2099-12-31T00:00Z"/>! <retention limit="days(2)" action="delete"/>! </cluster>! <cluster name=”cluster-secondary" type="target">! <validity start="2012-01-01T00:00Z" end="2099-12-31T00:00Z"/>! <retention limit=”days(7)" action="delete"/>! </cluster>! </clusters>! ! <locations>! <location type="data” path="/weblogs/${YEAR}-${MONTH}-${DAY}-${HOUR} "/>! </locations>! Cluster where data is located Retention policy 2 days Cluster where data will be replicated Retention policy 1 week ! <ACL owner=”hdfs" group="users" permission="0755"/>! <schema location="/none" provider="none"/>! </feed>! © Hortonworks Inc. 2013. Confidential and Proprietary. Location of the data
50.
Example 3: Weblogs •
Weblogs land hourly in my primary cluster • HDFS location is /weblogs/{date} • I want to: – Replicate weblogs to a discovery cluster – Replicate weblogs to a BCP cluster – Evict weblogs from primary cluster after 2 days – Evict weblogs from discovery cluster after 1 week – Evict weblogs from BCP cluster after 3 months © Hortonworks Inc. 2013. Confidential and Proprietary. Page 50
51.
Feed Specification 3:
Weblogs <feed description=“” name=”feed-weblogs” xmlns="uri:falcon:feed:0.1">! <frequency>hours(1)</frequency>! ! <clusters>! <cluster name=”cluster-primary" type="source">! <validity start="2012-01-01T00:00Z" end="2099-12-31T00:00Z"/>! <retention limit="days(2)" action="delete"/>! </cluster>! <cluster name=“cluster-discovery" type="target">! <validity start="2012-01-01T00:00Z" end="2099-12-31T00:00Z"/>! <retention limit=”days(7)" action="delete"/>! <locations>! <location type="data” path="/projects/recommendations/${YEAR}-${MONTH}-${DAY}-${HOUR} "/>! </locations>! </cluster>! <cluster name=”cluster-bcp" type="target">! <validity start="2012-01-01T00:00Z" end="2099-12-31T00:00Z"/>! <retention limit=”months(3)" action="delete"/>! <locations>! <location type="data” path="/weblogs/${YEAR}-${MONTH}-${DAY}-${HOUR} "/>! </locations>! </cluster>! </clusters>! ! <locations>! <location type="data” path="/weblogs/${YEAR}-${MONTH}-${DAY}-${HOUR} "/>! </locations>! ! <ACL owner=”hdfs" group="users" permission="0755"/>! <schema location="/none" provider="none"/>! </feed>! © Hortonworks Inc. 2013. Confidential and Proprietary. Cluster specific location Cluster specific location
52.
Apache Knox Secure Access
to Hadoop © Hortonworks Inc. 2013. Confidential and Proprietary.
53.
Connecting to the
Cluster..Edge Nodes • What is an Edge Node? – Nodes in a DMZ zone that has access to the cluster. Only way to access the cluster – Hadoop client Apis and MR/Pig/Hive jobs would be executed from these edge nodes. – Users SSH to Edge Node and upload all job artifacts and then execute API/ Commands commands from shell SSH! User Edge Node Hadoop • Challenges – SSH, Edge Node, and job maintenance nightmare – Difficult to integrate with Applications © Hortonworks Inc. 2013. Confidential and Proprietary. Page 53
54.
Connecting to the
Cluster..REST API Service API WebHDFS Supports HDFS user operations including reading files, writing to files, making directories, changing permissions and renaming. Learn more about WebHDFS. WebHCat Job control for MapReduce, Pig and Hive jobs, and HCatalog DDL commands. Learn more about WebHCat. Oozie Job submission and management, and Oozie administration. Learn more about Oozie. • Useful for connecting to Hadoop from the outside the cluster • When more client language flexibility is required – i.e. Java binding not an option • Challenges – Client must have knowledge of cluster topology – Required to open ports (and in some cases, on every host) outside the cluster © Hortonworks Inc. 2013. Confidential and Proprietary. Page 54
55.
Apache Knox Gateway
– Perimeter Security Simplified Access Centralized Security • Single Hadoop access point • Rationalized REST API hierarchy • Eliminate SSH “edge node” • LDAP and ActiveDirectory auth • Consolidated API calls • Multi-cluster support • Central API management + audit • Client DSL © Hortonworks Inc. 2013. Confidential and Proprietary. Page 55
56.
Knox Gateway Network
Architecture Kerberos/ Enterprise Identity Provider Enterprise/ Cloud SSO Provider Firewall Firewall Browser Identity Providers Secure Hadoop Cluster 1 Masters NN Web HCat JT DN Ambari Client DMZ Oozie TT YARN HBase Hive Knox Gateway Cluster REST Client GW GW GW JDBC Client Secure Hadoop Cluster 2 Masters NN JT DN A stateless cluster of reverse proxy instances deployed in DMZ Ambari Server/ Hue Server © Hortonworks Inc. 2013. Confidential and Proprietary. Web HCat Oozie TT -Requests streamed through GW to Hadoop services after auth. HBase Hive -URLs rewritten to refer to gateway YARN Page 56
57.
Wot no 2.2.0? Where
can I get the Hadoop 2.2.0 fix? © Hortonworks Inc. 2013. Confidential and Proprietary. Page 57
58.
Like the Truth,
Hadoop 2.2.0 is out there… Component HDP2.0 CDH4 CDH5 Beta Intel IDH3.0 MapR 3 IBM Big Insights 2.1 Hadoop Common 2.2.0 2.0.0 2.2.0 2.0.4 N/A 1.1.1 Hive + HCatalog 0.12 0.10 + 0.5 0.11 0.10 + 0.5 0.11 0.9 + 0.4 Pig 0.12 0.11 0.11 0.10 0.11 0.10 Mahout 0.8 0.7 0.8 0.8 0.8 N/A Flume 1.4.0 1.4.0 1.4.0 1.3.0 1.4.0 1.3.0 Oozie 4.0.0 3.3.2 4.0.0 3.3.0 3.3.2 3.2.0 Sqoop 1.4.4 1.4.3 1.4.4 1.4.3 1.4.4 1.4.2 HBase 0.96.0 0.94.6 95.2 0.94.7 94.9 0.94.3 © Hortonworks Inc. 2013. Confidential and Proprietary. Page 58
59.
Thank You THUG Life ©
Hortonworks Inc. 2013. Confidential and Proprietary.
Descargar ahora