SlideShare una empresa de Scribd logo
1 de 22
by Ran Silberman
DevOps for Big Data
Cluster management tools
20.4.2015
Hosted by:
FullStack Developers Israel
Ran Silberman,
Big Data Architect
...and amateur birder
● Explain Cluster Management
tools by example
● Demo Cloudera Management
● Pros and Cons
Agenda
Birds of Brazil Wiki application
● Input photos and locations
● Batch: Display statistics on bird,
location & photographer.
● Real-time: Count how many birds
were seen in the last minute from
each species
Application
requirements
● Volume growth
● Velocity of Streaming and Batch
● Same env from DEV to PROD
● Data from PROD to test on DEV
● Manage Deployment of many
applications on many nodes
Big Data lifecycle
considerations
● HDFS for storing the data
● Hive for batch processing
● Solr/elasticsearch for search
● Spark for streaming
● ...Home-grown applications
Choosing the
Infrastructures
Many Infrastructures
How can we manage all those
infrastructures?
● Hortonworks Ambari
or
● Cloudera Manager
Choosing the
Management
tool
● All platforms & infrastructures are
installed by the tool
● Monitoring, Audits & logs are
built-in
● Easy installation and upgrade
● Save scripting work
What are the
news for DevOps
pipeline?
● Manage cluster with GUI or API
● Hadoop installation and setup
● System monitoring & alerts
● Built-in systems: Zookeeper,
Spark, Hive Impala and more
● Ability to add parcels
CM features
● Monolithic packages
● Relocatable
● sudo-less installs
● Rolling upgrade
Parcels
Custom Service Descriptors
● CSD is a descriptor for a service
used by CM
● Defines how to install start/stop
a service and the logic used by
CM
CSD
Demo
● Archive data in Hadoop
● Growing data affects DWH
performance & capabilities
● Creating realistic testing data
● Dev and Prod env. may differ in
cluster size (dev may be 1 node)
More DevOps
considerations
Tools Comparison
CM Ambari
Licence Paid Ent edition Free Apache Open Source
Technology Cloudera puppet, ganglia, nagios
Dependency CDH HDP
Manage cluster Parcels Yum
REST API + +
Extra Features Rolling Upgrade, 3rd-
parties Mngt,
Extendable by REST API
CM features
Express Enterprise
Subscription Free Annual
Deployment &
Configuration
+ +
Management + +
Monitoring + +
Diagnostic + +
Extra Features Reports, Rollbacks, Rolling
Upgrade, AD Kerberos, Kerberos
wizard, Backup & DR
● Fast Deploy
● Easy management by GUI
● Built in monitoring and alerts
● Simple upgrades
● Same management and deploy
in Dev and Prod
Pros. of Hadoop
Management
tools
● Tied to specific vendor
proprietary system
● Tied to system version by
Parcels
● Less flexibility to low-level
management
Cons. of Hadoop
Management
tools
THANK YOU
Ran Silberman
Email: ran@tikalk.com

Más contenido relacionado

La actualidad más candente

Hadoop on Docker
Hadoop on DockerHadoop on Docker
Hadoop on DockerRakesh Saha
 
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Amazon Web Services
 
Building A Diverse Geo-Architecture For Cloud Native Applications In One Day
Building A Diverse Geo-Architecture For Cloud Native Applications In One DayBuilding A Diverse Geo-Architecture For Cloud Native Applications In One Day
Building A Diverse Geo-Architecture For Cloud Native Applications In One DayVMware Tanzu
 
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at LyftSF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at LyftChester Chen
 
Real-Time Data Processing Pipeline & Visualization with Docker, Spark, Kafka ...
Real-Time Data Processing Pipeline & Visualization with Docker, Spark, Kafka ...Real-Time Data Processing Pipeline & Visualization with Docker, Spark, Kafka ...
Real-Time Data Processing Pipeline & Visualization with Docker, Spark, Kafka ...Roberto Hashioka
 
HDFS on Kubernetes—Lessons Learned with Kimoon Kim
HDFS on Kubernetes—Lessons Learned with Kimoon KimHDFS on Kubernetes—Lessons Learned with Kimoon Kim
HDFS on Kubernetes—Lessons Learned with Kimoon KimDatabricks
 
Episode 3: Kubernetes and Big Data Services
Episode 3: Kubernetes and Big Data ServicesEpisode 3: Kubernetes and Big Data Services
Episode 3: Kubernetes and Big Data ServicesMesosphere Inc.
 
DevNexus 2015: Kubernetes & Container Engine
DevNexus 2015: Kubernetes & Container EngineDevNexus 2015: Kubernetes & Container Engine
DevNexus 2015: Kubernetes & Container EngineKit Merker
 
Gocd – Kubernetes/Nomad Continuous Deployment
Gocd – Kubernetes/Nomad Continuous DeploymentGocd – Kubernetes/Nomad Continuous Deployment
Gocd – Kubernetes/Nomad Continuous DeploymentLeandro Totino Pereira
 
Kafka for begginer
Kafka for begginerKafka for begginer
Kafka for begginerYousun Jeong
 
Netflix viewing data architecture evolution - QCon 2014
Netflix viewing data architecture evolution - QCon 2014Netflix viewing data architecture evolution - QCon 2014
Netflix viewing data architecture evolution - QCon 2014Philip Fisher-Ogden
 
Spark Summit EU talk by William Benton
Spark Summit EU talk by William BentonSpark Summit EU talk by William Benton
Spark Summit EU talk by William BentonSpark Summit
 
RedisConf18 - Redis Enterprise on Cloud Native Platforms
RedisConf18 - Redis Enterprise on Cloud  Native  Platforms RedisConf18 - Redis Enterprise on Cloud  Native  Platforms
RedisConf18 - Redis Enterprise on Cloud Native Platforms Redis Labs
 
Architecture of a Kafka camus infrastructure
Architecture of a Kafka camus infrastructureArchitecture of a Kafka camus infrastructure
Architecture of a Kafka camus infrastructuremattlieber
 
Bridging the gap of Relational to Hadoop using Sqoop @ Expedia
Bridging the gap of Relational to Hadoop using Sqoop @ ExpediaBridging the gap of Relational to Hadoop using Sqoop @ Expedia
Bridging the gap of Relational to Hadoop using Sqoop @ ExpediaDataWorks Summit/Hadoop Summit
 
Scalable Spark deployment using Kubernetes
Scalable Spark deployment using KubernetesScalable Spark deployment using Kubernetes
Scalable Spark deployment using Kubernetesdatamantra
 
CI/CD with Azure DevOps and Azure Databricks
CI/CD with Azure DevOps and Azure DatabricksCI/CD with Azure DevOps and Azure Databricks
CI/CD with Azure DevOps and Azure DatabricksGoDataDriven
 
Should you read Kafka as a stream or in batch? Should you even care? | Ido Na...
Should you read Kafka as a stream or in batch? Should you even care? | Ido Na...Should you read Kafka as a stream or in batch? Should you even care? | Ido Na...
Should you read Kafka as a stream or in batch? Should you even care? | Ido Na...HostedbyConfluent
 

La actualidad más candente (20)

Big data and Kubernetes
Big data and KubernetesBig data and Kubernetes
Big data and Kubernetes
 
Hadoop on Docker
Hadoop on DockerHadoop on Docker
Hadoop on Docker
 
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
 
Building A Diverse Geo-Architecture For Cloud Native Applications In One Day
Building A Diverse Geo-Architecture For Cloud Native Applications In One DayBuilding A Diverse Geo-Architecture For Cloud Native Applications In One Day
Building A Diverse Geo-Architecture For Cloud Native Applications In One Day
 
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at LyftSF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
 
Real-Time Data Processing Pipeline & Visualization with Docker, Spark, Kafka ...
Real-Time Data Processing Pipeline & Visualization with Docker, Spark, Kafka ...Real-Time Data Processing Pipeline & Visualization with Docker, Spark, Kafka ...
Real-Time Data Processing Pipeline & Visualization with Docker, Spark, Kafka ...
 
HDFS on Kubernetes—Lessons Learned with Kimoon Kim
HDFS on Kubernetes—Lessons Learned with Kimoon KimHDFS on Kubernetes—Lessons Learned with Kimoon Kim
HDFS on Kubernetes—Lessons Learned with Kimoon Kim
 
Episode 3: Kubernetes and Big Data Services
Episode 3: Kubernetes and Big Data ServicesEpisode 3: Kubernetes and Big Data Services
Episode 3: Kubernetes and Big Data Services
 
DevNexus 2015: Kubernetes & Container Engine
DevNexus 2015: Kubernetes & Container EngineDevNexus 2015: Kubernetes & Container Engine
DevNexus 2015: Kubernetes & Container Engine
 
Gocd – Kubernetes/Nomad Continuous Deployment
Gocd – Kubernetes/Nomad Continuous DeploymentGocd – Kubernetes/Nomad Continuous Deployment
Gocd – Kubernetes/Nomad Continuous Deployment
 
Kafka for begginer
Kafka for begginerKafka for begginer
Kafka for begginer
 
Netflix viewing data architecture evolution - QCon 2014
Netflix viewing data architecture evolution - QCon 2014Netflix viewing data architecture evolution - QCon 2014
Netflix viewing data architecture evolution - QCon 2014
 
Spark Summit EU talk by William Benton
Spark Summit EU talk by William BentonSpark Summit EU talk by William Benton
Spark Summit EU talk by William Benton
 
RedisConf18 - Redis Enterprise on Cloud Native Platforms
RedisConf18 - Redis Enterprise on Cloud  Native  Platforms RedisConf18 - Redis Enterprise on Cloud  Native  Platforms
RedisConf18 - Redis Enterprise on Cloud Native Platforms
 
Architecture of a Kafka camus infrastructure
Architecture of a Kafka camus infrastructureArchitecture of a Kafka camus infrastructure
Architecture of a Kafka camus infrastructure
 
How to deploy Apache Spark 
to Mesos/DCOS
How to deploy Apache Spark 
to Mesos/DCOSHow to deploy Apache Spark 
to Mesos/DCOS
How to deploy Apache Spark 
to Mesos/DCOS
 
Bridging the gap of Relational to Hadoop using Sqoop @ Expedia
Bridging the gap of Relational to Hadoop using Sqoop @ ExpediaBridging the gap of Relational to Hadoop using Sqoop @ Expedia
Bridging the gap of Relational to Hadoop using Sqoop @ Expedia
 
Scalable Spark deployment using Kubernetes
Scalable Spark deployment using KubernetesScalable Spark deployment using Kubernetes
Scalable Spark deployment using Kubernetes
 
CI/CD with Azure DevOps and Azure Databricks
CI/CD with Azure DevOps and Azure DatabricksCI/CD with Azure DevOps and Azure Databricks
CI/CD with Azure DevOps and Azure Databricks
 
Should you read Kafka as a stream or in batch? Should you even care? | Ido Na...
Should you read Kafka as a stream or in batch? Should you even care? | Ido Na...Should you read Kafka as a stream or in batch? Should you even care? | Ido Na...
Should you read Kafka as a stream or in batch? Should you even care? | Ido Na...
 

Destacado

PaaS Lessons: Cisco IT Deploys OpenShift to Meet Developer Demand
PaaS Lessons: Cisco IT Deploys OpenShift to Meet Developer DemandPaaS Lessons: Cisco IT Deploys OpenShift to Meet Developer Demand
PaaS Lessons: Cisco IT Deploys OpenShift to Meet Developer DemandCisco IT
 
Sharding with spider solutions 20160721
Sharding with spider solutions 20160721Sharding with spider solutions 20160721
Sharding with spider solutions 20160721Kentoku
 
Diário da Região 02/12/2011 Sexta-Feira
Diário da Região 02/12/2011 Sexta-FeiraDiário da Região 02/12/2011 Sexta-Feira
Diário da Região 02/12/2011 Sexta-Feiraijacomassi
 
Mastering the Social Media Ecosystem (More Advanced Social Media Training)
Mastering the Social Media Ecosystem (More Advanced Social Media Training)Mastering the Social Media Ecosystem (More Advanced Social Media Training)
Mastering the Social Media Ecosystem (More Advanced Social Media Training)Danielle Brigida
 
Workplace health & safety act january 2011
Workplace health & safety act january 2011Workplace health & safety act january 2011
Workplace health & safety act january 2011Optimuminsurance
 
Curso/CTR Reisejournalismus: INBerlin - A budget traveller’s Guide
Curso/CTR Reisejournalismus: INBerlin - A budget traveller’s Guide Curso/CTR Reisejournalismus: INBerlin - A budget traveller’s Guide
Curso/CTR Reisejournalismus: INBerlin - A budget traveller’s Guide Citytravelreview / Curso eG
 
Programa provisional V Foro Joven Institucional
Programa provisional V Foro Joven InstitucionalPrograma provisional V Foro Joven Institucional
Programa provisional V Foro Joven Institucionaljsaragon
 
Clash of clans data structures
Clash of clans   data structuresClash of clans   data structures
Clash of clans data structuresRan Silberman
 
From a kafkaesque story to The Promised Land
From a kafkaesque story to The Promised LandFrom a kafkaesque story to The Promised Land
From a kafkaesque story to The Promised LandRan Silberman
 
Prince2, Características, Benefícios e Diferenciais de Sucesso
Prince2, Características, Benefícios e Diferenciais de SucessoPrince2, Características, Benefícios e Diferenciais de Sucesso
Prince2, Características, Benefícios e Diferenciais de SucessoMaria Angelica Castellani
 
Big Data - Marrying Service Management With Service Delivery - #Pink13
Big Data - Marrying Service Management With Service Delivery - #Pink13Big Data - Marrying Service Management With Service Delivery - #Pink13
Big Data - Marrying Service Management With Service Delivery - #Pink13TeamQuest Corporation
 
How the Big Data of APM can Supercharge DevOps
How the Big Data of APM can Supercharge DevOpsHow the Big Data of APM can Supercharge DevOps
How the Big Data of APM can Supercharge DevOpsCA Technologies
 
Best example of Cloud computing is my academic digital library.
Best example of Cloud computing is my academic digital library.Best example of Cloud computing is my academic digital library.
Best example of Cloud computing is my academic digital library.Aman Pandey
 
Continuous Integration and the Data Warehouse - PASS SQL Saturday Slovenia
Continuous Integration and the Data Warehouse - PASS SQL Saturday SloveniaContinuous Integration and the Data Warehouse - PASS SQL Saturday Slovenia
Continuous Integration and the Data Warehouse - PASS SQL Saturday SloveniaDr. John Tunnicliffe
 
Cluster management and automation with cloudera manager
Cluster management and automation with cloudera managerCluster management and automation with cloudera manager
Cluster management and automation with cloudera managerChris Westin
 
Building large-scale analytics platform with Storm, Kafka and Cassandra - NYC...
Building large-scale analytics platform with Storm, Kafka and Cassandra - NYC...Building large-scale analytics platform with Storm, Kafka and Cassandra - NYC...
Building large-scale analytics platform with Storm, Kafka and Cassandra - NYC...Alexey Kharlamov
 

Destacado (20)

PaaS Lessons: Cisco IT Deploys OpenShift to Meet Developer Demand
PaaS Lessons: Cisco IT Deploys OpenShift to Meet Developer DemandPaaS Lessons: Cisco IT Deploys OpenShift to Meet Developer Demand
PaaS Lessons: Cisco IT Deploys OpenShift to Meet Developer Demand
 
Sharding with spider solutions 20160721
Sharding with spider solutions 20160721Sharding with spider solutions 20160721
Sharding with spider solutions 20160721
 
Diário da Região 02/12/2011 Sexta-Feira
Diário da Região 02/12/2011 Sexta-FeiraDiário da Região 02/12/2011 Sexta-Feira
Diário da Região 02/12/2011 Sexta-Feira
 
Mastering the Social Media Ecosystem (More Advanced Social Media Training)
Mastering the Social Media Ecosystem (More Advanced Social Media Training)Mastering the Social Media Ecosystem (More Advanced Social Media Training)
Mastering the Social Media Ecosystem (More Advanced Social Media Training)
 
Workplace health & safety act january 2011
Workplace health & safety act january 2011Workplace health & safety act january 2011
Workplace health & safety act january 2011
 
Curso/CTR Reisejournalismus: INBerlin - A budget traveller’s Guide
Curso/CTR Reisejournalismus: INBerlin - A budget traveller’s Guide Curso/CTR Reisejournalismus: INBerlin - A budget traveller’s Guide
Curso/CTR Reisejournalismus: INBerlin - A budget traveller’s Guide
 
Runes
RunesRunes
Runes
 
Programa provisional V Foro Joven Institucional
Programa provisional V Foro Joven InstitucionalPrograma provisional V Foro Joven Institucional
Programa provisional V Foro Joven Institucional
 
Clash of clans data structures
Clash of clans   data structuresClash of clans   data structures
Clash of clans data structures
 
Patriot Act & Datenschutz
Patriot Act & Datenschutz Patriot Act & Datenschutz
Patriot Act & Datenschutz
 
From a kafkaesque story to The Promised Land
From a kafkaesque story to The Promised LandFrom a kafkaesque story to The Promised Land
From a kafkaesque story to The Promised Land
 
Prince2, Características, Benefícios e Diferenciais de Sucesso
Prince2, Características, Benefícios e Diferenciais de SucessoPrince2, Características, Benefícios e Diferenciais de Sucesso
Prince2, Características, Benefícios e Diferenciais de Sucesso
 
BI + Big Data
BI + Big DataBI + Big Data
BI + Big Data
 
Big Data - Marrying Service Management With Service Delivery - #Pink13
Big Data - Marrying Service Management With Service Delivery - #Pink13Big Data - Marrying Service Management With Service Delivery - #Pink13
Big Data - Marrying Service Management With Service Delivery - #Pink13
 
How the Big Data of APM can Supercharge DevOps
How the Big Data of APM can Supercharge DevOpsHow the Big Data of APM can Supercharge DevOps
How the Big Data of APM can Supercharge DevOps
 
Best example of Cloud computing is my academic digital library.
Best example of Cloud computing is my academic digital library.Best example of Cloud computing is my academic digital library.
Best example of Cloud computing is my academic digital library.
 
Continuous Integration and the Data Warehouse - PASS SQL Saturday Slovenia
Continuous Integration and the Data Warehouse - PASS SQL Saturday SloveniaContinuous Integration and the Data Warehouse - PASS SQL Saturday Slovenia
Continuous Integration and the Data Warehouse - PASS SQL Saturday Slovenia
 
Hadoop ecosystem
Hadoop ecosystemHadoop ecosystem
Hadoop ecosystem
 
Cluster management and automation with cloudera manager
Cluster management and automation with cloudera managerCluster management and automation with cloudera manager
Cluster management and automation with cloudera manager
 
Building large-scale analytics platform with Storm, Kafka and Cassandra - NYC...
Building large-scale analytics platform with Storm, Kafka and Cassandra - NYC...Building large-scale analytics platform with Storm, Kafka and Cassandra - NYC...
Building large-scale analytics platform with Storm, Kafka and Cassandra - NYC...
 

Similar a Dev ops for big data cluster management tools

Introduction to PaaS and Heroku
Introduction to PaaS and HerokuIntroduction to PaaS and Heroku
Introduction to PaaS and HerokuTapio Rautonen
 
Google Cloud Next '22 Recap: Serverless & Data edition
Google Cloud Next '22 Recap: Serverless & Data editionGoogle Cloud Next '22 Recap: Serverless & Data edition
Google Cloud Next '22 Recap: Serverless & Data editionDaniel Zivkovic
 
The Fastest Way to Redis on Pivotal Cloud Foundry
The Fastest Way to Redis on Pivotal Cloud FoundryThe Fastest Way to Redis on Pivotal Cloud Foundry
The Fastest Way to Redis on Pivotal Cloud FoundryVMware Tanzu
 
Wie macht man aus Software einen Online-Service in der Cloud
Wie macht man aus Software einen Online-Service in der CloudWie macht man aus Software einen Online-Service in der Cloud
Wie macht man aus Software einen Online-Service in der CloudAarno Aukia
 
Pivotal Greenplum Cloud Marketplaces - Greenplum Summit 2019
Pivotal Greenplum Cloud Marketplaces - Greenplum Summit 2019Pivotal Greenplum Cloud Marketplaces - Greenplum Summit 2019
Pivotal Greenplum Cloud Marketplaces - Greenplum Summit 2019VMware Tanzu
 
CI/CD on Google Cloud Platform
CI/CD on Google Cloud PlatformCI/CD on Google Cloud Platform
CI/CD on Google Cloud PlatformDevOps Indonesia
 
introduction to micro services
introduction to micro servicesintroduction to micro services
introduction to micro servicesSpyros Lambrinidis
 
Prometheus and Docker (Docker Galway, November 2015)
Prometheus and Docker (Docker Galway, November 2015)Prometheus and Docker (Docker Galway, November 2015)
Prometheus and Docker (Docker Galway, November 2015)Brian Brazil
 
.NET Cloud-Native Bootcamp- Los Angeles
.NET Cloud-Native Bootcamp- Los Angeles.NET Cloud-Native Bootcamp- Los Angeles
.NET Cloud-Native Bootcamp- Los AngelesVMware Tanzu
 
Upgrade your InfoSec, Ops and Dev teams with PCF 1.12
Upgrade your InfoSec, Ops and Dev teams with PCF 1.12Upgrade your InfoSec, Ops and Dev teams with PCF 1.12
Upgrade your InfoSec, Ops and Dev teams with PCF 1.12VMware Tanzu
 
Controlled Evolution with Puppet and AWS
Controlled Evolution with Puppet and AWSControlled Evolution with Puppet and AWS
Controlled Evolution with Puppet and AWSPuppet
 
DevOps for TYPO3 Teams and Projects
DevOps for TYPO3 Teams and ProjectsDevOps for TYPO3 Teams and Projects
DevOps for TYPO3 Teams and ProjectsFedir RYKHTIK
 
.NET Cloud-Native Bootcamp
.NET Cloud-Native Bootcamp.NET Cloud-Native Bootcamp
.NET Cloud-Native BootcampVMware Tanzu
 
Cloud Native Transformation (Alexis Richardson) - Continuous Lifecycle 2018 ...
 Cloud Native Transformation (Alexis Richardson) - Continuous Lifecycle 2018 ... Cloud Native Transformation (Alexis Richardson) - Continuous Lifecycle 2018 ...
Cloud Native Transformation (Alexis Richardson) - Continuous Lifecycle 2018 ...Weaveworks
 
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a MonthUSENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a MonthNicolas Brousse
 
Introduction to Google Cloud Services / Platforms
Introduction to Google Cloud Services / PlatformsIntroduction to Google Cloud Services / Platforms
Introduction to Google Cloud Services / PlatformsNilanchal
 
Google Cloud Fundamentals
Google Cloud Fundamentals Google Cloud Fundamentals
Google Cloud Fundamentals Omar Fathy
 
Open shift and docker - october,2014
Open shift and docker - october,2014Open shift and docker - october,2014
Open shift and docker - october,2014Hojoong Kim
 
Platform as a Service (PaaS) - A cloud service for Developers
Platform as a Service (PaaS) - A cloud service for Developers Platform as a Service (PaaS) - A cloud service for Developers
Platform as a Service (PaaS) - A cloud service for Developers Ravindra Dastikop
 
Cloud Manthn Software Solutions Pvt Ltd - What we do ?
Cloud Manthn Software Solutions Pvt Ltd - What we do ?Cloud Manthn Software Solutions Pvt Ltd - What we do ?
Cloud Manthn Software Solutions Pvt Ltd - What we do ?amodkadam
 

Similar a Dev ops for big data cluster management tools (20)

Introduction to PaaS and Heroku
Introduction to PaaS and HerokuIntroduction to PaaS and Heroku
Introduction to PaaS and Heroku
 
Google Cloud Next '22 Recap: Serverless & Data edition
Google Cloud Next '22 Recap: Serverless & Data editionGoogle Cloud Next '22 Recap: Serverless & Data edition
Google Cloud Next '22 Recap: Serverless & Data edition
 
The Fastest Way to Redis on Pivotal Cloud Foundry
The Fastest Way to Redis on Pivotal Cloud FoundryThe Fastest Way to Redis on Pivotal Cloud Foundry
The Fastest Way to Redis on Pivotal Cloud Foundry
 
Wie macht man aus Software einen Online-Service in der Cloud
Wie macht man aus Software einen Online-Service in der CloudWie macht man aus Software einen Online-Service in der Cloud
Wie macht man aus Software einen Online-Service in der Cloud
 
Pivotal Greenplum Cloud Marketplaces - Greenplum Summit 2019
Pivotal Greenplum Cloud Marketplaces - Greenplum Summit 2019Pivotal Greenplum Cloud Marketplaces - Greenplum Summit 2019
Pivotal Greenplum Cloud Marketplaces - Greenplum Summit 2019
 
CI/CD on Google Cloud Platform
CI/CD on Google Cloud PlatformCI/CD on Google Cloud Platform
CI/CD on Google Cloud Platform
 
introduction to micro services
introduction to micro servicesintroduction to micro services
introduction to micro services
 
Prometheus and Docker (Docker Galway, November 2015)
Prometheus and Docker (Docker Galway, November 2015)Prometheus and Docker (Docker Galway, November 2015)
Prometheus and Docker (Docker Galway, November 2015)
 
.NET Cloud-Native Bootcamp- Los Angeles
.NET Cloud-Native Bootcamp- Los Angeles.NET Cloud-Native Bootcamp- Los Angeles
.NET Cloud-Native Bootcamp- Los Angeles
 
Upgrade your InfoSec, Ops and Dev teams with PCF 1.12
Upgrade your InfoSec, Ops and Dev teams with PCF 1.12Upgrade your InfoSec, Ops and Dev teams with PCF 1.12
Upgrade your InfoSec, Ops and Dev teams with PCF 1.12
 
Controlled Evolution with Puppet and AWS
Controlled Evolution with Puppet and AWSControlled Evolution with Puppet and AWS
Controlled Evolution with Puppet and AWS
 
DevOps for TYPO3 Teams and Projects
DevOps for TYPO3 Teams and ProjectsDevOps for TYPO3 Teams and Projects
DevOps for TYPO3 Teams and Projects
 
.NET Cloud-Native Bootcamp
.NET Cloud-Native Bootcamp.NET Cloud-Native Bootcamp
.NET Cloud-Native Bootcamp
 
Cloud Native Transformation (Alexis Richardson) - Continuous Lifecycle 2018 ...
 Cloud Native Transformation (Alexis Richardson) - Continuous Lifecycle 2018 ... Cloud Native Transformation (Alexis Richardson) - Continuous Lifecycle 2018 ...
Cloud Native Transformation (Alexis Richardson) - Continuous Lifecycle 2018 ...
 
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a MonthUSENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
 
Introduction to Google Cloud Services / Platforms
Introduction to Google Cloud Services / PlatformsIntroduction to Google Cloud Services / Platforms
Introduction to Google Cloud Services / Platforms
 
Google Cloud Fundamentals
Google Cloud Fundamentals Google Cloud Fundamentals
Google Cloud Fundamentals
 
Open shift and docker - october,2014
Open shift and docker - october,2014Open shift and docker - october,2014
Open shift and docker - october,2014
 
Platform as a Service (PaaS) - A cloud service for Developers
Platform as a Service (PaaS) - A cloud service for Developers Platform as a Service (PaaS) - A cloud service for Developers
Platform as a Service (PaaS) - A cloud service for Developers
 
Cloud Manthn Software Solutions Pvt Ltd - What we do ?
Cloud Manthn Software Solutions Pvt Ltd - What we do ?Cloud Manthn Software Solutions Pvt Ltd - What we do ?
Cloud Manthn Software Solutions Pvt Ltd - What we do ?
 

Último

Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...GQ Research
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxBoston Institute of Analytics
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 

Último (20)

Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 

Dev ops for big data cluster management tools

  • 1. by Ran Silberman DevOps for Big Data Cluster management tools 20.4.2015 Hosted by: FullStack Developers Israel
  • 2. Ran Silberman, Big Data Architect ...and amateur birder
  • 3. ● Explain Cluster Management tools by example ● Demo Cloudera Management ● Pros and Cons Agenda
  • 4. Birds of Brazil Wiki application
  • 5.
  • 6. ● Input photos and locations ● Batch: Display statistics on bird, location & photographer. ● Real-time: Count how many birds were seen in the last minute from each species Application requirements
  • 7. ● Volume growth ● Velocity of Streaming and Batch ● Same env from DEV to PROD ● Data from PROD to test on DEV ● Manage Deployment of many applications on many nodes Big Data lifecycle considerations
  • 8. ● HDFS for storing the data ● Hive for batch processing ● Solr/elasticsearch for search ● Spark for streaming ● ...Home-grown applications Choosing the Infrastructures
  • 10. How can we manage all those infrastructures?
  • 11. ● Hortonworks Ambari or ● Cloudera Manager Choosing the Management tool
  • 12. ● All platforms & infrastructures are installed by the tool ● Monitoring, Audits & logs are built-in ● Easy installation and upgrade ● Save scripting work What are the news for DevOps pipeline?
  • 13. ● Manage cluster with GUI or API ● Hadoop installation and setup ● System monitoring & alerts ● Built-in systems: Zookeeper, Spark, Hive Impala and more ● Ability to add parcels CM features
  • 14. ● Monolithic packages ● Relocatable ● sudo-less installs ● Rolling upgrade Parcels
  • 15. Custom Service Descriptors ● CSD is a descriptor for a service used by CM ● Defines how to install start/stop a service and the logic used by CM CSD
  • 16. Demo
  • 17. ● Archive data in Hadoop ● Growing data affects DWH performance & capabilities ● Creating realistic testing data ● Dev and Prod env. may differ in cluster size (dev may be 1 node) More DevOps considerations
  • 18. Tools Comparison CM Ambari Licence Paid Ent edition Free Apache Open Source Technology Cloudera puppet, ganglia, nagios Dependency CDH HDP Manage cluster Parcels Yum REST API + + Extra Features Rolling Upgrade, 3rd- parties Mngt, Extendable by REST API
  • 19. CM features Express Enterprise Subscription Free Annual Deployment & Configuration + + Management + + Monitoring + + Diagnostic + + Extra Features Reports, Rollbacks, Rolling Upgrade, AD Kerberos, Kerberos wizard, Backup & DR
  • 20. ● Fast Deploy ● Easy management by GUI ● Built in monitoring and alerts ● Simple upgrades ● Same management and deploy in Dev and Prod Pros. of Hadoop Management tools
  • 21. ● Tied to specific vendor proprietary system ● Tied to system version by Parcels ● Less flexibility to low-level management Cons. of Hadoop Management tools

Notas del editor

  1. Manage services health Show timeline Search box Start or stop services/cluster Enable HDFS high availability Enable Kerberos Changing HDFS block size from Configuration, View configuration history View Host’s status (charts) and processes Obtaining version of CDH > hosts > hosts inspector Upgrade CDH using parcels Install CSD, change port.