SlideShare una empresa de Scribd logo
1 de 24
Modernized Monitoring for Clusters
and Clouds of All Types
Ian Lumb
Product Marketing Manager
Bright Topics Webinar
April 15, 2015
RECORDING
2
 Modernized monitoring
 Monitoring HPC and Hadoop clusters
 Monitoring public and private clouds
 Monitoring with alerts and health checks
 Customized monitoring - including how to incorporate
your own monitors
Key Takeaways
RECORDING
The Five Essential Strategies
1. Plan to manage the impact of software complexity
2. Plan for scalable growth
3. Plan to manage heterogeneous hardware/software
solutions
4. Be ready for the Cloud
5. Have an answer for the Hadoop question
http://insidehpc.com/2014/05/five-essential-strategies-successful-hpc-clusters/
http://insidehpc.com/2014/11/monitoring-hpc-clusters-modernized/
5
The problem with “Toolkits”
 Toolkits — “A patchwork of disparate tools”
• Tools typically used: Ganglia, Nagios, Cfengine, System Imager,
Puppet, Chef, Cobbler, Hobbit, Big Brother, Zabbix, etc.
• Scripts
 Issues with the “toolkit” approach:
• Scripts poorly documented and hard to maintain
• Tools not designed to work together
• Each tool has its own user interface (CLI/GUI)
• Each tool has its own agent and database
Hidden assumptions and biases re: sampling and more
• Tools rarely designed for scale & high performance
• Accelerators and coprocessors often not supported
 Making a collection of unrelated tools work together
• Requires a lot of expertise and scripting
• Rarely leads to a really easy-to-use and scalable solution
6
The problem with “Meta-Toolkits”
 Meta-Toolkits likely to obfuscate
• Assumptions and biases involved in sampling and processing
Was interpolation or extrapolation required?
• Scalability limitations
• Existing capabilities within a specific toolkit
User beware the LCD effect!
• The ongoing burden of management and maintenance
http://insidehpc.com/2014/11/monitoring-hpc-clusters-modernized/
7
Pressing concerns, real implications
 Significant toolkit legacy in HPC
• Use of meta-toolkits escalating
 Hadoop deployments rediscovering
toolkit legacy
• Hadoop monitoring +
{ NAGIOS || Ganglia || ??? }
• Apache Ambari an evolving meta-toolkit
‘Modernized’ monitoring with meta-toolkits?
http://www.hpcwire.com/2014/09/18/modernizing-hpc-cluster-monitoring/
"Those who cannot
remember the past
are condemned to
repeat it“
George Santayana
The Life of Reason, Vol. 1
1905
Hadoop Users: Stop Settling for the Santayana Effect TODAY!
https://www.linkedin.com/pulse/hadoop-users-stop-settling-santayana-effect-today-ian-lumb
https://lnkd.in/eymE82J
11
Pressing concerns, real implications
 Significant toolkit legacy in HPC
• Use of meta-toolkits escalating
 Hadoop deployments rediscovering
toolkit legacy
• Hadoop monitoring +
{ NAGIOS || Ganglia || ??? }
• Apache Ambari an evolving meta-toolkit
 OpenStack on track to also
rediscover the toolkit legacy
‘Modernized’ monitoring with meta-toolkits?
http://www.hpcwire.com/2014/09/18/modernizing-hpc-cluster-monitoring/
"Those who cannot
remember the past
are condemned to
repeat it“
George Santayana
The Life of Reason, Vol. 1
1905
Hadoop Users: Stop Settling for the Santayana Effect TODAY!
https://www.linkedin.com/pulse/hadoop-users-stop-settling-santayana-effect-today-ian-lumb
http://docs.openstack.org/admin-guide-cloud/content/figures/2/figures/openstack-arch-havana-logical-v1.jpg
OpenStack Architecture (Havana)
13
Pressing concerns, real implications
 Significant toolkit legacy in HPC
• Use of meta-toolkits escalating
 Hadoop deployments rediscovering
toolkit legacy
• Hadoop monitoring +
{ NAGIOS || Ganglia || ??? }
• Apache Ambari an evolving meta-toolkit
 OpenStack on track to also
rediscover the toolkit legacy
‘Modernized’ monitoring with meta-toolkits?
http://www.hpcwire.com/2014/09/18/modernizing-hpc-cluster-monitoring/
"Those who cannot
remember the past
are condemned to
repeat it“
George Santayana
The Life of Reason, Vol. 1
1905
Hadoop Users: Stop Settling for the Santayana Effect TODAY!
https://www.linkedin.com/pulse/hadoop-users-stop-settling-santayana-effect-today-ian-lumb
14
A Better Solution
 Bright Cluster Manager takes a much more fundamental
& integrated approach
• Designed and written from the ground up
• Single cluster management agent provides all functionality
• Single, central database for configuration and monitoring data
• Single UI for ALL cluster management functionality
 Which makes Bright Cluster Manager …
• Extremely easy to use
• Extremely scalable
• Secure & reliable
• Complete
• Flexible
• Maintainable
Bright Cluster
Architecture — Monitoring
CMDaemon
head node
node001
node003
node002
data
Cluster
Management
GUI
Cluster
Management
Shell
Web-Based
User Portal
Third-Party
Applications
BMC
BMC
BMCraw data consolidated
data
metrics
metrics
metrics
metrics
metrics
16
Native Metrics for Clusters & Clouds
 Over 160 relating to HPC
• From bare metal to workload managers to apps
Includes accelerators and coprocessors
• On-the-ground and in-the-public-cloud
 Over 400 relating to Hadoop
• From distros, HDFS & YARN to data-platform apps
 Almost 90 relating to OpenStack
• Tenant-specific plus private cloud as-a-whole
 Over 60 relating to Ceph
http://www.brightcomputing.com/Linux-Cluster-Monitoring
19
Monitoring++
 Proactive alert-based monitoring
• Define thresholds for any metric
• Associate actions with thresholds
Actions execute when thresholds exceeded
 Health checks
• Invasive plus dynamic diagnostics
Cluster monitoring vs. health checking: What’s the difference?
http://info.brightcomputing.com/blog/cluster-monitoring-vs.-health-checking-whats-the-difference
http://www.brightcomputing.com/Linux-Cluster-Health
20
23
 Modernized monitoring
 Monitoring HPC and Hadoop clusters
 Monitoring public and private clouds
 Monitoring with alerts and health checks
 Customized monitoring - including how to incorporate
your own monitors
Key Takeaways
RECORDING
Q & A
Ian Lumb, ian.lumb@brightcomputing.com
http://www.brightcomputing.com/

Más contenido relacionado

Similar a Bright Topics Webinar April 15, 2015 - Modernized Monitoring for Cluster and Clouds of All Types

Capacity Management and BigData/Hadoop - Hitchhiker's guide for the Capacity ...
Capacity Management and BigData/Hadoop - Hitchhiker's guide for the Capacity ...Capacity Management and BigData/Hadoop - Hitchhiker's guide for the Capacity ...
Capacity Management and BigData/Hadoop - Hitchhiker's guide for the Capacity ...Renato Bonomini
 
Operating a Highly Available Cloud Service
Operating a Highly Available Cloud ServiceOperating a Highly Available Cloud Service
Operating a Highly Available Cloud ServiceDepankar Neogi
 
Top 10 DevOps Areas Need To Focus
Top 10 DevOps Areas Need To FocusTop 10 DevOps Areas Need To Focus
Top 10 DevOps Areas Need To Focusdevopsjourney
 
Humana digitally transforming health and well-being with Pivotal cloud foundr...
Humana digitally transforming health and well-being with Pivotal cloud foundr...Humana digitally transforming health and well-being with Pivotal cloud foundr...
Humana digitally transforming health and well-being with Pivotal cloud foundr...Dynatrace
 
Taking agile development to enterprise scale in a mixed tool environment with...
Taking agile development to enterprise scale in a mixed tool environment with...Taking agile development to enterprise scale in a mixed tool environment with...
Taking agile development to enterprise scale in a mixed tool environment with...IBM Rational software
 
Hp discover 2012 managing the virtualization explosion
Hp discover 2012   managing the virtualization explosionHp discover 2012   managing the virtualization explosion
Hp discover 2012 managing the virtualization explosionStefan Bergstein
 
A DevOps adoption playbook- achieving business value at scale
A DevOps adoption playbook- achieving business value at scaleA DevOps adoption playbook- achieving business value at scale
A DevOps adoption playbook- achieving business value at scaleSanjeev Sharma
 
Bitfusion Nimbix Dev Summit Heterogeneous Architectures
Bitfusion Nimbix Dev Summit Heterogeneous Architectures Bitfusion Nimbix Dev Summit Heterogeneous Architectures
Bitfusion Nimbix Dev Summit Heterogeneous Architectures Subbu Rama
 
HadoopCon- Trend Micro SPN Hadoop Overview
HadoopCon- Trend Micro SPN Hadoop OverviewHadoopCon- Trend Micro SPN Hadoop Overview
HadoopCon- Trend Micro SPN Hadoop OverviewYafang Chang
 
Cloud Foundry and Microservices: A Mutualistic Symbiotic Relationship
Cloud Foundry and Microservices: A Mutualistic Symbiotic RelationshipCloud Foundry and Microservices: A Mutualistic Symbiotic Relationship
Cloud Foundry and Microservices: A Mutualistic Symbiotic RelationshipVMware Tanzu
 
Cloud Foundry and Microservices: A Mutualistic Symbiotic Relationship
Cloud Foundry and Microservices: A Mutualistic Symbiotic RelationshipCloud Foundry and Microservices: A Mutualistic Symbiotic Relationship
Cloud Foundry and Microservices: A Mutualistic Symbiotic RelationshipMatt Stine
 
Automate Hadoop Cluster Deployment in a Banking Ecosystem
Automate Hadoop Cluster Deployment in a Banking EcosystemAutomate Hadoop Cluster Deployment in a Banking Ecosystem
Automate Hadoop Cluster Deployment in a Banking EcosystemHellmar Becker
 
Technology insights: Decision Science Platform
Technology insights: Decision Science PlatformTechnology insights: Decision Science Platform
Technology insights: Decision Science PlatformDecision Science Community
 
ML-Ops: Philosophy, Best-Practices and Tools
ML-Ops:Philosophy, Best-Practices and ToolsML-Ops:Philosophy, Best-Practices and Tools
ML-Ops: Philosophy, Best-Practices and ToolsJorge Davila-Chacon
 
Evolution of Drupal and the Drupal community
Evolution of Drupal and the Drupal communityEvolution of Drupal and the Drupal community
Evolution of Drupal and the Drupal communityAngela Byron
 
OpenHPC: A Comprehensive System Software Stack
OpenHPC: A Comprehensive System Software StackOpenHPC: A Comprehensive System Software Stack
OpenHPC: A Comprehensive System Software Stackinside-BigData.com
 
VMworld 2013: Building the Management Stack for Your Software Defined Data Ce...
VMworld 2013: Building the Management Stack for Your Software Defined Data Ce...VMworld 2013: Building the Management Stack for Your Software Defined Data Ce...
VMworld 2013: Building the Management Stack for Your Software Defined Data Ce...VMworld
 
What HPC can learn from DevOps?
What HPC can learn from DevOps?What HPC can learn from DevOps?
What HPC can learn from DevOps?Walid Shaari
 
Does Big Data Spell Big Costs- Impetus Webinar
Does Big Data Spell Big Costs- Impetus WebinarDoes Big Data Spell Big Costs- Impetus Webinar
Does Big Data Spell Big Costs- Impetus WebinarImpetus Technologies
 
PureApp Hybrid Cloud - Mark Willemse ING Presentation 11th September 2014
PureApp Hybrid Cloud - Mark Willemse ING Presentation 11th September 2014PureApp Hybrid Cloud - Mark Willemse ING Presentation 11th September 2014
PureApp Hybrid Cloud - Mark Willemse ING Presentation 11th September 2014IBM Systems UKI
 

Similar a Bright Topics Webinar April 15, 2015 - Modernized Monitoring for Cluster and Clouds of All Types (20)

Capacity Management and BigData/Hadoop - Hitchhiker's guide for the Capacity ...
Capacity Management and BigData/Hadoop - Hitchhiker's guide for the Capacity ...Capacity Management and BigData/Hadoop - Hitchhiker's guide for the Capacity ...
Capacity Management and BigData/Hadoop - Hitchhiker's guide for the Capacity ...
 
Operating a Highly Available Cloud Service
Operating a Highly Available Cloud ServiceOperating a Highly Available Cloud Service
Operating a Highly Available Cloud Service
 
Top 10 DevOps Areas Need To Focus
Top 10 DevOps Areas Need To FocusTop 10 DevOps Areas Need To Focus
Top 10 DevOps Areas Need To Focus
 
Humana digitally transforming health and well-being with Pivotal cloud foundr...
Humana digitally transforming health and well-being with Pivotal cloud foundr...Humana digitally transforming health and well-being with Pivotal cloud foundr...
Humana digitally transforming health and well-being with Pivotal cloud foundr...
 
Taking agile development to enterprise scale in a mixed tool environment with...
Taking agile development to enterprise scale in a mixed tool environment with...Taking agile development to enterprise scale in a mixed tool environment with...
Taking agile development to enterprise scale in a mixed tool environment with...
 
Hp discover 2012 managing the virtualization explosion
Hp discover 2012   managing the virtualization explosionHp discover 2012   managing the virtualization explosion
Hp discover 2012 managing the virtualization explosion
 
A DevOps adoption playbook- achieving business value at scale
A DevOps adoption playbook- achieving business value at scaleA DevOps adoption playbook- achieving business value at scale
A DevOps adoption playbook- achieving business value at scale
 
Bitfusion Nimbix Dev Summit Heterogeneous Architectures
Bitfusion Nimbix Dev Summit Heterogeneous Architectures Bitfusion Nimbix Dev Summit Heterogeneous Architectures
Bitfusion Nimbix Dev Summit Heterogeneous Architectures
 
HadoopCon- Trend Micro SPN Hadoop Overview
HadoopCon- Trend Micro SPN Hadoop OverviewHadoopCon- Trend Micro SPN Hadoop Overview
HadoopCon- Trend Micro SPN Hadoop Overview
 
Cloud Foundry and Microservices: A Mutualistic Symbiotic Relationship
Cloud Foundry and Microservices: A Mutualistic Symbiotic RelationshipCloud Foundry and Microservices: A Mutualistic Symbiotic Relationship
Cloud Foundry and Microservices: A Mutualistic Symbiotic Relationship
 
Cloud Foundry and Microservices: A Mutualistic Symbiotic Relationship
Cloud Foundry and Microservices: A Mutualistic Symbiotic RelationshipCloud Foundry and Microservices: A Mutualistic Symbiotic Relationship
Cloud Foundry and Microservices: A Mutualistic Symbiotic Relationship
 
Automate Hadoop Cluster Deployment in a Banking Ecosystem
Automate Hadoop Cluster Deployment in a Banking EcosystemAutomate Hadoop Cluster Deployment in a Banking Ecosystem
Automate Hadoop Cluster Deployment in a Banking Ecosystem
 
Technology insights: Decision Science Platform
Technology insights: Decision Science PlatformTechnology insights: Decision Science Platform
Technology insights: Decision Science Platform
 
ML-Ops: Philosophy, Best-Practices and Tools
ML-Ops:Philosophy, Best-Practices and ToolsML-Ops:Philosophy, Best-Practices and Tools
ML-Ops: Philosophy, Best-Practices and Tools
 
Evolution of Drupal and the Drupal community
Evolution of Drupal and the Drupal communityEvolution of Drupal and the Drupal community
Evolution of Drupal and the Drupal community
 
OpenHPC: A Comprehensive System Software Stack
OpenHPC: A Comprehensive System Software StackOpenHPC: A Comprehensive System Software Stack
OpenHPC: A Comprehensive System Software Stack
 
VMworld 2013: Building the Management Stack for Your Software Defined Data Ce...
VMworld 2013: Building the Management Stack for Your Software Defined Data Ce...VMworld 2013: Building the Management Stack for Your Software Defined Data Ce...
VMworld 2013: Building the Management Stack for Your Software Defined Data Ce...
 
What HPC can learn from DevOps?
What HPC can learn from DevOps?What HPC can learn from DevOps?
What HPC can learn from DevOps?
 
Does Big Data Spell Big Costs- Impetus Webinar
Does Big Data Spell Big Costs- Impetus WebinarDoes Big Data Spell Big Costs- Impetus Webinar
Does Big Data Spell Big Costs- Impetus Webinar
 
PureApp Hybrid Cloud - Mark Willemse ING Presentation 11th September 2014
PureApp Hybrid Cloud - Mark Willemse ING Presentation 11th September 2014PureApp Hybrid Cloud - Mark Willemse ING Presentation 11th September 2014
PureApp Hybrid Cloud - Mark Willemse ING Presentation 11th September 2014
 

Más de Ian Lumb

Towards Deep Learning from Twitter for Improved Tsunami Alerts and Advisories
Towards Deep Learning from Twitter for Improved Tsunami Alerts and AdvisoriesTowards Deep Learning from Twitter for Improved Tsunami Alerts and Advisories
Towards Deep Learning from Twitter for Improved Tsunami Alerts and AdvisoriesIan Lumb
 
Univa and SUSE at SC17: Scaling Machine Learning for SUSE Linux Containers, S...
Univa and SUSE at SC17: Scaling Machine Learning for SUSE Linux Containers, S...Univa and SUSE at SC17: Scaling Machine Learning for SUSE Linux Containers, S...
Univa and SUSE at SC17: Scaling Machine Learning for SUSE Linux Containers, S...Ian Lumb
 
Managing Containerized HPC and AI Workloads on TSUBAME3.0
Managing Containerized HPC and AI Workloads on TSUBAME3.0Managing Containerized HPC and AI Workloads on TSUBAME3.0
Managing Containerized HPC and AI Workloads on TSUBAME3.0Ian Lumb
 
Univa Unicloud - High Volume Workloads: How Smart Companies are Harnessing th...
Univa Unicloud - High Volume Workloads: How Smart Companies are Harnessing th...Univa Unicloud - High Volume Workloads: How Smart Companies are Harnessing th...
Univa Unicloud - High Volume Workloads: How Smart Companies are Harnessing th...Ian Lumb
 
Dev / Test / Ops – Gain More Horsepower and Reduce Costs by Sharing Kubernete...
Dev / Test / Ops – Gain More Horsepower and Reduce Costs by Sharing Kubernete...Dev / Test / Ops – Gain More Horsepower and Reduce Costs by Sharing Kubernete...
Dev / Test / Ops – Gain More Horsepower and Reduce Costs by Sharing Kubernete...Ian Lumb
 
Drilling Deep with Machine Learning as an Enterprise Enabled Micro Service
Drilling Deep with Machine Learning as an Enterprise Enabled Micro ServiceDrilling Deep with Machine Learning as an Enterprise Enabled Micro Service
Drilling Deep with Machine Learning as an Enterprise Enabled Micro ServiceIan Lumb
 
Machine Learning for Big Data Analytics: Scaling In with Containers while Sc...
Machine Learning for Big Data Analytics:  Scaling In with Containers while Sc...Machine Learning for Big Data Analytics:  Scaling In with Containers while Sc...
Machine Learning for Big Data Analytics: Scaling In with Containers while Sc...Ian Lumb
 
Docker 101 - all about Docker containers
Docker 101 - all about Docker containers Docker 101 - all about Docker containers
Docker 101 - all about Docker containers Ian Lumb
 
High Performance Computing in the Cloud?
High Performance Computing in the Cloud?High Performance Computing in the Cloud?
High Performance Computing in the Cloud?Ian Lumb
 
VoDcast Slides: The Rise in Popularity of Apache Spark
VoDcast Slides: The Rise in Popularity of Apache SparkVoDcast Slides: The Rise in Popularity of Apache Spark
VoDcast Slides: The Rise in Popularity of Apache SparkIan Lumb
 
Utilizing Public AND Private Clouds with Bright Cluster Manager
Utilizing Public AND Private Clouds with Bright Cluster ManagerUtilizing Public AND Private Clouds with Bright Cluster Manager
Utilizing Public AND Private Clouds with Bright Cluster ManagerIan Lumb
 
How to Upgrade Your Hadoop Stack in 1 Step -- with Zero Downtime
How to Upgrade Your Hadoop Stack in 1 Step -- with Zero DowntimeHow to Upgrade Your Hadoop Stack in 1 Step -- with Zero Downtime
How to Upgrade Your Hadoop Stack in 1 Step -- with Zero DowntimeIan Lumb
 
Bright Cluster Manager: A Comprehensive, Integrated Management Solution for P...
Bright Cluster Manager: A Comprehensive, Integrated Management Solution for P...Bright Cluster Manager: A Comprehensive, Integrated Management Solution for P...
Bright Cluster Manager: A Comprehensive, Integrated Management Solution for P...Ian Lumb
 

Más de Ian Lumb (13)

Towards Deep Learning from Twitter for Improved Tsunami Alerts and Advisories
Towards Deep Learning from Twitter for Improved Tsunami Alerts and AdvisoriesTowards Deep Learning from Twitter for Improved Tsunami Alerts and Advisories
Towards Deep Learning from Twitter for Improved Tsunami Alerts and Advisories
 
Univa and SUSE at SC17: Scaling Machine Learning for SUSE Linux Containers, S...
Univa and SUSE at SC17: Scaling Machine Learning for SUSE Linux Containers, S...Univa and SUSE at SC17: Scaling Machine Learning for SUSE Linux Containers, S...
Univa and SUSE at SC17: Scaling Machine Learning for SUSE Linux Containers, S...
 
Managing Containerized HPC and AI Workloads on TSUBAME3.0
Managing Containerized HPC and AI Workloads on TSUBAME3.0Managing Containerized HPC and AI Workloads on TSUBAME3.0
Managing Containerized HPC and AI Workloads on TSUBAME3.0
 
Univa Unicloud - High Volume Workloads: How Smart Companies are Harnessing th...
Univa Unicloud - High Volume Workloads: How Smart Companies are Harnessing th...Univa Unicloud - High Volume Workloads: How Smart Companies are Harnessing th...
Univa Unicloud - High Volume Workloads: How Smart Companies are Harnessing th...
 
Dev / Test / Ops – Gain More Horsepower and Reduce Costs by Sharing Kubernete...
Dev / Test / Ops – Gain More Horsepower and Reduce Costs by Sharing Kubernete...Dev / Test / Ops – Gain More Horsepower and Reduce Costs by Sharing Kubernete...
Dev / Test / Ops – Gain More Horsepower and Reduce Costs by Sharing Kubernete...
 
Drilling Deep with Machine Learning as an Enterprise Enabled Micro Service
Drilling Deep with Machine Learning as an Enterprise Enabled Micro ServiceDrilling Deep with Machine Learning as an Enterprise Enabled Micro Service
Drilling Deep with Machine Learning as an Enterprise Enabled Micro Service
 
Machine Learning for Big Data Analytics: Scaling In with Containers while Sc...
Machine Learning for Big Data Analytics:  Scaling In with Containers while Sc...Machine Learning for Big Data Analytics:  Scaling In with Containers while Sc...
Machine Learning for Big Data Analytics: Scaling In with Containers while Sc...
 
Docker 101 - all about Docker containers
Docker 101 - all about Docker containers Docker 101 - all about Docker containers
Docker 101 - all about Docker containers
 
High Performance Computing in the Cloud?
High Performance Computing in the Cloud?High Performance Computing in the Cloud?
High Performance Computing in the Cloud?
 
VoDcast Slides: The Rise in Popularity of Apache Spark
VoDcast Slides: The Rise in Popularity of Apache SparkVoDcast Slides: The Rise in Popularity of Apache Spark
VoDcast Slides: The Rise in Popularity of Apache Spark
 
Utilizing Public AND Private Clouds with Bright Cluster Manager
Utilizing Public AND Private Clouds with Bright Cluster ManagerUtilizing Public AND Private Clouds with Bright Cluster Manager
Utilizing Public AND Private Clouds with Bright Cluster Manager
 
How to Upgrade Your Hadoop Stack in 1 Step -- with Zero Downtime
How to Upgrade Your Hadoop Stack in 1 Step -- with Zero DowntimeHow to Upgrade Your Hadoop Stack in 1 Step -- with Zero Downtime
How to Upgrade Your Hadoop Stack in 1 Step -- with Zero Downtime
 
Bright Cluster Manager: A Comprehensive, Integrated Management Solution for P...
Bright Cluster Manager: A Comprehensive, Integrated Management Solution for P...Bright Cluster Manager: A Comprehensive, Integrated Management Solution for P...
Bright Cluster Manager: A Comprehensive, Integrated Management Solution for P...
 

Último

W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceanilsa9823
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 

Último (20)

W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 

Bright Topics Webinar April 15, 2015 - Modernized Monitoring for Cluster and Clouds of All Types

  • 1. Modernized Monitoring for Clusters and Clouds of All Types Ian Lumb Product Marketing Manager Bright Topics Webinar April 15, 2015 RECORDING
  • 2. 2  Modernized monitoring  Monitoring HPC and Hadoop clusters  Monitoring public and private clouds  Monitoring with alerts and health checks  Customized monitoring - including how to incorporate your own monitors Key Takeaways RECORDING
  • 3. The Five Essential Strategies 1. Plan to manage the impact of software complexity 2. Plan for scalable growth 3. Plan to manage heterogeneous hardware/software solutions 4. Be ready for the Cloud 5. Have an answer for the Hadoop question http://insidehpc.com/2014/05/five-essential-strategies-successful-hpc-clusters/
  • 5. 5 The problem with “Toolkits”  Toolkits — “A patchwork of disparate tools” • Tools typically used: Ganglia, Nagios, Cfengine, System Imager, Puppet, Chef, Cobbler, Hobbit, Big Brother, Zabbix, etc. • Scripts  Issues with the “toolkit” approach: • Scripts poorly documented and hard to maintain • Tools not designed to work together • Each tool has its own user interface (CLI/GUI) • Each tool has its own agent and database Hidden assumptions and biases re: sampling and more • Tools rarely designed for scale & high performance • Accelerators and coprocessors often not supported  Making a collection of unrelated tools work together • Requires a lot of expertise and scripting • Rarely leads to a really easy-to-use and scalable solution
  • 6. 6 The problem with “Meta-Toolkits”  Meta-Toolkits likely to obfuscate • Assumptions and biases involved in sampling and processing Was interpolation or extrapolation required? • Scalability limitations • Existing capabilities within a specific toolkit User beware the LCD effect! • The ongoing burden of management and maintenance http://insidehpc.com/2014/11/monitoring-hpc-clusters-modernized/
  • 7. 7 Pressing concerns, real implications  Significant toolkit legacy in HPC • Use of meta-toolkits escalating  Hadoop deployments rediscovering toolkit legacy • Hadoop monitoring + { NAGIOS || Ganglia || ??? } • Apache Ambari an evolving meta-toolkit ‘Modernized’ monitoring with meta-toolkits? http://www.hpcwire.com/2014/09/18/modernizing-hpc-cluster-monitoring/ "Those who cannot remember the past are condemned to repeat it“ George Santayana The Life of Reason, Vol. 1 1905 Hadoop Users: Stop Settling for the Santayana Effect TODAY! https://www.linkedin.com/pulse/hadoop-users-stop-settling-santayana-effect-today-ian-lumb
  • 9.
  • 10.
  • 11. 11 Pressing concerns, real implications  Significant toolkit legacy in HPC • Use of meta-toolkits escalating  Hadoop deployments rediscovering toolkit legacy • Hadoop monitoring + { NAGIOS || Ganglia || ??? } • Apache Ambari an evolving meta-toolkit  OpenStack on track to also rediscover the toolkit legacy ‘Modernized’ monitoring with meta-toolkits? http://www.hpcwire.com/2014/09/18/modernizing-hpc-cluster-monitoring/ "Those who cannot remember the past are condemned to repeat it“ George Santayana The Life of Reason, Vol. 1 1905 Hadoop Users: Stop Settling for the Santayana Effect TODAY! https://www.linkedin.com/pulse/hadoop-users-stop-settling-santayana-effect-today-ian-lumb
  • 13. 13 Pressing concerns, real implications  Significant toolkit legacy in HPC • Use of meta-toolkits escalating  Hadoop deployments rediscovering toolkit legacy • Hadoop monitoring + { NAGIOS || Ganglia || ??? } • Apache Ambari an evolving meta-toolkit  OpenStack on track to also rediscover the toolkit legacy ‘Modernized’ monitoring with meta-toolkits? http://www.hpcwire.com/2014/09/18/modernizing-hpc-cluster-monitoring/ "Those who cannot remember the past are condemned to repeat it“ George Santayana The Life of Reason, Vol. 1 1905 Hadoop Users: Stop Settling for the Santayana Effect TODAY! https://www.linkedin.com/pulse/hadoop-users-stop-settling-santayana-effect-today-ian-lumb
  • 14. 14 A Better Solution  Bright Cluster Manager takes a much more fundamental & integrated approach • Designed and written from the ground up • Single cluster management agent provides all functionality • Single, central database for configuration and monitoring data • Single UI for ALL cluster management functionality  Which makes Bright Cluster Manager … • Extremely easy to use • Extremely scalable • Secure & reliable • Complete • Flexible • Maintainable
  • 15. Bright Cluster Architecture — Monitoring CMDaemon head node node001 node003 node002 data Cluster Management GUI Cluster Management Shell Web-Based User Portal Third-Party Applications BMC BMC BMCraw data consolidated data metrics metrics metrics metrics metrics
  • 16. 16 Native Metrics for Clusters & Clouds  Over 160 relating to HPC • From bare metal to workload managers to apps Includes accelerators and coprocessors • On-the-ground and in-the-public-cloud  Over 400 relating to Hadoop • From distros, HDFS & YARN to data-platform apps  Almost 90 relating to OpenStack • Tenant-specific plus private cloud as-a-whole  Over 60 relating to Ceph http://www.brightcomputing.com/Linux-Cluster-Monitoring
  • 17.
  • 18.
  • 19. 19 Monitoring++  Proactive alert-based monitoring • Define thresholds for any metric • Associate actions with thresholds Actions execute when thresholds exceeded  Health checks • Invasive plus dynamic diagnostics Cluster monitoring vs. health checking: What’s the difference? http://info.brightcomputing.com/blog/cluster-monitoring-vs.-health-checking-whats-the-difference http://www.brightcomputing.com/Linux-Cluster-Health
  • 20. 20
  • 21.
  • 22.
  • 23. 23  Modernized monitoring  Monitoring HPC and Hadoop clusters  Monitoring public and private clouds  Monitoring with alerts and health checks  Customized monitoring - including how to incorporate your own monitors Key Takeaways RECORDING
  • 24. Q & A Ian Lumb, ian.lumb@brightcomputing.com http://www.brightcomputing.com/