SlideShare a Scribd company logo
1 of 20
WEBINAR
How CBS Interactive Uses Cloudera Manager to Effectively
Manage their Hadoop Cluster

Wednesday, September 19th, 2012




Manoj Murumkar - Senior Manager, Data Engineering, CBS Interactive
Bala Venkatrao – Director of Products, Cloudera
Agenda

Introductions
CBSi
   • Hadoop Use Case
   • Operational Challenges
   • How Cloudera Manager helps CBSi & Demo
   • Benefits of using Cloudera Manager

Cloudera Manager
   • Overview & Benefits
   • Key Features
   • Roadmap

Q&A


       2                   ©2012 Cloudera, Inc. All Rights Reserved.
Introductions

Manoj Murumkar
Senior Manager, Data Engineering at CBS Interactive
  Manoj has been working with data technologies since 1998. His team currently responsible for
  providing data infrastructure solutions and operating them for internet division of CBS corporation.
  He has been involved with Hadoop for more around 3 years, around 2 years of which working with
  Cloudera. His team has built big data infrastructure from ground up that helps in clickstream
  analysis using Hadoop streaming.


Bala Venkatrao
Director, Products at Cloudera
  Bala Venkatrao is part of the product management team at Cloudera and leads the efforts around
  Cloudera Manager. In addition, he is involved in several other initiatives, including customer
  advocacy, partnership development, marketing etc.




      3                                 ©2012 Cloudera, Inc. All Rights Reserved.
Building web analytics for Top 10 global web property
      on Hadoop
       235M worldwide monthly unique users

           Challenge                                                               Solution                                                      Results
      Requires advanced analytics                                       Web analytics platform on                                       Optimizing what content is placed
       on click stream data in near                                       Hadoop processes >1B global                                      beside that which user is currently
       real time                                                          events/day                                                       reading

                                                                                                                                          Reduced processing time by 6+ hrs.
      Weblog processing time on                                         >1PB on Hadoop; 42 nodes
                                                                                                                                           to reach SLA
       proprietary platform hit limit
       while data volumes                                                Tracking clicks, page views,                                    Accommodates 50% data volume
       continuously increased                                             downloads, streaming video                                       increase per year
                                                                          events, ad events, etc.
      Ability/Cost to store historical                                                                                                   Reduce cost of storing/processing
       data for analyses                                                 Hadoop Components: HDFS,                                         data
                                                                          Hive, MapReduce, Pig, Hadoop
                                                                          Streaming                                                       Greater ad revenues achieved

Source: Hadoop World 2012 presentation. Michael Sun, Lead Software Engineer & Manager of DW Operations, CBS Interactive.
http://www.cloudera.com/resource/hadoop-world-2012-presentation-slides-building-web-analytics-processing-on-hadoop-at-cbs-interactive/



                       4                                                          ©2012 Cloudera, Inc. All Rights Reserved.
CBSi Hadoop
 Operational Challenges
Prior to Cloudera Manager
  Lack of…
      Holistic view
      Configuration control
                  No audit trail/history of changes



  Existing solutions were…
      Ganglia , Hadoop web UI pages and custom scripts
                  Difficult to maintain


  No visibility into activity failures
   •   Reactive to user complaints on failed/long running jobs



       5                                       ©2012 Cloudera, Inc. All Rights Reserved.
How Cloudera Manager helps CBSi with
Hadoop Operations
Intuitive visual interface
    Can manage and monitor the whole cluster
          Overall health status/dashboard
          Ability to drill down from services > roles > hosts


Service Monitoring and Alerting
    Makes Hadoop operations pro-active
            Heatmaps provides an easy way to identify outliers

Activity Monitoring
   • Helps identify failed or slow running jobs
            Notify end-users on failed jobs and manage SLA’s

Workflows
    Simple to add new ‘data nodes’, hosts, clients etc.




     6                               ©2012 Cloudera, Inc. All Rights Reserved.
CLOUDERA MANAGER
CBSi Demo
Key Benefits of
Using Cloudera Manager at CBSi
Lowers the barrier for Hadoop administration
   Do not need to rely on experts solely

Makes life easier – saves money & time
    Avoid licensing costs associated with managing multiple tools
    Cuts technical and human resource costs
    Reduces time to manage and maintain the cluster

Provides a “one-stop” holistic view
    Easy to understand how the overall cluster is performing

Helps create repeatable processes & workflows for
Hadoop operations
          Improves efficiency of the Operations team


       8                         ©2012 Cloudera, Inc. All Rights Reserved.
The 6 Characteristics of
Enterprise Grade Hadoop




    9                  ©2012 Cloudera, Inc. All Rights Reserved.
Why You Need Cloudera Manager


1      COMPLEXITY
       HADOOP IS MORE THAN A DOZEN SERVICES RUNNING
       ACROSS MANY MACHINES
           HUNDREDS OF HARDWARE COMPONENTS
           THOUSANDS OF SETTINGS
           LIMITLESS PERMUTATIONS




2      CONTEXT
       HADOOP IS A SYSTEM, NOT JUST A
       COLLECTION OF PARTS
           EVERYTHING IS INTERRELATED
           RAW DATA ABOUT INDIVIDUAL PIECES IS NOT ENOUGH
           MUST EXTRACT WHAT’S IMPORTANT




3      EFFICIENCY
       MANAGING HADOOP WITH MULTIPLE TOOLS & MANUAL
       PROCESSES TAKES LONGER
           COMPLICATED, ERROR PRONE WORKFLOWS
           LONGER ISSUE RESOLUTION
           LACK OF CONSISTENT AND REPEATABLE PROCESSES




  10                                              ©2012 Cloudera, Inc. All Rights Reserved.
Cloudera Manager Provides
End-to-End CDH Administration



1   DEPLOY
    INSTALL, CONFIGURE AND START YOUR CLUSTER IN 3
    SIMPLE STEPS




2   CONFIGURE & OPTIMIZE
    ENSURE OPTIMAL SETTINGS FOR ALL HOSTS AND
    SERVICES




3   MONITOR, DIAGNOSE & REPORT
    FIND AND FIX PROBLEMS QUICKLY, VIEW CURRENT AND
    HISTORICAL ACTIVITY AND RESOURCE USAGE
                                                                                      CDH




     11                                   ©2012 Cloudera, Inc. All Rights Reserved.
Managing Complexity
      One Tool For Everything
DEPLOYMENT &                                                                                                            ACTIVITY
                      MONITORING   WORKFLOWS   EVENTS & ALERTS              LOG SEARCH       DIAGNOSTICS   REPORTING
CONFIGURATION                                                                                                          MONITORING

DO-IT-YOURSELF




                         +




CLOUDERA ENTERPRISE




“In a recent Cloudera survey, >95% of respondents emphasized the need for a single end-to-end
                          tool to manage their Hadoop Operations”
                 12                              ©2012 Cloudera, Inc. All Rights Reserved.
Providing Context
Raw Data vs. Hadoop Intelligence




                                                 1 SMART CONFIGURATION
                                                          AUTO-SETS CONFIGURATIONS & GUARDS AGAINST USER
                                                          ERROR

                ?              VS. 2                      WORKFLOWS
                                                          ENSURES THAT MULTI-STEP TASKS ARE ACCOMPLISHED COMPLETELY &
                                                          IN THE CORRECT SEQUENCE


                                                  3 DEPENDENCIES
                                                          AWARE OF HOW A PARTICULAR ACTION AFFECTS THE REST OF THE
                                                          CLUSTER & MANAGES THE IMPACT


                                                  4 EVENTS & ALERTS
                                                          MAKES YOU AWARE OF WHAT’S IMPORTANT AT A HADOOP SYSTEM
                                                          LEVEL


                                                  5 HISTORY
                                                          COMPARES CURRENT & PAST ACTIVITIES FOR CONTEXT




   13               ©2012 Cloudera, Inc. All Rights Reserved.
Cloudera Manager Key Features
            Installs the complete Hadoop stack in minutes via a wizard-based interface

            Gives you complete, end-to-end visibility and control over your Hadoop cluster from a
            single interface

            Allows you to manage multiple clusters from a single instance of Cloudera Manager


            Integrate Cloudera Manager with Active Directory


            Establishes the time context globally for almost all views

            Correlates jobs, activities, logs, system changes, configuration changes and service
            metrics along a single timeline to simplify diagnosis

            Set server roles, configure services and manage security across the cluster

            Gracefully start, stop and restart of services as needed

            Supports Administrator and Read-Only users

            Maintains a complete record of configuration changes with the ability to roll back to
            previous states

            Monitors dozens of service performance metrics and alerts you when you approach
            critical thresholds

14                 ©2012 Cloudera, Inc. All Rights Reserved.
Cloudera Manager Key Features
            Gather, view and search Hadoop logs collected from across the cluster

            Scans Hadoop logs for irregularities and warns you before they impact the cluster

            Creates and aggregates relevant Hadoop events pertaining to system health, log
            messages, user services and activities and make them available for alerting and searching


            Generates email alerts when certain events occur


            Consolidates all cluster activity into a single, real-time view

            View information pertaining to hosts in your cluster including status, resident memory,
            virtual memory and roles

            Visualize health status and metrics across the cluster to quickly identify problem nodes
            and take action

            Visualize current and historical disk usage by user, group and directory
            Track MapReduce activity on the cluster by job or user

            Takes a snapshot of the cluster state and automatically sends it to Cloudera support to
            assist with resolution

            Easily integrate Cloudera Manager with your existing enterprise-wide management and
            monitoring tools


15                    ©2012 Cloudera, Inc. All Rights Reserved.
Cloudera Manager Roadmap
Maintenance mode
Platform Support
   Manage additional services like Flume, Hive etc.

Monitoring
   ZooKeeper monitoring
   Advanced HBase monitoring

Rolling Upgrades
Usability enhancements
   Improved error handling
   Log search enhancements
   Enhanced charting




   16                         ©2012 Cloudera, Inc. All Rights Reserved.
Why Enterprises are Standardizing on
Cloudera Manager

 1     SIMPLE
       END-TO-END HADOOP ADMINISTRATION IN A SINGLE TOOL




 2     INTELLIGENT
       MANAGES HADOOP AT THE SYSTEM LEVEL - CLOUDERA’S EXPERIENCE
       REALIZED IN SOFTWARE




 3     EFFICIENT
       SIMPLIFIES COMPLEX WORKFLOWS & MAKES
       ADMINISTRATORS MORE EFFICIENT




 4     BEST-IN-CLASS
       THE ONLY ENTERPRISE-GRADE HADOOP MANAGEMENT
       APPLICATION AVAILABLE




  17                                  ©2012 Cloudera, Inc. All Rights Reserved.
Next Steps

• Try out FREE edition of Cloudera Manager
• Download from:
http://www.cloudera.com/products-services/tools/
• Support available via scm-users@cloudera.org
• For Cloudera Enterprise subscriptions, please
  contact: sales@cloudera.com




   18              ©2012 Cloudera, Inc. All Rights Reserved.
Q&A
For more information go to www.cloudera.com
THANK YOU!
We appreciate your time and interest in Cloudera!

For more information: www.cloudera.com
Sales: (888)789-1488

More Related Content

What's hot

Webinar Slides: Geo-Distributed MySQL Clustering Done Right!
Webinar Slides: Geo-Distributed MySQL Clustering Done Right!Webinar Slides: Geo-Distributed MySQL Clustering Done Right!
Webinar Slides: Geo-Distributed MySQL Clustering Done Right!Continuent
 
Webinar slides: Introduction to Database Proxies (for MySQL)
Webinar slides: Introduction to Database Proxies (for MySQL)Webinar slides: Introduction to Database Proxies (for MySQL)
Webinar slides: Introduction to Database Proxies (for MySQL)Continuent
 
SQL Server Disaster Recovery on Azure - SQL Saturday 921
SQL Server Disaster Recovery on Azure - SQL Saturday 921SQL Server Disaster Recovery on Azure - SQL Saturday 921
SQL Server Disaster Recovery on Azure - SQL Saturday 921Marco Obinu
 
MariaDB on Docker
MariaDB on DockerMariaDB on Docker
MariaDB on DockerMariaDB plc
 
Why Distributed Databases?
Why Distributed Databases?Why Distributed Databases?
Why Distributed Databases?Sargun Dhillon
 
AWS re:Invent 2016: Open-Source Resources (DCS201)
AWS re:Invent 2016: Open-Source Resources (DCS201)AWS re:Invent 2016: Open-Source Resources (DCS201)
AWS re:Invent 2016: Open-Source Resources (DCS201)Amazon Web Services
 
Webinar Slides: Multi-Region AWS Aurora vs Continuent Tungsten for MySQL & Ma...
Webinar Slides: Multi-Region AWS Aurora vs Continuent Tungsten for MySQL & Ma...Webinar Slides: Multi-Region AWS Aurora vs Continuent Tungsten for MySQL & Ma...
Webinar Slides: Multi-Region AWS Aurora vs Continuent Tungsten for MySQL & Ma...Continuent
 
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera Cluster
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera ClusterWebinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera Cluster
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera ClusterContinuent
 
Oozie @ Riot Games
Oozie @ Riot GamesOozie @ Riot Games
Oozie @ Riot GamesMatt Goeke
 
Mtc learnings from isv & enterprise (dated - Dec -2014)
Mtc learnings from isv & enterprise (dated - Dec -2014)Mtc learnings from isv & enterprise (dated - Dec -2014)
Mtc learnings from isv & enterprise (dated - Dec -2014)Govind Kanshi
 
Mtc learnings from isv & enterprise interaction
Mtc learnings from isv & enterprise  interactionMtc learnings from isv & enterprise  interaction
Mtc learnings from isv & enterprise interactionGovind Kanshi
 
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Amazon Web Services
 
Scalable Web Architectures: Common Patterns and Approaches
Scalable Web Architectures: Common Patterns and ApproachesScalable Web Architectures: Common Patterns and Approaches
Scalable Web Architectures: Common Patterns and Approachesadunne
 
Architecture to Scale. DONN ROCHETTE at Big Data Spain 2012
Architecture to Scale. DONN ROCHETTE at Big Data Spain 2012Architecture to Scale. DONN ROCHETTE at Big Data Spain 2012
Architecture to Scale. DONN ROCHETTE at Big Data Spain 2012Big Data Spain
 
Cloudify Open PaaS Stack for DevOps
Cloudify Open PaaS Stack for DevOps  Cloudify Open PaaS Stack for DevOps
Cloudify Open PaaS Stack for DevOps Nati Shalom
 
Webinar Slides: High Noon at AWS — Amazon RDS vs. Tungsten Clustering with My...
Webinar Slides: High Noon at AWS — Amazon RDS vs. Tungsten Clustering with My...Webinar Slides: High Noon at AWS — Amazon RDS vs. Tungsten Clustering with My...
Webinar Slides: High Noon at AWS — Amazon RDS vs. Tungsten Clustering with My...Continuent
 
DC/OS 1.8 Container Networking
DC/OS 1.8 Container NetworkingDC/OS 1.8 Container Networking
DC/OS 1.8 Container NetworkingSargun Dhillon
 

What's hot (20)

Webinar Slides: Geo-Distributed MySQL Clustering Done Right!
Webinar Slides: Geo-Distributed MySQL Clustering Done Right!Webinar Slides: Geo-Distributed MySQL Clustering Done Right!
Webinar Slides: Geo-Distributed MySQL Clustering Done Right!
 
Webinar slides: Introduction to Database Proxies (for MySQL)
Webinar slides: Introduction to Database Proxies (for MySQL)Webinar slides: Introduction to Database Proxies (for MySQL)
Webinar slides: Introduction to Database Proxies (for MySQL)
 
SQL Server Disaster Recovery on Azure - SQL Saturday 921
SQL Server Disaster Recovery on Azure - SQL Saturday 921SQL Server Disaster Recovery on Azure - SQL Saturday 921
SQL Server Disaster Recovery on Azure - SQL Saturday 921
 
MariaDB on Docker
MariaDB on DockerMariaDB on Docker
MariaDB on Docker
 
Intro to Databases
Intro to DatabasesIntro to Databases
Intro to Databases
 
Why Distributed Databases?
Why Distributed Databases?Why Distributed Databases?
Why Distributed Databases?
 
AWS re:Invent 2016: Open-Source Resources (DCS201)
AWS re:Invent 2016: Open-Source Resources (DCS201)AWS re:Invent 2016: Open-Source Resources (DCS201)
AWS re:Invent 2016: Open-Source Resources (DCS201)
 
Openstack summit 2015
Openstack summit 2015Openstack summit 2015
Openstack summit 2015
 
Webinar Slides: Multi-Region AWS Aurora vs Continuent Tungsten for MySQL & Ma...
Webinar Slides: Multi-Region AWS Aurora vs Continuent Tungsten for MySQL & Ma...Webinar Slides: Multi-Region AWS Aurora vs Continuent Tungsten for MySQL & Ma...
Webinar Slides: Multi-Region AWS Aurora vs Continuent Tungsten for MySQL & Ma...
 
March 2011 HUG: Scaling Hadoop
March 2011 HUG: Scaling HadoopMarch 2011 HUG: Scaling Hadoop
March 2011 HUG: Scaling Hadoop
 
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera Cluster
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera ClusterWebinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera Cluster
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera Cluster
 
Oozie @ Riot Games
Oozie @ Riot GamesOozie @ Riot Games
Oozie @ Riot Games
 
Mtc learnings from isv & enterprise (dated - Dec -2014)
Mtc learnings from isv & enterprise (dated - Dec -2014)Mtc learnings from isv & enterprise (dated - Dec -2014)
Mtc learnings from isv & enterprise (dated - Dec -2014)
 
Mtc learnings from isv & enterprise interaction
Mtc learnings from isv & enterprise  interactionMtc learnings from isv & enterprise  interaction
Mtc learnings from isv & enterprise interaction
 
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
 
Scalable Web Architectures: Common Patterns and Approaches
Scalable Web Architectures: Common Patterns and ApproachesScalable Web Architectures: Common Patterns and Approaches
Scalable Web Architectures: Common Patterns and Approaches
 
Architecture to Scale. DONN ROCHETTE at Big Data Spain 2012
Architecture to Scale. DONN ROCHETTE at Big Data Spain 2012Architecture to Scale. DONN ROCHETTE at Big Data Spain 2012
Architecture to Scale. DONN ROCHETTE at Big Data Spain 2012
 
Cloudify Open PaaS Stack for DevOps
Cloudify Open PaaS Stack for DevOps  Cloudify Open PaaS Stack for DevOps
Cloudify Open PaaS Stack for DevOps
 
Webinar Slides: High Noon at AWS — Amazon RDS vs. Tungsten Clustering with My...
Webinar Slides: High Noon at AWS — Amazon RDS vs. Tungsten Clustering with My...Webinar Slides: High Noon at AWS — Amazon RDS vs. Tungsten Clustering with My...
Webinar Slides: High Noon at AWS — Amazon RDS vs. Tungsten Clustering with My...
 
DC/OS 1.8 Container Networking
DC/OS 1.8 Container NetworkingDC/OS 1.8 Container Networking
DC/OS 1.8 Container Networking
 

Viewers also liked

Cloudera for Internet of Things
Cloudera for Internet of ThingsCloudera for Internet of Things
Cloudera for Internet of ThingsCloudera, Inc.
 
Oracle database 12c introduction- Satyendra Pasalapudi
Oracle database 12c introduction- Satyendra PasalapudiOracle database 12c introduction- Satyendra Pasalapudi
Oracle database 12c introduction- Satyendra Pasalapudipasalapudi123
 
Oracle database 12c intro
Oracle database 12c introOracle database 12c intro
Oracle database 12c intropasalapudi
 
Introduction to Oracle Clusterware 12c
Introduction to Oracle Clusterware 12cIntroduction to Oracle Clusterware 12c
Introduction to Oracle Clusterware 12cGuatemala User Group
 
The Business Advantage of Hadoop: Lessons from the Field – Cloudera Summer We...
The Business Advantage of Hadoop: Lessons from the Field – Cloudera Summer We...The Business Advantage of Hadoop: Lessons from the Field – Cloudera Summer We...
The Business Advantage of Hadoop: Lessons from the Field – Cloudera Summer We...Cloudera, Inc.
 
Part 2: Apache Kudu: Extending the Capabilities of Operational and Analytic D...
Part 2: Apache Kudu: Extending the Capabilities of Operational and Analytic D...Part 2: Apache Kudu: Extending the Capabilities of Operational and Analytic D...
Part 2: Apache Kudu: Extending the Capabilities of Operational and Analytic D...Cloudera, Inc.
 

Viewers also liked (7)

Data guard
Data guardData guard
Data guard
 
Cloudera for Internet of Things
Cloudera for Internet of ThingsCloudera for Internet of Things
Cloudera for Internet of Things
 
Oracle database 12c introduction- Satyendra Pasalapudi
Oracle database 12c introduction- Satyendra PasalapudiOracle database 12c introduction- Satyendra Pasalapudi
Oracle database 12c introduction- Satyendra Pasalapudi
 
Oracle database 12c intro
Oracle database 12c introOracle database 12c intro
Oracle database 12c intro
 
Introduction to Oracle Clusterware 12c
Introduction to Oracle Clusterware 12cIntroduction to Oracle Clusterware 12c
Introduction to Oracle Clusterware 12c
 
The Business Advantage of Hadoop: Lessons from the Field – Cloudera Summer We...
The Business Advantage of Hadoop: Lessons from the Field – Cloudera Summer We...The Business Advantage of Hadoop: Lessons from the Field – Cloudera Summer We...
The Business Advantage of Hadoop: Lessons from the Field – Cloudera Summer We...
 
Part 2: Apache Kudu: Extending the Capabilities of Operational and Analytic D...
Part 2: Apache Kudu: Extending the Capabilities of Operational and Analytic D...Part 2: Apache Kudu: Extending the Capabilities of Operational and Analytic D...
Part 2: Apache Kudu: Extending the Capabilities of Operational and Analytic D...
 

Similar to How CBS Interactive uses Cloudera Manager to effectively manage their Hadoop cluster

What the Enterprise Requires - Business Continuity and Visibility
What the Enterprise Requires - Business Continuity and VisibilityWhat the Enterprise Requires - Business Continuity and Visibility
What the Enterprise Requires - Business Continuity and VisibilityCloudera, Inc.
 
Strata + Hadoop World 2012: Taming the Elephant - Learn how Monsanto manages ...
Strata + Hadoop World 2012: Taming the Elephant - Learn how Monsanto manages ...Strata + Hadoop World 2012: Taming the Elephant - Learn how Monsanto manages ...
Strata + Hadoop World 2012: Taming the Elephant - Learn how Monsanto manages ...Cloudera, Inc.
 
Vmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps IronfanVmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps IronfanJim Kaskade
 
Amr Awadallah, unSEXY Presentation
Amr Awadallah, unSEXY PresentationAmr Awadallah, unSEXY Presentation
Amr Awadallah, unSEXY Presentation500 Startups
 
Sneak Peak into Self-Service, Cross-Enterprise, Job Scheduling with CA Worklo...
Sneak Peak into Self-Service, Cross-Enterprise, Job Scheduling with CA Worklo...Sneak Peak into Self-Service, Cross-Enterprise, Job Scheduling with CA Worklo...
Sneak Peak into Self-Service, Cross-Enterprise, Job Scheduling with CA Worklo...CA Technologies
 
Transforming Your Business Management with Cloud Computing
Transforming Your Business Management with Cloud ComputingTransforming Your Business Management with Cloud Computing
Transforming Your Business Management with Cloud ComputingInnoTech
 
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo SlidesWebinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo SlidesCloudera, Inc.
 
The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...
The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...
The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...Impetus Technologies
 
Realizing the Promise of Big Data with Hadoop - Cloudera Summer Webinar Serie...
Realizing the Promise of Big Data with Hadoop - Cloudera Summer Webinar Serie...Realizing the Promise of Big Data with Hadoop - Cloudera Summer Webinar Serie...
Realizing the Promise of Big Data with Hadoop - Cloudera Summer Webinar Serie...Cloudera, Inc.
 
Simplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and CentrifySimplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and CentrifyHortonworks
 
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012Integrating Hadoop Into the Enterprise – Hadoop Summit 2012
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012Jonathan Seidman
 
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...TheInevitableCloud
 
Cw13 big data and apache hadoop by amr awadallah-cloudera
Cw13 big data and apache hadoop by amr awadallah-clouderaCw13 big data and apache hadoop by amr awadallah-cloudera
Cw13 big data and apache hadoop by amr awadallah-clouderainevitablecloud
 
Turning Data into Business Value with a Modern Data Platform
Turning Data into Business Value with a Modern Data PlatformTurning Data into Business Value with a Modern Data Platform
Turning Data into Business Value with a Modern Data PlatformCloudera, Inc.
 
Implementing cloud based devops for distributed agile projects
Implementing cloud based devops for distributed agile projectsImplementing cloud based devops for distributed agile projects
Implementing cloud based devops for distributed agile projectsTom Stiehm
 
Postgres Plus Cloud Database
Postgres Plus Cloud DatabasePostgres Plus Cloud Database
Postgres Plus Cloud DatabaseGary Carter
 
Bridging the Big Data Gap in the Software-Driven World
Bridging the Big Data Gap in the Software-Driven WorldBridging the Big Data Gap in the Software-Driven World
Bridging the Big Data Gap in the Software-Driven WorldCA Technologies
 

Similar to How CBS Interactive uses Cloudera Manager to effectively manage their Hadoop cluster (20)

What the Enterprise Requires - Business Continuity and Visibility
What the Enterprise Requires - Business Continuity and VisibilityWhat the Enterprise Requires - Business Continuity and Visibility
What the Enterprise Requires - Business Continuity and Visibility
 
Strata + Hadoop World 2012: Taming the Elephant - Learn how Monsanto manages ...
Strata + Hadoop World 2012: Taming the Elephant - Learn how Monsanto manages ...Strata + Hadoop World 2012: Taming the Elephant - Learn how Monsanto manages ...
Strata + Hadoop World 2012: Taming the Elephant - Learn how Monsanto manages ...
 
Vmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps IronfanVmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps Ironfan
 
Amr Awadallah, unSEXY Presentation
Amr Awadallah, unSEXY PresentationAmr Awadallah, unSEXY Presentation
Amr Awadallah, unSEXY Presentation
 
Day 2 p3 - automation
Day 2   p3 - automationDay 2   p3 - automation
Day 2 p3 - automation
 
Day 2 p3 - automation
Day 2   p3 - automationDay 2   p3 - automation
Day 2 p3 - automation
 
Sneak Peak into Self-Service, Cross-Enterprise, Job Scheduling with CA Worklo...
Sneak Peak into Self-Service, Cross-Enterprise, Job Scheduling with CA Worklo...Sneak Peak into Self-Service, Cross-Enterprise, Job Scheduling with CA Worklo...
Sneak Peak into Self-Service, Cross-Enterprise, Job Scheduling with CA Worklo...
 
Transforming Your Business Management with Cloud Computing
Transforming Your Business Management with Cloud ComputingTransforming Your Business Management with Cloud Computing
Transforming Your Business Management with Cloud Computing
 
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo SlidesWebinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
 
The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...
The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...
The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...
 
Realizing the Promise of Big Data with Hadoop - Cloudera Summer Webinar Serie...
Realizing the Promise of Big Data with Hadoop - Cloudera Summer Webinar Serie...Realizing the Promise of Big Data with Hadoop - Cloudera Summer Webinar Serie...
Realizing the Promise of Big Data with Hadoop - Cloudera Summer Webinar Serie...
 
Simplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and CentrifySimplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and Centrify
 
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012Integrating Hadoop Into the Enterprise – Hadoop Summit 2012
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012
 
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...
 
Cw13 big data and apache hadoop by amr awadallah-cloudera
Cw13 big data and apache hadoop by amr awadallah-clouderaCw13 big data and apache hadoop by amr awadallah-cloudera
Cw13 big data and apache hadoop by amr awadallah-cloudera
 
Turning Data into Business Value with a Modern Data Platform
Turning Data into Business Value with a Modern Data PlatformTurning Data into Business Value with a Modern Data Platform
Turning Data into Business Value with a Modern Data Platform
 
Implementing cloud based devops for distributed agile projects
Implementing cloud based devops for distributed agile projectsImplementing cloud based devops for distributed agile projects
Implementing cloud based devops for distributed agile projects
 
Postgres Plus Cloud Database
Postgres Plus Cloud DatabasePostgres Plus Cloud Database
Postgres Plus Cloud Database
 
Virtualized Hadoop
Virtualized HadoopVirtualized Hadoop
Virtualized Hadoop
 
Bridging the Big Data Gap in the Software-Driven World
Bridging the Big Data Gap in the Software-Driven WorldBridging the Big Data Gap in the Software-Driven World
Bridging the Big Data Gap in the Software-Driven World
 

More from Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformCloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
 

More from Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Recently uploaded

Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 

Recently uploaded (20)

Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 

How CBS Interactive uses Cloudera Manager to effectively manage their Hadoop cluster

  • 1. WEBINAR How CBS Interactive Uses Cloudera Manager to Effectively Manage their Hadoop Cluster Wednesday, September 19th, 2012 Manoj Murumkar - Senior Manager, Data Engineering, CBS Interactive Bala Venkatrao – Director of Products, Cloudera
  • 2. Agenda Introductions CBSi • Hadoop Use Case • Operational Challenges • How Cloudera Manager helps CBSi & Demo • Benefits of using Cloudera Manager Cloudera Manager • Overview & Benefits • Key Features • Roadmap Q&A 2 ©2012 Cloudera, Inc. All Rights Reserved.
  • 3. Introductions Manoj Murumkar Senior Manager, Data Engineering at CBS Interactive Manoj has been working with data technologies since 1998. His team currently responsible for providing data infrastructure solutions and operating them for internet division of CBS corporation. He has been involved with Hadoop for more around 3 years, around 2 years of which working with Cloudera. His team has built big data infrastructure from ground up that helps in clickstream analysis using Hadoop streaming. Bala Venkatrao Director, Products at Cloudera Bala Venkatrao is part of the product management team at Cloudera and leads the efforts around Cloudera Manager. In addition, he is involved in several other initiatives, including customer advocacy, partnership development, marketing etc. 3 ©2012 Cloudera, Inc. All Rights Reserved.
  • 4. Building web analytics for Top 10 global web property on Hadoop  235M worldwide monthly unique users Challenge Solution Results  Requires advanced analytics  Web analytics platform on  Optimizing what content is placed on click stream data in near Hadoop processes >1B global beside that which user is currently real time events/day reading  Reduced processing time by 6+ hrs.  Weblog processing time on  >1PB on Hadoop; 42 nodes to reach SLA proprietary platform hit limit while data volumes  Tracking clicks, page views,  Accommodates 50% data volume continuously increased downloads, streaming video increase per year events, ad events, etc.  Ability/Cost to store historical  Reduce cost of storing/processing data for analyses  Hadoop Components: HDFS, data Hive, MapReduce, Pig, Hadoop Streaming  Greater ad revenues achieved Source: Hadoop World 2012 presentation. Michael Sun, Lead Software Engineer & Manager of DW Operations, CBS Interactive. http://www.cloudera.com/resource/hadoop-world-2012-presentation-slides-building-web-analytics-processing-on-hadoop-at-cbs-interactive/ 4 ©2012 Cloudera, Inc. All Rights Reserved.
  • 5. CBSi Hadoop Operational Challenges Prior to Cloudera Manager Lack of…  Holistic view  Configuration control  No audit trail/history of changes Existing solutions were…  Ganglia , Hadoop web UI pages and custom scripts  Difficult to maintain No visibility into activity failures • Reactive to user complaints on failed/long running jobs 5 ©2012 Cloudera, Inc. All Rights Reserved.
  • 6. How Cloudera Manager helps CBSi with Hadoop Operations Intuitive visual interface  Can manage and monitor the whole cluster  Overall health status/dashboard  Ability to drill down from services > roles > hosts Service Monitoring and Alerting  Makes Hadoop operations pro-active  Heatmaps provides an easy way to identify outliers Activity Monitoring • Helps identify failed or slow running jobs  Notify end-users on failed jobs and manage SLA’s Workflows  Simple to add new ‘data nodes’, hosts, clients etc. 6 ©2012 Cloudera, Inc. All Rights Reserved.
  • 8. Key Benefits of Using Cloudera Manager at CBSi Lowers the barrier for Hadoop administration  Do not need to rely on experts solely Makes life easier – saves money & time  Avoid licensing costs associated with managing multiple tools  Cuts technical and human resource costs  Reduces time to manage and maintain the cluster Provides a “one-stop” holistic view  Easy to understand how the overall cluster is performing Helps create repeatable processes & workflows for Hadoop operations  Improves efficiency of the Operations team 8 ©2012 Cloudera, Inc. All Rights Reserved.
  • 9. The 6 Characteristics of Enterprise Grade Hadoop 9 ©2012 Cloudera, Inc. All Rights Reserved.
  • 10. Why You Need Cloudera Manager 1 COMPLEXITY HADOOP IS MORE THAN A DOZEN SERVICES RUNNING ACROSS MANY MACHINES  HUNDREDS OF HARDWARE COMPONENTS  THOUSANDS OF SETTINGS  LIMITLESS PERMUTATIONS 2 CONTEXT HADOOP IS A SYSTEM, NOT JUST A COLLECTION OF PARTS  EVERYTHING IS INTERRELATED  RAW DATA ABOUT INDIVIDUAL PIECES IS NOT ENOUGH  MUST EXTRACT WHAT’S IMPORTANT 3 EFFICIENCY MANAGING HADOOP WITH MULTIPLE TOOLS & MANUAL PROCESSES TAKES LONGER  COMPLICATED, ERROR PRONE WORKFLOWS  LONGER ISSUE RESOLUTION  LACK OF CONSISTENT AND REPEATABLE PROCESSES 10 ©2012 Cloudera, Inc. All Rights Reserved.
  • 11. Cloudera Manager Provides End-to-End CDH Administration 1 DEPLOY INSTALL, CONFIGURE AND START YOUR CLUSTER IN 3 SIMPLE STEPS 2 CONFIGURE & OPTIMIZE ENSURE OPTIMAL SETTINGS FOR ALL HOSTS AND SERVICES 3 MONITOR, DIAGNOSE & REPORT FIND AND FIX PROBLEMS QUICKLY, VIEW CURRENT AND HISTORICAL ACTIVITY AND RESOURCE USAGE CDH 11 ©2012 Cloudera, Inc. All Rights Reserved.
  • 12. Managing Complexity One Tool For Everything DEPLOYMENT & ACTIVITY MONITORING WORKFLOWS EVENTS & ALERTS LOG SEARCH DIAGNOSTICS REPORTING CONFIGURATION MONITORING DO-IT-YOURSELF + CLOUDERA ENTERPRISE “In a recent Cloudera survey, >95% of respondents emphasized the need for a single end-to-end tool to manage their Hadoop Operations” 12 ©2012 Cloudera, Inc. All Rights Reserved.
  • 13. Providing Context Raw Data vs. Hadoop Intelligence 1 SMART CONFIGURATION AUTO-SETS CONFIGURATIONS & GUARDS AGAINST USER ERROR ? VS. 2 WORKFLOWS ENSURES THAT MULTI-STEP TASKS ARE ACCOMPLISHED COMPLETELY & IN THE CORRECT SEQUENCE 3 DEPENDENCIES AWARE OF HOW A PARTICULAR ACTION AFFECTS THE REST OF THE CLUSTER & MANAGES THE IMPACT 4 EVENTS & ALERTS MAKES YOU AWARE OF WHAT’S IMPORTANT AT A HADOOP SYSTEM LEVEL 5 HISTORY COMPARES CURRENT & PAST ACTIVITIES FOR CONTEXT 13 ©2012 Cloudera, Inc. All Rights Reserved.
  • 14. Cloudera Manager Key Features Installs the complete Hadoop stack in minutes via a wizard-based interface Gives you complete, end-to-end visibility and control over your Hadoop cluster from a single interface Allows you to manage multiple clusters from a single instance of Cloudera Manager Integrate Cloudera Manager with Active Directory Establishes the time context globally for almost all views Correlates jobs, activities, logs, system changes, configuration changes and service metrics along a single timeline to simplify diagnosis Set server roles, configure services and manage security across the cluster Gracefully start, stop and restart of services as needed Supports Administrator and Read-Only users Maintains a complete record of configuration changes with the ability to roll back to previous states Monitors dozens of service performance metrics and alerts you when you approach critical thresholds 14 ©2012 Cloudera, Inc. All Rights Reserved.
  • 15. Cloudera Manager Key Features Gather, view and search Hadoop logs collected from across the cluster Scans Hadoop logs for irregularities and warns you before they impact the cluster Creates and aggregates relevant Hadoop events pertaining to system health, log messages, user services and activities and make them available for alerting and searching Generates email alerts when certain events occur Consolidates all cluster activity into a single, real-time view View information pertaining to hosts in your cluster including status, resident memory, virtual memory and roles Visualize health status and metrics across the cluster to quickly identify problem nodes and take action Visualize current and historical disk usage by user, group and directory Track MapReduce activity on the cluster by job or user Takes a snapshot of the cluster state and automatically sends it to Cloudera support to assist with resolution Easily integrate Cloudera Manager with your existing enterprise-wide management and monitoring tools 15 ©2012 Cloudera, Inc. All Rights Reserved.
  • 16. Cloudera Manager Roadmap Maintenance mode Platform Support  Manage additional services like Flume, Hive etc. Monitoring  ZooKeeper monitoring  Advanced HBase monitoring Rolling Upgrades Usability enhancements  Improved error handling  Log search enhancements  Enhanced charting 16 ©2012 Cloudera, Inc. All Rights Reserved.
  • 17. Why Enterprises are Standardizing on Cloudera Manager 1 SIMPLE END-TO-END HADOOP ADMINISTRATION IN A SINGLE TOOL 2 INTELLIGENT MANAGES HADOOP AT THE SYSTEM LEVEL - CLOUDERA’S EXPERIENCE REALIZED IN SOFTWARE 3 EFFICIENT SIMPLIFIES COMPLEX WORKFLOWS & MAKES ADMINISTRATORS MORE EFFICIENT 4 BEST-IN-CLASS THE ONLY ENTERPRISE-GRADE HADOOP MANAGEMENT APPLICATION AVAILABLE 17 ©2012 Cloudera, Inc. All Rights Reserved.
  • 18. Next Steps • Try out FREE edition of Cloudera Manager • Download from: http://www.cloudera.com/products-services/tools/ • Support available via scm-users@cloudera.org • For Cloudera Enterprise subscriptions, please contact: sales@cloudera.com 18 ©2012 Cloudera, Inc. All Rights Reserved.
  • 19. Q&A For more information go to www.cloudera.com
  • 20. THANK YOU! We appreciate your time and interest in Cloudera! For more information: www.cloudera.com Sales: (888)789-1488

Editor's Notes

  1. CBS Interactive
  2. Cost – just because you can doesn’t mean you should. You could cut your grass with hedge clippers. But why would you?