Empowering Customers with Personalized Insights

•

2 recomendaciones•1,433 vistas

Opower, a Cloudera customer, discusss how they implemented a scalable energy analysis platform that generates personalized insights for millions of people. To date, Opower’s insights have collectively saved over 5 terawatt hours of energy and $500 million in energy bills.

Software

Empowering Customers with Personalized Insights

Agenda
Why and How to Operationalize Analytics
Opower – Personalized Energy Usage Insights
Opower - Before, After, and Lessons Learned
Live Q&A
Speakers
TJ Laher
Product Marketing at Cloudera
Scott Kuehn
Data Architect at Opower
Get Social
#ClouderaWebinars

Why Automate Insights?
Unlock Competitive Advantages Decision Point Analytics Increase Data Returns

The Process of Operationalizing Analytics
Data
Generation
Batch Processing
Data Discovery
Analysis
Technique
Batch Processing
Report, Model,
or Rules
Analyst
Discovery
Flow
Data
Generation
Stream or Batch
Processing
Respond to Data
Feed Data
Application
Optimize Report,
Model, or Rules
Operational
Analytics
Flow

Preparing for Operational Analytics
Data Sources Data Analysis Data Serving
Human Data
Discovery
Structured
Unstructured
Data Processing &
Storage
Batch
Stream
Optimize
Extend
Innovate
Machine
Response
Single Analysis
Data
Applications
Store

5
Opower – Personalized Energy Usage Insights

Opower Overview
A Software as a Service Customer Engagement Platform
The Company
• Serving 95+ utilities in 9 countries
• Over 5TWh saved to date
• 40% of US household data under management totaling 300
billion reads
Our DNA
• Behavioral science software
• Data analytics
• Consumer marketing
• User-centric design

Personalized Insights
Neighbor comparisons Usage trend analysis

Initial Hadoop Architecture
1
2
3
Ingest performance
Complex query paths
1
3
2
Challenges
Multiple workloads

Modern Hadoop Architecture
Improvements
1 2
Offline Product Analytics Analysis and Experimentation
Ingest Performance
2 Entity-centric HBase schema 1
3 Workload separation
3

Insight Creation Environments
Product Calculation and Delivery Offline Analysis and Experimentation
Insight Delivery
Insight Calculation
Hive BI
Raw
MR
Batch Tools
HDFS
Reporting
External
Feeds
HBase Export
Non-product
Insights

What does this mean to end users?
Batch Analytic Calculations Individual Insight Query Latency
Pre-Hadoop Modern Hadoop
Hours
48
24
12
Hours
Days
Pre-Hadoop
Seconds
3
2
1
~10ms
3 secs
Analytic Development Time
Pre-Hadoop
Months
5
3
1
Weeks
Months
Modern Hadoop Modern Hadoop

Key Lessons Learned: External Support
1. Issue resolution and escalation
2. Backport critical patches
3. Tuning and configuration
guidance
1. Community support channels
1. New features, bug fixes
2. Roadmap planning

Key Lessons Learned: Cluster operations
1. Cloudera Manager is useful: alerts, log
collection, metrics
1. Upgrade often (safely)
2. Off-cluster data backup/replication
Customized charting via CM UI

15
What’s Next?
©2014 Cloudera, Inc. All rights reserved.
Contact Us
1-866-843-7207
@Cloudera
TJ Laher
Product Marketing
@tjlaher
Scott Kuehn
Data Architect at Opower
Try a live demo of Hadoop, right now.
www.cloudera.com/live

Más contenido relacionado

La actualidad más candente

Big Data Business Wins: Real-time Inventory Tracking with HadoopDataWorks Summit

Securing the Data Hub--Protecting your Customer IP (Technical Workshop)Cloudera, Inc.

WhereScape + HVR Webcast – How Progressive Leasing Accelerated Data Warehousi...WhereScape

43948_HPE Big Data Svcs infographic finalJoleneDobbin

A Modern Data Strategy for Precision MedicineCloudera, Inc.

Big data success story slides 2016Jason Saputo

But how do I GET the data? Transparency Camp 2014Jeffrey Quigley

Using Big Data to Transform Your Customer’s Experience - Part 1 Cloudera, Inc.

Skedule x web based solutionskedulex

Best Practices for a CoESplunk

Becoming Data-Driven Through Cultural ChangeCloudera, Inc.

RecordService for Unified Access ControlCloudera, Inc.

Turning Data into Business Value with a Modern Data PlatformCloudera, Inc.

Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...Cloudera, Inc.

Analytica 2014 - Biotech Forum - IDBS Bioprocess Execution SystemIDBS

data_blendingsubit1615

Bio-IT 2014: 'Capturing Value from Collaborative Research with the IDBS Trans...IDBS

Preclinical development in the current Pharmaceutical spaceIDBS

Webinar: Transforming Customer Experience Through an Always-On Data PlatformDataStax

Automation First as Strategy for Data Warehouse Modernization WhereScape

La actualidad más candente (20)

Big Data Business Wins: Real-time Inventory Tracking with Hadoop

Securing the Data Hub--Protecting your Customer IP (Technical Workshop)

WhereScape + HVR Webcast – How Progressive Leasing Accelerated Data Warehousi...

43948_HPE Big Data Svcs infographic final

A Modern Data Strategy for Precision Medicine

Big data success story slides 2016

But how do I GET the data? Transparency Camp 2014

Using Big Data to Transform Your Customer’s Experience - Part 1 

Skedule x web based solution

Best Practices for a CoE

Becoming Data-Driven Through Cultural Change

RecordService for Unified Access Control

Turning Data into Business Value with a Modern Data Platform

Introducing Cloudera Navigator Optimizer: Offload Assessments and Active Data...

Analytica 2014 - Biotech Forum - IDBS Bioprocess Execution System

data_blending

Bio-IT 2014: 'Capturing Value from Collaborative Research with the IDBS Trans...

Preclinical development in the current Pharmaceutical space

Webinar: Transforming Customer Experience Through an Always-On Data Platform

Automation First as Strategy for Data Warehouse Modernization

Destacado

Pekka Koskinen-Usage of web analytics to gain competitive advantage in tough ...Altex Marketing OÜ

Unlock the Value of Usage DataKissmetrics on SlideShare

Pre-Con Education: How to Deliver a "5-Star" Mobile App Experience With CA ...CA Technologies

SharePoint 2013 Usage Analytics and Making Metrics ActionableJoel Oleson

Treaty bookJacalyn Tapp

Safford_CV_2016Twymun Safford

Firstweekofschool gettingtoknowyouhomeworksheetsJacalyn Tapp

Educacion fisica atletismo.Víctormaestrojuanavila

Deconstructing Technology Enhanced Learning: from platforms to the cloudEADTU

SXSW Workshop on Designing for Behavior Change (2014)Stephen Wendel

Ehun epitelialakAxiersukun

Nerbio-ehunaAxiersukun

Web Analytics 101Nilotpal Paul

Value Proposition Workshop MacInnis Marketing

Leizaran and Plaiaundi Ecological ParkAxiersukun

Ciclovidazunzunx zunzunx

Tomorrows Web Analytics Technology and UsageDennis Mortensen

Destacado (17)

Pekka Koskinen-Usage of web analytics to gain competitive advantage in tough ...

Unlock the Value of Usage Data

Pre-Con Education: How to Deliver a "5-Star" Mobile App Experience With CA ...

SharePoint 2013 Usage Analytics and Making Metrics Actionable

Treaty book

Safford_CV_2016

Firstweekofschool gettingtoknowyouhomeworksheets

Educacion fisica atletismo.Víctor

Deconstructing Technology Enhanced Learning: from platforms to the cloud

SXSW Workshop on Designing for Behavior Change (2014)

Ehun epitelialak

Nerbio-ehuna

Web Analytics 101

Value Proposition Workshop

Leizaran and Plaiaundi Ecological Park

Ciclovida

Tomorrows Web Analytics Technology and Usage

Similar a Empowering Customers with Personalized Insights

Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...RTTS

Hadoop in the Cloud: Common Architectural PatternsDataWorks Summit

Skilwise Big dataSkillwise Group

Customer value analysis of big data productsVikas Sardana

Skillwise Big Data part 2Skillwise Group

[Webinar] Getting to Insights Faster: A Framework for Agile Big DataInfochimps, a CSC Big Data Business

Postgres in Production - Best Practices 2014EDB

J1 - Keynote Data Platform - Rohan KumarMS Cloud Summit

12Nov13 Webinar: Big Data Analysis with Teradata and Revolution AnalyticsRevolution Analytics

Simplifying Real-Time Architectures for IoT with Apache KuduCloudera, Inc.

Migrating Analytics to the Cloud at Fannie MaeDataWorks Summit

Innovation in the Enterprise Rent-A-Car Data WarehouseDataWorks Summit

CS-Op AnalyticsCloudera, Inc.

Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the CloudDataWorks Summit

Data Warehouse OptimizationCloudera, Inc.

R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster AnswersRevolution Analytics

CSC - Presentation at Hortonworks Booth - Strata 2014Hortonworks

What is OLAP -Data Warehouse Concepts - IT Online Training @ NewyorksysNEWYORKSYS-IT SOLUTIONS

Testing Big Data: Automated ETL Testing of HadoopRTTS

Oracle canvas 140604 2Javier Ordozgoiti

Similar a Empowering Customers with Personalized Insights (20)

Big Data Testing : Automate theTesting of Hadoop, NoSQL & DWH without Writing...

Hadoop in the Cloud: Common Architectural Patterns

Skilwise Big data

Customer value analysis of big data products

Skillwise Big Data part 2

[Webinar] Getting to Insights Faster: A Framework for Agile Big Data

Postgres in Production - Best Practices 2014

J1 - Keynote Data Platform - Rohan Kumar

12Nov13 Webinar: Big Data Analysis with Teradata and Revolution Analytics

Simplifying Real-Time Architectures for IoT with Apache Kudu

Migrating Analytics to the Cloud at Fannie Mae

Innovation in the Enterprise Rent-A-Car Data Warehouse

CS-Op Analytics

Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the Cloud

Data Warehouse Optimization

R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers

CSC - Presentation at Hortonworks Booth - Strata 2014

What is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys

Testing Big Data: Automated ETL Testing of Hadoop

Oracle canvas 140604 2

Más de Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.

Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.

2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.

Edc event vienna presentation 1 oct 2019Cloudera, Inc.

Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.

Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.

Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.

Introducing Cloudera Data Science Workbench for HDP 2.12.19Cloudera, Inc.

Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Cloudera, Inc.

Leveraging the cloud for analytics and machine learning 1.29.19Cloudera, Inc.

Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.

Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.

Modern Data Warehouse Fundamentals Part 3Cloudera, Inc.

Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.

Modern Data Warehouse Fundamentals Part 1Cloudera, Inc.

Extending Cloudera SDX beyond the PlatformCloudera, Inc.

Federated Learning: ML with Privacy on the Edge 11.15.18Cloudera, Inc.

Analyst Webinar: Doing a 180 on Customer 360Cloudera, Inc.

Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.

Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.

Más de Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx

Cloudera Data Impact Awards 2021 - Finalists

2020 Cloudera Data Impact Awards Finalists

Edc event vienna presentation 1 oct 2019

Machine Learning with Limited Labeled Data 4/3/19

Data Driven With the Cloudera Modern Data Warehouse 3.19.19

Introducing Cloudera DataFlow (CDF) 2.13.19

Introducing Cloudera Data Science Workbench for HDP 2.12.19

Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19

Leveraging the cloud for analytics and machine learning 1.29.19

Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19

Leveraging the Cloud for Big Data Analytics 12.11.18

Modern Data Warehouse Fundamentals Part 3

Modern Data Warehouse Fundamentals Part 2

Modern Data Warehouse Fundamentals Part 1

Extending Cloudera SDX beyond the Platform

Federated Learning: ML with Privacy on the Edge 11.15.18

Analyst Webinar: Doing a 180 on Customer 360

Build a modern platform for anti-money laundering 9.19.18

Introducing the data science sandbox as a service 8.30.18

Último

8257 interfacing 2 in microprocessor for btech studentsHimanshiGarg82

%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba

The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...kalichargn70th171

Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy

Software Quality Assurance Interview QuestionsArshad QA

Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfryanfarris8

%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfonteinmasabamasaba

%in Bahrain+277-882-255-28 abortion pills for sale in Bahrainmasabamasaba

AI & Machine Learning Presentation TemplatePresentation.STUDIO

Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...Nitya salvi

Define the academic and professional writing..pdfPearlKirahMaeRagusta1

call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls

Optimizing AI for immediate response in Smart CCTVshikhaohhpro

The Top App Development Trends Shaping the Industry in 2024-25 .pdfayushiqss

Pharm-D Biostatistics and Research methodologyAnusha Are

%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba

+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health

Exploring the Best Video Editing App.pdfproinshot.com

Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab

MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...Jittipong Loespradit

Empowering Customers with Personalized Insights

1. Empowering Customers with Personalized Insights

2. Agenda Why and How to Operationalize Analytics Opower – Personalized Energy Usage Insights Opower - Before, After, and Lessons Learned Live Q&A Speakers TJ Laher Product Marketing at Cloudera Scott Kuehn Data Architect at Opower Get Social #ClouderaWebinars

3. Why Automate Insights? Unlock Competitive Advantages Decision Point Analytics Increase Data Returns

4. The Process of Operationalizing Analytics Data Generation Batch Processing Data Discovery Analysis Technique Batch Processing Report, Model, or Rules Analyst Discovery Flow Data Generation Stream or Batch Processing Respond to Data Feed Data Application Optimize Report, Model, or Rules Operational Analytics Flow

5. Preparing for Operational Analytics Data Sources Data Analysis Data Serving Human Data Discovery Structured Unstructured Data Processing & Storage Batch Stream Optimize Extend Innovate Machine Response Single Analysis Data Applications Store

6. 5 Opower – Personalized Energy Usage Insights

7. Opower Overview A Software as a Service Customer Engagement Platform The Company • Serving 95+ utilities in 9 countries • Over 5TWh saved to date • 40% of US household data under management totaling 300 billion reads Our DNA • Behavioral science software • Data analytics • Consumer marketing • User-centric design

8. Personalized Insights Neighbor comparisons Usage trend analysis

9. Initial Hadoop Architecture 1 2 3 Ingest performance Complex query paths 1 3 2 Challenges Multiple workloads

10. Modern Hadoop Architecture Improvements 1 2 Offline Product Analytics Analysis and Experimentation Ingest Performance 2 Entity-centric HBase schema 1 3 Workload separation 3

11. Insight Creation Environments Product Calculation and Delivery Offline Analysis and Experimentation Insight Delivery Insight Calculation Hive BI Raw MR Batch Tools HDFS Reporting External Feeds HBase Export Non-product Insights

12. What does this mean to end users? Batch Analytic Calculations Individual Insight Query Latency Pre-Hadoop Modern Hadoop Hours 48 24 12 Hours Days Pre-Hadoop Seconds 3 2 1 ~10ms 3 secs Analytic Development Time Pre-Hadoop Months 5 3 1 Weeks Months Modern Hadoop Modern Hadoop

13. Key Lessons Learned: External Support 1. Issue resolution and escalation 2. Backport critical patches 3. Tuning and configuration guidance 1. Community support channels 1. New features, bug fixes 2. Roadmap planning

14. Key Lessons Learned: Cluster operations 1. Cloudera Manager is useful: alerts, log collection, metrics 1. Upgrade often (safely) 2. Off-cluster data backup/replication Customized charting via CM UI

15. 14 Q&A

16. 15 What’s Next? ©2014 Cloudera, Inc. All rights reserved. Contact Us 1-866-843-7207 @Cloudera TJ Laher Product Marketing @tjlaher Scott Kuehn Data Architect at Opower Try a live demo of Hadoop, right now. www.cloudera.com/live

Notas del editor

Opower Intro: Who is Opower and what does Opower do? Produce energy insights to help utilities and customers manage energy consumption. 100+millions of meter reads received daily. Millions of individual insight calculations routinely created, from simple trending analytics, to more advanced forecasting/prediction. Energy saved: 5+TW hours, $500M energy bill savings, >6 billion lbs CO2 Product lines: Consumer engagement Energy efficiency Demand Response Hadoop-based insights are a critical portion of each of these product lines. Transition: Some example hadoop-based insights:
Two example of Opower’s personalized insights that use hadoop components: neighbor comparisons and unusual usage alerts Energy usage is stored in HBase, along with insights derived from the energy usage. Billions of energy usage data reads are stored in HBase. Insights are served directly from HBase. Unusual usage alerts were the first use case for HBase/hadoop. We sold a deal that required us to generate “unusual usage alerts” at a scale we had yet hit UUA are email or phone messages we send to let customers know if they are trending towards higher than usual energy usage We also project the bill for them and can let them know if they are going to pay more than expected Transition: The initial architecture we built to calculate and deliver this insight
Hadoop has been used in production at Opower since 2012. Overview of end-to-end architecture: Data is copied from single-tenant mysql databases into HBase. MySQL is single tenant (one DB per opower client), and we have > 100 MySQL dbs in production. Batch clients read from HBase. Other workloads running on the cluster as an attempt to eliminate the need to support clusters for separate workloads. Sqoop is a mapreduce job that reads data from mysql and outputs to some other source, like hive+hdfs or in our case HBase. Challenges: Sqoop ingest introduced a lot of memory pressure on region servers and traffic on mysql read slaves. Need to take caution to not introduce excessive MySQL load from sqoop queries, as the databases are serving other critical apps Queries required longer multi-row scans and aggregations. Lot’s of tuning was necessary, such as increase in region file sizes, memstore sizes, heap size. Disable major compactions, HDFS short-circuit. Composite row keys with timestamps in them, thinking about Hbase more like a relational table than a big sorted map We had supporting data in single-tenanted because we were sqooping it over from the mysql databases Because of how we designed the schema, we needed multiple tables to store the data Single-tenanted tables adds operational overhead and difficulty in tracking bottlenecks in the process Initial support of ad-hoc MR jobs via Hive was quickly removed due to unmanageable load This architecture has been successful, but difficult to scale. The hbase schema was difficult to extend to support new insights and there was no story for offline analytics and experimentation. Transition: V2 (the modern) Opower hadoop architecture addresses these issues
Overview/walk through of the major components. Usage data is collected from the utility and directly ingested into HBase via bulkloading MR jobs. [Explain bulkloading] Data is stored in an Entity-centric table, where each entity is a single hbase row containing the energy usage history for a household, and any derived analytics from that energy, such as bill forecasts and neighbor comparisons. MapReduce jobs will periodically referesh these analytics, but some are also refreshed on-demand in a streaming fashion, as insights are queried. Data is replicated to the data warehouse cluster via a combination of HBase replication (for direct puts) and as an HFile distcp step during the intial bulkload ingest (not pictured). Full, multi-tenant datasets are now available to be analyzed in the data warehouse, which has enabled new off-line analytics such as product eligibility calculations, and a general test-bed experimenting with new insights. There is no longer a need to painstakingly collect data from multiple sources or worry about crashing a mysql slave when running a full table scan. Improvements: Write path performance via bulkloading. less GC pressure in the region server, no memstore flushes. fewer RPC’s/round-trips to the databsae. Simultaneous bulkloading via distcp into the data warehouse hbase instance, so the data warehouse has fresh data. Entity-centric HBase schema provides ability to add new analytics/insights in a scalable manner. Data used to derive a personalized insight is stored in a single HBase row, providing data locality for scans and eliminating hbase overhead of multi-row traversals and aggregations. Secondary analyics were moved to data warehouse, reducing the memory pressure and task contention on the service cluster. MR jobs on the service cluster are specific to generation of personalized insights served at low-latency. The new architecture has worked, but there are still areas we want to improve, such as automation and ETL tooling that will make it easier to load new datasets and create new insights. Transition: This new architecture enables two distinct environments for creating new data insights
Product calculations are built as producer-style mapreduce jobs – reading and writing to the same HBase row. For example, a trend in energy usage for the current bill period will be derived from the usage data present in the row and used to forecast the customers energy consumption and spending for the current period. Insights are accessed by a service query layer. An template HBase service container can be easily extended to create service API’s for different insight products. Service client applications are used by reporting pipelines and embedded web components. Offline analysis and experimentation occurs in the data warehouse. Hive, BI tools (platfora, datameer), and raw mapreduce jobs are used to create aggregate reports, and non-product analytics such as customer program eligibility. These tools are also used for ad-hoc analysis of full energy usage datasets, such as electric car charging trends or the impact of the super bowl on energy consumption. In the future we look to link the two systems, enabling analytics developed offline to be ‘promoted’ to product calculations. Transition: What’s been the result of a switch to hadoop architecture?
Batch analytics calculated via the producer pattern are much more amenable to the MR parallelization and take advantage of HBase row locality. Run time reduced significantly. Some jobs could be multi-tenant, which are easier to operate. Individual insight query latency dropped from several seconds to ~10ms. Our performance tests measure at the 99.999% point on the latency tail, so average time is even faster. Query latency has been critical for SOA-model SLA’s, since multiple external services will access this data in real time. Analytic development time is faster, although it could still be improved. Development speedups came from adding a data warehouse cluster for development and experimentation, which used more analyst friendly tools like hive and scalding. Also, the entity-based schema used in production is more amenable to adding new data. Transition: We’ve had some success but encountered challenges along the way. Here are some lessons we learned:
There are numerous experts in the HBase community, and chances are someone has tried to do what you have Cloudera support critical in helping with hadoop challenges: Cloudera Support Issue resolution and escalation. Ex:Jobtracker memory issues and configuration. Escalate to the larger hadoop community Backport critical patches. HBase sequence ID and cell overwrite bugs Tuning and configuration guidance. HBase, sqoop Apache HBase community Community technical guidance: message boards, meetups, Hbasecon New feature development: hbase community always open to new ideas for improvements Roadmap planning: What relevent features will be released in the next versions of the software,. Ex, how would stripe-compaction or new block cache implementations impact your architecture? Refs: HBase-8521 (cell overwrites) HBase-6590 (hfile sequence id’s)base HBase-10958 (blindspot) Transition: Other lessons learned
With any moderately sized hadoop cluster you will need infrastructure to collect logs, monitor processes, and analyze metrics. We have effectively used cloudera manager for this purpose. CM will alerting on service process status changes, report performance metrics like read-latency, clock-skew. Post-issue forensics via log file analysis. Custom charting enables you to create dashboards to analyze your specific bottlenecks or recurring issues. Upgrade often. Hadoop components are routinely patched, so be sure to upgrade and use cloudera and the community to understand issues with your current releases. Always test your upgrades. Backup data for safety. We use HBase snapshot exports, then distcp to a backup cluster. Cloudera manager has a useful UI for managing distcp’s.

Empowering Customers with Personalized Insights

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (17)

Similar a Empowering Customers with Personalized Insights

Similar a Empowering Customers with Personalized Insights (20)

Más de Cloudera, Inc.

Más de Cloudera, Inc. (20)

Último

Último (20)

Empowering Customers with Personalized Insights

Notas del editor