SlideShare a Scribd company logo
1 of 13
Big Data Case Study:
Fortune 100 Media / Telco Company
Fortune 100 Media / Telco Company
Business Goal
• Big Data analytics to improve customer experience
• Provide daily insights to internal and external teams
• Sandbox environment to support ad-hoc analysis
• Isolated environments for external content providers
Key Challenges
• Limited IT resources and skill sets in Hadoop and Spark
• Administrative overhead managing existing Big Data environments
• Onboarding multiple internal and external user groups
Big Data Case Study Example
MANAGEMENT
COMPLEXITY
DUPLICATION
OF DATA
CLUSTER
SPRAWL
< 30% UTILIZATION
IT
Fortune 100 Media / Telco Company
Big Data Infrastructure = Complex and Expensive
External Content
Provider
External Content
Provider
Other Internal
Teams
Data Scientists and
Developers
Going Forward: Two Options Considered
Expand on-premises Hadoop infrastructure
• Ongoing management of physical servers
• Multi-tenancy required for external providers
• Significant IT overhead
Fortune 100 Media / Telco Company
Move to AWS Elastic MapReduce
• Hadoop-as-a-Service offers simplicity and agility
• Internal security policies are barrier
• Ongoing TCO of AWS cloud services
• Data is on-premises, difficult to copy or move
Physical
Data
Copy
Hadoop Cluster
(~ 15 nodes)
(Converted to Production from Pilot)
New Physical Nodes ($$)
To increase performance & capacity
Hue Console
(Hadoop jobs)
Marketing
External Content Provider
Advanced administration
Groups/queues/schedulers
BI Tool(s)
Custom Web App ($$)
(Security, access control & onboarding)
New Physical Nodes ($)
For BI/ETL tools
User administration (AD/LDAP) User administration (AD/LDAP)
Utilization < 20%
NFS Database Other
Physical Data Copy/Duplication
Sales Support
Data
Scientists
Developers
Dev/Test Cluster
New Physical Nodes
($)
BigDataApplications
&Users
BigData
Infrastructure
Existing
DataFortune 100 Media / Telco Company
Option 1: Expand On-Premises Infrastructure
External Content Provider
A third option: Hadoop-as-a-Service on-premises
• Infrastructure software platform (BlueData) for Hadoop and Spark
Self-service, on-demand virtual clusters
• Amazon EMR-like experience
• Agility and speed for data scientists
• IT infrastructure efficiency, higher utilization
Secure and multi-tenant architecture
• Eliminate complexities and pitfalls of multiple isolated physical clusters
• Stronger isolation and greater flexibility, no data duplication
Solution and Benefits
Fortune 100 Media / Telco Company
Hadoop Cluster
(~ 15 nodes)
(Converted to Production from Pilot)
New Physical Nodes ($)
Performance optimized (CPU & Memory)
Data Scientists
and Developers
Web UI – multi-tenant, role-based access control
User administration (AD/LDAP)
EPIC Platform ($)
Content Provider
Tenant 3
VIRTUAL HADOOP CLUSTER
HUE CONSOLE + BI TOOLS
Content Provider
Tenant 2
VIRTUAL HADOOP CLUSTER
HUE CONSOLE + BI TOOLS
Internal Team
Tenant 1
VIRTUAL HADOOP CLUSTER
HUE CONSOLE + BI TOOLS
In-place access
Other Internal
Teams
NS Gluster Other
BigDataApplications
&Users
BigData
Infrastructure
Existing
DataFortune 100 Media / Telco Company
Option 3: Deploy BlueData EPIC Software Platform
External Content
Provider
External Content
Provider
Multi-Tenant, Virtualized Infrastructure
Access to Data in Existing Storage Systems
Out-of-the-box Hadoop and Spark Support
• Significantly lower costs (~70%) – less hardware
required for dev/test cluster and BI / analytical tools
• Reduced administrative overhead – simpler user
management and administration, elminated data copying
• Speed and self-service – on-demand provisioning of
virtual Hadoop and Spark clusters
• Higher utilization – consolidation ratio of 8:1 between
virtual and physical servers
Fortune 100 Media / Telco Company
Big Data Case Study – Example Benefits
GLUSTER HDFS SWIFT NFS
Utilization > 90%
Simplified
management
No duplication of
data
No cluster
sprawl
ElasticPlane TM : Self-service, multi-tenant clusters
DataTap TM : In-place access to enterprise data stores
IOBoost TM : Extreme performance and scalability
EPIC Platform
Fortune 100 Media / Telco Company
Big Data Infrastructure Made Easy
External Content
Provider
External Content
Provider
Other Internal
Teams
Data Scientists and
Developers
Learn more at:
www.bluedata.com

More Related Content

What's hot

Lightning Fast Analytics with Hive LLAP and Druid
Lightning Fast Analytics with Hive LLAP and DruidLightning Fast Analytics with Hive LLAP and Druid
Lightning Fast Analytics with Hive LLAP and Druid
DataWorks Summit
 

What's hot (20)

Build Big Data Enterprise solutions faster on Azure HDInsight
Build Big Data Enterprise solutions faster on Azure HDInsightBuild Big Data Enterprise solutions faster on Azure HDInsight
Build Big Data Enterprise solutions faster on Azure HDInsight
 
Built-In Security for the Cloud
Built-In Security for the CloudBuilt-In Security for the Cloud
Built-In Security for the Cloud
 
Apache Spark Workshop at Hadoop Summit
Apache Spark Workshop at Hadoop SummitApache Spark Workshop at Hadoop Summit
Apache Spark Workshop at Hadoop Summit
 
Achieving cloud scale with microservices based applications on azure
Achieving cloud scale with microservices based applications on azureAchieving cloud scale with microservices based applications on azure
Achieving cloud scale with microservices based applications on azure
 
The new big data
The new big dataThe new big data
The new big data
 
Introducing Big Data
Introducing Big DataIntroducing Big Data
Introducing Big Data
 
Accelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud EraAccelerate Analytics and ML in the Hybrid Cloud Era
Accelerate Analytics and ML in the Hybrid Cloud Era
 
Integrating and Analyzing Data from Multiple Manufacturing Sites using Apache...
Integrating and Analyzing Data from Multiple Manufacturing Sites using Apache...Integrating and Analyzing Data from Multiple Manufacturing Sites using Apache...
Integrating and Analyzing Data from Multiple Manufacturing Sites using Apache...
 
Protect your Private Data in your Hadoop Clusters with ORC Column Encryption
Protect your Private Data in your Hadoop Clusters with ORC Column EncryptionProtect your Private Data in your Hadoop Clusters with ORC Column Encryption
Protect your Private Data in your Hadoop Clusters with ORC Column Encryption
 
Hadoop in the Cloud: Real World Lessons from Enterprise Customers
Hadoop in the Cloud: Real World Lessons from Enterprise CustomersHadoop in the Cloud: Real World Lessons from Enterprise Customers
Hadoop in the Cloud: Real World Lessons from Enterprise Customers
 
Hybrid Data Platform
Hybrid Data Platform Hybrid Data Platform
Hybrid Data Platform
 
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
Cortana Analytics Workshop: The "Big Data" of the Cortana Analytics Suite, Pa...
 
Lightning Fast Analytics with Hive LLAP and Druid
Lightning Fast Analytics with Hive LLAP and DruidLightning Fast Analytics with Hive LLAP and Druid
Lightning Fast Analytics with Hive LLAP and Druid
 
Apache Spark Workshop, Apr. 2016, Euangelos Linardos
Apache Spark Workshop, Apr. 2016, Euangelos LinardosApache Spark Workshop, Apr. 2016, Euangelos Linardos
Apache Spark Workshop, Apr. 2016, Euangelos Linardos
 
Hadoop and Friends as Key Enabler of the IoE - Continental's Dynamic eHorizon
Hadoop and Friends as Key Enabler of the IoE - Continental's Dynamic eHorizonHadoop and Friends as Key Enabler of the IoE - Continental's Dynamic eHorizon
Hadoop and Friends as Key Enabler of the IoE - Continental's Dynamic eHorizon
 
Key trends in Big Data and new reference architecture from Hewlett Packard En...
Key trends in Big Data and new reference architecture from Hewlett Packard En...Key trends in Big Data and new reference architecture from Hewlett Packard En...
Key trends in Big Data and new reference architecture from Hewlett Packard En...
 
Snowflake Datawarehouse Architecturing
Snowflake Datawarehouse ArchitecturingSnowflake Datawarehouse Architecturing
Snowflake Datawarehouse Architecturing
 
Unified Data Access with Gimel
Unified Data Access with GimelUnified Data Access with Gimel
Unified Data Access with Gimel
 
Speeding Up Atlas Deep Learning Platform with Alluxio + Fluid
Speeding Up Atlas Deep Learning Platform with Alluxio + FluidSpeeding Up Atlas Deep Learning Platform with Alluxio + Fluid
Speeding Up Atlas Deep Learning Platform with Alluxio + Fluid
 
Architecting a Heterogeneous Data Platform Across Clusters, Regions, and Clouds
Architecting a Heterogeneous Data Platform Across Clusters, Regions, and CloudsArchitecting a Heterogeneous Data Platform Across Clusters, Regions, and Clouds
Architecting a Heterogeneous Data Platform Across Clusters, Regions, and Clouds
 

Viewers also liked (6)

Business case for Big Data Analytics
Business case for Big Data AnalyticsBusiness case for Big Data Analytics
Business case for Big Data Analytics
 
Three Big Data Case Studies
Three Big Data Case StudiesThree Big Data Case Studies
Three Big Data Case Studies
 
Big Data Trends
Big Data TrendsBig Data Trends
Big Data Trends
 
Big Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should KnowBig Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should Know
 
What is big data?
What is big data?What is big data?
What is big data?
 
What is Big Data?
What is Big Data?What is Big Data?
What is Big Data?
 

Similar to Big Data Case Study: Fortune 100 Telco

Making BD Work~TIAS_20150622
Making BD Work~TIAS_20150622Making BD Work~TIAS_20150622
Making BD Work~TIAS_20150622
Anthony Potappel
 
Big Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKBig Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RK
Rajesh Jayarman
 
Data Warehouse Optimization
Data Warehouse OptimizationData Warehouse Optimization
Data Warehouse Optimization
Cloudera, Inc.
 
Harvard university i tv3.2
Harvard university i tv3.2Harvard university i tv3.2
Harvard university i tv3.2
kevin_donovan
 

Similar to Big Data Case Study: Fortune 100 Telco (20)

Making BD Work~TIAS_20150622
Making BD Work~TIAS_20150622Making BD Work~TIAS_20150622
Making BD Work~TIAS_20150622
 
HPE Keynote Hadoop Summit San Jose 2016
HPE Keynote Hadoop Summit San Jose 2016HPE Keynote Hadoop Summit San Jose 2016
HPE Keynote Hadoop Summit San Jose 2016
 
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
 
Big Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKBig Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RK
 
Big Data: Myths and Realities
Big Data: Myths and RealitiesBig Data: Myths and Realities
Big Data: Myths and Realities
 
ELIXIR
ELIXIRELIXIR
ELIXIR
 
Bridging the Big Data Gap in the Software-Driven World
Bridging the Big Data Gap in the Software-Driven WorldBridging the Big Data Gap in the Software-Driven World
Bridging the Big Data Gap in the Software-Driven World
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
 
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...
Hadoop Summit San Jose 2015: What it Takes to Run Hadoop at Scale Yahoo Persp...
 
Innovation in the Enterprise Rent-A-Car Data Warehouse
Innovation in the Enterprise Rent-A-Car Data WarehouseInnovation in the Enterprise Rent-A-Car Data Warehouse
Innovation in the Enterprise Rent-A-Car Data Warehouse
 
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
 
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data TorrentSeagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
 
Oracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analyticsOracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analytics
 
Intel and Cloudera: Accelerating Enterprise Big Data Success
Intel and Cloudera: Accelerating Enterprise Big Data SuccessIntel and Cloudera: Accelerating Enterprise Big Data Success
Intel and Cloudera: Accelerating Enterprise Big Data Success
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
 
What it takes to bring Hadoop to a production-ready state
What it takes to bring Hadoop to a production-ready stateWhat it takes to bring Hadoop to a production-ready state
What it takes to bring Hadoop to a production-ready state
 
Data Warehouse Optimization
Data Warehouse OptimizationData Warehouse Optimization
Data Warehouse Optimization
 
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
 
Harvard university i tv3.2
Harvard university i tv3.2Harvard university i tv3.2
Harvard university i tv3.2
 
Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...
Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...
Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...
 

More from BlueData, Inc.

How to Protect Big Data in a Containerized Environment
How to Protect Big Data in a Containerized EnvironmentHow to Protect Big Data in a Containerized Environment
How to Protect Big Data in a Containerized Environment
BlueData, Inc.
 
Lessons Learned from Dockerizing Spark Workloads
Lessons Learned from Dockerizing Spark WorkloadsLessons Learned from Dockerizing Spark Workloads
Lessons Learned from Dockerizing Spark Workloads
BlueData, Inc.
 
Lessons Learned Running Hadoop and Spark in Docker Containers
Lessons Learned Running Hadoop and Spark in Docker ContainersLessons Learned Running Hadoop and Spark in Docker Containers
Lessons Learned Running Hadoop and Spark in Docker Containers
BlueData, Inc.
 

More from BlueData, Inc. (14)

Introduction to KubeDirector - SF Kubernetes Meetup
Introduction to KubeDirector - SF Kubernetes MeetupIntroduction to KubeDirector - SF Kubernetes Meetup
Introduction to KubeDirector - SF Kubernetes Meetup
 
Dell EMC Ready Solutions for Big Data
Dell EMC Ready Solutions for Big DataDell EMC Ready Solutions for Big Data
Dell EMC Ready Solutions for Big Data
 
BlueData and Hortonworks Data Platform (HDP)
BlueData and Hortonworks Data Platform (HDP)BlueData and Hortonworks Data Platform (HDP)
BlueData and Hortonworks Data Platform (HDP)
 
How to Protect Big Data in a Containerized Environment
How to Protect Big Data in a Containerized EnvironmentHow to Protect Big Data in a Containerized Environment
How to Protect Big Data in a Containerized Environment
 
BlueData EPIC datasheet (en Français)
BlueData EPIC datasheet (en Français)BlueData EPIC datasheet (en Français)
BlueData EPIC datasheet (en Français)
 
Best Practices for Running Kafka on Docker Containers
Best Practices for Running Kafka on Docker ContainersBest Practices for Running Kafka on Docker Containers
Best Practices for Running Kafka on Docker Containers
 
Bare-metal performance for Big Data workloads on Docker containers
Bare-metal performance for Big Data workloads on Docker containersBare-metal performance for Big Data workloads on Docker containers
Bare-metal performance for Big Data workloads on Docker containers
 
Lessons Learned from Dockerizing Spark Workloads
Lessons Learned from Dockerizing Spark WorkloadsLessons Learned from Dockerizing Spark Workloads
Lessons Learned from Dockerizing Spark Workloads
 
BlueData EPIC on AWS - Spec Sheet
BlueData EPIC on AWS - Spec SheetBlueData EPIC on AWS - Spec Sheet
BlueData EPIC on AWS - Spec Sheet
 
Lessons Learned Running Hadoop and Spark in Docker Containers
Lessons Learned Running Hadoop and Spark in Docker ContainersLessons Learned Running Hadoop and Spark in Docker Containers
Lessons Learned Running Hadoop and Spark in Docker Containers
 
The Time Has Come for Big-Data-as-a-Service
The Time Has Come for Big-Data-as-a-ServiceThe Time Has Come for Big-Data-as-a-Service
The Time Has Come for Big-Data-as-a-Service
 
Solution Brief: Real-Time Pipeline Accelerator
Solution Brief: Real-Time Pipeline AcceleratorSolution Brief: Real-Time Pipeline Accelerator
Solution Brief: Real-Time Pipeline Accelerator
 
Solution Brief: Big Data Lab Accelerator
Solution Brief: Big Data Lab AcceleratorSolution Brief: Big Data Lab Accelerator
Solution Brief: Big Data Lab Accelerator
 
BlueData Hunk Integration: Splunk Analytics for Hadoop
BlueData Hunk Integration: Splunk Analytics for HadoopBlueData Hunk Integration: Splunk Analytics for Hadoop
BlueData Hunk Integration: Splunk Analytics for Hadoop
 

Recently uploaded

+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
shinachiaurasa2
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
VictorSzoltysek
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Recently uploaded (20)

+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
Pharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodologyPharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodology
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 

Big Data Case Study: Fortune 100 Telco

  • 1. Big Data Case Study: Fortune 100 Media / Telco Company
  • 2. Fortune 100 Media / Telco Company Business Goal • Big Data analytics to improve customer experience • Provide daily insights to internal and external teams • Sandbox environment to support ad-hoc analysis • Isolated environments for external content providers Key Challenges • Limited IT resources and skill sets in Hadoop and Spark • Administrative overhead managing existing Big Data environments • Onboarding multiple internal and external user groups Big Data Case Study Example
  • 3. MANAGEMENT COMPLEXITY DUPLICATION OF DATA CLUSTER SPRAWL < 30% UTILIZATION IT Fortune 100 Media / Telco Company Big Data Infrastructure = Complex and Expensive External Content Provider External Content Provider Other Internal Teams Data Scientists and Developers
  • 4. Going Forward: Two Options Considered Expand on-premises Hadoop infrastructure • Ongoing management of physical servers • Multi-tenancy required for external providers • Significant IT overhead Fortune 100 Media / Telco Company Move to AWS Elastic MapReduce • Hadoop-as-a-Service offers simplicity and agility • Internal security policies are barrier • Ongoing TCO of AWS cloud services • Data is on-premises, difficult to copy or move
  • 5. Physical Data Copy Hadoop Cluster (~ 15 nodes) (Converted to Production from Pilot) New Physical Nodes ($$) To increase performance & capacity Hue Console (Hadoop jobs) Marketing External Content Provider Advanced administration Groups/queues/schedulers BI Tool(s) Custom Web App ($$) (Security, access control & onboarding) New Physical Nodes ($) For BI/ETL tools User administration (AD/LDAP) User administration (AD/LDAP) Utilization < 20% NFS Database Other Physical Data Copy/Duplication Sales Support Data Scientists Developers Dev/Test Cluster New Physical Nodes ($) BigDataApplications &Users BigData Infrastructure Existing DataFortune 100 Media / Telco Company Option 1: Expand On-Premises Infrastructure External Content Provider
  • 6. A third option: Hadoop-as-a-Service on-premises • Infrastructure software platform (BlueData) for Hadoop and Spark Self-service, on-demand virtual clusters • Amazon EMR-like experience • Agility and speed for data scientists • IT infrastructure efficiency, higher utilization Secure and multi-tenant architecture • Eliminate complexities and pitfalls of multiple isolated physical clusters • Stronger isolation and greater flexibility, no data duplication Solution and Benefits Fortune 100 Media / Telco Company
  • 7. Hadoop Cluster (~ 15 nodes) (Converted to Production from Pilot) New Physical Nodes ($) Performance optimized (CPU & Memory) Data Scientists and Developers Web UI – multi-tenant, role-based access control User administration (AD/LDAP) EPIC Platform ($) Content Provider Tenant 3 VIRTUAL HADOOP CLUSTER HUE CONSOLE + BI TOOLS Content Provider Tenant 2 VIRTUAL HADOOP CLUSTER HUE CONSOLE + BI TOOLS Internal Team Tenant 1 VIRTUAL HADOOP CLUSTER HUE CONSOLE + BI TOOLS In-place access Other Internal Teams NS Gluster Other BigDataApplications &Users BigData Infrastructure Existing DataFortune 100 Media / Telco Company Option 3: Deploy BlueData EPIC Software Platform External Content Provider External Content Provider
  • 9. Access to Data in Existing Storage Systems
  • 10. Out-of-the-box Hadoop and Spark Support
  • 11. • Significantly lower costs (~70%) – less hardware required for dev/test cluster and BI / analytical tools • Reduced administrative overhead – simpler user management and administration, elminated data copying • Speed and self-service – on-demand provisioning of virtual Hadoop and Spark clusters • Higher utilization – consolidation ratio of 8:1 between virtual and physical servers Fortune 100 Media / Telco Company Big Data Case Study – Example Benefits
  • 12. GLUSTER HDFS SWIFT NFS Utilization > 90% Simplified management No duplication of data No cluster sprawl ElasticPlane TM : Self-service, multi-tenant clusters DataTap TM : In-place access to enterprise data stores IOBoost TM : Extreme performance and scalability EPIC Platform Fortune 100 Media / Telco Company Big Data Infrastructure Made Easy External Content Provider External Content Provider Other Internal Teams Data Scientists and Developers