SlideShare una empresa de Scribd logo
1 de 17
Big Data Analytics:Profiling the Use of Analytic Platforms in User Organizations Wayne Eckerson Director of Research, Business Applications and Architecture Media Group TechTarget
Sponsors
Why Big Data?  Changing data types Technology advances Insourcing & outsourcing Developers discover data
Analytics against Big Data Patterns Real-time Complex calculations Sustainable advantage
Framework for success Culture People Organization Architecture Analytic Platform Reporting Event-driven Data Governance BI Governance Performance Measurement IT professionals Fact-based Decisions Casual Users Analytics Analytics Center of Excellence Power Users Business Executives
Analytic Platforms 	An analytic platform is a data management system optimized for query processing and analytic that provides superior price-performance and availability compared with general purpose database management systems.  Have you purchased or implemented an analytic platform as defined in this survey?
Analytical Techniques MPP Balanced configurations  Storage-level processing Columnar storage and compression Memory Query optimizer Plug-in analytics
Types of Analytic Platforms
Which type of analytic platform have you purchased or implemented?
Purchase requirements
Explicitly looking for this option?
BI Delivery Framework 2020 Business Intelligence End-User Tools Dashboard Alerts Search, NoSQL, Java Reports and Dashboards Design Framework Universal Information Access                  Hadoop, Map Reduce Event detection and correlation MAD Dashboards Key-value  pair indexes Architecture CEP, Streams Data Ware- housing Data Warehousing Reporting &  Analysis Content Intelligence Event-Driven Alerts  and Dashboards Continuous Intelligence Event-driven Analytic Sandboxes Analytic Sandboxes Ad hoc query, Spreadsheets, OLAP, Visual Analysis, Analytic Workbenches, Hadoop Ad hoc exploration Excel, Access, SAS, Visual Analysis Analytics Intelligence  12
Pros:  -Alignment -Consistency Cons:  -Hard to build -Politically charged -Hard to change - Expensive -“Schema Heavy” TOP DOWN- “Business Intelligence”  Corporate Objectives and Strategy Reporting & Monitoring (Casual Users)  Non-volatile data DW Architecture Predefined  Metrics Reports  Beget Analysis  Analysis Begets Reports Pros:  -Quick to build - Politically uncharged - Easy to change - Low cost  Cons:  -Alignment -Consistency --“Schema Light” Volatile  data  Ad hoc  queries  Analytics Architecture Analysis and Prediction (Power Users)  Processes and Projects BOTTOM UP – “Analytics Intelligence”
BI Architecture - 2020 Operational Systems (Structured data) Operational System Extract, Transform, Load (Batch, near real-time, or real-time) Casual User Streaming/ CEP Engine Alerts Operational System Reports /Dashboards BI Server Data Warehouse Virtual Sandboxes Machine Data Dept Data Mart Hadoop Cluster Top-down Architecture Bottom-up Architecture Web Data In-memory  BI Sandbox Ad hoc query Upload & query Audio/video Data Free- Standing Sandbox Query & report Ad hoc query Analytical platform or non-relational database External Data Ad hoc query Power User Documents & Text
BI Architecture - 2020 Operational Systems (Structured data) Operational System Extract, Transform, Load (Batch, near real-time, or real-time) Casual User Streaming/ CEP Engine Alerts Operational System Reports Dashboards BI Server Data Warehouse  Virtual Sandboxes Machine Data Dept Data Mart Hadoop Cluster Top-down Architecture Bottom-up Architecture Web Data In-memory  BI Sandbox Ad hoc query Upload & query Audio/video Data Free- Standing Sandbox Query & report Ad hoc query Analytical platform or non-relational database External Data Ad hoc query Power User Documents & Text
Recommendations Harmonize top down and bottom up BI Implement a BI architecture that supports multiple intelligences Create multiple types of analytic sandboxes  Implement analytic platforms that meet business and technical requirements
Big Data Analytics Webinar

Más contenido relacionado

La actualidad más candente

Analytics in a Day Ft. Synapse Virtual Workshop
Analytics in a Day Ft. Synapse Virtual WorkshopAnalytics in a Day Ft. Synapse Virtual Workshop
Analytics in a Day Ft. Synapse Virtual WorkshopCCG
 
Data Virtualization at Logitech = #Winning
Data Virtualization at Logitech = #WinningData Virtualization at Logitech = #Winning
Data Virtualization at Logitech = #WinningDenodo
 
Analyst Webinar: Best Practices In Enabling Data-Driven Decision Making
Analyst Webinar: Best Practices In Enabling Data-Driven Decision MakingAnalyst Webinar: Best Practices In Enabling Data-Driven Decision Making
Analyst Webinar: Best Practices In Enabling Data-Driven Decision MakingDenodo
 
Solution Architecture US healthcare
Solution Architecture US healthcare Solution Architecture US healthcare
Solution Architecture US healthcare sumiteshkr
 
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...Denodo
 
Predictive Analytics - Big Data Warehousing Meetup
Predictive Analytics - Big Data Warehousing MeetupPredictive Analytics - Big Data Warehousing Meetup
Predictive Analytics - Big Data Warehousing MeetupCaserta
 
Rethink Analytics with an Enterprise Data Hub
Rethink Analytics with an Enterprise Data HubRethink Analytics with an Enterprise Data Hub
Rethink Analytics with an Enterprise Data HubCloudera, Inc.
 
Modern Data Architecture
Modern Data ArchitectureModern Data Architecture
Modern Data ArchitectureEd Thewlis
 
Tools and techniques for predictive analytics
Tools and techniques for predictive analyticsTools and techniques for predictive analytics
Tools and techniques for predictive analyticsRohanKumarJumnani
 
Agile Data Management with Enterprise Data Fabric (ASEAN)
Agile Data Management with Enterprise Data Fabric (ASEAN)Agile Data Management with Enterprise Data Fabric (ASEAN)
Agile Data Management with Enterprise Data Fabric (ASEAN)Denodo
 
Informatica Becomes Part of the Business Data Lake Ecosystem
Informatica Becomes Part of the Business Data Lake EcosystemInformatica Becomes Part of the Business Data Lake Ecosystem
Informatica Becomes Part of the Business Data Lake EcosystemCapgemini
 
Self-Service Analytics with Guard Rails
Self-Service Analytics with Guard RailsSelf-Service Analytics with Guard Rails
Self-Service Analytics with Guard RailsDenodo
 
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing Keynote
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing KeynoteArchitecting Data For The Modern Enterprise - Data Summit 2017, Closing Keynote
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing KeynoteCaserta
 
Sisense Introduction PPT
Sisense Introduction PPTSisense Introduction PPT
Sisense Introduction PPTKhirod Sahu
 
Making Big Data Easy for Everyone
Making Big Data Easy for EveryoneMaking Big Data Easy for Everyone
Making Big Data Easy for EveryoneCaserta
 
Bank Struggles Along the Way for the Holy Grail of Personalization: Customer 360
Bank Struggles Along the Way for the Holy Grail of Personalization: Customer 360Bank Struggles Along the Way for the Holy Grail of Personalization: Customer 360
Bank Struggles Along the Way for the Holy Grail of Personalization: Customer 360Databricks
 
A modern, flexible approach to Hadoop implementation incorporating innovation...
A modern, flexible approach to Hadoop implementation incorporating innovation...A modern, flexible approach to Hadoop implementation incorporating innovation...
A modern, flexible approach to Hadoop implementation incorporating innovation...DataWorks Summit
 
Self Service Analytics enabled by Data Virtualization from Denodo
Self Service Analytics enabled by Data Virtualization from DenodoSelf Service Analytics enabled by Data Virtualization from Denodo
Self Service Analytics enabled by Data Virtualization from DenodoDenodo
 
Embedding Insight through Prediction Driven Logistics
Embedding Insight through Prediction Driven LogisticsEmbedding Insight through Prediction Driven Logistics
Embedding Insight through Prediction Driven LogisticsDatabricks
 

La actualidad más candente (20)

Analytics in a Day Ft. Synapse Virtual Workshop
Analytics in a Day Ft. Synapse Virtual WorkshopAnalytics in a Day Ft. Synapse Virtual Workshop
Analytics in a Day Ft. Synapse Virtual Workshop
 
Data Virtualization at Logitech = #Winning
Data Virtualization at Logitech = #WinningData Virtualization at Logitech = #Winning
Data Virtualization at Logitech = #Winning
 
Analyst Webinar: Best Practices In Enabling Data-Driven Decision Making
Analyst Webinar: Best Practices In Enabling Data-Driven Decision MakingAnalyst Webinar: Best Practices In Enabling Data-Driven Decision Making
Analyst Webinar: Best Practices In Enabling Data-Driven Decision Making
 
Solution Architecture US healthcare
Solution Architecture US healthcare Solution Architecture US healthcare
Solution Architecture US healthcare
 
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
Building a Single Logical Data Lake: For Advanced Analytics, Data Science, an...
 
Predictive Analytics - Big Data Warehousing Meetup
Predictive Analytics - Big Data Warehousing MeetupPredictive Analytics - Big Data Warehousing Meetup
Predictive Analytics - Big Data Warehousing Meetup
 
Rethink Analytics with an Enterprise Data Hub
Rethink Analytics with an Enterprise Data HubRethink Analytics with an Enterprise Data Hub
Rethink Analytics with an Enterprise Data Hub
 
Modern Data Architecture
Modern Data ArchitectureModern Data Architecture
Modern Data Architecture
 
Tools and techniques for predictive analytics
Tools and techniques for predictive analyticsTools and techniques for predictive analytics
Tools and techniques for predictive analytics
 
Agile Data Management with Enterprise Data Fabric (ASEAN)
Agile Data Management with Enterprise Data Fabric (ASEAN)Agile Data Management with Enterprise Data Fabric (ASEAN)
Agile Data Management with Enterprise Data Fabric (ASEAN)
 
Informatica Becomes Part of the Business Data Lake Ecosystem
Informatica Becomes Part of the Business Data Lake EcosystemInformatica Becomes Part of the Business Data Lake Ecosystem
Informatica Becomes Part of the Business Data Lake Ecosystem
 
Self-Service Analytics with Guard Rails
Self-Service Analytics with Guard RailsSelf-Service Analytics with Guard Rails
Self-Service Analytics with Guard Rails
 
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing Keynote
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing KeynoteArchitecting Data For The Modern Enterprise - Data Summit 2017, Closing Keynote
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing Keynote
 
Sisense Introduction PPT
Sisense Introduction PPTSisense Introduction PPT
Sisense Introduction PPT
 
Making Big Data Easy for Everyone
Making Big Data Easy for EveryoneMaking Big Data Easy for Everyone
Making Big Data Easy for Everyone
 
Bank Struggles Along the Way for the Holy Grail of Personalization: Customer 360
Bank Struggles Along the Way for the Holy Grail of Personalization: Customer 360Bank Struggles Along the Way for the Holy Grail of Personalization: Customer 360
Bank Struggles Along the Way for the Holy Grail of Personalization: Customer 360
 
A modern, flexible approach to Hadoop implementation incorporating innovation...
A modern, flexible approach to Hadoop implementation incorporating innovation...A modern, flexible approach to Hadoop implementation incorporating innovation...
A modern, flexible approach to Hadoop implementation incorporating innovation...
 
Self Service Analytics enabled by Data Virtualization from Denodo
Self Service Analytics enabled by Data Virtualization from DenodoSelf Service Analytics enabled by Data Virtualization from Denodo
Self Service Analytics enabled by Data Virtualization from Denodo
 
How Businesses use Big Data to Impact the Bottom Line
How Businesses use Big Data to Impact the Bottom LineHow Businesses use Big Data to Impact the Bottom Line
How Businesses use Big Data to Impact the Bottom Line
 
Embedding Insight through Prediction Driven Logistics
Embedding Insight through Prediction Driven LogisticsEmbedding Insight through Prediction Driven Logistics
Embedding Insight through Prediction Driven Logistics
 

Similar a Big Data Analytics Webinar

Bi presentation to bkk
Bi presentation to bkkBi presentation to bkk
Bi presentation to bkkguest4e975e2
 
Analyti x mapping manager product overview presentation
Analyti x mapping manager product overview presentationAnalyti x mapping manager product overview presentation
Analyti x mapping manager product overview presentationAnalytixDataServices
 
Thought leadership Oct2015 selfserve
Thought leadership Oct2015 selfserveThought leadership Oct2015 selfserve
Thought leadership Oct2015 selfserveRon Krzoska
 
BI Reporting Application Comparison
BI Reporting Application ComparisonBI Reporting Application Comparison
BI Reporting Application ComparisonScott Mitchell
 
Rev_3 Components of a Data Warehouse
Rev_3 Components of a Data WarehouseRev_3 Components of a Data Warehouse
Rev_3 Components of a Data WarehouseRyan Andhavarapu
 
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BI
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BIAugmentation, Collaboration, Governance: Defining the Future of Self-Service BI
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BIDenodo
 
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)Denodo
 
Become BI Architect with 1KEY Agile BI Suite - Architecture
Become BI Architect with 1KEY Agile BI Suite  - ArchitectureBecome BI Architect with 1KEY Agile BI Suite  - Architecture
Become BI Architect with 1KEY Agile BI Suite - ArchitectureDhiren Gala
 
Data lake benefits
Data lake benefitsData lake benefits
Data lake benefitsRicky Barron
 
Big Data Meetup: Analytical Systems Evolution
Big Data Meetup: Analytical Systems EvolutionBig Data Meetup: Analytical Systems Evolution
Big Data Meetup: Analytical Systems EvolutionProvectus
 
Business Analytics Paradigm Change
Business Analytics Paradigm ChangeBusiness Analytics Paradigm Change
Business Analytics Paradigm ChangeDmitry Anoshin
 
Business Analytics Training
Business Analytics TrainingBusiness Analytics Training
Business Analytics TrainingNatalija Pavic
 
Profile Bi Sameer
Profile Bi SameerProfile Bi Sameer
Profile Bi Sameersameerb
 
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...Hortonworks
 
Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...
Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...
Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...Denodo
 

Similar a Big Data Analytics Webinar (20)

Technologies
TechnologiesTechnologies
Technologies
 
Bi presentation to bkk
Bi presentation to bkkBi presentation to bkk
Bi presentation to bkk
 
Analyti x mapping manager product overview presentation
Analyti x mapping manager product overview presentationAnalyti x mapping manager product overview presentation
Analyti x mapping manager product overview presentation
 
Thought leadership Oct2015 selfserve
Thought leadership Oct2015 selfserveThought leadership Oct2015 selfserve
Thought leadership Oct2015 selfserve
 
BI Reporting Application Comparison
BI Reporting Application ComparisonBI Reporting Application Comparison
BI Reporting Application Comparison
 
Business intelligence
Business intelligenceBusiness intelligence
Business intelligence
 
Next Generation of BI
Next Generation of BINext Generation of BI
Next Generation of BI
 
Rev_3 Components of a Data Warehouse
Rev_3 Components of a Data WarehouseRev_3 Components of a Data Warehouse
Rev_3 Components of a Data Warehouse
 
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BI
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BIAugmentation, Collaboration, Governance: Defining the Future of Self-Service BI
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BI
 
Data Management Strategy
Data Management StrategyData Management Strategy
Data Management Strategy
 
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)
 
Meetup Data-science OVH
Meetup Data-science OVHMeetup Data-science OVH
Meetup Data-science OVH
 
Become BI Architect with 1KEY Agile BI Suite - Architecture
Become BI Architect with 1KEY Agile BI Suite  - ArchitectureBecome BI Architect with 1KEY Agile BI Suite  - Architecture
Become BI Architect with 1KEY Agile BI Suite - Architecture
 
Data lake benefits
Data lake benefitsData lake benefits
Data lake benefits
 
Big Data Meetup: Analytical Systems Evolution
Big Data Meetup: Analytical Systems EvolutionBig Data Meetup: Analytical Systems Evolution
Big Data Meetup: Analytical Systems Evolution
 
Business Analytics Paradigm Change
Business Analytics Paradigm ChangeBusiness Analytics Paradigm Change
Business Analytics Paradigm Change
 
Business Analytics Training
Business Analytics TrainingBusiness Analytics Training
Business Analytics Training
 
Profile Bi Sameer
Profile Bi SameerProfile Bi Sameer
Profile Bi Sameer
 
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
 
Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...
Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...
Extended Data Warehouse - A New Data Architecture for Modern BI with Claudia ...
 

Más de Eckerson Group

The Evolution of Self-Service Analytics
The Evolution of Self-Service AnalyticsThe Evolution of Self-Service Analytics
The Evolution of Self-Service AnalyticsEckerson Group
 
Managing Data Sprawl with Data Catalogs for Self-Service
Managing Data Sprawl with Data Catalogs for Self-ServiceManaging Data Sprawl with Data Catalogs for Self-Service
Managing Data Sprawl with Data Catalogs for Self-ServiceEckerson Group
 
Tips for BI & Analytics leaders
Tips for BI & Analytics leadersTips for BI & Analytics leaders
Tips for BI & Analytics leadersEckerson Group
 
AI in Financial Services
AI in Financial ServicesAI in Financial Services
AI in Financial ServicesEckerson Group
 
TDWI Boston Keynote: The New BI/Analytics Synergy
TDWI Boston Keynote: The New BI/Analytics Synergy TDWI Boston Keynote: The New BI/Analytics Synergy
TDWI Boston Keynote: The New BI/Analytics Synergy Eckerson Group
 
Visual discovery tools
Visual discovery toolsVisual discovery tools
Visual discovery toolsEckerson Group
 
Data Virtualization Survey Results
Data Virtualization Survey ResultsData Virtualization Survey Results
Data Virtualization Survey ResultsEckerson Group
 
Business driven BI - Self-service Techniques
Business driven BI - Self-service TechniquesBusiness driven BI - Self-service Techniques
Business driven BI - Self-service TechniquesEckerson Group
 
Strategies for Integrating with Hadoop
Strategies for Integrating with HadoopStrategies for Integrating with Hadoop
Strategies for Integrating with HadoopEckerson Group
 
Collaboration Survey results
Collaboration Survey resultsCollaboration Survey results
Collaboration Survey resultsEckerson Group
 
Characteristics of next generation bi tools - bi leadership web site
Characteristics of next generation bi tools - bi leadership web siteCharacteristics of next generation bi tools - bi leadership web site
Characteristics of next generation bi tools - bi leadership web siteEckerson Group
 
BI Federation Survey Results
BI Federation Survey ResultsBI Federation Survey Results
BI Federation Survey ResultsEckerson Group
 
BI Architectures - Next Generation
BI Architectures - Next GenerationBI Architectures - Next Generation
BI Architectures - Next GenerationEckerson Group
 

Más de Eckerson Group (16)

The Evolution of Self-Service Analytics
The Evolution of Self-Service AnalyticsThe Evolution of Self-Service Analytics
The Evolution of Self-Service Analytics
 
Managing Data Sprawl with Data Catalogs for Self-Service
Managing Data Sprawl with Data Catalogs for Self-ServiceManaging Data Sprawl with Data Catalogs for Self-Service
Managing Data Sprawl with Data Catalogs for Self-Service
 
Tips for BI & Analytics leaders
Tips for BI & Analytics leadersTips for BI & Analytics leaders
Tips for BI & Analytics leaders
 
AI in Financial Services
AI in Financial ServicesAI in Financial Services
AI in Financial Services
 
TDWI Boston Keynote: The New BI/Analytics Synergy
TDWI Boston Keynote: The New BI/Analytics Synergy TDWI Boston Keynote: The New BI/Analytics Synergy
TDWI Boston Keynote: The New BI/Analytics Synergy
 
Operational Analytics
Operational AnalyticsOperational Analytics
Operational Analytics
 
Cloud BI Survey
Cloud BI SurveyCloud BI Survey
Cloud BI Survey
 
Visual discovery tools
Visual discovery toolsVisual discovery tools
Visual discovery tools
 
Data Virtualization Survey Results
Data Virtualization Survey ResultsData Virtualization Survey Results
Data Virtualization Survey Results
 
Mobile BI Trends
Mobile BI TrendsMobile BI Trends
Mobile BI Trends
 
Business driven BI - Self-service Techniques
Business driven BI - Self-service TechniquesBusiness driven BI - Self-service Techniques
Business driven BI - Self-service Techniques
 
Strategies for Integrating with Hadoop
Strategies for Integrating with HadoopStrategies for Integrating with Hadoop
Strategies for Integrating with Hadoop
 
Collaboration Survey results
Collaboration Survey resultsCollaboration Survey results
Collaboration Survey results
 
Characteristics of next generation bi tools - bi leadership web site
Characteristics of next generation bi tools - bi leadership web siteCharacteristics of next generation bi tools - bi leadership web site
Characteristics of next generation bi tools - bi leadership web site
 
BI Federation Survey Results
BI Federation Survey ResultsBI Federation Survey Results
BI Federation Survey Results
 
BI Architectures - Next Generation
BI Architectures - Next GenerationBI Architectures - Next Generation
BI Architectures - Next Generation
 

Último

Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 

Último (20)

Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 

Big Data Analytics Webinar

  • 1. Big Data Analytics:Profiling the Use of Analytic Platforms in User Organizations Wayne Eckerson Director of Research, Business Applications and Architecture Media Group TechTarget
  • 3. Why Big Data? Changing data types Technology advances Insourcing & outsourcing Developers discover data
  • 4. Analytics against Big Data Patterns Real-time Complex calculations Sustainable advantage
  • 5. Framework for success Culture People Organization Architecture Analytic Platform Reporting Event-driven Data Governance BI Governance Performance Measurement IT professionals Fact-based Decisions Casual Users Analytics Analytics Center of Excellence Power Users Business Executives
  • 6. Analytic Platforms An analytic platform is a data management system optimized for query processing and analytic that provides superior price-performance and availability compared with general purpose database management systems. Have you purchased or implemented an analytic platform as defined in this survey?
  • 7. Analytical Techniques MPP Balanced configurations Storage-level processing Columnar storage and compression Memory Query optimizer Plug-in analytics
  • 8. Types of Analytic Platforms
  • 9. Which type of analytic platform have you purchased or implemented?
  • 11. Explicitly looking for this option?
  • 12. BI Delivery Framework 2020 Business Intelligence End-User Tools Dashboard Alerts Search, NoSQL, Java Reports and Dashboards Design Framework Universal Information Access Hadoop, Map Reduce Event detection and correlation MAD Dashboards Key-value pair indexes Architecture CEP, Streams Data Ware- housing Data Warehousing Reporting & Analysis Content Intelligence Event-Driven Alerts and Dashboards Continuous Intelligence Event-driven Analytic Sandboxes Analytic Sandboxes Ad hoc query, Spreadsheets, OLAP, Visual Analysis, Analytic Workbenches, Hadoop Ad hoc exploration Excel, Access, SAS, Visual Analysis Analytics Intelligence 12
  • 13. Pros: -Alignment -Consistency Cons: -Hard to build -Politically charged -Hard to change - Expensive -“Schema Heavy” TOP DOWN- “Business Intelligence” Corporate Objectives and Strategy Reporting & Monitoring (Casual Users) Non-volatile data DW Architecture Predefined Metrics Reports Beget Analysis Analysis Begets Reports Pros: -Quick to build - Politically uncharged - Easy to change - Low cost Cons: -Alignment -Consistency --“Schema Light” Volatile data Ad hoc queries Analytics Architecture Analysis and Prediction (Power Users) Processes and Projects BOTTOM UP – “Analytics Intelligence”
  • 14. BI Architecture - 2020 Operational Systems (Structured data) Operational System Extract, Transform, Load (Batch, near real-time, or real-time) Casual User Streaming/ CEP Engine Alerts Operational System Reports /Dashboards BI Server Data Warehouse Virtual Sandboxes Machine Data Dept Data Mart Hadoop Cluster Top-down Architecture Bottom-up Architecture Web Data In-memory BI Sandbox Ad hoc query Upload & query Audio/video Data Free- Standing Sandbox Query & report Ad hoc query Analytical platform or non-relational database External Data Ad hoc query Power User Documents & Text
  • 15. BI Architecture - 2020 Operational Systems (Structured data) Operational System Extract, Transform, Load (Batch, near real-time, or real-time) Casual User Streaming/ CEP Engine Alerts Operational System Reports Dashboards BI Server Data Warehouse Virtual Sandboxes Machine Data Dept Data Mart Hadoop Cluster Top-down Architecture Bottom-up Architecture Web Data In-memory BI Sandbox Ad hoc query Upload & query Audio/video Data Free- Standing Sandbox Query & report Ad hoc query Analytical platform or non-relational database External Data Ad hoc query Power User Documents & Text
  • 16. Recommendations Harmonize top down and bottom up BI Implement a BI architecture that supports multiple intelligences Create multiple types of analytic sandboxes Implement analytic platforms that meet business and technical requirements

Notas del editor

  1. Welcome to this Webcast on Big Data Analytics. My name is Wayne Eckerson, a long-time industry analyst and thought leader in the business intelligence market. I will be your speaker today. One housekeeping item before we begin. This is a prerecorded Webcast so there will be no Q&A session at the end If you have questions for me, please don’t hesitate to send me an email at weckerson@gmail.com. I’d be happy to dialogue with you about this important topic! The research and findings that I will present in this Webcast are based on a report that you can download for free from the BeyeNetwork web site or from Bitpipe. It’s a 40-page report so I hope you take the time to peruse through its details. This 60-minute webcast will present highlights from that report. First, I’ll talk about the big data analytics movement, what’s behind it, what it is, and best practices for doing it. Second, I’ll talk about big data analytics engines. I’ll explain the technology most of these engines use to turbo-charge analytical queries and then catalog vendors in the space. Third, I’ll lump analytic engines into four categories and present survey results that show what causes customers to buy each category of product. Finally, and perhaps most importantly, I’ll describe a framework for implementing big data analytics and show how to expand your existing business intelligence and data warehousing architecture to handle new requirements. So with that, let’s begin.
  2. I’d like to thank our sponsors who made the research and this webcast possible.
  3. There has been a lot of talk about “big data” in the past year, which I find a bit puzzling. I’ve been in the data warehousing field for more than 15 years, and data warehousing has always been about big data. So what’s new in 2011? Why are we are talking about “big data” today? There are several reasons: Changing data types. Organizations are capturing different types of data today. Until about five years ago, most data was transactional in nature, consisting of numeric data that fit easily into rows and columns of relational databases. Today, the growth in data is fueled by largely unstructured data from wWebsites as well as machine-generated data from an exploding number of sensors. Technology advances. Hardware has finally caught up with software. The exponential gains in price/-performance exhibited by computer processors, memory, and disk storage have finally made it possible to store and analyze large volumes of data at an affordable price. Organizations are storing and analyzing more data because they can.!  Insourcingand outsourcing. Because of the complexity and cost of storing and analyzing Web traffic data, most organizations have outsourced these functions to third-party service bureaus. But as the size and importance of corporate e-commerce channels have increased, many are now eager to insource this data to gain greater insights about customers. At the same time, virtualization technology is making it attractive for organizations to move large-scale data processing to private hosted networks or public clouds. Developers discover data. The biggest reason for the popularity of the term “big data” is that Web and application developers have discovered the value of building a new data-intensive applications. To application developers, “big data” is new and exciting. Of course, for those of us who have made their careers in the data world, the new era of “big data” is simply another step in the evolution of data management systems that support reporting and analysis applications.
  4. Big data by itself, regardless of the type, is worthless unless business users do something with it that delivers value to their organizations. That’s where analytics comes in. Although organizations have always run reports against data warehouses, most haven’t opened these repositories to ad hoc exploration. This is partly because analysis tools are too complex for the average user but also because the repositories often don’t contain all the data needed by the power user. But this is changing. Patterns. A valuable characteristic of ““big data”” is that it contains more patterns and interesting anomalies than “small” data. Thus, organizations can gain greater value by mining large data volumes than small ones. Fortunately, techniques already exist to mine big data thanks to companies, such as SAS Institute and SPSS (now part of IBM), that ship analyticalworkbenches. Real-time. Organizations that accumulate big data recognize quickly that they need to change the way they capture, transform, and move data from a nightly batch process to a continuous process using micro batch loads or event-driven updates. This technical constraint pays big business dividends because it makes it possible to deliver critical information to users in near-real-time. Complex analytics. In addition, during the past 15 years, the “analytical IQ” of many organizations has evolved from reporting and dashboarding to lightweight analysis. Many are now on the verge of upping their analytical IQ by implementing predictive analyticsagainst both structured and unstructured data. This type of analyticscan be used to do everything from delivered highly tailored cross-sell recommendations to predicting failure rates of aircraft engines. Sustainable advantage . At the same time, executives have recognized the power of analytics to deliver a competitive advantage, thanks to the pioneering work of thought leaders, such as Tom Davenport, who co-wrote the book,“Competing on Analytics.” In fact, forward-thinking executives recognize that analytics may be the only true source of sustainable advantage since it empowers employees at all levels of an organization with information to help them make smarter decisions.
  5. However, the road to big data analytics is not easy and success is not guaranteed. Analytical champions are still rare. That’s because succeeding with big data analytics requires the right culture, people, organization, architecture, and technology.The right culture. Analytical organizations are championed by executives who believe in making fact-based decisions or validating intuition with data. These executives create a culture of performance measurement in which individuals and groups are held accountable for the outcomes of predefined metrics aligned with strategic objectives. The right people. You can’t do big data analytics without power users, or more specifically, business analysts, analytical modelers, and data scientists. These folks possess a rare combination of skills and knowledge: Tthey have a deep understanding of business processes and the data that sits behind those processes and are skillful in the use of various analytical tools, including Excel, SQL, analytical workbenches and coding languages. The right organization. Historically, analysts with the aforementioned skills were pooled in pockets of an organization hired by department heads. But analytical champions create a shared service organization (i.e., an analytical center of excellence) that makes analytics a pervasive competence. Analysts are still assigned to specific departments and processes, but they are also part of a central organization that provides collaboration, camaraderie, and a career path. Analytic platform. At the heart of an analytical infrastructure is an analytic platform, the underlying data management system that consumes, integrates, and provides user access to information for reporting and analysis activities. Today, many vendors, including most sponsors of this Webinar, provide specialized analytic platforms that provide dramatically better query performance  than existing systems. There are many different types of analytic platforms sold by dozens of vendors.
  6. So what is an analytic platform? It’s a data management system optimized for query processing and analytics that provides superior price-performance and availability compared with general purpose database management systems. Given this definition, 72% of our survey respondents said they already have an analytic platform. This is a surprisingly high percentage given that these platforms, except for Teradata and Sybase IQ, have only been generally available for the past five years or so. In looking at the survey responses, I did see a lot of Microsoft customers who think SQL Server fits this definition, which is doesn’t. Nevertheless, I think the results speak volumes for the power of these analytic platforms to optimize the performance of analytical applications.
  7. Analytic platforms offer superior price-performance for many reasons. And while product architectures vary considerably, most support the following characteristics: Massively parallel processing (MPP). Most analytic platforms spread data across multiple nodes, each containing their own CPU, memory, and storage and connected to a high-speed backplane. When a user submits a query or runs an application, the “shared nothing” system divides the work across the nodes, each of which process the query on their piece of the data and ship the results to a master node that assembles the final result and sends it to the user. MPP systems are highly scalable, since you simply add nodes to increase processing power. Balanced configurations. Analytic platforms optimize the configuration of CPU, memory, and disk for query processing rather than transaction processing. Analytic appliances essentially “hard wire” this configuration into the system and don’t let customers change it, whereas analytic bundles or analytic databases (i.e., software-only solutions) allow customers to configure the underlying hardware to match unique application requirements. Storage-level processing. Netezza’s big innovation was to move some database functions, specifically data filtering functions, into the storage system using field programmable gate arrays. This storage-level filtering reduces the amount of data that the DBMS has to process, which significantly increases query performance. Many vendors have followed suit, moving various databases functions into hardware. Columnar storage and compression. Many vendors have followed the lead of Sybase, Sand Technology, Paraccel, and other columnar pioneers, by storing data in columns not rows. Since most queries ask for a subset of columns in a row rather than all rows, storing data in columns minimizes the amount of data that needs to be retrieved from disk and processed by the database, accelerating query performance. In addition, since data elements in many columns are repeated (e.g., “male” and “female” in the gender field), column-store systems can eliminate duplicates and compress data volumes significantly, sometimes as much as 10:1. This enables more data to fit into memory, which speeds processing. Memory. Many analytic platforms make liberal use of memory caches to speed query processing. Some products, such as SAP HANA and QlikTech’sQlikView, store all data in memory, while others store recently queried results in a smart cache so others who need to retrieve the same data can pull it from memory rather than from disk. Given the growing affordability of memory and the widespread deployment of 64-bit operating systems, many analytic platforms are expanding their memory footprints to speed processing.Query Optimizer. Analytic platform vendors invest a lot of time and money researching ways to enhance their query optimizers to handle various workloads. A good query optimizer is the biggest contributor query performance. In this respect, the older vendors with established products have an edge. Plug-in Analytics. True to their name, many analytic platforms offer built-in support for complex analytic. This includes complex SQL, such as correlated subqueries, as well as procedural code implemented as plug-ins to the database. Some vendors offer a library of analytical routines, from fuzzy matching algorithms to market-basket calculations. Some, like Aster Data (now owned by Teradata) provide native support for MapReduce programs that are called using SQL.
  8. MPP Analytic Databases - Row-based databases designed to scale out on a cluster of commodity servers and run complex queries in parallel against large volumes of data. Columnar Databases - Database management systems that stores data in columns, not rows, and support high data compression ratios. Analytic Appliances - Preconfigured hardware-software system designed for query processing and analytic that requires little tuning.Analytic Bundles - Predefined hardware and software configurations that are certified to meet specific performance criteria, but the customer must purchase and configure themselves.In-memory Database - System that loads data into memory to execute complex queries. Distributed file-based systems - Designed for storing, indexing, manipulating and querying large volumes of unstructured and semi-structured dataAnalytic Services - Analytic platform delivered as a hosted or public-cloud-based service. Nonrelational databases - Optimized for querying unstructured data as well as structured data. CEP/Streaming Engines - Ingest, filter, calculate, and correlate large volumes of discrete events and apply rules that trigger alerts when conditions are met.
  9. Our survey grouped analytic platforms into four major categories to make it easier to compare and contrast various product offerings:Analytic databasesare software-only analytic platforms that run on a variety of hardware that customers purchase. Customers install, configure and tune software, including the analytic database, before they can use the analytic system. Most MPP analytic databases, columnar databases, and in-memory databases qualify as analytic databases. As a rule of thumb, analytic databases are good for organizations that want to tune database performance for specific workloads or run the RDBMS software on a virtualized private cloud. Analytic appliances: These are hardware-software combinations designed to support ad hoc queries and other types of analytic processing. This category includes both analytic appliances and analytic bundles. As a rule of thumb, analytic appliances are fast to deploy and easy to maintain and make good replacements for Microsoft SQL Server or Oracle data warehouses that have run out of gas. They also make great standalone data marts to offload complex queries from large, maxed-out data warehousing hubs. Analytic services: Rather than deploy an analytic platform in a customer’s data center, an analytic service enables customers to house the system in an off-site hosted environment or public cloud. As a rule of thumb, analytic services are great for development, test and prototyping applications as well as for organizations that don’t have an IT department or want to outsource data center operations or get up and running very quickly. File-based analytic system: This generally refers to Hadoop, but we also lumped NoSQL or nonrelational systems into this category, although it’s not entirely accurate since nonrelational systems are databases. However, since both are used to store and analyze large volumes of unstructured data and don’t’ require an up-front schema design, they share more similarities than differences. As a rule of thumb, this category of products are ideal for processing large volumes of Web traffic and other log-based or machine-generated data.
  10. When examining the business requirements driving purchases of analytic platforms overall, three percolate to the top: “faster queries,” “storing more data” and “reduced costs.” These requirements are followed by “more complex queries,” “higher availability” and “quicker to deploy.” This ranking is based on summing the percentages of all four deployment options for each requirement.More important, this chart shows that customers purchase each deployment option for different reasons. Analytic database customers value “quick to deploy” (46%), “built-in analytic” (43%) and “easier maintenance” (41%) more than other requirements, while analytic service customers favor “storing more data” (67%), “high availability” (67%), “reduced costs” (56%) and “more concurrent users” (56%). Not surprisingly, customers with file-based systems look for the ability to support “more diverse data” (64%) and “more flexible schemas” (64%), two hallmarks of a Hadoop/NoSQL offering. Analytic appliance customers had the most emphatic requirements. Almost two-thirds value faster queries (70%), more complex queries (64%) and faster load times (63%), suggesting that analytic appliance customers seek to offload complex ad hoc queries from data warehouses.
  11. We also asked respondents if they were looking for a specific deployment option when evaluating products (see Figure 14). Except for customers of file-based systems, most customers investigated products across these four categories. For example, Blue Cross Blue Shield of Kansas City looked at three columnar databases (i.e., software-only) and an appliance before making a decision. Interestingly, no analytic service customers intended to subscribe to a service prior to evaluating products. That’s because many analytic-service customers subscribe to such services on a temporary basis, either to test or prototype a system or to wait until the IT department readies the hardware to house the system. Some of these customers continue with the services, recognizing that they provide a more cost-effective test and development environment than an in-house system.
  12. Now that we’ve discussed the engines that drive big data analytics, let’s step back a bit and look at the overall framework in which they operate. I introduced this BI Delivery Framework 2020 in March. It’s basically my vision for what BI environments will look like in about 10 years. Instead of one intelligence and BI architecture to support reporting and analysis applications depicted in the middle, there will be four intelligences. Let me briefly describe each. Business Intelligence represents a classic data warehousing environment that delivers reports and dashboards primarily to casual users via a MAD framework. MAD stands for….Moving to the right, Continuous Intelligence delivers near real-time information and alerts to operational workers using event-driven architectures that handle simple and complex events. At the bottom, Analytics Intelligence enables power users to submit ad hoc queries against any data source using a variety of tools, ideally supported by analytic sandboxes built into the top-down environment. To the left, Content Intelligence makes unstructured data an equal target for reporting and analysis applications. These systems using a variety of indexing technologies to store both structured and unstructured data and allow users to submit queries against them. This is a fast growing area that encompasses Hadoop, NoSQL, and search-based technologies. If you want more information on this framework, please download my first report, titled Analytic Architectures: BI Delivery Framework 2020 from BeyeNEtwork’s Web site. But before leaving the framework, I want to drill down on business intelligence and analytics intelligence, the two most inter-related intelligences in this framework, and the two most problematic to manage synergistically.
  13. This is another depiction of the two intelligences. As I already mentioned, Business Intelligence is a top-down environment that delivers reports and dashboards to casual users. The output is based on predefined metrics aligned with strategic goals and objectives. In other words, in a top-down environment, you know in advance what questions users want to ask and you model the environment accordingly. The benefits of this environment is that it delivers information consistency and alignment – the proverbial single version of truth. The downsides are that it’s hard to build, hard to change, costly, and politically charged. In contrast, Analytics Intelligence is the opposite. It’s a bottom-up environment geared to power users who submit ad hoc queries against a variety of sources to optimize processes and projects. This ad hoc environment is quick to build, easy to change, low cost, and politically uncharged. Yet, it creates myriad analytic silos and thus, it forfeits information alignment and consistency. The problem here is that most companies try to do all BI in either a top-down or bottom-up environment. They may start with a top-down and get discouraged that it’s expensive and not geared to ad hoc types of requests. So they abandon it in favor of analytics intelligence, which works fine for awhile until they realize they are overwhelmed with analytic silos and don’t have a common understanding of business performance. The first key here is to recognize that you need both top down and bottom up environments. They are synergistic. Analysis begets reports and reports beget analysis. You do some analysis, find something interesting, and turn it into a regularly scheduled report for others to see. But that report should trigger additional questions, which call for additional analysis, and so on. The second key is to apply the right architecture to the right tasks. Typically top down environments address 80% of your information requirements and bottom-up 20%. Yet, bottom-up may uncover 80% of your most valuable insights. Both are equally important and must be treated equivalently when building your corporate information architecture.
  14. So here’s the architecture behind the BI Delivery Framework 2020. Let me step you through this: What’s pictured in below is the classic top-down business intelligence and data warehousing environment that most organizations have already built. …..What’s pictured in pink are new components that address the other three intelligences.To the left are new sources of data that aren’t typically loaded into DWs: Machine generated data, Web data, unstructured data and external data. In front of these sources is a Hadoop cluster, which is ideal for processing in batch large volumes of unstructured and semi-structured data, although it also can manage structured dataAtop the DW is the streaming/complex event processing engine for handling continuous intelligence and alerting.Below the DW is a free-standing database or sandbox that offloads bottom-up analytic processing from the DW, if desired. To the right and bottom is the power user, who traditionally has been left out of classic BI/DW architecture. Now, they have access to five types of analytic sandboxes designed to support ad hoc query processing as well as external data, if they have permission.
  15. This is the same BI architecture but with the five sandboxes highlighted in green. A virtual sandbox inside the DW is a set of dedicated partitions into which analysts can upload their own data and mix it with corporate data. To avoid contention among DW resources, many companies create a free-standing sandbox to house data or users which the DW can’t support. Basically, this option offloads complex processing to a separate machine. A local in-memory BI tool can also serve as a sandbox as long as it requires analysts to publish their findings to an IT-managed server rather than proliferate spreadmarts. Hadoop is a sandbox because it allows power users who know the atomic data well and can write code to submit queries against large volumes of unstructured or structured data.Like Hadoop, a DW can be a sandbox for those power users whom IT trusts to write well-designed SQL that doesn’t bog down performance for others
  16. So, to wrap up, I have five recommendations for supporting big data analytics: #1. Harmonize top down and bottom up BI. For too long, organizations have tried to shoehorn all types of users into a single information architecture. That has never worked. Organizations need to recognize that casual users need top-down, interactive reports and dashboards, while power users need ad hoc exploratory tools and environments. #2. Implement a BI architecture that supports multiple intelligences The BI architecture of the future supports both traditional data warehousing to handle detailed transactional data and file-based and nonrelational systems to handle unstructured and semi-structured data. It also supports continuous intelligence through CEP and streaming engines and analytical sandboxes for ad hoc exploration. #3. Create multiple types of analytic sandboxes. Analytic sandboxesbring power users more fully into the corporate data environment by enabling them to mix personal and corporate data and run complex, ad hoc queries with minimal restrictions. #4. Implement analytic platforms that meet business and technical requirements. There are four broad types of analytic platforms. Pick the one that is right for you. Appliances are quick to deploy and easy to maintain; analytic databases provide flexibility to run the software on the hardware of your choice; analytic services forego the time and cost of provisioning software in your own data center if you have one; and file-based systems are ideal for processing unstructured and semi-structured data.