SlideShare una empresa de Scribd logo
1 de 20
Descargar para leer sin conexión
What is Big Data?



A new generation of technologies and architectures
designed to economically extract value from very large
volumes of a wide variety of data, by enabling high
velocity capture, discovery and/or analysis
VELOCITY
                              VARIETY
                              VOLUME
                    +   VISUALIZATION

                                   VALUE



Big Data’s impact can be expressed by The Five V’s
 E-Commerce Site fed by outsourced Ad Servers
 Ads appear on a wide range of sites with various offers
 Massive amount of data is generated by these servers:
 • Web logs and click stream data from the E-Commerce Site
 • Ad logs and click stream data from the Ad Servers
 • Results in relational transactions on the site


 Goal: Maximize Traffic Analysis for Business Value
 • Velocity Demo: Pinpoint activity in real-time & react
 • Variety Demo: Examine historical trends across sources
 • Visualization Demo: Enable ad-hoc data analysis for insights



Demo Context
WEB SERVERS



                        How to identify when Ad clicks results in Site Traffic?
                         High volume stream of log activity coming in:
                           •   Web logs and Ad Server logs
                         Real-time stream analysis allows for pinpointing
                          data when it happens
      LOG FILES          Simultaneously join structured and unstructured
                          data in a persistent query
                         Can be used for A/B testing, Offer improvement,
                          Site Dynamic behavior, or Fraud Detection




     AD SERVERS

Velocity Architecture
DEMO: StreamInsight
WEB SERVERS
                       How to do historical analysis on unstructured data?




                        M/R
      LOG FILES


                        Ad Servers and Web Servers generate different log files with different formats
                         making them hard to analyze
                        Map/Reduce processing allows for us to execute a query across variant data
                         formats stored in Hadoop
                        Hive provides a traditional query interface to Map/Reduce
                        Correlate and connect high variety data for trend analysis
     AD SERVERS

Variety Architecture
Access Azure blob storage via a Hive “view” and aggregate session data
 CREATE EXTERNAL TABLE logs (
 date1 STRING,
 time1 STRING,
 action STRING,
 page_uri STRING,
 cookie STRING)
 ROW FORMAT DELIMITED FIELDS TERMINATED BY ' '
 STORED AS TEXTFILE
 LOCATION 'asv://logs/logs/';
 CREATE TABLE log_summary AS
 SELECT l.cookie
 ,MAX(regexp_replace(cookie, '[-]', '') % 36) AS geo_hash
 ,MAX(l.time1) AS time1
 ,l.page_uri
 ,MAX(CASE LOWER(action) WHEN 'click' THEN concat(l.date1, ' ', l.time1) ELSE NULL END) AS click_time
 ,MIN(CASE LOWER(action) WHEN 'view' THEN concat(l.date1, ' ', l.time1) ELSE NULL END) AS view_time
 ,MAX(l.date1) AS date1
 FROM logs l
 GROUP BY l.cookie, l.page_uri;

Hive HQL Queries
DEMO: Azure HDInsight
Hadoop is an open source framework for building large scale,
             distributed, data- intensive applications

                                               • Hadoop is HDFS, the
                                                 kernel & M/R
                                               • MapReduce brings the
                                                 code to the data
                                               • Open set of tools exist to
                                                 extend its functional uses
                                                 and representations




Hadoop Ecosystem Overview
The "Map" step                                                     The "Reduce" step
 The mappers are responsible for reading the input data and         Each reducer executes a function on all values for a given
 emitting key/value pairs. The input file can be CSV, XML, or any   key. The framework ensures that all values for the same
 format as long as it can be converted into k/v pairs.              key are sent to the same reducer.




Map/Reduce Distributes Processing of Operations
WEB SERVERS
                     How to do ad-hoc data discovery and visualizations?




                      M/R
      LOG FILES


                      Ad Servers and Web Servers generate different log files with different formats
                       making them hard to analyze
                      Map/Reduce processing allows for us to execute a query across variant data
                       formats stored in Hadoop
                      Hive provides a traditional query interface to Map/Reduce
                      Correlate and connect high variety data for trend analysis
     AD SERVERS

Visualization Architecture
DEMO: Excel & Hive Adapter
 Big Data & Analytics Projects are often Additive
 • New Capabilities layered on top of existing data & apps
 • Analytics can drive Applications in new ways
 Visualizations put Big Data in the hands of the Business




Summary
We are BlueMetal Architects
Take the next steps – Imagine, Define, Build
 Envisioning & Strategy Briefing: Big Data, Analytics & Collaboration
 Envisioning Session: Data is the App – Envisioning the Next
  Generation, Data Driven Enterprise
 Architecture Design Session: Big Data & Analytics
 Healthcare / Life Sciences: Strategy Briefing or Architecture Design
  Session – Big Data Architecture, Cloud & Use Case Driven Analytics
  and applications, Portal, M-Health and UX design for Providers,
  Patients, Pharma & Biotechnology
 Financial Services: Strategy Briefing or Architecture Design Session –
  Big Data & Analytics for Banking, Capital Markets, Retail Brokerage or
  Insurance




Take the next steps - our offerings
Thank You
DESIGN            Differentiation




             UX   DATA     SOCIAL   Specialization




                  CODE              Foundation




Who We Are
DESIGN
                                                       Differentiation
              Strategy     Analysis      Creative



               UX          DATA        SOCIAL
              Desktop      Analytics   Web Content
                                                       Specialization
              Mobile       Big Data      Intranets

             Web Client    Core SQL    Collaboration



               .NET       SERVICES     On-Premise
                                                       Foundation
                Java         PPP          Cloud




Who We Are

Más contenido relacionado

La actualidad más candente

ESRI Mapping & Charting Solution: ArcGIS 10 Production Mapping
ESRI Mapping & Charting Solution: ArcGIS 10 Production MappingESRI Mapping & Charting Solution: ArcGIS 10 Production Mapping
ESRI Mapping & Charting Solution: ArcGIS 10 Production Mappingmmarques_esri
 
RDX Insights Presentation - Microsoft Business Intelligence
RDX Insights Presentation - Microsoft Business IntelligenceRDX Insights Presentation - Microsoft Business Intelligence
RDX Insights Presentation - Microsoft Business IntelligenceChristopher Foot
 
Cortana Analytics Suite
Cortana Analytics SuiteCortana Analytics Suite
Cortana Analytics SuiteJames Serra
 
Transitioning to a BI Role
Transitioning to a BI RoleTransitioning to a BI Role
Transitioning to a BI RoleJames Serra
 
Cepta The Future of Data with Power BI
Cepta The Future of Data with Power BICepta The Future of Data with Power BI
Cepta The Future of Data with Power BIKellyn Pot'Vin-Gorman
 
Day1 concurrent fellows
Day1 concurrent fellowsDay1 concurrent fellows
Day1 concurrent fellowstoptrails
 
Esri Ireland "ArcGIS - The Platform Story" Roadmap Session - Eamonn Doyle, Es...
Esri Ireland "ArcGIS - The Platform Story" Roadmap Session - Eamonn Doyle, Es...Esri Ireland "ArcGIS - The Platform Story" Roadmap Session - Eamonn Doyle, Es...
Esri Ireland "ArcGIS - The Platform Story" Roadmap Session - Eamonn Doyle, Es...Esri Ireland
 
Azure cafe marketplace with looker data analytics
Azure cafe marketplace with looker data analyticsAzure cafe marketplace with looker data analytics
Azure cafe marketplace with looker data analyticsMark Kromer
 
Semantic Web Application Development
Semantic Web Application DevelopmentSemantic Web Application Development
Semantic Web Application DevelopmentDaniel Slamowitz
 
Evolution of Esri Data Formats Seminar
Evolution of Esri Data Formats SeminarEvolution of Esri Data Formats Seminar
Evolution of Esri Data Formats SeminarEsri South Africa
 
ArcGIS
ArcGISArcGIS
ArcGISEsri
 
Overview of Microsoft Appliances: Scaling SQL Server to Hundreds of Terabytes
Overview of Microsoft Appliances: Scaling SQL Server to Hundreds of TerabytesOverview of Microsoft Appliances: Scaling SQL Server to Hundreds of Terabytes
Overview of Microsoft Appliances: Scaling SQL Server to Hundreds of TerabytesJames Serra
 
Is the traditional data warehouse dead?
Is the traditional data warehouse dead?Is the traditional data warehouse dead?
Is the traditional data warehouse dead?James Serra
 
Big Data Storage Challenges and Solutions
Big Data Storage Challenges and SolutionsBig Data Storage Challenges and Solutions
Big Data Storage Challenges and SolutionsWSO2
 
Customer Experience at Disney+ Through Data Perspective
Customer Experience at Disney+ Through Data PerspectiveCustomer Experience at Disney+ Through Data Perspective
Customer Experience at Disney+ Through Data PerspectiveDatabricks
 
Data Virtualization Primer - Introduction
Data Virtualization Primer - IntroductionData Virtualization Primer - Introduction
Data Virtualization Primer - IntroductionKenneth Peeples
 
Exploring Puerto Rico Open Data with Power BI
Exploring Puerto Rico Open Data with Power BIExploring Puerto Rico Open Data with Power BI
Exploring Puerto Rico Open Data with Power BIGuillermo Caicedo
 
Open Data Portals: 9 Solutions and How they Compare
Open Data Portals: 9 Solutions and How they CompareOpen Data Portals: 9 Solutions and How they Compare
Open Data Portals: 9 Solutions and How they CompareSafe Software
 
Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020
Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020
Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020Mariano Gonzalez
 

La actualidad más candente (20)

ESRI Mapping & Charting Solution: ArcGIS 10 Production Mapping
ESRI Mapping & Charting Solution: ArcGIS 10 Production MappingESRI Mapping & Charting Solution: ArcGIS 10 Production Mapping
ESRI Mapping & Charting Solution: ArcGIS 10 Production Mapping
 
RDX Insights Presentation - Microsoft Business Intelligence
RDX Insights Presentation - Microsoft Business IntelligenceRDX Insights Presentation - Microsoft Business Intelligence
RDX Insights Presentation - Microsoft Business Intelligence
 
Cortana Analytics Suite
Cortana Analytics SuiteCortana Analytics Suite
Cortana Analytics Suite
 
Transitioning to a BI Role
Transitioning to a BI RoleTransitioning to a BI Role
Transitioning to a BI Role
 
Cepta The Future of Data with Power BI
Cepta The Future of Data with Power BICepta The Future of Data with Power BI
Cepta The Future of Data with Power BI
 
Day1 concurrent fellows
Day1 concurrent fellowsDay1 concurrent fellows
Day1 concurrent fellows
 
Esri Ireland "ArcGIS - The Platform Story" Roadmap Session - Eamonn Doyle, Es...
Esri Ireland "ArcGIS - The Platform Story" Roadmap Session - Eamonn Doyle, Es...Esri Ireland "ArcGIS - The Platform Story" Roadmap Session - Eamonn Doyle, Es...
Esri Ireland "ArcGIS - The Platform Story" Roadmap Session - Eamonn Doyle, Es...
 
Azure cafe marketplace with looker data analytics
Azure cafe marketplace with looker data analyticsAzure cafe marketplace with looker data analytics
Azure cafe marketplace with looker data analytics
 
Semantic Web Application Development
Semantic Web Application DevelopmentSemantic Web Application Development
Semantic Web Application Development
 
Modernizando plataforma de bi
Modernizando plataforma de biModernizando plataforma de bi
Modernizando plataforma de bi
 
Evolution of Esri Data Formats Seminar
Evolution of Esri Data Formats SeminarEvolution of Esri Data Formats Seminar
Evolution of Esri Data Formats Seminar
 
ArcGIS
ArcGISArcGIS
ArcGIS
 
Overview of Microsoft Appliances: Scaling SQL Server to Hundreds of Terabytes
Overview of Microsoft Appliances: Scaling SQL Server to Hundreds of TerabytesOverview of Microsoft Appliances: Scaling SQL Server to Hundreds of Terabytes
Overview of Microsoft Appliances: Scaling SQL Server to Hundreds of Terabytes
 
Is the traditional data warehouse dead?
Is the traditional data warehouse dead?Is the traditional data warehouse dead?
Is the traditional data warehouse dead?
 
Big Data Storage Challenges and Solutions
Big Data Storage Challenges and SolutionsBig Data Storage Challenges and Solutions
Big Data Storage Challenges and Solutions
 
Customer Experience at Disney+ Through Data Perspective
Customer Experience at Disney+ Through Data PerspectiveCustomer Experience at Disney+ Through Data Perspective
Customer Experience at Disney+ Through Data Perspective
 
Data Virtualization Primer - Introduction
Data Virtualization Primer - IntroductionData Virtualization Primer - Introduction
Data Virtualization Primer - Introduction
 
Exploring Puerto Rico Open Data with Power BI
Exploring Puerto Rico Open Data with Power BIExploring Puerto Rico Open Data with Power BI
Exploring Puerto Rico Open Data with Power BI
 
Open Data Portals: 9 Solutions and How they Compare
Open Data Portals: 9 Solutions and How they CompareOpen Data Portals: 9 Solutions and How they Compare
Open Data Portals: 9 Solutions and How they Compare
 
Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020
Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020
Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020
 

Destacado

El "Internet de Todo" (IoT)
El "Internet de Todo" (IoT)El "Internet de Todo" (IoT)
El "Internet de Todo" (IoT)Egdares Futch H.
 
Iot- Construyendo negocios a través de la información - Carlos Calderón
Iot- Construyendo negocios a través de la información - Carlos CalderónIot- Construyendo negocios a través de la información - Carlos Calderón
Iot- Construyendo negocios a través de la información - Carlos CalderónCNT
 
Big data architectures
Big data architecturesBig data architectures
Big data architecturesDaan Gerits
 
User and IoT Data Analytics
User and IoT Data AnalyticsUser and IoT Data Analytics
User and IoT Data AnalyticsEricsson
 
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...SoftServe
 
Tableau Software - Business Analytics and Data Visualization
Tableau Software - Business Analytics and Data VisualizationTableau Software - Business Analytics and Data Visualization
Tableau Software - Business Analytics and Data Visualizationlesterathayde
 
Big Data & Analytics Architecture
Big Data & Analytics ArchitectureBig Data & Analytics Architecture
Big Data & Analytics ArchitectureArvind Sathi
 
Big Data: Architectures and Approaches
Big Data: Architectures and ApproachesBig Data: Architectures and Approaches
Big Data: Architectures and ApproachesThoughtworks
 
The Internet of Things is Here: Implementing IoT in Your Facility
The Internet of Things is Here: Implementing IoT in Your FacilityThe Internet of Things is Here: Implementing IoT in Your Facility
The Internet of Things is Here: Implementing IoT in Your FacilitySenseware
 
Internet of Things (IoT) - We Are at the Tip of An Iceberg
Internet of Things (IoT) - We Are at the Tip of An IcebergInternet of Things (IoT) - We Are at the Tip of An Iceberg
Internet of Things (IoT) - We Are at the Tip of An IcebergDr. Mazlan Abbas
 
Internet of things (IoT) and big data- r.nabati
Internet of things (IoT) and big data- r.nabatiInternet of things (IoT) and big data- r.nabati
Internet of things (IoT) and big data- r.nabatinabati
 
IoT in Agriculture
IoT in AgricultureIoT in Agriculture
IoT in AgricultureTibbo
 
The Future of Everything
The Future of EverythingThe Future of Everything
The Future of EverythingCharbel Zeaiter
 
Hype vs. Reality: The AI Explainer
Hype vs. Reality: The AI ExplainerHype vs. Reality: The AI Explainer
Hype vs. Reality: The AI ExplainerLuminary Labs
 

Destacado (15)

El "Internet de Todo" (IoT)
El "Internet de Todo" (IoT)El "Internet de Todo" (IoT)
El "Internet de Todo" (IoT)
 
Iot- Construyendo negocios a través de la información - Carlos Calderón
Iot- Construyendo negocios a través de la información - Carlos CalderónIot- Construyendo negocios a través de la información - Carlos Calderón
Iot- Construyendo negocios a través de la información - Carlos Calderón
 
Big data architectures
Big data architecturesBig data architectures
Big data architectures
 
User and IoT Data Analytics
User and IoT Data AnalyticsUser and IoT Data Analytics
User and IoT Data Analytics
 
Big Data Architectural Patterns
Big Data Architectural PatternsBig Data Architectural Patterns
Big Data Architectural Patterns
 
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
 
Tableau Software - Business Analytics and Data Visualization
Tableau Software - Business Analytics and Data VisualizationTableau Software - Business Analytics and Data Visualization
Tableau Software - Business Analytics and Data Visualization
 
Big Data & Analytics Architecture
Big Data & Analytics ArchitectureBig Data & Analytics Architecture
Big Data & Analytics Architecture
 
Big Data: Architectures and Approaches
Big Data: Architectures and ApproachesBig Data: Architectures and Approaches
Big Data: Architectures and Approaches
 
The Internet of Things is Here: Implementing IoT in Your Facility
The Internet of Things is Here: Implementing IoT in Your FacilityThe Internet of Things is Here: Implementing IoT in Your Facility
The Internet of Things is Here: Implementing IoT in Your Facility
 
Internet of Things (IoT) - We Are at the Tip of An Iceberg
Internet of Things (IoT) - We Are at the Tip of An IcebergInternet of Things (IoT) - We Are at the Tip of An Iceberg
Internet of Things (IoT) - We Are at the Tip of An Iceberg
 
Internet of things (IoT) and big data- r.nabati
Internet of things (IoT) and big data- r.nabatiInternet of things (IoT) and big data- r.nabati
Internet of things (IoT) and big data- r.nabati
 
IoT in Agriculture
IoT in AgricultureIoT in Agriculture
IoT in Agriculture
 
The Future of Everything
The Future of EverythingThe Future of Everything
The Future of Everything
 
Hype vs. Reality: The AI Explainer
Hype vs. Reality: The AI ExplainerHype vs. Reality: The AI Explainer
Hype vs. Reality: The AI Explainer
 

Similar a What is Big Data? The 5 V's and Real World Use Cases

Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016StampedeCon
 
Benefits of the Azure Cloud
Benefits of the Azure CloudBenefits of the Azure Cloud
Benefits of the Azure CloudCaserta
 
Benefits of the Azure cloud
Benefits of the Azure cloudBenefits of the Azure cloud
Benefits of the Azure cloudJames Serra
 
Logitech - LOGITECH ACCELERATES CLOUD ANALYTICS USING DATA VIRTUALIZATION
Logitech - LOGITECH ACCELERATES CLOUD ANALYTICS USING DATA VIRTUALIZATIONLogitech - LOGITECH ACCELERATES CLOUD ANALYTICS USING DATA VIRTUALIZATION
Logitech - LOGITECH ACCELERATES CLOUD ANALYTICS USING DATA VIRTUALIZATIONAvinash Deshpande
 
Microsoft cloud big data strategy
Microsoft cloud big data strategyMicrosoft cloud big data strategy
Microsoft cloud big data strategyJames Serra
 
Kyvos Insights
Kyvos Insights Kyvos Insights
Kyvos Insights rebeccatho
 
Bringing the Power of Big Data Computation to Salesforce
Bringing the Power of Big Data Computation to SalesforceBringing the Power of Big Data Computation to Salesforce
Bringing the Power of Big Data Computation to SalesforceSalesforce Developers
 
How does Microsoft solve Big Data?
How does Microsoft solve Big Data?How does Microsoft solve Big Data?
How does Microsoft solve Big Data?James Serra
 
QuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing WebinarQuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing WebinarRTTS
 
OpenSistemas Corporate Presentation
OpenSistemas Corporate PresentationOpenSistemas Corporate Presentation
OpenSistemas Corporate PresentationOpenSistemas
 
Data Virtualization: Introduction and Business Value (UK)
Data Virtualization: Introduction and Business Value (UK)Data Virtualization: Introduction and Business Value (UK)
Data Virtualization: Introduction and Business Value (UK)Denodo
 
Denodo Design Studio: Modeling and Creation of Data Services
Denodo Design Studio: Modeling and Creation of Data ServicesDenodo Design Studio: Modeling and Creation of Data Services
Denodo Design Studio: Modeling and Creation of Data ServicesDenodo
 
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dataconomy Media
 
Self service BI with sql server 2008 R2 and microsoft power pivot short
Self service BI with sql server 2008 R2 and microsoft power pivot shortSelf service BI with sql server 2008 R2 and microsoft power pivot short
Self service BI with sql server 2008 R2 and microsoft power pivot shortEduardo Castro
 
USQL Trivadis Azure Data Lake Event
USQL Trivadis Azure Data Lake EventUSQL Trivadis Azure Data Lake Event
USQL Trivadis Azure Data Lake EventTrivadis
 
SMAC - Social, Mobile, Analytics and Cloud - An overview
SMAC - Social, Mobile, Analytics and Cloud - An overview SMAC - Social, Mobile, Analytics and Cloud - An overview
SMAC - Social, Mobile, Analytics and Cloud - An overview Rajesh Menon
 
Technology Overview
Technology OverviewTechnology Overview
Technology OverviewLiran Zelkha
 
Apache hadoop for windows server and windwos azure
Apache hadoop for windows server and windwos azureApache hadoop for windows server and windwos azure
Apache hadoop for windows server and windwos azureBrad Sarsfield
 
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionDifferentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionJames Serra
 

Similar a What is Big Data? The 5 V's and Real World Use Cases (20)

Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
 
Benefits of the Azure Cloud
Benefits of the Azure CloudBenefits of the Azure Cloud
Benefits of the Azure Cloud
 
Benefits of the Azure cloud
Benefits of the Azure cloudBenefits of the Azure cloud
Benefits of the Azure cloud
 
Logitech - LOGITECH ACCELERATES CLOUD ANALYTICS USING DATA VIRTUALIZATION
Logitech - LOGITECH ACCELERATES CLOUD ANALYTICS USING DATA VIRTUALIZATIONLogitech - LOGITECH ACCELERATES CLOUD ANALYTICS USING DATA VIRTUALIZATION
Logitech - LOGITECH ACCELERATES CLOUD ANALYTICS USING DATA VIRTUALIZATION
 
Microsoft cloud big data strategy
Microsoft cloud big data strategyMicrosoft cloud big data strategy
Microsoft cloud big data strategy
 
Kyvos Insights
Kyvos Insights Kyvos Insights
Kyvos Insights
 
Bringing the Power of Big Data Computation to Salesforce
Bringing the Power of Big Data Computation to SalesforceBringing the Power of Big Data Computation to Salesforce
Bringing the Power of Big Data Computation to Salesforce
 
How does Microsoft solve Big Data?
How does Microsoft solve Big Data?How does Microsoft solve Big Data?
How does Microsoft solve Big Data?
 
QuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing WebinarQuerySurge Slide Deck for Big Data Testing Webinar
QuerySurge Slide Deck for Big Data Testing Webinar
 
OpenSistemas Corporate Presentation
OpenSistemas Corporate PresentationOpenSistemas Corporate Presentation
OpenSistemas Corporate Presentation
 
Data Virtualization: Introduction and Business Value (UK)
Data Virtualization: Introduction and Business Value (UK)Data Virtualization: Introduction and Business Value (UK)
Data Virtualization: Introduction and Business Value (UK)
 
Denodo Design Studio: Modeling and Creation of Data Services
Denodo Design Studio: Modeling and Creation of Data ServicesDenodo Design Studio: Modeling and Creation of Data Services
Denodo Design Studio: Modeling and Creation of Data Services
 
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
 
Self service BI with sql server 2008 R2 and microsoft power pivot short
Self service BI with sql server 2008 R2 and microsoft power pivot shortSelf service BI with sql server 2008 R2 and microsoft power pivot short
Self service BI with sql server 2008 R2 and microsoft power pivot short
 
USQL Trivadis Azure Data Lake Event
USQL Trivadis Azure Data Lake EventUSQL Trivadis Azure Data Lake Event
USQL Trivadis Azure Data Lake Event
 
SMAC - Social, Mobile, Analytics and Cloud - An overview
SMAC - Social, Mobile, Analytics and Cloud - An overview SMAC - Social, Mobile, Analytics and Cloud - An overview
SMAC - Social, Mobile, Analytics and Cloud - An overview
 
Neelima_Resume
Neelima_ResumeNeelima_Resume
Neelima_Resume
 
Technology Overview
Technology OverviewTechnology Overview
Technology Overview
 
Apache hadoop for windows server and windwos azure
Apache hadoop for windows server and windwos azureApache hadoop for windows server and windwos azure
Apache hadoop for windows server and windwos azure
 
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionDifferentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
 

Más de BlueMetalInc

Field enablement roadshow keynote - Bob Familiar
Field enablement roadshow keynote - Bob FamiliarField enablement roadshow keynote - Bob Familiar
Field enablement roadshow keynote - Bob FamiliarBlueMetalInc
 
Field Enablement Business Drivers - Matt Bienfang
Field Enablement Business Drivers - Matt BienfangField Enablement Business Drivers - Matt Bienfang
Field Enablement Business Drivers - Matt BienfangBlueMetalInc
 
Field enablement roadshow - Real World Solutions - John Pelak
Field enablement roadshow - Real World Solutions - John PelakField enablement roadshow - Real World Solutions - John Pelak
Field enablement roadshow - Real World Solutions - John PelakBlueMetalInc
 
BlueMetal - Our Company Culture in 30 Seconds
BlueMetal - Our Company Culture in 30 SecondsBlueMetal - Our Company Culture in 30 Seconds
BlueMetal - Our Company Culture in 30 SecondsBlueMetalInc
 
Automating Site Provisioning in SharePoint - Presented 7/27/13 at SharePoint ...
Automating Site Provisioning in SharePoint - Presented 7/27/13 at SharePoint ...Automating Site Provisioning in SharePoint - Presented 7/27/13 at SharePoint ...
Automating Site Provisioning in SharePoint - Presented 7/27/13 at SharePoint ...BlueMetalInc
 
Apps 101 - Moving to the SharePoint 2013 App Model - Presented 7/27/13 at Sha...
Apps 101 - Moving to the SharePoint 2013 App Model - Presented 7/27/13 at Sha...Apps 101 - Moving to the SharePoint 2013 App Model - Presented 7/27/13 at Sha...
Apps 101 - Moving to the SharePoint 2013 App Model - Presented 7/27/13 at Sha...BlueMetalInc
 
20130427 What's Your Social IQ?
20130427 What's Your Social IQ?20130427 What's Your Social IQ?
20130427 What's Your Social IQ?BlueMetalInc
 
20130427 - Turbocharge SharePoint 2010 with SharePoint 2013 Search
20130427 - Turbocharge SharePoint 2010 with SharePoint 2013 Search20130427 - Turbocharge SharePoint 2010 with SharePoint 2013 Search
20130427 - Turbocharge SharePoint 2010 with SharePoint 2013 SearchBlueMetalInc
 
Turbo-Charge Collaboration by Automating Site Provisioning in SharePoint 2010
Turbo-Charge Collaboration by Automating Site Provisioning in SharePoint 2010Turbo-Charge Collaboration by Automating Site Provisioning in SharePoint 2010
Turbo-Charge Collaboration by Automating Site Provisioning in SharePoint 2010BlueMetalInc
 
Empowering business users with hybrid solutions
Empowering business users with hybrid solutionsEmpowering business users with hybrid solutions
Empowering business users with hybrid solutionsBlueMetalInc
 

Más de BlueMetalInc (10)

Field enablement roadshow keynote - Bob Familiar
Field enablement roadshow keynote - Bob FamiliarField enablement roadshow keynote - Bob Familiar
Field enablement roadshow keynote - Bob Familiar
 
Field Enablement Business Drivers - Matt Bienfang
Field Enablement Business Drivers - Matt BienfangField Enablement Business Drivers - Matt Bienfang
Field Enablement Business Drivers - Matt Bienfang
 
Field enablement roadshow - Real World Solutions - John Pelak
Field enablement roadshow - Real World Solutions - John PelakField enablement roadshow - Real World Solutions - John Pelak
Field enablement roadshow - Real World Solutions - John Pelak
 
BlueMetal - Our Company Culture in 30 Seconds
BlueMetal - Our Company Culture in 30 SecondsBlueMetal - Our Company Culture in 30 Seconds
BlueMetal - Our Company Culture in 30 Seconds
 
Automating Site Provisioning in SharePoint - Presented 7/27/13 at SharePoint ...
Automating Site Provisioning in SharePoint - Presented 7/27/13 at SharePoint ...Automating Site Provisioning in SharePoint - Presented 7/27/13 at SharePoint ...
Automating Site Provisioning in SharePoint - Presented 7/27/13 at SharePoint ...
 
Apps 101 - Moving to the SharePoint 2013 App Model - Presented 7/27/13 at Sha...
Apps 101 - Moving to the SharePoint 2013 App Model - Presented 7/27/13 at Sha...Apps 101 - Moving to the SharePoint 2013 App Model - Presented 7/27/13 at Sha...
Apps 101 - Moving to the SharePoint 2013 App Model - Presented 7/27/13 at Sha...
 
20130427 What's Your Social IQ?
20130427 What's Your Social IQ?20130427 What's Your Social IQ?
20130427 What's Your Social IQ?
 
20130427 - Turbocharge SharePoint 2010 with SharePoint 2013 Search
20130427 - Turbocharge SharePoint 2010 with SharePoint 2013 Search20130427 - Turbocharge SharePoint 2010 with SharePoint 2013 Search
20130427 - Turbocharge SharePoint 2010 with SharePoint 2013 Search
 
Turbo-Charge Collaboration by Automating Site Provisioning in SharePoint 2010
Turbo-Charge Collaboration by Automating Site Provisioning in SharePoint 2010Turbo-Charge Collaboration by Automating Site Provisioning in SharePoint 2010
Turbo-Charge Collaboration by Automating Site Provisioning in SharePoint 2010
 
Empowering business users with hybrid solutions
Empowering business users with hybrid solutionsEmpowering business users with hybrid solutions
Empowering business users with hybrid solutions
 

Último

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 

Último (20)

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 

What is Big Data? The 5 V's and Real World Use Cases

  • 1.
  • 2. What is Big Data? A new generation of technologies and architectures designed to economically extract value from very large volumes of a wide variety of data, by enabling high velocity capture, discovery and/or analysis
  • 3. VELOCITY VARIETY VOLUME + VISUALIZATION VALUE Big Data’s impact can be expressed by The Five V’s
  • 4.  E-Commerce Site fed by outsourced Ad Servers  Ads appear on a wide range of sites with various offers  Massive amount of data is generated by these servers: • Web logs and click stream data from the E-Commerce Site • Ad logs and click stream data from the Ad Servers • Results in relational transactions on the site  Goal: Maximize Traffic Analysis for Business Value • Velocity Demo: Pinpoint activity in real-time & react • Variety Demo: Examine historical trends across sources • Visualization Demo: Enable ad-hoc data analysis for insights Demo Context
  • 5. WEB SERVERS How to identify when Ad clicks results in Site Traffic?  High volume stream of log activity coming in: • Web logs and Ad Server logs  Real-time stream analysis allows for pinpointing data when it happens LOG FILES  Simultaneously join structured and unstructured data in a persistent query  Can be used for A/B testing, Offer improvement, Site Dynamic behavior, or Fraud Detection AD SERVERS Velocity Architecture
  • 7. WEB SERVERS How to do historical analysis on unstructured data? M/R LOG FILES  Ad Servers and Web Servers generate different log files with different formats making them hard to analyze  Map/Reduce processing allows for us to execute a query across variant data formats stored in Hadoop  Hive provides a traditional query interface to Map/Reduce  Correlate and connect high variety data for trend analysis AD SERVERS Variety Architecture
  • 8. Access Azure blob storage via a Hive “view” and aggregate session data CREATE EXTERNAL TABLE logs ( date1 STRING, time1 STRING, action STRING, page_uri STRING, cookie STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ' ' STORED AS TEXTFILE LOCATION 'asv://logs/logs/'; CREATE TABLE log_summary AS SELECT l.cookie ,MAX(regexp_replace(cookie, '[-]', '') % 36) AS geo_hash ,MAX(l.time1) AS time1 ,l.page_uri ,MAX(CASE LOWER(action) WHEN 'click' THEN concat(l.date1, ' ', l.time1) ELSE NULL END) AS click_time ,MIN(CASE LOWER(action) WHEN 'view' THEN concat(l.date1, ' ', l.time1) ELSE NULL END) AS view_time ,MAX(l.date1) AS date1 FROM logs l GROUP BY l.cookie, l.page_uri; Hive HQL Queries
  • 10. Hadoop is an open source framework for building large scale, distributed, data- intensive applications • Hadoop is HDFS, the kernel & M/R • MapReduce brings the code to the data • Open set of tools exist to extend its functional uses and representations Hadoop Ecosystem Overview
  • 11. The "Map" step The "Reduce" step The mappers are responsible for reading the input data and Each reducer executes a function on all values for a given emitting key/value pairs. The input file can be CSV, XML, or any key. The framework ensures that all values for the same format as long as it can be converted into k/v pairs. key are sent to the same reducer. Map/Reduce Distributes Processing of Operations
  • 12. WEB SERVERS How to do ad-hoc data discovery and visualizations? M/R LOG FILES  Ad Servers and Web Servers generate different log files with different formats making them hard to analyze  Map/Reduce processing allows for us to execute a query across variant data formats stored in Hadoop  Hive provides a traditional query interface to Map/Reduce  Correlate and connect high variety data for trend analysis AD SERVERS Visualization Architecture
  • 13. DEMO: Excel & Hive Adapter
  • 14.  Big Data & Analytics Projects are often Additive • New Capabilities layered on top of existing data & apps • Analytics can drive Applications in new ways Visualizations put Big Data in the hands of the Business Summary
  • 15. We are BlueMetal Architects
  • 16. Take the next steps – Imagine, Define, Build
  • 17.  Envisioning & Strategy Briefing: Big Data, Analytics & Collaboration  Envisioning Session: Data is the App – Envisioning the Next Generation, Data Driven Enterprise  Architecture Design Session: Big Data & Analytics  Healthcare / Life Sciences: Strategy Briefing or Architecture Design Session – Big Data Architecture, Cloud & Use Case Driven Analytics and applications, Portal, M-Health and UX design for Providers, Patients, Pharma & Biotechnology  Financial Services: Strategy Briefing or Architecture Design Session – Big Data & Analytics for Banking, Capital Markets, Retail Brokerage or Insurance Take the next steps - our offerings
  • 19. DESIGN Differentiation UX DATA SOCIAL Specialization CODE Foundation Who We Are
  • 20. DESIGN Differentiation Strategy Analysis Creative UX DATA SOCIAL Desktop Analytics Web Content Specialization Mobile Big Data Intranets Web Client Core SQL Collaboration .NET SERVICES On-Premise Foundation Java PPP Cloud Who We Are