SlideShare a Scribd company logo
1 of 29
UNIFIED DATA ARCHITECTURE

Chris Hillman
Teradata Principal Data Scientist
Need for a Unified Data Architecture for New Insights
Enabling Any User for Any Data Type from Data Capture to Analysis




                   Java, C/C++, Python, R, SAS, SQL, Excel, BI, Visualization


                                                          Reporting and Execution
              Discover and Explore
                                                             in the Enterprise


                               Capture, Store and Refine


     Audio/                                    Web &               Machine
                   Images   Docs    Text                                     CRM   SCM   ERP
     Video                                     Social               Logs



 2       4/23/12                           Teradata Confidential
UNIFIED DATA ARCHITECTURE
                  Data Scientists                Quants                Customers / Partners      Front-Line Workers
                        Engineers          Business Analysts                Executives           Operational Systems




                       LANGUAGES         MATH & STATS        DATA MINING      BUSINESS INTELLIGENCE   APPLICATIONS




                                         DISCOVERY                                      INTEGRATED
                                         PLATFORM                                     DATA WAREHOUSE




       AUDIO & VIDEO       IMAGES             TEXT          WEB & SOCIAL    MACHINE LOGS       CRM           SCM       ERP




3   Confidential and proprietary. Copyright © 2012 Teradata Corporation.
Requirements for an Integrated Data
    Warehouse
                                                                                                    Customers/Partners
• Single View of Your Business                                                                          Marketing
                                                                              Business Analysts                          Front-line Workers
• Cross-Functional Analysis                                                                             Executives
                                                                             Knowledge Workers                           Operational Systems
• Shared Source for Analytics
• Load Once, Use Many Times
• Highest Business Value                                                    BUSINESS INTELLIGENCE         DATA MINING           APPLICATIONS


• Lowest Total Cost of
  Ownership
• Fastest Time-to-Market For
  New Apps


                                                                                                      INTEGRATED
                                                                                                    DATA WAREHOUSE




4    Confidential and proprietary. Copyright © 2012 Teradata Corporation.
Requirements of a Discovery Platform

     DATA SOURCES                                     DISCOVERY               DISCOVERY TOOLS         USERS


          Non-                                                                        SQL
        Relational
          Data
                                                  Discovery
                                                  Platform                                              Data
                                                                                  MapReduce           Scientist
          Multi-
        Structured
                                            • Structured and                  Statistical Functions   Business
           Data                                                                                       Analyst
                                              multi-structured
                                              data
                                                                             • Fraud patterns
                                            • Doesn’t require
        Structured                            extensive data                 • Customer behavior
           Data                               modeling                       • Digital marketing
                                            • Doesn’t balance the              optimization
                                              books                          • Supply chain and
                                            • Data completeness                supply line sensors
          OLTP                                can be good enough
         DBMS’s                             • No stringent SLAs



5     Confidential and proprietary. Copyright © 2012 Teradata Corporation.
UNIFIED DATA ARCHITECTURE
                  Data Scientists                Quants                Customers / Partners        Front-Line Workers
                        Engineers          Business Analysts                 Executives            Operational Systems




                       LANGUAGES         MATH & STATS        DATA MINING        BUSINESS INTELLIGENCE   APPLICATIONS




    Big Data Analytics                   DISCOVERY                                         INTEGRATED
                                         PLATFORM                                        DATA WAREHOUSE




                                                                                                  Big Data Management
                                                              CAPTURE | STORE | REFINE




       AUDIO & VIDEO       IMAGES             TEXT          WEB & SOCIAL      MACHINE LOGS       CRM           SCM       ERP




6   Confidential and proprietary. Copyright © 2012 Teradata Corporation.
TERADATA UNIFIED DATA ARCHITECTURE

                    Data Scientists         Quants                Customers / Partners        Front-Line Workers
                        Engineers      Business Analysts               Executives             Operational Systems




                        LANGUAGES     MATH & STATS    DATA MINING        BUSINESS INTELLIGENCE         APPLICATIONS


                                                                                                                      Productionize
                                                                                                            Analytic Score with Path Variable
 Golden Path Application Submit                                                                                      Event Triggers
    Fraud Sentiment Analysis                                                                                     Marketing Integration
Multi-Channel Customer Behavior                                                                               Customer Behavior Analysis
        Channel Hoping                                                                                            MySpending Report
          Attrition Paths                                                                                       Customer Segmentation
        Fraudulent Paths                                                                                          Credit Risk Analysis
  Digital Marketing Attribution       DISCOVERY                                     INTEGRATED                   Customer profitability
                                      PLATFORM                                    DATA WAREHOUSE                    Portfolio Analysis



                                                          Consumerization
                                                           Sessionization
                                                     Cross Platform Aggregation




                                                       CAPTURE | STORE | REFINE




           E-MAIL         STORE SVP      SURVEY         ON-LINE         BRANCH DATA      CALL CENTER          ATM          PROFILE
TERADATA UNIFIED DATA ARCHITECTURE
                  Data Scientists                Quants                Customers / Partners        Front-Line Workers
                        Engineers          Business Analysts                 Executives            Operational Systems




                       LANGUAGES         MATH & STATS        DATA MINING        BUSINESS INTELLIGENCE   APPLICATIONS




                                         DISCOVERY                                         INTEGRATED
                                         PLATFORM                                        DATA WAREHOUSE




                                                      SQL-H




                                                              CAPTURE | STORE | REFINE




8   ConfidentialVIDEOproprietary. Copyright © 2012 Teradata Corporation.
       AUDIO & and           IMAGES            TEXT          WEB & SOCIAL     MACHINE LOGS       CRM           SCM       ERP
SQL-H In Action
Join Teradata, Hadoop, Aster tables; feed into Map Reduce
SELECT qrd_focus_area, count(*)                                                   SQL manipulation
                                                                                  for calculation
FROM nPath(
    ON (
      SELECT * FROM
            ( SELECT * FROM load_from_teradata(
                                                                                  TD Connector to
                ON mr_driver          TDPID(‘dbc’)
                                                                                  get OWNERSHIP
                USERNAME(‘name1’)               PASSWORD(‘password1’)             data
                QUERY(‘SELECT * FROM owner.prod_own_fact’) ) ) AS td
                                                                                  Include local Aster
      JOIN owner.prod_dim proddim ON td.prod_id = proddim.product_id
                                                                                  tables in JOIN
      JOIN
            ( SELECT * FROM load_from_hadoop(
                ON mr_driver SERVER ('10.10.3.139')                               Hadoop Connector
                                                                                  to get WARRANTY
                USERNAME (‘name2') DBNAME (‘repair')                              data
                TABLENAME ('transaction') ) ) AS sqlh
      ON sqlh.prod_ident_nbr = proddim.id )
    PARTITION BY party_id, prod_id ORDER BY repair_dt
                                                                                  Any path you
    MODE (OVERLAPPING)                                                            want, specified
    PATTERN ( ‘REPAIR{3}' )                                                       with the power
                                                                                  of regular
    SYMBOLS ( event = ‘REPAIR’ AS REPAIR )
                                                                                  expressions!
    RESULT (ACCUMULATE(qrd_focus_area OF ANY(REPAIR)) AS qrd_focus_area_path )
)n
GROUP BY 1 ORDER BY 2 desc ;
9          Confidential and proprietary. Copyright © 2012 Teradata Corporation.
TERADATA UNIFIED DATA ARCHITECTURE
                     Data Scientists                Quants                Customers / Partners        Front-Line Workers
                           Engineers          Business Analysts                 Executives            Operational Systems




     VIEWPOINT            LANGUAGES         MATH & STATS        DATA MINING        BUSINESS INTELLIGENCE      APPLICATIONS      SUPPORT




                                            DISCOVERY              Aster Teradata             INTEGRATED
                                            PLATFORM                 Connector              DATA WAREHOUSE




                     Aster Connector for                 SQL-H                                             Teradata Connector
                           Hadoop                                                                              for Hadoop


             Aster Loader                                                                                           Teradata Loader
                                                                 CAPTURE | STORE | REFINE




10     ConfidentialVIDEOproprietary. Copyright © 2012 Teradata Corporation.
          AUDIO & and           IMAGES            TEXT          WEB & SOCIAL     MACHINE LOGS       CRM             SCM         ERP
When to Use Which?
      The best approach by workload and data type
Processing as a Function of Schema Requirements and Stage of Data Pipeline

                                                                   “Simple math
                                              Data Pre-
                       Low Cost                                       at scale”         Joins,       Analytics
                                             Processing,
                     Storage and                                    (Score, filter,    Unions,     (Iterative and   Reporting
                                              Refining,
                     Fast Loading                                    sort, avg.,      Aggregates    data mining)
                                              Cleansing
                                                                      count...)


                                                Financial Analysis, Ad-Hoc/OLAP
       Stable            Teradata/              Enterprise-Wide BI TeradataReporting
                                                Teradata  Teradata
                                                                    and        Teradata                              Teradata
      Schema              Hadoop
                                                         Spatial/Temporal
                                                         Active Execution

                                               Interactive Data Discovery                                             Aster
  Evolving                                                                                                           (SQL +
                                          Web Clickstream, Set-Top Box Analysis
                                                 Aster /
                                                      Aster /
                          Hadoop                                Aster    Aster                                        Aster
   Schema                                        Hadoop
                                                      Hadoop                                                        MapReduce
                                                CDRs, Sensor Logs, JSON                                             Analytics)



                                            Social Feeds, Text, Image Processing                                      Aster
   Format,
No Schema
                        Hadoop
                         Hadoop               Audio/Video Storage and Refining
                                              Hadoop
                                               Hadoop  Hadoop
                                                       Hadoop     Aster
                                                                  Aster    Aster
                                                                           Aster                                    (MapReduce
                                                                                                                       Aster
                                                                                                                     Analytics)
                                             Storage and Batch Transformations

 11      Confidential and proprietary. Copyright © 2012 Teradata Corporation.
When to Use Which?
      The best approach by workload and data type
Processing as a Function of Schema Requirements and Stage of Data Pipeline

                                                                   “Simple math
                                              Data Pre-
                       Low Cost                                       at scale”         Joins,       Analytics
                                             Processing,
                     Storage and                                    (Score, filter,    Unions,     (Iterative and   Reporting
                                              Refining,
                     Fast Loading                                    sort, avg.,      Aggregates    data mining)
                                              Cleansing
                                                                      count...)




       Stable            Teradata/
                                                 Teradata               Teradata       Teradata      Teradata        Teradata
      Schema              Hadoop




                                                                                                                      Aster
  Evolving                Hadoop
                                                 Aster /                 Aster /
                                                                                        Aster          Aster         (SQL +
                                                                                                                      Aster
   Schema                                        Hadoop                  Hadoop                                     MapReduce
                                                                                                                    Analytics)




                                                                                                                      Aster
   Format,
                        Hadoop
                         Hadoop                 Hadoop
                                                 Hadoop                 Hadoop
                                                                        Hadoop          Aster
                                                                                        Aster         Aster
                                                                                                      Aster         (MapReduce
                                                                                                                       Aster
No Schema                                                                                                            Analytics)




 12      Confidential and proprietary. Copyright © 2012 Teradata Corporation.
UDA IN PRACTICE
IPTV QUALITY OF SERVICE
Starting point: Complaints Data




14    Confidential and proprietary. Copyright © 2012 Teradata Corporation.
Churners – and data quality




15    Confidential and proprietary. Copyright © 2012 Teradata Corporation.
What events lead up to a reboot?



 Note number of paths
     with a reboot,
   following another
         reboot!




      CREATE dimension table wrk.npath_reboot_5events
      AS SELECT path, COUNT(*) AS path_count
      FROM nPath
             (ON wrk.w_event_f
              PARTITION BY srv_id                                              SELECT *
              ORDER BY evt_ts desc                                             FROM GraphGen (ON
              MODE (NONOVERLAPPING )                                                         (SELECT * from wrk.npath_reboot_5events
              PATTERN ('X{0,5}.reboot')                                                      ORDER BY path_count
              SYMBOLS                                                                         LIMIT 30 )
                    (true as X,                                                PARTITION BY 1
                  evt_name = 'REBOOT' AS reboot)                               ORDER BY path_count desc
             RESULT                                                            item_format('npath')
                 (FIRST( srv_id OF X) AS srv_id,                               item1_col('path')
                  ACCUMULATE (evt_name OF ANY (X,reboot))                      score_col('path_count')
              AS path)                                                         output_format('sankey')
             ) GROUP BY 1 ;                                                    justify('right'));




16      Confidential and proprietary. Copyright © 2012 Teradata Corporation.
View events data in Tableau




                                                                             Looks like an issue with the data
                                                                             on the 30th September and
                                                                             beyond, the Reboot data for
                                                                             October seems to have been
                                                                             aggregated and added to
                                                                             September the 30th




17    Confidential and proprietary. Copyright © 2012 Teradata Corporation.
Address data quality
 • Remove paths will all reboots and exclude data from 30th
   September




                                                                             Would appear
                                                                             that events with
                                                                             suffix 1 and 2
                                                                             can be added
                                                                             together




18    Confidential and proprietary. Copyright © 2012 Teradata Corporation.
Visualise as a Graph using Aster GraphGen

                                                                                                     Size of Node = number
                                                                                                     of customers
                                                                                                     Width of Edge = number
                                                                                                     of errors




                                                                             SELECT *
                                                                             FROM graphgen
                                                                               (ON
                                                                                       (SELECT DISTINCT dmt_act_dslam,
                                                                                        nra_id,
                                                                                                  nbr_of_srvid,
                                                                                        errorspersrv,
                                                                                        nbr_of_dslam
                                                                                       FROM wrk.srvid_dslam_err)
                                                                                PARTITION BY 1
                                                                                ORDER BY errorspersrv
                                                                                item_format('cfilter')
                                                                                item1_col('dmt_act_dslam')
                                                                                item2_col('nra_id')
                                                                                score_col('errorspersrv')
                                                                                cnt1_col('nbr_of_srvid')
                                                                                cnt2_col('nbr_of_dslam')
                                                                                output_format('sigma')
                                                                                directed('false')
                                                                                width_max(10)
                                                                                width_min(1)
                                                                                nodesize_max (3)
                                                                                nodesize_min (1));




19    Confidential and proprietary. Copyright © 2012 Teradata Corporation.
Synch Issues by Hub Type




20    Confidential and proprietary. Copyright © 2012 Teradata Corporation.
Error and Complaint rates by equipment
     type




21    Confidential and proprietary. Copyright © 2012 Teradata Corporation.
UDA IN PRACTICE PREDICTIVE
MODELS
Input Data
 create table wrk.cih_dshb_ads as
 SELECT srv_id, sav_flag, offer, inseecode, code_postal, libelle, nom_dep, nom_region, longitude, latitude,
            coalesce(topo_nra, 'Unknown') as topo_nra, topo_dslam, coalesce(iad_hardwareversion, 'Unknown') as iad_hardwareversion,
            coalesce(iad_manufacturer, 'Unknown') as iad_manufacturer,
            coalesce(iad_modelname , 'Unknown') as iad_modelname,
            coalesce(iad_modemfirmwareversion , 'Unknown') as iad_modemfirmwareversion,
            coalesce(iad_productclass , 'Unknown') as iad_productclass,
            coalesce(iad_provisioningcode , 'Unknown') as iad_provisioningcode,
            coalesce(iad_softwareversion , 'Unknown') as iad_softwareversion,
            coalesce(iad_vendorconfigfiledescription_1 , 'Unknown') as iad_vendorconfigfiledescription_1,
            coalesce(iad_vendorconfigfilename_1 , 'Unknown') as iad_vendorconfigfilename_1,
            coalesce(iad_vendorconfigfilenumbofentries , 0) as iad_vendorconfigfilenumbofentries,
            coalesce(iad_vendorconfigfileversion_1 , 'Unknown') as iad_vendorconfigfileversion_1,
            coalesce(iad_x_000e50_boardversion , 'Unknown') as iad_x_000e50_boardversion,
            coalesce(stb_description , 'Unknown') as stb_description,
            coalesce(stb_devicestatus , 'Unknown') as stb_devicestatus,
            coalesce(stb_gwinfoproductclass , 'Unknown') as stb_gwinfoproductclass,
            coalesce(stb_hardwareversion , 'Unknown') as stb_hardwareversion,
            coalesce(stb_manufacturer , 'Unknown') as stb_manufacturer,
            coalesce(stb_productclass , 'Unknown') as stb_productclass,
            coalesce( stb_softwareversion, 'Unknown') as stb_softwareversion,
            dev_iad_uptime_diff,dsl_showtime_diff,dev_stb_uptime_diff,
            kpi_iad_uptime,kpi_iad_synctime,kpi_stb_uptime,
            dev_iad_uptime,dsl_showtime,dev_stb_uptime,
            dsl_downstr_att,dsl_downstr_cur,dsl_downstr_max,
            kpi_voip_nb_dropped_calls_diff,kpi_voip_nb_dropped_calls,kpi_dsl_nb_crc,kpi_dsl_dscurrate_ratio_qualite,
            kpi_voip_tx_appels_coupes,kpi_voip_qualite,kpi_voip_qualite_diff,kpi_iptv_plr_nb_bon,kpi_iptv_plr_nb_moyen,
           ,kpi_iptv_conso_heures,kpi_iptv_packetslosts,kpi_iptv_packetsreceived, kpi_dsl_dscurrate_before,kpi_dsl_dscurrate_after,
   FROM wrk.cih_dshb_bis
  where network = 'BYT'
 and stb_manufacturer is not null
 and topo_dslam is not null


24     Confidential and proprietary. Copyright © 2012 Teradata Corporation.
Decision Trees
 SELECT *
 FROM forest_drive
 (ON (SELECT 1)
  PARTITION BY 1
  DATABASE('beehive')
  USERID('beehive')
  PASSWORD('beehive')
  INPUTTABLE('wrk.cih_dshb_tree_in')
  OUTPUTTABLE('wrk.cih_dshb_tree_out')
  RESPONSE('sav_flag')
  NUMERICINPUTS(‘KPI_SIGNAL')
  CATEGORICALINPUTS('offer', 'nom_dep', 'nom_region',
 'topo_nra','topo_dslam' , 'iad_modemfirmwareversion',
 'iad_vendorconfigfiledescription_1', 'iad_x_000e50_boardversion',
 'stb_description', 'stb_productclass', 'stb_softwareversion',
 'topo_dslam_brand')
  NUMTREES(4)
 )

25    Confidential and proprietary. Copyright © 2012 Teradata Corporation.
Naïve Bayes
 CREATE TABLE wrk.cih_dshb_model (PARTITION KEY(class)) AS
 SELECT * FROM naiveBayesReduce(
   ON(SELECT * FROM naiveBayesMap(
       ON (select * from wrk.cih_dshb_ads_in_11 where kpi_iad_uptime is not null)
       RESPONSE('sav_flag')
       NUMERICINPUTS('dev_iad_uptime','dsl_showtime','dev_stb_uptime',
 'dsl_downstr_att','dsl_downstr_cur','dsl_downstr_max',
 'kpi_voip_nb_dropped_calls_diff','kpi_voip_nb_dropped_calls','kpi_dsl_nb_crc','kpi_dsl_d
 scurrate_ratio_qualite','kpi_voip_tx_appels_coupes','kpi_voip_qualite','kpi_voip_qualite_
 diff','kpi_iptv_plr_nb_bon','kpi_iptv_plr_nb_moyen','kpi_iptv_plr_nb_mauvais',
 'kpi_iptv_packetslosts','kpi_iptv_packetsreceived','kpi_stb_uptime','kpi_iad_synctime','kp
 i_iad_uptime')
       CATEGORICALINPUTS('offer', 'nom_dep', 'nom_region', 'topo_nra','topo_dslam' ,
 'iad_modemfirmwareversion','iad_vendorconfigfiledescription_1','iad_x_000e50_boardve
 rsion', 'stb_description', 'stb_productclass', 'stb_softwareversion', 'topo_dslam_brand')
    )
   )PARTITION BY class
 );




26    Confidential and proprietary. Copyright © 2012 Teradata Corporation.
Support Vector Machine
 create table wrk.cih_svm_train2 distribute by hash(srv_id) as
 select srv_id, 'topo_nra_insee' as attr, topo_nra_insee::varchar as attr_value, sav_all_tgt
 FROM wrk.cih_sav_train union all
 select srv_id, 'code_postal' as attr, code_postal::varchar as attr_value, sav_all_tgt
 FROM wrk.cih_sav_train union all
 select srv_id, 'kpi_iad_uptime_avg' as attr, kpi_iad_uptime_avg::varchar as attr_value, sav_all_tgt
 FROM wrk.cih_sav_train union all
 select srv_id, 'dev_iad_uptime_diff_avg' as attr, dev_iad_uptime_diff_avg::varchar as attr_value, sav_all_tgt
 FROM wrk.cih_sav_train union all
 select srv_id, 'kpi_voip_nb_dropped_calls_diff_avg' as attr, kpi_voip_nb_dropped_calls_diff_avg::varchar as
 attr_value, sav_all_tgt
 FROM wrk.cih_sav_train union all
 select srv_id, 'sav_nb_contacts' as attr, sav_nb_contacts::varchar as attr_value, sav_all_tgt
 FROM wrk.cih_sav_train union all
 select srv_id, 'nb_tr' as attr, nb_tr::varchar as attr_value, sav_all_tgt FROM wrk.cih_sav_train union all
 select srv_id, 'kpi_dsl_nb_crc_avg' as attr, kpi_dsl_nb_crc_avg::varchar as attr_value, sav_all_tgt
 FROM wrk.cih_sav_train;
 /*Run SVM*/

 CREATE TABLE wrk.cih_svm_model3 (PARTITION KEY(vec_index)) AS
 SELECT vec_index, avg(vec_value) as vec_value FROM
 svm( ON wrk.cih_svm_train2
 PARTITION BY srv_id
 OUTCOME( 'sav_flag' )
 ATTRIBUTE_NAME( 'attr' )
 ATTRIBUTE_VALUE( 'attr_value' )
 )GROUP BY vec_index;


27    Confidential and proprietary. Copyright © 2012 Teradata Corporation.
Lift Chart to View Predictive Model
 Performance




28   Confidential and proprietary. Copyright © 2012 Teradata Corporation.
Teradata Big Data London Seminar

More Related Content

What's hot

Data-Ed Online Presents: Data Warehouse Strategies
Data-Ed Online Presents: Data Warehouse StrategiesData-Ed Online Presents: Data Warehouse Strategies
Data-Ed Online Presents: Data Warehouse StrategiesDATAVERSITY
 
SAP BW vs Teradat; A White Paper
SAP BW vs Teradat; A White PaperSAP BW vs Teradat; A White Paper
SAP BW vs Teradat; A White PaperVipul Neema
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecturepcherukumalla
 
The principles of the business data lake
The principles of the business data lakeThe principles of the business data lake
The principles of the business data lakeCapgemini
 
Business objects data services in an sap landscape
Business objects data services in an sap landscapeBusiness objects data services in an sap landscape
Business objects data services in an sap landscapePradeep Ketoli
 
Informatica World 2006 - MDM Data Quality
Informatica World 2006 - MDM Data QualityInformatica World 2006 - MDM Data Quality
Informatica World 2006 - MDM Data QualityDatabase Architechs
 
SAP Data Services
SAP Data ServicesSAP Data Services
SAP Data ServicesGeetika
 
Traditional Data-warehousing / BI overview
Traditional Data-warehousing / BI overviewTraditional Data-warehousing / BI overview
Traditional Data-warehousing / BI overviewNagaraj Yerram
 
Hu Yoshida's Point of View: Competing In An Always On World
Hu Yoshida's Point of View: Competing In An Always On WorldHu Yoshida's Point of View: Competing In An Always On World
Hu Yoshida's Point of View: Competing In An Always On WorldHitachi Vantara
 
BI Masterclass slides (Reference Architecture v3)
BI Masterclass slides (Reference Architecture v3)BI Masterclass slides (Reference Architecture v3)
BI Masterclass slides (Reference Architecture v3)Syaifuddin Ismail
 
20100430 introduction to business objects data services
20100430 introduction to business objects data services20100430 introduction to business objects data services
20100430 introduction to business objects data servicesJunhyun Song
 
Agile Business Intelligence
Agile Business IntelligenceAgile Business Intelligence
Agile Business IntelligenceDon Jackson
 
Big Data and BI Tools - BI Reporting for Bay Area Startups User Group
Big Data and BI Tools - BI Reporting for Bay Area Startups User GroupBig Data and BI Tools - BI Reporting for Bay Area Startups User Group
Big Data and BI Tools - BI Reporting for Bay Area Startups User GroupScott Mitchell
 
Beyond E R P With 1 K E Y B I
Beyond  E R P With 1 K E Y  B IBeyond  E R P With 1 K E Y  B I
Beyond E R P With 1 K E Y B ISanjay Mehta
 
Microsoft business intelligence
Microsoft business intelligenceMicrosoft business intelligence
Microsoft business intelligenceJawad Mohmand
 
Rev_3 Components of a Data Warehouse
Rev_3 Components of a Data WarehouseRev_3 Components of a Data Warehouse
Rev_3 Components of a Data WarehouseRyan Andhavarapu
 

What's hot (20)

Tera stream ETL
Tera stream ETLTera stream ETL
Tera stream ETL
 
Data-Ed Online Presents: Data Warehouse Strategies
Data-Ed Online Presents: Data Warehouse StrategiesData-Ed Online Presents: Data Warehouse Strategies
Data-Ed Online Presents: Data Warehouse Strategies
 
SAP BW vs Teradat; A White Paper
SAP BW vs Teradat; A White PaperSAP BW vs Teradat; A White Paper
SAP BW vs Teradat; A White Paper
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecture
 
The principles of the business data lake
The principles of the business data lakeThe principles of the business data lake
The principles of the business data lake
 
Vw sachin 2
Vw sachin 2Vw sachin 2
Vw sachin 2
 
Business objects data services in an sap landscape
Business objects data services in an sap landscapeBusiness objects data services in an sap landscape
Business objects data services in an sap landscape
 
Informatica World 2006 - MDM Data Quality
Informatica World 2006 - MDM Data QualityInformatica World 2006 - MDM Data Quality
Informatica World 2006 - MDM Data Quality
 
SAP Data Services
SAP Data ServicesSAP Data Services
SAP Data Services
 
Vendor comparisons: the end game in business intelligence
Vendor comparisons: the end game in business intelligenceVendor comparisons: the end game in business intelligence
Vendor comparisons: the end game in business intelligence
 
Traditional Data-warehousing / BI overview
Traditional Data-warehousing / BI overviewTraditional Data-warehousing / BI overview
Traditional Data-warehousing / BI overview
 
Hu Yoshida's Point of View: Competing In An Always On World
Hu Yoshida's Point of View: Competing In An Always On WorldHu Yoshida's Point of View: Competing In An Always On World
Hu Yoshida's Point of View: Competing In An Always On World
 
BI Masterclass slides (Reference Architecture v3)
BI Masterclass slides (Reference Architecture v3)BI Masterclass slides (Reference Architecture v3)
BI Masterclass slides (Reference Architecture v3)
 
20100430 introduction to business objects data services
20100430 introduction to business objects data services20100430 introduction to business objects data services
20100430 introduction to business objects data services
 
Agile Business Intelligence
Agile Business IntelligenceAgile Business Intelligence
Agile Business Intelligence
 
Big Data and BI Tools - BI Reporting for Bay Area Startups User Group
Big Data and BI Tools - BI Reporting for Bay Area Startups User GroupBig Data and BI Tools - BI Reporting for Bay Area Startups User Group
Big Data and BI Tools - BI Reporting for Bay Area Startups User Group
 
Kodak
KodakKodak
Kodak
 
Beyond E R P With 1 K E Y B I
Beyond  E R P With 1 K E Y  B IBeyond  E R P With 1 K E Y  B I
Beyond E R P With 1 K E Y B I
 
Microsoft business intelligence
Microsoft business intelligenceMicrosoft business intelligence
Microsoft business intelligence
 
Rev_3 Components of a Data Warehouse
Rev_3 Components of a Data WarehouseRev_3 Components of a Data Warehouse
Rev_3 Components of a Data Warehouse
 

Viewers also liked

Teradata introduction - A basic introduction for Taradate system Architecture
Teradata introduction - A basic introduction for Taradate system ArchitectureTeradata introduction - A basic introduction for Taradate system Architecture
Teradata introduction - A basic introduction for Taradate system ArchitectureMohammad Tahoon
 
Introduction to Teradata And How Teradata Works
Introduction to Teradata And How Teradata WorksIntroduction to Teradata And How Teradata Works
Introduction to Teradata And How Teradata WorksBigClasses Com
 
Teradata 13.10
Teradata 13.10Teradata 13.10
Teradata 13.10Teradata
 
Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014Hortonworks
 
ABC of Teradata System Performance Analysis
ABC of Teradata System Performance AnalysisABC of Teradata System Performance Analysis
ABC of Teradata System Performance AnalysisShaheryar Iqbal
 
Teradata Architecture
Teradata Architecture Teradata Architecture
Teradata Architecture BigClasses Com
 
Teradata Aggregate Join Indices And Dimensional Models
Teradata Aggregate Join Indices And Dimensional ModelsTeradata Aggregate Join Indices And Dimensional Models
Teradata Aggregate Join Indices And Dimensional Modelspepeborja
 
Understanding System Performance
Understanding System PerformanceUnderstanding System Performance
Understanding System PerformanceTeradata
 
Teradata memory management - A balancing act
Teradata memory management  -  A balancing actTeradata memory management  -  A balancing act
Teradata memory management - A balancing actShaheryar Iqbal
 
How to Use Algorithms to Scale Digital Business
How to Use Algorithms to Scale Digital BusinessHow to Use Algorithms to Scale Digital Business
How to Use Algorithms to Scale Digital BusinessTeradata
 
Utilities Industry - Smart Analytics
Utilities Industry - Smart AnalyticsUtilities Industry - Smart Analytics
Utilities Industry - Smart AnalyticsTeradata
 
Manuel del buen vivir
Manuel del buen vivirManuel del buen vivir
Manuel del buen vivirYazz K'brera
 
BIG DATA - TERADATA
BIG DATA - TERADATABIG DATA - TERADATA
BIG DATA - TERADATAibankuk
 

Viewers also liked (16)

Teradata introduction - A basic introduction for Taradate system Architecture
Teradata introduction - A basic introduction for Taradate system ArchitectureTeradata introduction - A basic introduction for Taradate system Architecture
Teradata introduction - A basic introduction for Taradate system Architecture
 
Introduction to Teradata And How Teradata Works
Introduction to Teradata And How Teradata WorksIntroduction to Teradata And How Teradata Works
Introduction to Teradata And How Teradata Works
 
Teradata
TeradataTeradata
Teradata
 
Teradata 13.10
Teradata 13.10Teradata 13.10
Teradata 13.10
 
Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014
 
ABC of Teradata System Performance Analysis
ABC of Teradata System Performance AnalysisABC of Teradata System Performance Analysis
ABC of Teradata System Performance Analysis
 
Teradata sql-tuning-top-10
Teradata sql-tuning-top-10Teradata sql-tuning-top-10
Teradata sql-tuning-top-10
 
Teradata Architecture
Teradata Architecture Teradata Architecture
Teradata Architecture
 
Teradata Aggregate Join Indices And Dimensional Models
Teradata Aggregate Join Indices And Dimensional ModelsTeradata Aggregate Join Indices And Dimensional Models
Teradata Aggregate Join Indices And Dimensional Models
 
Understanding System Performance
Understanding System PerformanceUnderstanding System Performance
Understanding System Performance
 
Teradata memory management - A balancing act
Teradata memory management  -  A balancing actTeradata memory management  -  A balancing act
Teradata memory management - A balancing act
 
Teradata - Architecture of Teradata
Teradata - Architecture of TeradataTeradata - Architecture of Teradata
Teradata - Architecture of Teradata
 
How to Use Algorithms to Scale Digital Business
How to Use Algorithms to Scale Digital BusinessHow to Use Algorithms to Scale Digital Business
How to Use Algorithms to Scale Digital Business
 
Utilities Industry - Smart Analytics
Utilities Industry - Smart AnalyticsUtilities Industry - Smart Analytics
Utilities Industry - Smart Analytics
 
Manuel del buen vivir
Manuel del buen vivirManuel del buen vivir
Manuel del buen vivir
 
BIG DATA - TERADATA
BIG DATA - TERADATABIG DATA - TERADATA
BIG DATA - TERADATA
 

Similar to Teradata Big Data London Seminar

Big Data Needs Big Analytics
Big Data Needs Big AnalyticsBig Data Needs Big Analytics
Big Data Needs Big AnalyticsDeepak Ramanathan
 
The Comprehensive Approach: A Unified Information Architecture
The Comprehensive Approach: A Unified Information ArchitectureThe Comprehensive Approach: A Unified Information Architecture
The Comprehensive Approach: A Unified Information ArchitectureInside Analysis
 
The Next Generation of Big Data Analytics
The Next Generation of Big Data AnalyticsThe Next Generation of Big Data Analytics
The Next Generation of Big Data AnalyticsHortonworks
 
Investigative Analytics- What's in a Data Scientists Toolbox
Investigative Analytics- What's in a Data Scientists ToolboxInvestigative Analytics- What's in a Data Scientists Toolbox
Investigative Analytics- What's in a Data Scientists ToolboxData Science London
 
Left Brain, Right Brain: How to Unify Enterprise Analytics
Left Brain, Right Brain: How to Unify Enterprise AnalyticsLeft Brain, Right Brain: How to Unify Enterprise Analytics
Left Brain, Right Brain: How to Unify Enterprise AnalyticsInside Analysis
 
Farklı Ortamlarda Büyük Veri Kavramı -Big Data by Sybase
Farklı Ortamlarda Büyük Veri Kavramı -Big Data by Sybase Farklı Ortamlarda Büyük Veri Kavramı -Big Data by Sybase
Farklı Ortamlarda Büyük Veri Kavramı -Big Data by Sybase Sybase Türkiye
 
Big Data Analytics in a Heterogeneous World - Joydeep Das of Sybase
Big Data Analytics in a Heterogeneous World - Joydeep Das of SybaseBig Data Analytics in a Heterogeneous World - Joydeep Das of Sybase
Big Data Analytics in a Heterogeneous World - Joydeep Das of SybaseBigDataCloud
 
EDF2013: Selected Talk: Bryan Drexler: The 80/20 Rule and Big Data
EDF2013: Selected Talk: Bryan Drexler: The 80/20 Rule and Big Data EDF2013: Selected Talk: Bryan Drexler: The 80/20 Rule and Big Data
EDF2013: Selected Talk: Bryan Drexler: The 80/20 Rule and Big Data European Data Forum
 
Building a business intelligence architecture fit for the 21st century by Jon...
Building a business intelligence architecture fit for the 21st century by Jon...Building a business intelligence architecture fit for the 21st century by Jon...
Building a business intelligence architecture fit for the 21st century by Jon...Mark Tapley
 
Metadata Use Cases
Metadata Use CasesMetadata Use Cases
Metadata Use Casesdmurph4
 
Metadata Use Cases You Can Use
Metadata Use Cases You Can UseMetadata Use Cases You Can Use
Metadata Use Cases You Can Usedmurph4
 
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & TalendIntroducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & TalendCaserta
 
Simplifying Big Data Analytics for the Business
Simplifying Big Data Analytics for the BusinessSimplifying Big Data Analytics for the Business
Simplifying Big Data Analytics for the BusinessTeradata Aster
 
TBR ODS EDW Planning 2007
TBR ODS EDW Planning 2007TBR ODS EDW Planning 2007
TBR ODS EDW Planning 2007Thomas Danford
 
Module 1 Information Management and Analytics Final
Module 1 Information Management and Analytics FinalModule 1 Information Management and Analytics Final
Module 1 Information Management and Analytics FinalVivastream
 
Martin Wildberger Presentation
Martin Wildberger PresentationMartin Wildberger Presentation
Martin Wildberger PresentationMauricio Godoy
 
IBM Cognos - IBM informations-integration för IBM Cognos användare
IBM Cognos - IBM informations-integration för IBM Cognos användareIBM Cognos - IBM informations-integration för IBM Cognos användare
IBM Cognos - IBM informations-integration för IBM Cognos användareIBM Sverige
 
Information Management: Answering Today’s Enterprise Challenge
Information Management: Answering Today’s Enterprise ChallengeInformation Management: Answering Today’s Enterprise Challenge
Information Management: Answering Today’s Enterprise ChallengeBob Rhubart
 

Similar to Teradata Big Data London Seminar (20)

Big Data Needs Big Analytics
Big Data Needs Big AnalyticsBig Data Needs Big Analytics
Big Data Needs Big Analytics
 
The Comprehensive Approach: A Unified Information Architecture
The Comprehensive Approach: A Unified Information ArchitectureThe Comprehensive Approach: A Unified Information Architecture
The Comprehensive Approach: A Unified Information Architecture
 
The Next Generation of Big Data Analytics
The Next Generation of Big Data AnalyticsThe Next Generation of Big Data Analytics
The Next Generation of Big Data Analytics
 
Investigative Analytics- What's in a Data Scientists Toolbox
Investigative Analytics- What's in a Data Scientists ToolboxInvestigative Analytics- What's in a Data Scientists Toolbox
Investigative Analytics- What's in a Data Scientists Toolbox
 
Left Brain, Right Brain: How to Unify Enterprise Analytics
Left Brain, Right Brain: How to Unify Enterprise AnalyticsLeft Brain, Right Brain: How to Unify Enterprise Analytics
Left Brain, Right Brain: How to Unify Enterprise Analytics
 
Farklı Ortamlarda Büyük Veri Kavramı -Big Data by Sybase
Farklı Ortamlarda Büyük Veri Kavramı -Big Data by Sybase Farklı Ortamlarda Büyük Veri Kavramı -Big Data by Sybase
Farklı Ortamlarda Büyük Veri Kavramı -Big Data by Sybase
 
Big Data Analytics in a Heterogeneous World - Joydeep Das of Sybase
Big Data Analytics in a Heterogeneous World - Joydeep Das of SybaseBig Data Analytics in a Heterogeneous World - Joydeep Das of Sybase
Big Data Analytics in a Heterogeneous World - Joydeep Das of Sybase
 
EDF2013: Selected Talk: Bryan Drexler: The 80/20 Rule and Big Data
EDF2013: Selected Talk: Bryan Drexler: The 80/20 Rule and Big Data EDF2013: Selected Talk: Bryan Drexler: The 80/20 Rule and Big Data
EDF2013: Selected Talk: Bryan Drexler: The 80/20 Rule and Big Data
 
Building a business intelligence architecture fit for the 21st century by Jon...
Building a business intelligence architecture fit for the 21st century by Jon...Building a business intelligence architecture fit for the 21st century by Jon...
Building a business intelligence architecture fit for the 21st century by Jon...
 
Metadata Use Cases
Metadata Use CasesMetadata Use Cases
Metadata Use Cases
 
Metadata Use Cases You Can Use
Metadata Use Cases You Can UseMetadata Use Cases You Can Use
Metadata Use Cases You Can Use
 
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & TalendIntroducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
 
Search2012 ibm vf
Search2012 ibm vfSearch2012 ibm vf
Search2012 ibm vf
 
Simplifying Big Data Analytics for the Business
Simplifying Big Data Analytics for the BusinessSimplifying Big Data Analytics for the Business
Simplifying Big Data Analytics for the Business
 
TBR ODS EDW Planning 2007
TBR ODS EDW Planning 2007TBR ODS EDW Planning 2007
TBR ODS EDW Planning 2007
 
Module 1 Information Management and Analytics Final
Module 1 Information Management and Analytics FinalModule 1 Information Management and Analytics Final
Module 1 Information Management and Analytics Final
 
Martin Wildberger Presentation
Martin Wildberger PresentationMartin Wildberger Presentation
Martin Wildberger Presentation
 
IBM Cognos - IBM informations-integration för IBM Cognos användare
IBM Cognos - IBM informations-integration för IBM Cognos användareIBM Cognos - IBM informations-integration för IBM Cognos användare
IBM Cognos - IBM informations-integration för IBM Cognos användare
 
Information Management: Answering Today’s Enterprise Challenge
Information Management: Answering Today’s Enterprise ChallengeInformation Management: Answering Today’s Enterprise Challenge
Information Management: Answering Today’s Enterprise Challenge
 
Enterprise Services Solutions
Enterprise Services SolutionsEnterprise Services Solutions
Enterprise Services Solutions
 

More from Hortonworks

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyHortonworks
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakHortonworks
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsHortonworks
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysHortonworks
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's NewHortonworks
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerHortonworks
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsHortonworks
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeHortonworks
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidHortonworks
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleHortonworks
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATAHortonworks
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Hortonworks
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseHortonworks
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseHortonworks
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationHortonworks
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementHortonworks
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHortonworks
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCHortonworks
 

More from Hortonworks (20)

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's New
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
 

Teradata Big Data London Seminar

  • 1. UNIFIED DATA ARCHITECTURE Chris Hillman Teradata Principal Data Scientist
  • 2. Need for a Unified Data Architecture for New Insights Enabling Any User for Any Data Type from Data Capture to Analysis Java, C/C++, Python, R, SAS, SQL, Excel, BI, Visualization Reporting and Execution Discover and Explore in the Enterprise Capture, Store and Refine Audio/ Web & Machine Images Docs Text CRM SCM ERP Video Social Logs 2 4/23/12 Teradata Confidential
  • 3. UNIFIED DATA ARCHITECTURE Data Scientists Quants Customers / Partners Front-Line Workers Engineers Business Analysts Executives Operational Systems LANGUAGES MATH & STATS DATA MINING BUSINESS INTELLIGENCE APPLICATIONS DISCOVERY INTEGRATED PLATFORM DATA WAREHOUSE AUDIO & VIDEO IMAGES TEXT WEB & SOCIAL MACHINE LOGS CRM SCM ERP 3 Confidential and proprietary. Copyright © 2012 Teradata Corporation.
  • 4. Requirements for an Integrated Data Warehouse Customers/Partners • Single View of Your Business Marketing Business Analysts Front-line Workers • Cross-Functional Analysis Executives Knowledge Workers Operational Systems • Shared Source for Analytics • Load Once, Use Many Times • Highest Business Value BUSINESS INTELLIGENCE DATA MINING APPLICATIONS • Lowest Total Cost of Ownership • Fastest Time-to-Market For New Apps INTEGRATED DATA WAREHOUSE 4 Confidential and proprietary. Copyright © 2012 Teradata Corporation.
  • 5. Requirements of a Discovery Platform DATA SOURCES DISCOVERY DISCOVERY TOOLS USERS Non- SQL Relational Data Discovery Platform Data MapReduce Scientist Multi- Structured • Structured and Statistical Functions Business Data Analyst multi-structured data • Fraud patterns • Doesn’t require Structured extensive data • Customer behavior Data modeling • Digital marketing • Doesn’t balance the optimization books • Supply chain and • Data completeness supply line sensors OLTP can be good enough DBMS’s • No stringent SLAs 5 Confidential and proprietary. Copyright © 2012 Teradata Corporation.
  • 6. UNIFIED DATA ARCHITECTURE Data Scientists Quants Customers / Partners Front-Line Workers Engineers Business Analysts Executives Operational Systems LANGUAGES MATH & STATS DATA MINING BUSINESS INTELLIGENCE APPLICATIONS Big Data Analytics DISCOVERY INTEGRATED PLATFORM DATA WAREHOUSE Big Data Management CAPTURE | STORE | REFINE AUDIO & VIDEO IMAGES TEXT WEB & SOCIAL MACHINE LOGS CRM SCM ERP 6 Confidential and proprietary. Copyright © 2012 Teradata Corporation.
  • 7. TERADATA UNIFIED DATA ARCHITECTURE Data Scientists Quants Customers / Partners Front-Line Workers Engineers Business Analysts Executives Operational Systems LANGUAGES MATH & STATS DATA MINING BUSINESS INTELLIGENCE APPLICATIONS Productionize Analytic Score with Path Variable Golden Path Application Submit Event Triggers Fraud Sentiment Analysis Marketing Integration Multi-Channel Customer Behavior Customer Behavior Analysis Channel Hoping MySpending Report Attrition Paths Customer Segmentation Fraudulent Paths Credit Risk Analysis Digital Marketing Attribution DISCOVERY INTEGRATED Customer profitability PLATFORM DATA WAREHOUSE Portfolio Analysis Consumerization Sessionization Cross Platform Aggregation CAPTURE | STORE | REFINE E-MAIL STORE SVP SURVEY ON-LINE BRANCH DATA CALL CENTER ATM PROFILE
  • 8. TERADATA UNIFIED DATA ARCHITECTURE Data Scientists Quants Customers / Partners Front-Line Workers Engineers Business Analysts Executives Operational Systems LANGUAGES MATH & STATS DATA MINING BUSINESS INTELLIGENCE APPLICATIONS DISCOVERY INTEGRATED PLATFORM DATA WAREHOUSE SQL-H CAPTURE | STORE | REFINE 8 ConfidentialVIDEOproprietary. Copyright © 2012 Teradata Corporation. AUDIO & and IMAGES TEXT WEB & SOCIAL MACHINE LOGS CRM SCM ERP
  • 9. SQL-H In Action Join Teradata, Hadoop, Aster tables; feed into Map Reduce SELECT qrd_focus_area, count(*) SQL manipulation for calculation FROM nPath( ON ( SELECT * FROM ( SELECT * FROM load_from_teradata( TD Connector to ON mr_driver TDPID(‘dbc’) get OWNERSHIP USERNAME(‘name1’) PASSWORD(‘password1’) data QUERY(‘SELECT * FROM owner.prod_own_fact’) ) ) AS td Include local Aster JOIN owner.prod_dim proddim ON td.prod_id = proddim.product_id tables in JOIN JOIN ( SELECT * FROM load_from_hadoop( ON mr_driver SERVER ('10.10.3.139') Hadoop Connector to get WARRANTY USERNAME (‘name2') DBNAME (‘repair') data TABLENAME ('transaction') ) ) AS sqlh ON sqlh.prod_ident_nbr = proddim.id ) PARTITION BY party_id, prod_id ORDER BY repair_dt Any path you MODE (OVERLAPPING) want, specified PATTERN ( ‘REPAIR{3}' ) with the power of regular SYMBOLS ( event = ‘REPAIR’ AS REPAIR ) expressions! RESULT (ACCUMULATE(qrd_focus_area OF ANY(REPAIR)) AS qrd_focus_area_path ) )n GROUP BY 1 ORDER BY 2 desc ; 9 Confidential and proprietary. Copyright © 2012 Teradata Corporation.
  • 10. TERADATA UNIFIED DATA ARCHITECTURE Data Scientists Quants Customers / Partners Front-Line Workers Engineers Business Analysts Executives Operational Systems VIEWPOINT LANGUAGES MATH & STATS DATA MINING BUSINESS INTELLIGENCE APPLICATIONS SUPPORT DISCOVERY Aster Teradata INTEGRATED PLATFORM Connector DATA WAREHOUSE Aster Connector for SQL-H Teradata Connector Hadoop for Hadoop Aster Loader Teradata Loader CAPTURE | STORE | REFINE 10 ConfidentialVIDEOproprietary. Copyright © 2012 Teradata Corporation. AUDIO & and IMAGES TEXT WEB & SOCIAL MACHINE LOGS CRM SCM ERP
  • 11. When to Use Which? The best approach by workload and data type Processing as a Function of Schema Requirements and Stage of Data Pipeline “Simple math Data Pre- Low Cost at scale” Joins, Analytics Processing, Storage and (Score, filter, Unions, (Iterative and Reporting Refining, Fast Loading sort, avg., Aggregates data mining) Cleansing count...) Financial Analysis, Ad-Hoc/OLAP Stable Teradata/ Enterprise-Wide BI TeradataReporting Teradata Teradata and Teradata Teradata Schema Hadoop Spatial/Temporal Active Execution Interactive Data Discovery Aster Evolving (SQL + Web Clickstream, Set-Top Box Analysis Aster / Aster / Hadoop Aster Aster Aster Schema Hadoop Hadoop MapReduce CDRs, Sensor Logs, JSON Analytics) Social Feeds, Text, Image Processing Aster Format, No Schema Hadoop Hadoop Audio/Video Storage and Refining Hadoop Hadoop Hadoop Hadoop Aster Aster Aster Aster (MapReduce Aster Analytics) Storage and Batch Transformations 11 Confidential and proprietary. Copyright © 2012 Teradata Corporation.
  • 12. When to Use Which? The best approach by workload and data type Processing as a Function of Schema Requirements and Stage of Data Pipeline “Simple math Data Pre- Low Cost at scale” Joins, Analytics Processing, Storage and (Score, filter, Unions, (Iterative and Reporting Refining, Fast Loading sort, avg., Aggregates data mining) Cleansing count...) Stable Teradata/ Teradata Teradata Teradata Teradata Teradata Schema Hadoop Aster Evolving Hadoop Aster / Aster / Aster Aster (SQL + Aster Schema Hadoop Hadoop MapReduce Analytics) Aster Format, Hadoop Hadoop Hadoop Hadoop Hadoop Hadoop Aster Aster Aster Aster (MapReduce Aster No Schema Analytics) 12 Confidential and proprietary. Copyright © 2012 Teradata Corporation.
  • 13. UDA IN PRACTICE IPTV QUALITY OF SERVICE
  • 14. Starting point: Complaints Data 14 Confidential and proprietary. Copyright © 2012 Teradata Corporation.
  • 15. Churners – and data quality 15 Confidential and proprietary. Copyright © 2012 Teradata Corporation.
  • 16. What events lead up to a reboot? Note number of paths with a reboot, following another reboot! CREATE dimension table wrk.npath_reboot_5events AS SELECT path, COUNT(*) AS path_count FROM nPath (ON wrk.w_event_f PARTITION BY srv_id SELECT * ORDER BY evt_ts desc FROM GraphGen (ON MODE (NONOVERLAPPING ) (SELECT * from wrk.npath_reboot_5events PATTERN ('X{0,5}.reboot') ORDER BY path_count SYMBOLS LIMIT 30 ) (true as X, PARTITION BY 1 evt_name = 'REBOOT' AS reboot) ORDER BY path_count desc RESULT item_format('npath') (FIRST( srv_id OF X) AS srv_id, item1_col('path') ACCUMULATE (evt_name OF ANY (X,reboot)) score_col('path_count') AS path) output_format('sankey') ) GROUP BY 1 ; justify('right')); 16 Confidential and proprietary. Copyright © 2012 Teradata Corporation.
  • 17. View events data in Tableau Looks like an issue with the data on the 30th September and beyond, the Reboot data for October seems to have been aggregated and added to September the 30th 17 Confidential and proprietary. Copyright © 2012 Teradata Corporation.
  • 18. Address data quality • Remove paths will all reboots and exclude data from 30th September Would appear that events with suffix 1 and 2 can be added together 18 Confidential and proprietary. Copyright © 2012 Teradata Corporation.
  • 19. Visualise as a Graph using Aster GraphGen Size of Node = number of customers Width of Edge = number of errors SELECT * FROM graphgen (ON (SELECT DISTINCT dmt_act_dslam, nra_id, nbr_of_srvid, errorspersrv, nbr_of_dslam FROM wrk.srvid_dslam_err) PARTITION BY 1 ORDER BY errorspersrv item_format('cfilter') item1_col('dmt_act_dslam') item2_col('nra_id') score_col('errorspersrv') cnt1_col('nbr_of_srvid') cnt2_col('nbr_of_dslam') output_format('sigma') directed('false') width_max(10) width_min(1) nodesize_max (3) nodesize_min (1)); 19 Confidential and proprietary. Copyright © 2012 Teradata Corporation.
  • 20. Synch Issues by Hub Type 20 Confidential and proprietary. Copyright © 2012 Teradata Corporation.
  • 21. Error and Complaint rates by equipment type 21 Confidential and proprietary. Copyright © 2012 Teradata Corporation.
  • 22.
  • 23. UDA IN PRACTICE PREDICTIVE MODELS
  • 24. Input Data create table wrk.cih_dshb_ads as SELECT srv_id, sav_flag, offer, inseecode, code_postal, libelle, nom_dep, nom_region, longitude, latitude, coalesce(topo_nra, 'Unknown') as topo_nra, topo_dslam, coalesce(iad_hardwareversion, 'Unknown') as iad_hardwareversion, coalesce(iad_manufacturer, 'Unknown') as iad_manufacturer, coalesce(iad_modelname , 'Unknown') as iad_modelname, coalesce(iad_modemfirmwareversion , 'Unknown') as iad_modemfirmwareversion, coalesce(iad_productclass , 'Unknown') as iad_productclass, coalesce(iad_provisioningcode , 'Unknown') as iad_provisioningcode, coalesce(iad_softwareversion , 'Unknown') as iad_softwareversion, coalesce(iad_vendorconfigfiledescription_1 , 'Unknown') as iad_vendorconfigfiledescription_1, coalesce(iad_vendorconfigfilename_1 , 'Unknown') as iad_vendorconfigfilename_1, coalesce(iad_vendorconfigfilenumbofentries , 0) as iad_vendorconfigfilenumbofentries, coalesce(iad_vendorconfigfileversion_1 , 'Unknown') as iad_vendorconfigfileversion_1, coalesce(iad_x_000e50_boardversion , 'Unknown') as iad_x_000e50_boardversion, coalesce(stb_description , 'Unknown') as stb_description, coalesce(stb_devicestatus , 'Unknown') as stb_devicestatus, coalesce(stb_gwinfoproductclass , 'Unknown') as stb_gwinfoproductclass, coalesce(stb_hardwareversion , 'Unknown') as stb_hardwareversion, coalesce(stb_manufacturer , 'Unknown') as stb_manufacturer, coalesce(stb_productclass , 'Unknown') as stb_productclass, coalesce( stb_softwareversion, 'Unknown') as stb_softwareversion, dev_iad_uptime_diff,dsl_showtime_diff,dev_stb_uptime_diff, kpi_iad_uptime,kpi_iad_synctime,kpi_stb_uptime, dev_iad_uptime,dsl_showtime,dev_stb_uptime, dsl_downstr_att,dsl_downstr_cur,dsl_downstr_max, kpi_voip_nb_dropped_calls_diff,kpi_voip_nb_dropped_calls,kpi_dsl_nb_crc,kpi_dsl_dscurrate_ratio_qualite, kpi_voip_tx_appels_coupes,kpi_voip_qualite,kpi_voip_qualite_diff,kpi_iptv_plr_nb_bon,kpi_iptv_plr_nb_moyen, ,kpi_iptv_conso_heures,kpi_iptv_packetslosts,kpi_iptv_packetsreceived, kpi_dsl_dscurrate_before,kpi_dsl_dscurrate_after, FROM wrk.cih_dshb_bis where network = 'BYT' and stb_manufacturer is not null and topo_dslam is not null 24 Confidential and proprietary. Copyright © 2012 Teradata Corporation.
  • 25. Decision Trees SELECT * FROM forest_drive (ON (SELECT 1) PARTITION BY 1 DATABASE('beehive') USERID('beehive') PASSWORD('beehive') INPUTTABLE('wrk.cih_dshb_tree_in') OUTPUTTABLE('wrk.cih_dshb_tree_out') RESPONSE('sav_flag') NUMERICINPUTS(‘KPI_SIGNAL') CATEGORICALINPUTS('offer', 'nom_dep', 'nom_region', 'topo_nra','topo_dslam' , 'iad_modemfirmwareversion', 'iad_vendorconfigfiledescription_1', 'iad_x_000e50_boardversion', 'stb_description', 'stb_productclass', 'stb_softwareversion', 'topo_dslam_brand') NUMTREES(4) ) 25 Confidential and proprietary. Copyright © 2012 Teradata Corporation.
  • 26. Naïve Bayes CREATE TABLE wrk.cih_dshb_model (PARTITION KEY(class)) AS SELECT * FROM naiveBayesReduce( ON(SELECT * FROM naiveBayesMap( ON (select * from wrk.cih_dshb_ads_in_11 where kpi_iad_uptime is not null) RESPONSE('sav_flag') NUMERICINPUTS('dev_iad_uptime','dsl_showtime','dev_stb_uptime', 'dsl_downstr_att','dsl_downstr_cur','dsl_downstr_max', 'kpi_voip_nb_dropped_calls_diff','kpi_voip_nb_dropped_calls','kpi_dsl_nb_crc','kpi_dsl_d scurrate_ratio_qualite','kpi_voip_tx_appels_coupes','kpi_voip_qualite','kpi_voip_qualite_ diff','kpi_iptv_plr_nb_bon','kpi_iptv_plr_nb_moyen','kpi_iptv_plr_nb_mauvais', 'kpi_iptv_packetslosts','kpi_iptv_packetsreceived','kpi_stb_uptime','kpi_iad_synctime','kp i_iad_uptime') CATEGORICALINPUTS('offer', 'nom_dep', 'nom_region', 'topo_nra','topo_dslam' , 'iad_modemfirmwareversion','iad_vendorconfigfiledescription_1','iad_x_000e50_boardve rsion', 'stb_description', 'stb_productclass', 'stb_softwareversion', 'topo_dslam_brand') ) )PARTITION BY class ); 26 Confidential and proprietary. Copyright © 2012 Teradata Corporation.
  • 27. Support Vector Machine create table wrk.cih_svm_train2 distribute by hash(srv_id) as select srv_id, 'topo_nra_insee' as attr, topo_nra_insee::varchar as attr_value, sav_all_tgt FROM wrk.cih_sav_train union all select srv_id, 'code_postal' as attr, code_postal::varchar as attr_value, sav_all_tgt FROM wrk.cih_sav_train union all select srv_id, 'kpi_iad_uptime_avg' as attr, kpi_iad_uptime_avg::varchar as attr_value, sav_all_tgt FROM wrk.cih_sav_train union all select srv_id, 'dev_iad_uptime_diff_avg' as attr, dev_iad_uptime_diff_avg::varchar as attr_value, sav_all_tgt FROM wrk.cih_sav_train union all select srv_id, 'kpi_voip_nb_dropped_calls_diff_avg' as attr, kpi_voip_nb_dropped_calls_diff_avg::varchar as attr_value, sav_all_tgt FROM wrk.cih_sav_train union all select srv_id, 'sav_nb_contacts' as attr, sav_nb_contacts::varchar as attr_value, sav_all_tgt FROM wrk.cih_sav_train union all select srv_id, 'nb_tr' as attr, nb_tr::varchar as attr_value, sav_all_tgt FROM wrk.cih_sav_train union all select srv_id, 'kpi_dsl_nb_crc_avg' as attr, kpi_dsl_nb_crc_avg::varchar as attr_value, sav_all_tgt FROM wrk.cih_sav_train; /*Run SVM*/ CREATE TABLE wrk.cih_svm_model3 (PARTITION KEY(vec_index)) AS SELECT vec_index, avg(vec_value) as vec_value FROM svm( ON wrk.cih_svm_train2 PARTITION BY srv_id OUTCOME( 'sav_flag' ) ATTRIBUTE_NAME( 'attr' ) ATTRIBUTE_VALUE( 'attr_value' ) )GROUP BY vec_index; 27 Confidential and proprietary. Copyright © 2012 Teradata Corporation.
  • 28. Lift Chart to View Predictive Model Performance 28 Confidential and proprietary. Copyright © 2012 Teradata Corporation.

Editor's Notes

  1. We want to help companies manage all of their data and get the best analytics valuePeople define big data around 3 V’s (volume, velocity, variety)Teradata sees the most value in “Big A” – Analytics. New analytics is what solves business problems which couldn’t be addressed beforeTo leverage Big Data you must give all the business analysts in your organization the right analytical tool on all the existing and new data available Operationalizing these new insights drives competitive advantage To do this we’ve develop the Unified Data Architecture™, an architecture that leverages the right technology on the right analytical problems - leveraging best-of-breed technologies.
  2. Good slide. Important. Could be made prettier.