SlideShare una empresa de Scribd logo
1 de 6
Descargar para leer sin conexión
14/10/2011




                                                                     Agenda
                                  Using Hadoop with Talend            Talend Introduction

                                                                      MapReduce and Hadoop
                                                Mark Chapman
                                                                      Talend Integration Suite MPx
                                                  Imad Rahman
                                                                           Hadoop Features and TIS Components

                                                                      How to use Talend to simplify Hadoop

                                                                      Demo!

                                                                      Questions & Answers




                                                                    © Talend 2011                                                      2




 Agenda                                                              Global leader in open source integration
                                                                      Venture-backed
  Talend Introduction

  MapReduce and Hadoop                                               Global operations
                                                                      Corporate Headquarters                Talend across the world…
  Talend Integration Suite MPx
                                                                            San Francisco (Los Altos)
                                                                            Paris (Suresnes)
       Hadoop Features and TIS Components
                                                                      Operations
  How to use Talend to simplify Hadoop                                     Orange County (Irvine)
                                                                            Boston (Burlington)
  Demo!                                                                    New York (Tarrytown)
                                                                            London (Maidenhead)
  Questions & Answers                                                      Utrecht
                                                                            Nuremberg
                                                                            Bonn
                                                                            Munich
                                                                            Milan (Bergame)
                                                                            Tokyo
                                                                            Beijing

© Talend 2011                                                   3   © Talend 2011                                                      4




                                                                                                                                           1
14/10/2011




Customers By Industry
Public Sector & Education            Systems Integrators                                      Market Positioning

                                                                                                                         Application Integration
                                                                                                                         Connect applications & services
                                     Services & Others


Media & Telco
                                                                                                                                                           Master
                                                                                            Data Quality                                                   Data
                                     Software                                               Data profiling                                                 Management
                                                                                            Data cleansing
                                                                                                                                                           Model and master
                                                                                                                                                           any data or domain
                                     Retail and Manufacturing
Finance & Insurance
                                                                                                                        Data Integration
                                                                                                                        Analytics (ETL)
                                                                                                                        Operational data integration


© Talend 2011                                                                           5    © Talend 2011                                                                      6




Talend Unified Platform                                                                       Agenda
 Complete unified environment supports all integration approaches –
   data & application                                                                          Talend Introduction
 Uses consistent technology & leverages open standards                                        MapReduce and Hadoop

                               Studio           Comprehensive Eclipse-based                    Talend Integration Suite MPx
                                                user interface
                                                                                                    Hadoop Features and TIS Components
                                                Consolidated metadata & project
                               Repository       information                                    How to use Talend to simplify Hadoop

                                                                                               Demo!
                               Deployment Web-based deployment & scheduling
                                                                                               Questions & Answers
                                                Same containers for batch processing,
                               Execution        message routing & services


                               Monitoring       Single web-based monitoring console

© Talend 2011                                                                           7    © Talend 2011                                                                      8




                                                                                                                                                                                    2
14/10/2011




Background: MapReduce and Hadoop                                                 Talend Integration Suite MPx for Big Data

MapReduce: Parallel Programming Model                                                   •    One platform
  “Divide and Conquer                                                                  •    All sources
    Many possible implementations                                                      •    All modes                                      Big Data
                                                                                        •    All scales                                     ·Hadoop
                                                                                                                                            ·Filescale
                                                                                                                              High Volume
Hadoop: Open Source Java MapReduce                                                                                            (ELT)
  Simplified framework
                                                                                                              Batch ETL




 Cloud: flexible infrastructure                                                                  Right-Time

  e.g. Amazon Elastic MapReduce


© Talend 2011                                                               9    © Talend 2011                                                           10




Talend’s Big Data Partnerships                                                    Agenda

                              Partnering with Enterprise Big Data Leaders          Talend Introduction

                                                                                   MapReduce and Hadoop

                                                                                   Talend Integration Suite MPx

                              Cloudera: Enterprise Hadoop                               Hadoop Features and TIS Components
                                Talend: Open Source Cloudera
                                                                                   How to use Talend to simplify Hadoop
                                Connect Partner for Data Integration
                                                                                   Demo!

                                                                                   Questions & Answers
                              Greenplum: Hadoop-Powered Analytics
                               Big Data-scale Relational DB
                               Talend supports Greenplum for
                                 Hadoop and ELT
© Talend 2011                                                               11   © Talend 2011                                                           12




                                                                                                                                                              3
14/10/2011




Talend Integration Suite MPx                                                Talend Components for Hadoop Features

                                                                              HDFS (Hadoop File System) utilities – for loading/unloading files
                Hadoop                           Filescale
                Features                         Features                     Sqoop – utility for RDBMS extract to HDFS (Cloudera only)


       • Hadoop components for           • Use case: process                  Data Warehousing on Hadoop using Hive - SQL - like language, to
         easy job design                   structured flat files                  query and transform data
       • HDFS: store, retrieve data        (e.g. logs)
       • Cloudera Sqoop: Bulk ETL        • Uses MapReduce                     Transforming Data in Hadoop using Pig – transform, normalize, clean
                                           techniques                             HDFS data – very flexible
       • Hive: Relational DB layer
                                         • Performance optimized for
       • Pig: In-Hadoop                    this use case
         transformations                                                      Talend Integration Suite MPx Hadoop Support
                                         • Native code, no Java
                                                                                Components for HDFS and Sqoop loading/unloading
                                                                                   Components for defining Pig and Hive jobs
                                                                                   Integrate with any of Talend’s supported sources!



© Talend 2011                                                          13   © Talend 2011                                                            14




 Agenda                                                                     Applying Talend Big Data in Enterprise

  Talend Introduction                                                        Landing data from operational systems
                                                                              Transforming it before loading DW
  MapReduce and Hadoop
                                                                                                                                DW      BI
  Talend Integration Suite MPx                                                                           Hadoop
       Hadoop Features and TIS Components                                                              Hive


  How to use Talend to simplify Hadoop
                                                                                                               Pig
                                                                                                       Sqoop

                                                                                                               Hive

  Demo!
                                                                                                        HDFS




  Questions & Answers



                                                                              Performing additional analytics directly in Hadoop
                                                                              Keeping historical data online for queries

© Talend 2011                                                          15   © Talend 2011                                                            16




                                                                                                                                                          4
14/10/2011




Today’s Demo Scenario

       View sample log data from an online game source
       Load log data into Hive
       Aggregate the data into 2 aggregate tables
       Load aggregated data into RDBMS
       Additional processing using PIG
                                                                                                                        Show Time!




© Talend 2011                                                         17




Wrap-up                                                                     Agenda

                                                                             Talend Introduction
    Talend Integration Suite MPx…
                                                                             MapReduce and Hadoop
         delivers MapReduce technologies as part of a
                                                                             Talend Integration Suite MPx
                comprehensive data management solution
                                                                                  Hadoop Features and TIS Components
         makes using Hadoop like other data integration activities
                                                                             How to use Talend to simplify Hadoop

    …is available for you to try                                            Demo!

                                                                             Questions & Answers
         Free 2 month license to Talend Integration Suite MPx


         Visit http://info.talend.com/hugoffer.html



© Talend 2011                                                         19   © Talend 2011                                             20




                                                                                                                                          5
14/10/2011




 Questions and Answers

 Mark Chapman
  Technical Manager
  mchapman@talend.com
                                        Thank You!
  Skype: mchapman68



 Imad Rahman
  Technical Presales Consultant
  irahman@talend.com
  Skype: imadrahman.talend




© Talend 2011                      21




                                                     6

Más contenido relacionado

La actualidad más candente

User Interfaces and SOA - OPITZ CONSULTING - Maier - Winterberg
User Interfaces and SOA - OPITZ CONSULTING - Maier - WinterbergUser Interfaces and SOA - OPITZ CONSULTING - Maier - Winterberg
User Interfaces and SOA - OPITZ CONSULTING - Maier - WinterbergOPITZ CONSULTING Deutschland
 
Future of Publishing-Content Workflow-Shivaji Sengupta-London Book Fair 2012
Future of Publishing-Content Workflow-Shivaji Sengupta-London Book Fair 2012Future of Publishing-Content Workflow-Shivaji Sengupta-London Book Fair 2012
Future of Publishing-Content Workflow-Shivaji Sengupta-London Book Fair 2012NXTKey Corporation
 
Webinar: Spagic and eForm Services: a practical approach to PDF support
Webinar: Spagic and eForm Services: a practical approach to PDF supportWebinar: Spagic and eForm Services: a practical approach to PDF support
Webinar: Spagic and eForm Services: a practical approach to PDF supportSpagoWorld
 
Live to e-Learning, 
a lecture capture and delivery service based on MediaMosa
Live to e-Learning, 
a lecture capture and delivery service based on MediaMosaLive to e-Learning, 
a lecture capture and delivery service based on MediaMosa
Live to e-Learning, 
a lecture capture and delivery service based on MediaMosaMediaMosa
 
"Nuxeo 5 a Complete Open Source ECM Solution" by Andreea Stefanescu @ eLibera...
"Nuxeo 5 a Complete Open Source ECM Solution" by Andreea Stefanescu @ eLibera..."Nuxeo 5 a Complete Open Source ECM Solution" by Andreea Stefanescu @ eLibera...
"Nuxeo 5 a Complete Open Source ECM Solution" by Andreea Stefanescu @ eLibera...eLiberatica
 
IEEE P2P 2009 - Kalman Graffi - Monitoring and Management of Structured Peer-...
IEEE P2P 2009 - Kalman Graffi - Monitoring and Management of Structured Peer-...IEEE P2P 2009 - Kalman Graffi - Monitoring and Management of Structured Peer-...
IEEE P2P 2009 - Kalman Graffi - Monitoring and Management of Structured Peer-...Kalman Graffi
 
Laurence Coudroy from Johnson & Johnson on ‘SCM as a Market Strategy Differen...
Laurence Coudroy from Johnson & Johnson on ‘SCM as a Market Strategy Differen...Laurence Coudroy from Johnson & Johnson on ‘SCM as a Market Strategy Differen...
Laurence Coudroy from Johnson & Johnson on ‘SCM as a Market Strategy Differen...eyefortransport
 
Customer Liaison - The Salford Journey
Customer Liaison - The Salford JourneyCustomer Liaison - The Salford Journey
Customer Liaison - The Salford Journeysue_cunningham
 
Icws10 lecue-gorronogoitia-gonzalez-radzimski-villa-presentation
Icws10 lecue-gorronogoitia-gonzalez-radzimski-villa-presentationIcws10 lecue-gorronogoitia-gonzalez-radzimski-villa-presentation
Icws10 lecue-gorronogoitia-gonzalez-radzimski-villa-presentationFreddy Lecue
 
Impac Systems Iso Draw
Impac Systems   Iso DrawImpac Systems   Iso Draw
Impac Systems Iso Drawjackieliles
 
Apac summit ODCA - Allyson Klein
Apac summit ODCA - Allyson KleinApac summit ODCA - Allyson Klein
Apac summit ODCA - Allyson KleinIntelAPAC
 
Siemens Enterprise Communications Company Presentation
Siemens Enterprise Communications Company PresentationSiemens Enterprise Communications Company Presentation
Siemens Enterprise Communications Company Presentationguest0d3b260
 
Knowledge Management
Knowledge ManagementKnowledge Management
Knowledge ManagementAKAGroup
 
Linked in 4eme table ronde 20120601
Linked in 4eme table ronde 20120601Linked in 4eme table ronde 20120601
Linked in 4eme table ronde 20120601Dario Mangano
 
Forecast 2012: Cloud Storm Keynote Andy Stokes
Forecast 2012: Cloud Storm Keynote Andy StokesForecast 2012: Cloud Storm Keynote Andy Stokes
Forecast 2012: Cloud Storm Keynote Andy StokesOpen Data Center Alliance
 
Looking for Solutions
Looking for SolutionsLooking for Solutions
Looking for Solutionswfhdesign
 

La actualidad más candente (17)

User Interfaces and SOA - OPITZ CONSULTING - Maier - Winterberg
User Interfaces and SOA - OPITZ CONSULTING - Maier - WinterbergUser Interfaces and SOA - OPITZ CONSULTING - Maier - Winterberg
User Interfaces and SOA - OPITZ CONSULTING - Maier - Winterberg
 
Future of Publishing-Content Workflow-Shivaji Sengupta-London Book Fair 2012
Future of Publishing-Content Workflow-Shivaji Sengupta-London Book Fair 2012Future of Publishing-Content Workflow-Shivaji Sengupta-London Book Fair 2012
Future of Publishing-Content Workflow-Shivaji Sengupta-London Book Fair 2012
 
Webinar: Spagic and eForm Services: a practical approach to PDF support
Webinar: Spagic and eForm Services: a practical approach to PDF supportWebinar: Spagic and eForm Services: a practical approach to PDF support
Webinar: Spagic and eForm Services: a practical approach to PDF support
 
Live to e-Learning, 
a lecture capture and delivery service based on MediaMosa
Live to e-Learning, 
a lecture capture and delivery service based on MediaMosaLive to e-Learning, 
a lecture capture and delivery service based on MediaMosa
Live to e-Learning, 
a lecture capture and delivery service based on MediaMosa
 
"Nuxeo 5 a Complete Open Source ECM Solution" by Andreea Stefanescu @ eLibera...
"Nuxeo 5 a Complete Open Source ECM Solution" by Andreea Stefanescu @ eLibera..."Nuxeo 5 a Complete Open Source ECM Solution" by Andreea Stefanescu @ eLibera...
"Nuxeo 5 a Complete Open Source ECM Solution" by Andreea Stefanescu @ eLibera...
 
IEEE P2P 2009 - Kalman Graffi - Monitoring and Management of Structured Peer-...
IEEE P2P 2009 - Kalman Graffi - Monitoring and Management of Structured Peer-...IEEE P2P 2009 - Kalman Graffi - Monitoring and Management of Structured Peer-...
IEEE P2P 2009 - Kalman Graffi - Monitoring and Management of Structured Peer-...
 
Laurence Coudroy from Johnson & Johnson on ‘SCM as a Market Strategy Differen...
Laurence Coudroy from Johnson & Johnson on ‘SCM as a Market Strategy Differen...Laurence Coudroy from Johnson & Johnson on ‘SCM as a Market Strategy Differen...
Laurence Coudroy from Johnson & Johnson on ‘SCM as a Market Strategy Differen...
 
Customer Liaison - The Salford Journey
Customer Liaison - The Salford JourneyCustomer Liaison - The Salford Journey
Customer Liaison - The Salford Journey
 
Icws10 lecue-gorronogoitia-gonzalez-radzimski-villa-presentation
Icws10 lecue-gorronogoitia-gonzalez-radzimski-villa-presentationIcws10 lecue-gorronogoitia-gonzalez-radzimski-villa-presentation
Icws10 lecue-gorronogoitia-gonzalez-radzimski-villa-presentation
 
Impac Systems Iso Draw
Impac Systems   Iso DrawImpac Systems   Iso Draw
Impac Systems Iso Draw
 
Apac summit ODCA - Allyson Klein
Apac summit ODCA - Allyson KleinApac summit ODCA - Allyson Klein
Apac summit ODCA - Allyson Klein
 
Siemens Enterprise Communications Company Presentation
Siemens Enterprise Communications Company PresentationSiemens Enterprise Communications Company Presentation
Siemens Enterprise Communications Company Presentation
 
Knowledge Management
Knowledge ManagementKnowledge Management
Knowledge Management
 
Linked in 4eme table ronde 20120601
Linked in 4eme table ronde 20120601Linked in 4eme table ronde 20120601
Linked in 4eme table ronde 20120601
 
Forecast 2012: Cloud Storm Keynote Andy Stokes
Forecast 2012: Cloud Storm Keynote Andy StokesForecast 2012: Cloud Storm Keynote Andy Stokes
Forecast 2012: Cloud Storm Keynote Andy Stokes
 
Knowledge based institutional capacity building in the Hungarian Higher Educa...
Knowledge based institutional capacity building in the Hungarian Higher Educa...Knowledge based institutional capacity building in the Hungarian Higher Educa...
Knowledge based institutional capacity building in the Hungarian Higher Educa...
 
Looking for Solutions
Looking for SolutionsLooking for Solutions
Looking for Solutions
 

Similar a Taland Hadoop data integration

4 g world 2011 renesas mobile overview
4 g world 2011 renesas mobile overview4 g world 2011 renesas mobile overview
4 g world 2011 renesas mobile overviewDavid McTernan
 
Data Management Solutions based on Eclipse
Data Management Solutions based on EclipseData Management Solutions based on Eclipse
Data Management Solutions based on EclipseEclipse Day 2010 in Rome
 
The Changes In Service Delivery With Cloud Computing
The Changes In Service Delivery With Cloud ComputingThe Changes In Service Delivery With Cloud Computing
The Changes In Service Delivery With Cloud ComputingMartin Hingley
 
Talend Introduction by TSI
Talend Introduction by TSITalend Introduction by TSI
Talend Introduction by TSIRemain Software
 
Talend and Savoir-faire Linux Present Open Data Management
Talend and Savoir-faire Linux Present Open Data ManagementTalend and Savoir-faire Linux Present Open Data Management
Talend and Savoir-faire Linux Present Open Data ManagementSavoir-faire Linux
 
EDF2013: Invited Talk Bastiaan Deblieck: Who remembers EDP?
EDF2013: Invited Talk Bastiaan Deblieck: Who remembers EDP?EDF2013: Invited Talk Bastiaan Deblieck: Who remembers EDP?
EDF2013: Invited Talk Bastiaan Deblieck: Who remembers EDP?European Data Forum
 
SAP technology roadmap- 2012 Update
SAP technology roadmap- 2012 UpdateSAP technology roadmap- 2012 Update
SAP technology roadmap- 2012 UpdateA J
 
Tackling big data with hadoop and open source integration
Tackling big data with hadoop and open source integrationTackling big data with hadoop and open source integration
Tackling big data with hadoop and open source integrationDataWorks Summit
 
SunCorp Campaign Measurement
SunCorp Campaign MeasurementSunCorp Campaign Measurement
SunCorp Campaign MeasurementDatalicious
 
Transform Your SAP Landscape Using EMC Technologies
Transform Your SAP Landscape Using EMC TechnologiesTransform Your SAP Landscape Using EMC Technologies
Transform Your SAP Landscape Using EMC TechnologiesCenk Ersoy
 
Skelta Software Corporate Presentation
Skelta Software Corporate PresentationSkelta Software Corporate Presentation
Skelta Software Corporate PresentationSchneider Electric
 
TeAM and iSentric - Mobile Business Commercialization Program - 31st Mar 2011
TeAM and iSentric - Mobile Business Commercialization Program - 31st Mar 2011TeAM and iSentric - Mobile Business Commercialization Program - 31st Mar 2011
TeAM and iSentric - Mobile Business Commercialization Program - 31st Mar 2011Technopreneurs Association of Malaysia
 
Key criteria for selecting the best erp system
Key criteria for selecting the best erp systemKey criteria for selecting the best erp system
Key criteria for selecting the best erp systemAnish Kanaran
 
CBS Cloud presentation november 2012
CBS Cloud presentation november 2012CBS Cloud presentation november 2012
CBS Cloud presentation november 2012Henrik Hasselbalch
 
Manipulating Data with Talend.
Manipulating Data with Talend.Manipulating Data with Talend.
Manipulating Data with Talend.Edureka!
 
Manipulating data with Talend. Learn how?
Manipulating data with Talend. Learn how?Manipulating data with Talend. Learn how?
Manipulating data with Talend. Learn how?Edureka!
 
Deg Group2009eng
Deg Group2009engDeg Group2009eng
Deg Group2009engjosjans
 

Similar a Taland Hadoop data integration (20)

4 g world 2011 renesas mobile overview
4 g world 2011 renesas mobile overview4 g world 2011 renesas mobile overview
4 g world 2011 renesas mobile overview
 
Data Management Solutions based on Eclipse
Data Management Solutions based on EclipseData Management Solutions based on Eclipse
Data Management Solutions based on Eclipse
 
Gtl Corporate Presentation For Customers V1
Gtl Corporate Presentation For Customers V1Gtl Corporate Presentation For Customers V1
Gtl Corporate Presentation For Customers V1
 
Company profile Metrasys
Company profile MetrasysCompany profile Metrasys
Company profile Metrasys
 
The Changes In Service Delivery With Cloud Computing
The Changes In Service Delivery With Cloud ComputingThe Changes In Service Delivery With Cloud Computing
The Changes In Service Delivery With Cloud Computing
 
Talend Introduction by TSI
Talend Introduction by TSITalend Introduction by TSI
Talend Introduction by TSI
 
Talend and Savoir-faire Linux Present Open Data Management
Talend and Savoir-faire Linux Present Open Data ManagementTalend and Savoir-faire Linux Present Open Data Management
Talend and Savoir-faire Linux Present Open Data Management
 
EDF2013: Invited Talk Bastiaan Deblieck: Who remembers EDP?
EDF2013: Invited Talk Bastiaan Deblieck: Who remembers EDP?EDF2013: Invited Talk Bastiaan Deblieck: Who remembers EDP?
EDF2013: Invited Talk Bastiaan Deblieck: Who remembers EDP?
 
SAP technology roadmap- 2012 Update
SAP technology roadmap- 2012 UpdateSAP technology roadmap- 2012 Update
SAP technology roadmap- 2012 Update
 
Tackling big data with hadoop and open source integration
Tackling big data with hadoop and open source integrationTackling big data with hadoop and open source integration
Tackling big data with hadoop and open source integration
 
SunCorp Campaign Measurement
SunCorp Campaign MeasurementSunCorp Campaign Measurement
SunCorp Campaign Measurement
 
Transform Your SAP Landscape Using EMC Technologies
Transform Your SAP Landscape Using EMC TechnologiesTransform Your SAP Landscape Using EMC Technologies
Transform Your SAP Landscape Using EMC Technologies
 
Skelta Software Corporate Presentation
Skelta Software Corporate PresentationSkelta Software Corporate Presentation
Skelta Software Corporate Presentation
 
TeAM and iSentric - Mobile Business Commercialization Program - 31st Mar 2011
TeAM and iSentric - Mobile Business Commercialization Program - 31st Mar 2011TeAM and iSentric - Mobile Business Commercialization Program - 31st Mar 2011
TeAM and iSentric - Mobile Business Commercialization Program - 31st Mar 2011
 
Key criteria for selecting the best erp system
Key criteria for selecting the best erp systemKey criteria for selecting the best erp system
Key criteria for selecting the best erp system
 
CBS Cloud presentation november 2012
CBS Cloud presentation november 2012CBS Cloud presentation november 2012
CBS Cloud presentation november 2012
 
Manipulating Data with Talend.
Manipulating Data with Talend.Manipulating Data with Talend.
Manipulating Data with Talend.
 
Manipulating data with Talend. Learn how?
Manipulating data with Talend. Learn how?Manipulating data with Talend. Learn how?
Manipulating data with Talend. Learn how?
 
Deg Group2009eng
Deg Group2009engDeg Group2009eng
Deg Group2009eng
 
Evento Sugar Crm 2009 - Talend
Evento Sugar Crm 2009 - TalendEvento Sugar Crm 2009 - Talend
Evento Sugar Crm 2009 - Talend
 

Más de huguk

Data Wrangling on Hadoop - Olivier De Garrigues, Trifacta
Data Wrangling on Hadoop - Olivier De Garrigues, TrifactaData Wrangling on Hadoop - Olivier De Garrigues, Trifacta
Data Wrangling on Hadoop - Olivier De Garrigues, Trifactahuguk
 
ether.camp - Hackathon & ether.camp intro
ether.camp - Hackathon & ether.camp introether.camp - Hackathon & ether.camp intro
ether.camp - Hackathon & ether.camp introhuguk
 
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and HadoopGoogle Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoophuguk
 
Using Big Data techniques to query and store OpenStreetMap data. Stephen Knox...
Using Big Data techniques to query and store OpenStreetMap data. Stephen Knox...Using Big Data techniques to query and store OpenStreetMap data. Stephen Knox...
Using Big Data techniques to query and store OpenStreetMap data. Stephen Knox...huguk
 
Extracting maximum value from data while protecting consumer privacy. Jason ...
Extracting maximum value from data while protecting consumer privacy.  Jason ...Extracting maximum value from data while protecting consumer privacy.  Jason ...
Extracting maximum value from data while protecting consumer privacy. Jason ...huguk
 
Intelligence Augmented vs Artificial Intelligence. Alex Flamant, IBM Watson
Intelligence Augmented vs Artificial Intelligence. Alex Flamant, IBM WatsonIntelligence Augmented vs Artificial Intelligence. Alex Flamant, IBM Watson
Intelligence Augmented vs Artificial Intelligence. Alex Flamant, IBM Watsonhuguk
 
Streaming Dataflow with Apache Flink
Streaming Dataflow with Apache Flink Streaming Dataflow with Apache Flink
Streaming Dataflow with Apache Flink huguk
 
Lambda architecture on Spark, Kafka for real-time large scale ML
Lambda architecture on Spark, Kafka for real-time large scale MLLambda architecture on Spark, Kafka for real-time large scale ML
Lambda architecture on Spark, Kafka for real-time large scale MLhuguk
 
Today’s reality Hadoop with Spark- How to select the best Data Science approa...
Today’s reality Hadoop with Spark- How to select the best Data Science approa...Today’s reality Hadoop with Spark- How to select the best Data Science approa...
Today’s reality Hadoop with Spark- How to select the best Data Science approa...huguk
 
Jonathon Southam: Venture Capital, Funding & Pitching
Jonathon Southam: Venture Capital, Funding & PitchingJonathon Southam: Venture Capital, Funding & Pitching
Jonathon Southam: Venture Capital, Funding & Pitchinghuguk
 
Signal Media: Real-Time Media & News Monitoring
Signal Media: Real-Time Media & News MonitoringSignal Media: Real-Time Media & News Monitoring
Signal Media: Real-Time Media & News Monitoringhuguk
 
Dean Bryen: Scaling The Platform For Your Startup
Dean Bryen: Scaling The Platform For Your StartupDean Bryen: Scaling The Platform For Your Startup
Dean Bryen: Scaling The Platform For Your Startuphuguk
 
Peter Karney: Intro to the Digital catapult
Peter Karney: Intro to the Digital catapultPeter Karney: Intro to the Digital catapult
Peter Karney: Intro to the Digital catapulthuguk
 
Cytora: Real-Time Political Risk Analysis
Cytora:  Real-Time Political Risk AnalysisCytora:  Real-Time Political Risk Analysis
Cytora: Real-Time Political Risk Analysishuguk
 
Cubitic: Predictive Analytics
Cubitic: Predictive AnalyticsCubitic: Predictive Analytics
Cubitic: Predictive Analyticshuguk
 
Bird.i: Earth Observation Data Made Social
Bird.i: Earth Observation Data Made SocialBird.i: Earth Observation Data Made Social
Bird.i: Earth Observation Data Made Socialhuguk
 
Aiseedo: Real Time Machine Intelligence
Aiseedo: Real Time Machine IntelligenceAiseedo: Real Time Machine Intelligence
Aiseedo: Real Time Machine Intelligencehuguk
 
Secrets of Spark's success - Deenar Toraskar, Think Reactive
Secrets of Spark's success - Deenar Toraskar, Think Reactive Secrets of Spark's success - Deenar Toraskar, Think Reactive
Secrets of Spark's success - Deenar Toraskar, Think Reactive huguk
 
TV Marketing and big data: cat and dog or thick as thieves? Krzysztof Osiewal...
TV Marketing and big data: cat and dog or thick as thieves? Krzysztof Osiewal...TV Marketing and big data: cat and dog or thick as thieves? Krzysztof Osiewal...
TV Marketing and big data: cat and dog or thick as thieves? Krzysztof Osiewal...huguk
 
Hadoop - Looking to the Future By Arun Murthy
Hadoop - Looking to the Future By Arun MurthyHadoop - Looking to the Future By Arun Murthy
Hadoop - Looking to the Future By Arun Murthyhuguk
 

Más de huguk (20)

Data Wrangling on Hadoop - Olivier De Garrigues, Trifacta
Data Wrangling on Hadoop - Olivier De Garrigues, TrifactaData Wrangling on Hadoop - Olivier De Garrigues, Trifacta
Data Wrangling on Hadoop - Olivier De Garrigues, Trifacta
 
ether.camp - Hackathon & ether.camp intro
ether.camp - Hackathon & ether.camp introether.camp - Hackathon & ether.camp intro
ether.camp - Hackathon & ether.camp intro
 
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and HadoopGoogle Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
 
Using Big Data techniques to query and store OpenStreetMap data. Stephen Knox...
Using Big Data techniques to query and store OpenStreetMap data. Stephen Knox...Using Big Data techniques to query and store OpenStreetMap data. Stephen Knox...
Using Big Data techniques to query and store OpenStreetMap data. Stephen Knox...
 
Extracting maximum value from data while protecting consumer privacy. Jason ...
Extracting maximum value from data while protecting consumer privacy.  Jason ...Extracting maximum value from data while protecting consumer privacy.  Jason ...
Extracting maximum value from data while protecting consumer privacy. Jason ...
 
Intelligence Augmented vs Artificial Intelligence. Alex Flamant, IBM Watson
Intelligence Augmented vs Artificial Intelligence. Alex Flamant, IBM WatsonIntelligence Augmented vs Artificial Intelligence. Alex Flamant, IBM Watson
Intelligence Augmented vs Artificial Intelligence. Alex Flamant, IBM Watson
 
Streaming Dataflow with Apache Flink
Streaming Dataflow with Apache Flink Streaming Dataflow with Apache Flink
Streaming Dataflow with Apache Flink
 
Lambda architecture on Spark, Kafka for real-time large scale ML
Lambda architecture on Spark, Kafka for real-time large scale MLLambda architecture on Spark, Kafka for real-time large scale ML
Lambda architecture on Spark, Kafka for real-time large scale ML
 
Today’s reality Hadoop with Spark- How to select the best Data Science approa...
Today’s reality Hadoop with Spark- How to select the best Data Science approa...Today’s reality Hadoop with Spark- How to select the best Data Science approa...
Today’s reality Hadoop with Spark- How to select the best Data Science approa...
 
Jonathon Southam: Venture Capital, Funding & Pitching
Jonathon Southam: Venture Capital, Funding & PitchingJonathon Southam: Venture Capital, Funding & Pitching
Jonathon Southam: Venture Capital, Funding & Pitching
 
Signal Media: Real-Time Media & News Monitoring
Signal Media: Real-Time Media & News MonitoringSignal Media: Real-Time Media & News Monitoring
Signal Media: Real-Time Media & News Monitoring
 
Dean Bryen: Scaling The Platform For Your Startup
Dean Bryen: Scaling The Platform For Your StartupDean Bryen: Scaling The Platform For Your Startup
Dean Bryen: Scaling The Platform For Your Startup
 
Peter Karney: Intro to the Digital catapult
Peter Karney: Intro to the Digital catapultPeter Karney: Intro to the Digital catapult
Peter Karney: Intro to the Digital catapult
 
Cytora: Real-Time Political Risk Analysis
Cytora:  Real-Time Political Risk AnalysisCytora:  Real-Time Political Risk Analysis
Cytora: Real-Time Political Risk Analysis
 
Cubitic: Predictive Analytics
Cubitic: Predictive AnalyticsCubitic: Predictive Analytics
Cubitic: Predictive Analytics
 
Bird.i: Earth Observation Data Made Social
Bird.i: Earth Observation Data Made SocialBird.i: Earth Observation Data Made Social
Bird.i: Earth Observation Data Made Social
 
Aiseedo: Real Time Machine Intelligence
Aiseedo: Real Time Machine IntelligenceAiseedo: Real Time Machine Intelligence
Aiseedo: Real Time Machine Intelligence
 
Secrets of Spark's success - Deenar Toraskar, Think Reactive
Secrets of Spark's success - Deenar Toraskar, Think Reactive Secrets of Spark's success - Deenar Toraskar, Think Reactive
Secrets of Spark's success - Deenar Toraskar, Think Reactive
 
TV Marketing and big data: cat and dog or thick as thieves? Krzysztof Osiewal...
TV Marketing and big data: cat and dog or thick as thieves? Krzysztof Osiewal...TV Marketing and big data: cat and dog or thick as thieves? Krzysztof Osiewal...
TV Marketing and big data: cat and dog or thick as thieves? Krzysztof Osiewal...
 
Hadoop - Looking to the Future By Arun Murthy
Hadoop - Looking to the Future By Arun MurthyHadoop - Looking to the Future By Arun Murthy
Hadoop - Looking to the Future By Arun Murthy
 

Último

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 

Último (20)

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 

Taland Hadoop data integration

  • 1. 14/10/2011 Agenda Using Hadoop with Talend  Talend Introduction  MapReduce and Hadoop Mark Chapman  Talend Integration Suite MPx Imad Rahman  Hadoop Features and TIS Components  How to use Talend to simplify Hadoop  Demo!  Questions & Answers © Talend 2011 2 Agenda Global leader in open source integration Venture-backed  Talend Introduction  MapReduce and Hadoop Global operations Corporate Headquarters Talend across the world…  Talend Integration Suite MPx San Francisco (Los Altos) Paris (Suresnes)  Hadoop Features and TIS Components Operations  How to use Talend to simplify Hadoop Orange County (Irvine) Boston (Burlington)  Demo! New York (Tarrytown) London (Maidenhead)  Questions & Answers Utrecht Nuremberg Bonn Munich Milan (Bergame) Tokyo Beijing © Talend 2011 3 © Talend 2011 4 1
  • 2. 14/10/2011 Customers By Industry Public Sector & Education Systems Integrators Market Positioning Application Integration Connect applications & services Services & Others Media & Telco Master Data Quality Data Software Data profiling Management Data cleansing Model and master any data or domain Retail and Manufacturing Finance & Insurance Data Integration Analytics (ETL) Operational data integration © Talend 2011 5 © Talend 2011 6 Talend Unified Platform Agenda  Complete unified environment supports all integration approaches – data & application  Talend Introduction  Uses consistent technology & leverages open standards  MapReduce and Hadoop Studio Comprehensive Eclipse-based  Talend Integration Suite MPx user interface  Hadoop Features and TIS Components Consolidated metadata & project Repository information  How to use Talend to simplify Hadoop  Demo! Deployment Web-based deployment & scheduling  Questions & Answers Same containers for batch processing, Execution message routing & services Monitoring Single web-based monitoring console © Talend 2011 7 © Talend 2011 8 2
  • 3. 14/10/2011 Background: MapReduce and Hadoop Talend Integration Suite MPx for Big Data MapReduce: Parallel Programming Model • One platform  “Divide and Conquer • All sources  Many possible implementations • All modes Big Data • All scales ·Hadoop ·Filescale High Volume Hadoop: Open Source Java MapReduce (ELT)  Simplified framework Batch ETL Cloud: flexible infrastructure Right-Time  e.g. Amazon Elastic MapReduce © Talend 2011 9 © Talend 2011 10 Talend’s Big Data Partnerships Agenda Partnering with Enterprise Big Data Leaders  Talend Introduction  MapReduce and Hadoop  Talend Integration Suite MPx Cloudera: Enterprise Hadoop  Hadoop Features and TIS Components  Talend: Open Source Cloudera  How to use Talend to simplify Hadoop  Connect Partner for Data Integration  Demo!  Questions & Answers Greenplum: Hadoop-Powered Analytics  Big Data-scale Relational DB  Talend supports Greenplum for Hadoop and ELT © Talend 2011 11 © Talend 2011 12 3
  • 4. 14/10/2011 Talend Integration Suite MPx Talend Components for Hadoop Features  HDFS (Hadoop File System) utilities – for loading/unloading files Hadoop Filescale Features Features  Sqoop – utility for RDBMS extract to HDFS (Cloudera only) • Hadoop components for • Use case: process  Data Warehousing on Hadoop using Hive - SQL - like language, to easy job design structured flat files query and transform data • HDFS: store, retrieve data (e.g. logs) • Cloudera Sqoop: Bulk ETL • Uses MapReduce  Transforming Data in Hadoop using Pig – transform, normalize, clean techniques HDFS data – very flexible • Hive: Relational DB layer • Performance optimized for • Pig: In-Hadoop this use case transformations  Talend Integration Suite MPx Hadoop Support • Native code, no Java  Components for HDFS and Sqoop loading/unloading  Components for defining Pig and Hive jobs  Integrate with any of Talend’s supported sources! © Talend 2011 13 © Talend 2011 14 Agenda Applying Talend Big Data in Enterprise  Talend Introduction  Landing data from operational systems  Transforming it before loading DW  MapReduce and Hadoop DW BI  Talend Integration Suite MPx Hadoop  Hadoop Features and TIS Components Hive  How to use Talend to simplify Hadoop Pig Sqoop Hive  Demo! HDFS  Questions & Answers  Performing additional analytics directly in Hadoop  Keeping historical data online for queries © Talend 2011 15 © Talend 2011 16 4
  • 5. 14/10/2011 Today’s Demo Scenario  View sample log data from an online game source  Load log data into Hive  Aggregate the data into 2 aggregate tables  Load aggregated data into RDBMS  Additional processing using PIG Show Time! © Talend 2011 17 Wrap-up Agenda  Talend Introduction  Talend Integration Suite MPx…  MapReduce and Hadoop  delivers MapReduce technologies as part of a  Talend Integration Suite MPx comprehensive data management solution  Hadoop Features and TIS Components  makes using Hadoop like other data integration activities  How to use Talend to simplify Hadoop  …is available for you to try  Demo!  Questions & Answers  Free 2 month license to Talend Integration Suite MPx  Visit http://info.talend.com/hugoffer.html © Talend 2011 19 © Talend 2011 20 5
  • 6. 14/10/2011 Questions and Answers Mark Chapman  Technical Manager  mchapman@talend.com Thank You!  Skype: mchapman68 Imad Rahman  Technical Presales Consultant  irahman@talend.com  Skype: imadrahman.talend © Talend 2011 21 6