SlideShare una empresa de Scribd logo
1 de 15
Talend: Solutions Overview
Presenter: Rajan Kanitkar
Talend Big Data Overview
© Talend 2012
The Drivers for Big Data
Volume
Velocity
Variety
© Talend 2012
The defacto standard for big data processing
How to process big data?
© Talend 2012
Apache Hadoop, an open-source software library, is a framework
that allows for the distributed processing of large data sets across
clusters of commodity hardware using simple programming
models. It is designed to scale up from single servers to
thousands of machines, each offering local computation and
storage.
What is Hadoop?
© Talend 2012
The Big Data Ecosystem
Hadoop: the core project
HDFS: the Hadoop Distributed File System
MapReduce: the software framework for distributed processing
of large data sets
Hive: a data warehouse infrastructure that provides data
summarization and a querying language
Pig: a high-level data-flow language and execution framework
for parallel computation
HBase: this is the Hadoop database. Use it when you need
random, realtime read/write access to your Big Data
And many many more: Sqoop, HCatalog, Zookeeper, Oozie,
Cassandra, MongoDB, Flume, Impala, Stinger, Neo4J, etc.
Thanks to you all!
Google, Amazon, Facebook, Twitter, Yahoo, 10gen, Cloudera,
Hortonworks, MapR, etc.
Talend Big Data Overview
© Talend 2012
Key differentiator Of Our Next Gen Architecture…
JAVA
ETL
Day-to-day
integration
Run everywhere
SQL
ELT
DW
appliance
Teradata, Netezza…
MapReduce
Hadoop
Highly
Scalable
Hadoop Grid
CAMEL
CAMEL
Message
transform-
ation
High Frequency
 No black-box engine
 Enables light-weight
distributed, customizable and
parallelizable run time
 Standards-Based
Code Generator
© Talend 2012 12
Talend Unique Integration Solution
Consolidated
metadata & project
information
Repository
2
Web-based
deployment &
scheduling
Deployment
3 Same container for
batch processing,
message routing &
services
Execution
4
Single web-based
monitoring console
Monitoring
5
Comprehensive
Eclipse-based
user interface
1
Studio
Data
Quality
Data
Integration MDM ESB BPM
Best-of-
Breed
Solutions
+
Talend
Unified
Platform
=
Unique
Integration
Solution
© Talend 2012
Talend Big Data Product Strategy
Big Data Integration
▶ Land data in a Big Data cluster without coding
▶ Code generation for MapReduce, HDFS, Hbase, Pig, Hive, Hcatalog, etc.
Big Data Manipulation
▶ Simplify manipulation, such as sort and filter
▶ Computational expensive functions using Hadoop
Big Data Quality & Governance
▶ Identify linkages & duplicates, validate big data
▶ Match component, execute basic quality features
Big Data Project Management
▶ Place frameworks around big data projects
▶ Common Repository, scheduling, monitoring
strategic pillars
© Talend 2012
…an open source
ecosystem
Talend Open Studio for Big Data
• Improves efficiency of big data job design
with graphic interface
• Generates Hadoop code and run transforms
inside Hadoop
• Native support for HDFS, Pig, Hbase,
Hcatalog, Sqoop and Hive
• 100% open source under an Apache License
• Standards based
Pig
Vision: Democratize big data
© Talend 2012
…an open source
ecosystem
Talend Platform for Big Data
• Builds on Talend Open Studio for Big Data
• Adds data quality, advanced scalability and
management functions
• MapReduce massively parallel data
processing
• Shared Repository and remote
deployment
• Data quality and profiling
• Data cleansing
• Reporting and dashboards
• Commercial support, warranty/IP
indemnity under a subscription license
Pig
Vision: Democratize big data
© Talend 2012
Talend Big Data Partnerships
Hadoop DistributionsTalend Big Data
Partners
© Talend 2012
Demonstration: ETL for Big Data with Talend
Extract
Transform
Load
Talend Demo
2013

Más contenido relacionado

La actualidad más candente

Solr vs. Elasticsearch - Case by Case
Solr vs. Elasticsearch - Case by CaseSolr vs. Elasticsearch - Case by Case
Solr vs. Elasticsearch - Case by CaseAlexandre Rafalovitch
 
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...HostedbyConfluent
 
Talend ETL Tutorial | Talend Tutorial For Beginners | Talend Online Training ...
Talend ETL Tutorial | Talend Tutorial For Beginners | Talend Online Training ...Talend ETL Tutorial | Talend Tutorial For Beginners | Talend Online Training ...
Talend ETL Tutorial | Talend Tutorial For Beginners | Talend Online Training ...Edureka!
 
Introducing Databricks Delta
Introducing Databricks DeltaIntroducing Databricks Delta
Introducing Databricks DeltaDatabricks
 
Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream ProcessingGuido Schmutz
 
[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic
[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic
[DSC Europe 22] Overview of the Databricks Platform - Petar ZecevicDataScienceConferenc1
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake OverviewJames Serra
 
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...Simplilearn
 
Introduction to Azure Data Factory
Introduction to Azure Data FactoryIntroduction to Azure Data Factory
Introduction to Azure Data FactorySlava Kokaev
 
A brief history of "big data"
A brief history of "big data"A brief history of "big data"
A brief history of "big data"Nicola Ferraro
 
Keynote: Your Future With Cloud Computing - Dr. Werner Vogels - AWS Summit 2...
Keynote: Your Future With Cloud Computing - Dr. Werner Vogels  - AWS Summit 2...Keynote: Your Future With Cloud Computing - Dr. Werner Vogels  - AWS Summit 2...
Keynote: Your Future With Cloud Computing - Dr. Werner Vogels - AWS Summit 2...Amazon Web Services
 
Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0Databricks
 
Practical introduction to hadoop
Practical introduction to hadoopPractical introduction to hadoop
Practical introduction to hadoopinside-BigData.com
 
Etl overview training
Etl overview trainingEtl overview training
Etl overview trainingMondy Holten
 

La actualidad más candente (20)

Solr vs. Elasticsearch - Case by Case
Solr vs. Elasticsearch - Case by CaseSolr vs. Elasticsearch - Case by Case
Solr vs. Elasticsearch - Case by Case
 
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
 
Talend ETL Tutorial | Talend Tutorial For Beginners | Talend Online Training ...
Talend ETL Tutorial | Talend Tutorial For Beginners | Talend Online Training ...Talend ETL Tutorial | Talend Tutorial For Beginners | Talend Online Training ...
Talend ETL Tutorial | Talend Tutorial For Beginners | Talend Online Training ...
 
Introducing Databricks Delta
Introducing Databricks DeltaIntroducing Databricks Delta
Introducing Databricks Delta
 
Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream Processing
 
Data Lake,beyond the Data Warehouse
Data Lake,beyond the Data WarehouseData Lake,beyond the Data Warehouse
Data Lake,beyond the Data Warehouse
 
Ssis 2008
Ssis 2008Ssis 2008
Ssis 2008
 
Data Mesh 101
Data Mesh 101Data Mesh 101
Data Mesh 101
 
[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic
[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic
[DSC Europe 22] Overview of the Databricks Platform - Petar Zecevic
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake Overview
 
Preparing Data for Sharing: The FAIR Principles
Preparing Data for Sharing: The FAIR PrinciplesPreparing Data for Sharing: The FAIR Principles
Preparing Data for Sharing: The FAIR Principles
 
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
Big Data Tutorial | What Is Big Data | Big Data Hadoop Tutorial For Beginners...
 
Introduction to Azure Data Factory
Introduction to Azure Data FactoryIntroduction to Azure Data Factory
Introduction to Azure Data Factory
 
A brief history of "big data"
A brief history of "big data"A brief history of "big data"
A brief history of "big data"
 
Keynote: Your Future With Cloud Computing - Dr. Werner Vogels - AWS Summit 2...
Keynote: Your Future With Cloud Computing - Dr. Werner Vogels  - AWS Summit 2...Keynote: Your Future With Cloud Computing - Dr. Werner Vogels  - AWS Summit 2...
Keynote: Your Future With Cloud Computing - Dr. Werner Vogels - AWS Summit 2...
 
Operational Data Vault
Operational Data VaultOperational Data Vault
Operational Data Vault
 
Data Mesh
Data MeshData Mesh
Data Mesh
 
Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0
 
Practical introduction to hadoop
Practical introduction to hadoopPractical introduction to hadoop
Practical introduction to hadoop
 
Etl overview training
Etl overview trainingEtl overview training
Etl overview training
 

Similar a Talend Big Data Capabilities Overview

Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.
Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.
Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.OW2
 
Hadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log ProcessingHadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log ProcessingHitendra Kumar
 
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Innovative Management Services
 
M. Florence Dayana - Hadoop Foundation for Analytics.pptx
M. Florence Dayana - Hadoop Foundation for Analytics.pptxM. Florence Dayana - Hadoop Foundation for Analytics.pptx
M. Florence Dayana - Hadoop Foundation for Analytics.pptxDr.Florence Dayana
 
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopCreate a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopHortonworks
 
Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...
Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...
Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...NoSQLmatters
 
CCD-410 Cloudera Study Material
CCD-410 Cloudera Study MaterialCCD-410 Cloudera Study Material
CCD-410 Cloudera Study MaterialRoxycodone Online
 
Pivotal: Hadoop for Powerful Processing of Unstructured Data for Valuable Ins...
Pivotal: Hadoop for Powerful Processing of Unstructured Data for Valuable Ins...Pivotal: Hadoop for Powerful Processing of Unstructured Data for Valuable Ins...
Pivotal: Hadoop for Powerful Processing of Unstructured Data for Valuable Ins...EMC
 
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016MLconf
 
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...Hortonworks
 
Transform You Business with Big Data and Hortonworks
Transform You Business with Big Data and HortonworksTransform You Business with Big Data and Hortonworks
Transform You Business with Big Data and HortonworksHortonworks
 
Big Data Hoopla Simplified - TDWI Memphis 2014
Big Data Hoopla Simplified - TDWI Memphis 2014Big Data Hoopla Simplified - TDWI Memphis 2014
Big Data Hoopla Simplified - TDWI Memphis 2014Rajan Kanitkar
 
Architecting the Future of Big Data and Search
Architecting the Future of Big Data and SearchArchitecting the Future of Big Data and Search
Architecting the Future of Big Data and SearchHortonworks
 
Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks Pactera_US
 
Hadoop and BigData - July 2016
Hadoop and BigData - July 2016Hadoop and BigData - July 2016
Hadoop and BigData - July 2016Ranjith Sekar
 

Similar a Talend Big Data Capabilities Overview (20)

Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.
Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.
Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.
 
Hadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log ProcessingHadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log Processing
 
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
 
M. Florence Dayana - Hadoop Foundation for Analytics.pptx
M. Florence Dayana - Hadoop Foundation for Analytics.pptxM. Florence Dayana - Hadoop Foundation for Analytics.pptx
M. Florence Dayana - Hadoop Foundation for Analytics.pptx
 
Hadoop in action
Hadoop in actionHadoop in action
Hadoop in action
 
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopCreate a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache Hadoop
 
Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...
Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...
Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...
 
Talend for big_data_intorduction
Talend for big_data_intorductionTalend for big_data_intorduction
Talend for big_data_intorduction
 
CCD-410 Cloudera Study Material
CCD-410 Cloudera Study MaterialCCD-410 Cloudera Study Material
CCD-410 Cloudera Study Material
 
Pivotal: Hadoop for Powerful Processing of Unstructured Data for Valuable Ins...
Pivotal: Hadoop for Powerful Processing of Unstructured Data for Valuable Ins...Pivotal: Hadoop for Powerful Processing of Unstructured Data for Valuable Ins...
Pivotal: Hadoop for Powerful Processing of Unstructured Data for Valuable Ins...
 
Talend
TalendTalend
Talend
 
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
Arun Rathinasabapathy, Senior Software Engineer, LexisNexis at MLconf ATL 2016
 
Robin_Hadoop
Robin_HadoopRobin_Hadoop
Robin_Hadoop
 
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
 
Transform You Business with Big Data and Hortonworks
Transform You Business with Big Data and HortonworksTransform You Business with Big Data and Hortonworks
Transform You Business with Big Data and Hortonworks
 
Big Data Hoopla Simplified - TDWI Memphis 2014
Big Data Hoopla Simplified - TDWI Memphis 2014Big Data Hoopla Simplified - TDWI Memphis 2014
Big Data Hoopla Simplified - TDWI Memphis 2014
 
Architecting the Future of Big Data and Search
Architecting the Future of Big Data and SearchArchitecting the Future of Big Data and Search
Architecting the Future of Big Data and Search
 
BIGDATA ppts
BIGDATA pptsBIGDATA ppts
BIGDATA ppts
 
Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks
 
Hadoop and BigData - July 2016
Hadoop and BigData - July 2016Hadoop and BigData - July 2016
Hadoop and BigData - July 2016
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays
 

Último (20)

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 

Talend Big Data Capabilities Overview

  • 2. Talend Big Data Overview
  • 3. © Talend 2012 The Drivers for Big Data Volume Velocity Variety
  • 4. © Talend 2012 The defacto standard for big data processing How to process big data?
  • 5. © Talend 2012 Apache Hadoop, an open-source software library, is a framework that allows for the distributed processing of large data sets across clusters of commodity hardware using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. What is Hadoop?
  • 6. © Talend 2012 The Big Data Ecosystem Hadoop: the core project HDFS: the Hadoop Distributed File System MapReduce: the software framework for distributed processing of large data sets Hive: a data warehouse infrastructure that provides data summarization and a querying language Pig: a high-level data-flow language and execution framework for parallel computation HBase: this is the Hadoop database. Use it when you need random, realtime read/write access to your Big Data And many many more: Sqoop, HCatalog, Zookeeper, Oozie, Cassandra, MongoDB, Flume, Impala, Stinger, Neo4J, etc. Thanks to you all! Google, Amazon, Facebook, Twitter, Yahoo, 10gen, Cloudera, Hortonworks, MapR, etc.
  • 7. Talend Big Data Overview
  • 8. © Talend 2012 Key differentiator Of Our Next Gen Architecture… JAVA ETL Day-to-day integration Run everywhere SQL ELT DW appliance Teradata, Netezza… MapReduce Hadoop Highly Scalable Hadoop Grid CAMEL CAMEL Message transform- ation High Frequency  No black-box engine  Enables light-weight distributed, customizable and parallelizable run time  Standards-Based Code Generator
  • 9. © Talend 2012 12 Talend Unique Integration Solution Consolidated metadata & project information Repository 2 Web-based deployment & scheduling Deployment 3 Same container for batch processing, message routing & services Execution 4 Single web-based monitoring console Monitoring 5 Comprehensive Eclipse-based user interface 1 Studio Data Quality Data Integration MDM ESB BPM Best-of- Breed Solutions + Talend Unified Platform = Unique Integration Solution
  • 10. © Talend 2012 Talend Big Data Product Strategy Big Data Integration ▶ Land data in a Big Data cluster without coding ▶ Code generation for MapReduce, HDFS, Hbase, Pig, Hive, Hcatalog, etc. Big Data Manipulation ▶ Simplify manipulation, such as sort and filter ▶ Computational expensive functions using Hadoop Big Data Quality & Governance ▶ Identify linkages & duplicates, validate big data ▶ Match component, execute basic quality features Big Data Project Management ▶ Place frameworks around big data projects ▶ Common Repository, scheduling, monitoring strategic pillars
  • 11. © Talend 2012 …an open source ecosystem Talend Open Studio for Big Data • Improves efficiency of big data job design with graphic interface • Generates Hadoop code and run transforms inside Hadoop • Native support for HDFS, Pig, Hbase, Hcatalog, Sqoop and Hive • 100% open source under an Apache License • Standards based Pig Vision: Democratize big data
  • 12. © Talend 2012 …an open source ecosystem Talend Platform for Big Data • Builds on Talend Open Studio for Big Data • Adds data quality, advanced scalability and management functions • MapReduce massively parallel data processing • Shared Repository and remote deployment • Data quality and profiling • Data cleansing • Reporting and dashboards • Commercial support, warranty/IP indemnity under a subscription license Pig Vision: Democratize big data
  • 13. © Talend 2012 Talend Big Data Partnerships Hadoop DistributionsTalend Big Data Partners
  • 14. © Talend 2012 Demonstration: ETL for Big Data with Talend Extract Transform Load

Notas del editor

  1. Finally, the entire big data world has been built as an open source ecosystem. This all makes sense… talend is the open source leader.To this end we will introduce the first compelte set of tools that will democratize big data. Talend Open Studio for Big Data
  2. Finally, the entire big data world has been built as an open source ecosystem. This all makes sense… talend is the open source leader.To this end we will introduce the first compelte set of tools that will democratize big data. Talend Open Studio for Big Data
  3. Purpose of the slide:Present the 2nd big change in the Integration Market: Big DataRemember, this is not (yet) about Talend solutions, this is about changes in the market that will impact the prospect’s company, no matter what.___________________________________________________________________________________________________________________Key themes: The Big Data RevolutionThe volume, variety and complexity of data is growing explosivelyTo serve the needs of customers for the next decade, next generation integration platforms must be designed & architected with these requirement in mind.Processing of data needs to be flexible based on customer’s use cases: eithercentralized for simple batch jobs OR distributed and parallelized for high volume complex transformations. Processing of data needs to be cost effective by leveraging and optimizing an organization’s full IT infrastructure (ELT for example) AND/OR external cloud platforms to take advantage of elasticity.New technologies for Big Data like NoSQL or Hadoop are not compatible with existing DatabasesIn the last 10-15 years companies have made massive investments in DW and relational databases that are not compatible with the Big Data technologies. Data integration technologies today need to address all different database and processing technologies that will co-exist as customers will need to leverage their past investments with new emerging Big Data technologies.Big Data is inevitable but relatively new and apt to change so any technology must also be Adaptable to knew technologies, architectures and topologies.Animportant attribute of this market trend is the need for scalability that Big Data necessitates, whether scaling up or scaling down based on the business needs. Note: Don’t forget garbage in, garbage out exponentially with Big Data. Therefore Data Quality is more important than ever with Big Data.