SlideShare una empresa de Scribd logo
1 de 35
Descargar para leer sin conexión
We’ll get started soon… 
Q&A box is available for your questions 
Webinar will be recorded for future viewing 
Thank you for joining! 
Page 1 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Deliver the Data Lake (demo/deep dive) 
…using HDP and Red Hat JBoss Data Virtualization 
Page 2 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 
We do Hadoop.
Your speakers… 
Raghu Thiagarajan, Dir, Partner Product Management, Hortonworks 
Kimberly Palko, Principal Product Manager, Red Hat 
Kenny Peeples, Principal Technical Marketing Manager, Red Hat 
Page 3 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
An architectural shift towards an HDP Data Lake 
Unlocking the Data Lake 
SCALE SCOPE 
RDBMS 
MPP 
EDW 
Page 4 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 
Data Lake 
Enabled by YARN 
• Single data repository, 
shared infrastructure 
• Multiple biz apps 
accessing all the data 
• Enable a shift from 
reactive to proactive 
interactions 
• Gain new insight across 
the entire enterprise 
New Analytic Apps 
or IT Optimization 
HDP 2.1 
Governance 
& Integration 
Security 
Operations 
Data Access 
YARN 
Data Management
What is a Data Lake? 
Architectural Pattern in the Data Center 
Uses Hadoop to deliver deeper insight across a large, broad, diverse set 
of data efficiently 
§ Multipurpose, Open PLATFORM for Data (NOT a database) 
§ Land all data in a single place and interact with it in many ways 
§ Allows for the ecosystem to provide higher level services (SAS, SAP, Microsoft for Streaming, 
MPP, In-memory, etc..) 
§ First class data management capabilities (metadata management, security, transformation 
pipelines, replication, retention, etc..) 
Page 5 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
HDP Data Lake Solution Architecture 
Manage Steps 1-4: Data Management with Falcon, Security with HDP Advanced 
Security 
Step 4: Schedule and Orchestrate 
Step 3: Transform, Aggregate & Materialize 
STORM 
JMS 
Step 1:Extract & Load 
NFS 
Page 6 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 
HIVE PIG Cascading 
(table & user-defined metadata) 
Step 2: Model/Apply Metadata 
compute 
& 
storage 
HCATALOG 
. . . 
SolR 
Storm 
. . . 
. . 
compute 
& 
storage 
. 
. 
YARN 
AMBARI 
Data Lake HDP Grid 
Use Case Type 1: 
Materialize & 
Exchange 
Interactive 
Hive Server 
(Tez/Stinger) 
Stream Processing, 
Real-time Search, 
MPI, etc. 
YARN Apps 
Opens up Many 
New Use Cases 
Query/ 
Analytics/Reporting 
Tools 
Tableau, Excel, 
Microstrategy 
Datameer, Platfora, 
Business Objects 
Use Case Type 2: 
Explore/Visualize 
FALCON (Data pipeline & flow management) 
SOURCE DATA 
Click Stream 
Sales 
Transactions 
Product Data 
Marketing/ 
Inventory 
Social Data 
EDW 
NFS 
Apache Argus (Unified Access Controls and Audit) 
(data processing) 
Exchange 
HBase 
Client 
Sqoop/Hive 
Downstream 
Data Sources 
OLTP 
HBase 
EDW 
(Teradata) 
MR2 Graph 
SAS 
Ingestion 
SQOOP 
FLUME 
Web HDFS 
REST 
HTTP 
Streamin 
g 
TEZ 
Mahout
HDP Data Lake Solution Architecture + Virtual Data Mart 
Manage Steps 1-4: Data Management with Falcon, Security with HDP 
Advanced Security 
Step 4: Schedule and Orchestrate 
HIVE PIG Cascadin 
g 
Step 3: Transform, Aggregate & Materialize 
(table & user-defined metadata) 
Step 2: Model/Apply Metadata 
compute 
& 
storage 
STORM 
JMS 
Step 1:Extract & Load 
NFS 
Page 7 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 
HCATALOG 
. . . 
SolR 
Storm 
. . . 
. . 
compute 
& 
storage 
. 
. 
YARN 
AMBARI 
Data Lake HDP Grid 
Use Case Type 1: 
Materialize & 
Exchange 
Interactive 
Hive Server 
(Tez/Stinger) 
Stream 
Processing, 
Real-time Search, 
MPI, etc. 
YARN Apps 
Opens up Many 
New Use Cases 
Query/ 
Analytics/ 
Reporting Tools 
Tableau, Excel, 
Microstrategy 
Datameer, 
Platfora, Business 
Objects 
Use Case Type 2: 
Explore/Visualize 
FALCON (Data pipeline & flow management) 
SOURCE DATA 
Click Stream 
Sales 
Transactions 
Product Data 
Marketing/ 
Inventory 
Social Data 
EDW 
NFS 
Apache Argus (Unified Access Controls and Audit) 
(data processing) 
Exchange 
HBase 
Client 
Sqoop/Hive 
Downstream 
Data Sources 
OLTP 
HBase 
EDW 
(Teradata) 
MR2 Graph 
SAS 
Ingestion 
SQOOP 
FLUME 
Web HDFS 
REST 
HTTP 
Streami 
ng 
TEZ 
Mahout 
Dept Base Virtual Database (VDB) 
Team 1 
VDB 
Team2 
VDB 
View1 View2
Yarn allows for new processing engines 
Manage Steps 1-4: Data Management with Falcon, Security with HDP Advanced 
STORM 
JMS 
Step 1:Extract & Load 
NFS 
Page 8 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 
Security 
Step 4: Schedule and Orchestrate 
HIVE PIG Cascading 
Step 3: Transform, Aggregate & Materialize 
(table & user-defined metadata) 
Step 2: Model/Apply Metadata 
compute 
& 
storage 
HCATALOG 
. . . 
SolR 
Storm 
. . . 
. . 
compute 
& 
storage 
. 
. 
YARN 
AMBARI 
Data Lake HDP Grid 
Use Case Type 1: 
Materialize & 
Exchange 
Interactive 
Hive Server 
(Tez/Stinger) 
Stream Processing, 
Real-time Search, 
MPI, etc. 
YARN Apps 
Opens up Many New 
Use Cases 
Query/ 
Analytics/Reporting 
Tools 
Tableau, Excel, 
Microstrategy 
Datameer, Platfora, 
Business Objects 
Use Case Type 2: 
Explore/Visualize 
FALCON (Data pipeline & flow management) 
SOURCE DATA 
Click Stream 
Sales 
Transactions 
Product Data 
Marketing/ 
Inventory 
Social Data 
EDW 
NFS 
Apache Argus (Unified Access Controls and Audit) 
(data processing) 
Exchange 
HBase 
Client 
Sqoop/Hive 
Downstream 
Data Sources 
OLTP 
HBase 
EDW 
(Teradata) 
MR2 Graph 
SAS 
Ingestion 
SQOOP 
FLUME 
Web HDFS 
REST 
HTTP 
Streamin 
g 
TEZ 
Mahout
Falcon enables Governance of Data Pipelines 
Manage Steps 1-4: Data Management with Falcon, Security with HDP Advanced 
STORM 
JMS 
Step 1:Extract & Load 
NFS 
Page 9 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 
Security 
Step 4: Schedule and Orchestrate 
HIVE PIG Cascading 
Step 3: Transform, Aggregate & Materialize 
(table & user-defined metadata) 
Step 2: Model/Apply Metadata 
compute 
& 
storage 
HCATALOG 
. . . 
SolR 
Storm 
. . . 
. . 
compute 
& 
storage 
. 
. 
YARN 
AMBARI 
Data Lake HDP Grid 
Use Case Type 1: 
Materialize & 
Exchange 
Interactive 
Hive Server 
(Tez/Stinger) 
Stream Processing, 
Real-time Search, 
MPI, etc. 
YARN Apps 
Opens up Many New 
Use Cases 
Query/ 
Analytics/Reporting 
Tools 
Tableau, Excel, 
Microstrategy 
Datameer, Platfora, 
Business Objects 
Use Case Type 2: 
Explore/Visualize 
FALCON (Data pipeline & flow management) 
SOURCE DATA 
Click Stream 
Sales 
Transactions 
Product Data 
Marketing/ 
Inventory 
Social Data 
EDW 
NFS 
Apache Argus (Unified Access Controls and Audit) 
(data processing) 
Exchange 
HBase 
Client 
Sqoop/Hive 
Downstream 
Data Sources 
OLTP 
HBase 
EDW 
(Teradata) 
MR2 Graph 
SAS 
Ingestion 
SQOOP 
FLUME 
Web HDFS 
REST 
HTTP 
Streamin 
g 
TEZ 
Mahout
Apache Falcon: Data Governance in the Lake 
Falcon Adds the required data governance features 
Page 10 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 
Data pipeline 
Raw Clean Prep 
Defined in 
Adds the required data governance 
Auto generate & 
orchestrate 
Multiple complex Oozie workflows 
Job1 
Job2 JobN 
Job3 
Job4 Job7 Job6 JobN 
Job1 
Job2 JobN 
Job3 
Job4 Job7 Job6 JobN 
Other Hadoop 
ecosystem tools 
Eg. DistCp 
features 
DEFINITION 
Replication | Retention 
Eviction | Late data 
MONITORING 
TRACING 
Audit | Lineage 
Tagging
Mashing up diverse data types in the Data Lake 
Page 11 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Mashing up diverse data types in the Data Lake 
Page 12 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Mashing up diverse data types in the Data Lake 
Page 13 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Mashing up diverse data types in the Data Lake 
Page 14 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Mashing up diverse data types in the Data Lake 
Page 15 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Mashing up diverse data types in the Data Lake 
Page 16 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Virtual Data Marts with Red Hat JBoss 
Data Virtualization and Hortonworks HDP 
Kimberly Palko 
Page 17 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Data Supply and Integration Solution 
Data Virtualization sits in front of multiple data 
sources and 
ü allows them to be treated a single source 
ü delivering the desired data 
ü in the required form 
ü at the right time 
ü to any application and/or user. 
THINK VIRTUAL MACHINE FOR DATA 
Page 18 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Easy Access to Big Data 
Hive 
Page 19 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 
• Reporting tool accesses the 
data virtualization server via rich 
SQL dialect 
• The data virtualization server 
translates rich SQL dialect to 
HiveQL 
• Hive translates HiveQL to 
MapReduce 
• MapReduce runs MR job on big 
data 
MapReduce 
HDFS 
Analytical 
Reporting 
Tool 
Data 
Virtualization 
Server 
Hadoop 
Big Data
Use Case 1: Combine data from 
Hadoop with traditional data 
sources 
Problem: 
Data from new data sources like social media, 
clickstream and sensors needs to be combined 
with data from traditional sources to get the full 
value. 
Solution: 
Leverage JBoss Data Virtualization to mashup 
new data in Hadoop with data in traditional data 
sources without moving or copying any data and 
access it through a variety of BI tools and SOA 
technologies. 
Page 20 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 
Data 
can 
be 
accessed 
by 
mul/ple 
tools 
and 
methods 
already 
in-­‐house 
Consume 
Compose 
Connect 
JBoss Data 
Virtualization 
Hive 
SOURCE 
1: 
Hive/Hadoop 
contains 
data 
from 
new 
data 
sources 
like 
social 
media, 
clickstream 
and 
sensor 
data 
SOURCE 
2: 
Tradi/onal 
rela/onal 
databases 
in 
the 
enterprise
Use Case 2: Federating across 
Geographically Distributed 
Hadoop Clusters 
Problem: 
Geographically distributed Hadoop clusters contains 
sensitive data like patient records or customer 
identification that cannot be accessed by other 
regions due to regulatory policy. IT needs access to 
all data, but users can only access the data in their 
region. 
Solution: 
Leverage JBoss Data Virtualization to provide Row 
Level Security and Masking of columns while 
federating across Hadoop clusters. 
Page 21 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 
Data 
can 
be 
accessed 
by 
mul/ple 
tools 
and 
methods 
already 
in-­‐house 
Consume 
Compose 
Connect 
JBoss Data 
Virtualization 
Hive 
Hadoop 
cluster 
in 
one 
geographic 
region 
Hive 
Hadoop 
cluster 
in 
a 
second 
geographic 
region
Data for entire organization in Hadoop Data Lake 
Problem: How does IT control access and give business users just the 
data they need? 
- Does every line of business have access to everyone’s data? 
- How do business users get access to the data they need in a 
simple (even self-service) way? 
Hadoop Data Lake 
HR Employee 
Files Server 
Marketing 
Clickstream 
Data Finance 
Page 22 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 
Expense 
Reports 
Logs 
Sales 
Transactions 
Customer 
Twitter Sentiment Accounts 
Data
Secure, Self-Service Virtual Data Marts for Hadoop 
Solution: Use JBoss Data Virtualization to create virtual data marts 
on top of a Hadoop cluster 
- Lines of Business get access to the data they need in a simple manner 
- IT maintains the process and control it needs 
- All data remains in the data lake, nothing is copied or moved 
Marketing Finance IT 
Marketing 
Clickstream Data 
Page 23 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 
Hadoop Data Lake 
HR Employee Files Sales Transactions 
Finance 
Customer 
Expense 
Reports 
Twitter Sentiment Accounts 
Data 
Sales 
Server Logs
Optional hierarchical data architectures with virtual data mart 
Can be combined with security features like user role access and row and 
column masking 
Team2 
VDB 
Dept Base Virtual Database (VDB) 
Team 1 
VDB 
Page 24 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 
View1 View2
Want most recent data in an operational data store 
Problem: All the legacy and archived data is in the Hadoop data lake. 
We want to access the most recent, up to the minute, operational data 
often and quickly. 
Marketing 
Clickstream Data 
Hadoop Data Lake 
Historical Data 
Finance 
Expense 
Reports 
Page 25 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 
HR Employee Files Server 
Logs 
Sales Transactions 
Customer 
Accounts 
Twitter Sentiment Data
Caching For Faster Performance – Materialized View 
Query 1 Query 2 
Virtual Database (VDB) 
Page 26 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 
Cached or Materialized 
View 1 
View 1 
• Same cached view for multiple 
queries 
• Refreshed automatically or manually 
• Cache repository can be any 
supported data source
Want most recent data in an operational data store 
Solution: Use JBoss Data Virtualization to integrate up to the minute data from 
multiple diverse data sources that can be quickly queried. 
- Use HDP for all data older than today. 
- Use JDV to materialize the data in HDP for faster access and to combine with operational VDB 
Materialized 
View 
Operational VDB Historical Data 
with up to the 
minute data 
Page 27 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 
Marketing 
Clickstream Data 
Hadoop Data Lake 
HR Employee 
Files 
Finance 
Expense 
Reports 
Server 
Logs 
Sales 
Transactions 
Customer 
Accounts 
Twitter Sentiment 
Data 
Nightly 
Transfer from 
Data Sources
Demonstration 
Virtual Data Marts 
Page 28 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 
with 
Hadoop Data Lake 
Kenny Peeples
Use Case 3 - Overview 
Objexcxtivxe : 
–Purpose oriented data views for 
functional teams over a rich variety of 
semi-structured and structured data 
Problem: 
–Data Lakes have large volumes of 
consolidated clickstream data, product 
and customer data that need to be 
constrained for multi-departmental use. 
Solution: 
–Leverage HDP to mashup Clickstream 
analysis data with product and customer 
data on HDP to answer 
- Leverage Jboss Data Virt to provide 
Virtual data marts for each of Marketing 
and Product teams to ….. 
Page 29 © Hortonworks Inc. 2011 – 2014. All Rights RHesOerRveTdO NWORKS CONFIDENTIAL & PROPRIETARY INFORMATION
Use Case 3 - Architecture 
APPLICATIONS 
Business 
Analy/cs 
Page 30 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 
Custom 
Applica/ons 
Packaged 
Applica/ons 
DATA 
SYSTEM 
SOURCES 
Emerging 
Sources 
(Sensor, 
Sen/ment, 
Geo, 
Unstructured) 
Exis/ng 
Sources 
(CRM, 
ERP, 
Clickstream, 
Logs) 
HDP 2.1 
Governance 
& Integration 
Security 
Operations 
Data Access 
VIRTUAL 
DATA 
MART 
Data Management
Use Case 3 - Resources 
• GUIDE 
How to guide: https://github.com/DataVirtualizationByExample/HortonworksUseCase3 
Tutorial: Available soon 
• VIDEOS: 
http://vimeo.com/user16928011/hwxuc3configuration 
http://vimeo.com/user16928011/hwxuc3run 
http://vimeo.com/user16928011/hwxuc3overview 
• SOURCE: 
https://github.com/DataVirtualizationByExample/HortonworksUseCase3 
Page 31 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Benefits of JBoss Data Virtualization with 
Hortonworks HDP 2.1 
• Creates virtual databases for controlling 
access to data in a data lake while giving 
lines of business the autonomy they seek 
• Combines new data in Hadoop with data in 
traditional data sources without moving or 
copying data 
• Gives access to a variety of BI and analytics 
tools 
• Provides caching for faster access to data 
• Provides consistent security policy across 
multiple data sources 
Page 32 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Thank you! 
Page 33 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 
Hortonworks and Red Hat JBoss Data Virtualization
Next Steps... 
More about Red Hat & Hortonworks 
http://hortonworks.com/partner/redhat 
Download the Hortonworks Sandbox 
Learn Hadoop 
Build Your Analytic App 
Try Hadoop 2 
Contact us: events@hortonworks.com 
Page 34 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Don’t Forget to Register for our Next Webinar! 
Page 35 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 
September 17th, 10 AM PST 
Red Hat JBoss Data Virtualization and Hortonworks Data Platform 
http://info.hortonworks.com/RedHatSeries_Hortonworks.html

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Introduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready ProgramIntroduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready Program
 
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.nextDiscover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
 
HDP Advanced Security: Comprehensive Security for Enterprise Hadoop
HDP Advanced Security: Comprehensive Security for Enterprise HadoopHDP Advanced Security: Comprehensive Security for Enterprise Hadoop
HDP Advanced Security: Comprehensive Security for Enterprise Hadoop
 
Discover HDP 2.1: Apache Solr for Hadoop Search
Discover HDP 2.1: Apache Solr for Hadoop SearchDiscover HDP 2.1: Apache Solr for Hadoop Search
Discover HDP 2.1: Apache Solr for Hadoop Search
 
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big Data
 
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
 
Don't Let Security Be The 'Elephant in the Room'
Don't Let Security Be The 'Elephant in the Room'Don't Let Security Be The 'Elephant in the Room'
Don't Let Security Be The 'Elephant in the Room'
 
Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache Hive
Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache HiveDiscover HDP 2.1: Interactive SQL Query in Hadoop with Apache Hive
Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache Hive
 
Data Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop ImplementationData Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop Implementation
 
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache FalconDriving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
 
Hortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - WebinarHortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - Webinar
 
YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez
 
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache HadoopRescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
 
Bigger Data For Your Budget
Bigger Data For Your BudgetBigger Data For Your Budget
Bigger Data For Your Budget
 
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataCombine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
 
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
 
Hp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar SlidesHp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar Slides
 
Predicting Customer Experience through Hadoop and Customer Behavior Graphs
Predicting Customer Experience through Hadoop and Customer Behavior GraphsPredicting Customer Experience through Hadoop and Customer Behavior Graphs
Predicting Customer Experience through Hadoop and Customer Behavior Graphs
 
Implementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data GovernanceImplementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data Governance
 
Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?
 

Similar a Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3

Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Innovative Management Services
 
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUGReal-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
skumpf
 

Similar a Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3 (20)

Realtime analytics + hadoop 2.0
Realtime analytics + hadoop 2.0Realtime analytics + hadoop 2.0
Realtime analytics + hadoop 2.0
 
Realtime Analytics in Hadoop
Realtime Analytics in HadoopRealtime Analytics in Hadoop
Realtime Analytics in Hadoop
 
OOP 2014
OOP 2014OOP 2014
OOP 2014
 
Hortonworks and Red Hat Webinar - Part 2
Hortonworks and Red Hat Webinar - Part 2Hortonworks and Red Hat Webinar - Part 2
Hortonworks and Red Hat Webinar - Part 2
 
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
 
Cloud Austin Meetup - Hadoop like a champion
Cloud Austin Meetup - Hadoop like a championCloud Austin Meetup - Hadoop like a champion
Cloud Austin Meetup - Hadoop like a champion
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Discover hdp 2.2 hdfs - final
Discover hdp 2.2   hdfs - finalDiscover hdp 2.2   hdfs - final
Discover hdp 2.2 hdfs - final
 
Hadoop crash course workshop at Hadoop Summit
Hadoop crash course workshop at Hadoop SummitHadoop crash course workshop at Hadoop Summit
Hadoop crash course workshop at Hadoop Summit
 
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big Data
 
YARN - Strata 2014
YARN - Strata 2014YARN - Strata 2014
YARN - Strata 2014
 
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUGReal-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
 
How YARN Enables Multiple Data Processing Engines in Hadoop
How YARN Enables Multiple Data Processing Engines in HadoopHow YARN Enables Multiple Data Processing Engines in Hadoop
How YARN Enables Multiple Data Processing Engines in Hadoop
 
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder HortonworksThe Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
 
Discover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
Discover HDP2.1: Apache Storm for Stream Data Processing in HadoopDiscover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
Discover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
 
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
 
Spark and Hadoop Perfect Togeher by Arun Murthy
Spark and Hadoop Perfect Togeher by Arun MurthySpark and Hadoop Perfect Togeher by Arun Murthy
Spark and Hadoop Perfect Togeher by Arun Murthy
 
Spark Summit EMEA - Arun Murthy's Keynote
Spark Summit EMEA - Arun Murthy's KeynoteSpark Summit EMEA - Arun Murthy's Keynote
Spark Summit EMEA - Arun Murthy's Keynote
 
Hadoop crashcourse v3
Hadoop crashcourse v3Hadoop crashcourse v3
Hadoop crashcourse v3
 
Hortonworks Data in Motion Webinar Series - Part 1
Hortonworks Data in Motion Webinar Series - Part 1Hortonworks Data in Motion Webinar Series - Part 1
Hortonworks Data in Motion Webinar Series - Part 1
 

Más de Hortonworks

Más de Hortonworks (20)

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's New
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
 

Último

Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
amitlee9823
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
amitlee9823
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
amitlee9823
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
karishmasinghjnh
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 

Último (20)

April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 

Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3

  • 1. We’ll get started soon… Q&A box is available for your questions Webinar will be recorded for future viewing Thank you for joining! Page 1 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 2. Deliver the Data Lake (demo/deep dive) …using HDP and Red Hat JBoss Data Virtualization Page 2 © Hortonworks Inc. 2011 – 2014. All Rights Reserved We do Hadoop.
  • 3. Your speakers… Raghu Thiagarajan, Dir, Partner Product Management, Hortonworks Kimberly Palko, Principal Product Manager, Red Hat Kenny Peeples, Principal Technical Marketing Manager, Red Hat Page 3 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 4. An architectural shift towards an HDP Data Lake Unlocking the Data Lake SCALE SCOPE RDBMS MPP EDW Page 4 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Data Lake Enabled by YARN • Single data repository, shared infrastructure • Multiple biz apps accessing all the data • Enable a shift from reactive to proactive interactions • Gain new insight across the entire enterprise New Analytic Apps or IT Optimization HDP 2.1 Governance & Integration Security Operations Data Access YARN Data Management
  • 5. What is a Data Lake? Architectural Pattern in the Data Center Uses Hadoop to deliver deeper insight across a large, broad, diverse set of data efficiently § Multipurpose, Open PLATFORM for Data (NOT a database) § Land all data in a single place and interact with it in many ways § Allows for the ecosystem to provide higher level services (SAS, SAP, Microsoft for Streaming, MPP, In-memory, etc..) § First class data management capabilities (metadata management, security, transformation pipelines, replication, retention, etc..) Page 5 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 6. HDP Data Lake Solution Architecture Manage Steps 1-4: Data Management with Falcon, Security with HDP Advanced Security Step 4: Schedule and Orchestrate Step 3: Transform, Aggregate & Materialize STORM JMS Step 1:Extract & Load NFS Page 6 © Hortonworks Inc. 2011 – 2014. All Rights Reserved HIVE PIG Cascading (table & user-defined metadata) Step 2: Model/Apply Metadata compute & storage HCATALOG . . . SolR Storm . . . . . compute & storage . . YARN AMBARI Data Lake HDP Grid Use Case Type 1: Materialize & Exchange Interactive Hive Server (Tez/Stinger) Stream Processing, Real-time Search, MPI, etc. YARN Apps Opens up Many New Use Cases Query/ Analytics/Reporting Tools Tableau, Excel, Microstrategy Datameer, Platfora, Business Objects Use Case Type 2: Explore/Visualize FALCON (Data pipeline & flow management) SOURCE DATA Click Stream Sales Transactions Product Data Marketing/ Inventory Social Data EDW NFS Apache Argus (Unified Access Controls and Audit) (data processing) Exchange HBase Client Sqoop/Hive Downstream Data Sources OLTP HBase EDW (Teradata) MR2 Graph SAS Ingestion SQOOP FLUME Web HDFS REST HTTP Streamin g TEZ Mahout
  • 7. HDP Data Lake Solution Architecture + Virtual Data Mart Manage Steps 1-4: Data Management with Falcon, Security with HDP Advanced Security Step 4: Schedule and Orchestrate HIVE PIG Cascadin g Step 3: Transform, Aggregate & Materialize (table & user-defined metadata) Step 2: Model/Apply Metadata compute & storage STORM JMS Step 1:Extract & Load NFS Page 7 © Hortonworks Inc. 2011 – 2014. All Rights Reserved HCATALOG . . . SolR Storm . . . . . compute & storage . . YARN AMBARI Data Lake HDP Grid Use Case Type 1: Materialize & Exchange Interactive Hive Server (Tez/Stinger) Stream Processing, Real-time Search, MPI, etc. YARN Apps Opens up Many New Use Cases Query/ Analytics/ Reporting Tools Tableau, Excel, Microstrategy Datameer, Platfora, Business Objects Use Case Type 2: Explore/Visualize FALCON (Data pipeline & flow management) SOURCE DATA Click Stream Sales Transactions Product Data Marketing/ Inventory Social Data EDW NFS Apache Argus (Unified Access Controls and Audit) (data processing) Exchange HBase Client Sqoop/Hive Downstream Data Sources OLTP HBase EDW (Teradata) MR2 Graph SAS Ingestion SQOOP FLUME Web HDFS REST HTTP Streami ng TEZ Mahout Dept Base Virtual Database (VDB) Team 1 VDB Team2 VDB View1 View2
  • 8. Yarn allows for new processing engines Manage Steps 1-4: Data Management with Falcon, Security with HDP Advanced STORM JMS Step 1:Extract & Load NFS Page 8 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Security Step 4: Schedule and Orchestrate HIVE PIG Cascading Step 3: Transform, Aggregate & Materialize (table & user-defined metadata) Step 2: Model/Apply Metadata compute & storage HCATALOG . . . SolR Storm . . . . . compute & storage . . YARN AMBARI Data Lake HDP Grid Use Case Type 1: Materialize & Exchange Interactive Hive Server (Tez/Stinger) Stream Processing, Real-time Search, MPI, etc. YARN Apps Opens up Many New Use Cases Query/ Analytics/Reporting Tools Tableau, Excel, Microstrategy Datameer, Platfora, Business Objects Use Case Type 2: Explore/Visualize FALCON (Data pipeline & flow management) SOURCE DATA Click Stream Sales Transactions Product Data Marketing/ Inventory Social Data EDW NFS Apache Argus (Unified Access Controls and Audit) (data processing) Exchange HBase Client Sqoop/Hive Downstream Data Sources OLTP HBase EDW (Teradata) MR2 Graph SAS Ingestion SQOOP FLUME Web HDFS REST HTTP Streamin g TEZ Mahout
  • 9. Falcon enables Governance of Data Pipelines Manage Steps 1-4: Data Management with Falcon, Security with HDP Advanced STORM JMS Step 1:Extract & Load NFS Page 9 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Security Step 4: Schedule and Orchestrate HIVE PIG Cascading Step 3: Transform, Aggregate & Materialize (table & user-defined metadata) Step 2: Model/Apply Metadata compute & storage HCATALOG . . . SolR Storm . . . . . compute & storage . . YARN AMBARI Data Lake HDP Grid Use Case Type 1: Materialize & Exchange Interactive Hive Server (Tez/Stinger) Stream Processing, Real-time Search, MPI, etc. YARN Apps Opens up Many New Use Cases Query/ Analytics/Reporting Tools Tableau, Excel, Microstrategy Datameer, Platfora, Business Objects Use Case Type 2: Explore/Visualize FALCON (Data pipeline & flow management) SOURCE DATA Click Stream Sales Transactions Product Data Marketing/ Inventory Social Data EDW NFS Apache Argus (Unified Access Controls and Audit) (data processing) Exchange HBase Client Sqoop/Hive Downstream Data Sources OLTP HBase EDW (Teradata) MR2 Graph SAS Ingestion SQOOP FLUME Web HDFS REST HTTP Streamin g TEZ Mahout
  • 10. Apache Falcon: Data Governance in the Lake Falcon Adds the required data governance features Page 10 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Data pipeline Raw Clean Prep Defined in Adds the required data governance Auto generate & orchestrate Multiple complex Oozie workflows Job1 Job2 JobN Job3 Job4 Job7 Job6 JobN Job1 Job2 JobN Job3 Job4 Job7 Job6 JobN Other Hadoop ecosystem tools Eg. DistCp features DEFINITION Replication | Retention Eviction | Late data MONITORING TRACING Audit | Lineage Tagging
  • 11. Mashing up diverse data types in the Data Lake Page 11 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 12. Mashing up diverse data types in the Data Lake Page 12 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 13. Mashing up diverse data types in the Data Lake Page 13 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 14. Mashing up diverse data types in the Data Lake Page 14 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 15. Mashing up diverse data types in the Data Lake Page 15 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 16. Mashing up diverse data types in the Data Lake Page 16 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 17. Virtual Data Marts with Red Hat JBoss Data Virtualization and Hortonworks HDP Kimberly Palko Page 17 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 18. Data Supply and Integration Solution Data Virtualization sits in front of multiple data sources and ü allows them to be treated a single source ü delivering the desired data ü in the required form ü at the right time ü to any application and/or user. THINK VIRTUAL MACHINE FOR DATA Page 18 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 19. Easy Access to Big Data Hive Page 19 © Hortonworks Inc. 2011 – 2014. All Rights Reserved • Reporting tool accesses the data virtualization server via rich SQL dialect • The data virtualization server translates rich SQL dialect to HiveQL • Hive translates HiveQL to MapReduce • MapReduce runs MR job on big data MapReduce HDFS Analytical Reporting Tool Data Virtualization Server Hadoop Big Data
  • 20. Use Case 1: Combine data from Hadoop with traditional data sources Problem: Data from new data sources like social media, clickstream and sensors needs to be combined with data from traditional sources to get the full value. Solution: Leverage JBoss Data Virtualization to mashup new data in Hadoop with data in traditional data sources without moving or copying any data and access it through a variety of BI tools and SOA technologies. Page 20 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Data can be accessed by mul/ple tools and methods already in-­‐house Consume Compose Connect JBoss Data Virtualization Hive SOURCE 1: Hive/Hadoop contains data from new data sources like social media, clickstream and sensor data SOURCE 2: Tradi/onal rela/onal databases in the enterprise
  • 21. Use Case 2: Federating across Geographically Distributed Hadoop Clusters Problem: Geographically distributed Hadoop clusters contains sensitive data like patient records or customer identification that cannot be accessed by other regions due to regulatory policy. IT needs access to all data, but users can only access the data in their region. Solution: Leverage JBoss Data Virtualization to provide Row Level Security and Masking of columns while federating across Hadoop clusters. Page 21 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Data can be accessed by mul/ple tools and methods already in-­‐house Consume Compose Connect JBoss Data Virtualization Hive Hadoop cluster in one geographic region Hive Hadoop cluster in a second geographic region
  • 22. Data for entire organization in Hadoop Data Lake Problem: How does IT control access and give business users just the data they need? - Does every line of business have access to everyone’s data? - How do business users get access to the data they need in a simple (even self-service) way? Hadoop Data Lake HR Employee Files Server Marketing Clickstream Data Finance Page 22 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Expense Reports Logs Sales Transactions Customer Twitter Sentiment Accounts Data
  • 23. Secure, Self-Service Virtual Data Marts for Hadoop Solution: Use JBoss Data Virtualization to create virtual data marts on top of a Hadoop cluster - Lines of Business get access to the data they need in a simple manner - IT maintains the process and control it needs - All data remains in the data lake, nothing is copied or moved Marketing Finance IT Marketing Clickstream Data Page 23 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Hadoop Data Lake HR Employee Files Sales Transactions Finance Customer Expense Reports Twitter Sentiment Accounts Data Sales Server Logs
  • 24. Optional hierarchical data architectures with virtual data mart Can be combined with security features like user role access and row and column masking Team2 VDB Dept Base Virtual Database (VDB) Team 1 VDB Page 24 © Hortonworks Inc. 2011 – 2014. All Rights Reserved View1 View2
  • 25. Want most recent data in an operational data store Problem: All the legacy and archived data is in the Hadoop data lake. We want to access the most recent, up to the minute, operational data often and quickly. Marketing Clickstream Data Hadoop Data Lake Historical Data Finance Expense Reports Page 25 © Hortonworks Inc. 2011 – 2014. All Rights Reserved HR Employee Files Server Logs Sales Transactions Customer Accounts Twitter Sentiment Data
  • 26. Caching For Faster Performance – Materialized View Query 1 Query 2 Virtual Database (VDB) Page 26 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Cached or Materialized View 1 View 1 • Same cached view for multiple queries • Refreshed automatically or manually • Cache repository can be any supported data source
  • 27. Want most recent data in an operational data store Solution: Use JBoss Data Virtualization to integrate up to the minute data from multiple diverse data sources that can be quickly queried. - Use HDP for all data older than today. - Use JDV to materialize the data in HDP for faster access and to combine with operational VDB Materialized View Operational VDB Historical Data with up to the minute data Page 27 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Marketing Clickstream Data Hadoop Data Lake HR Employee Files Finance Expense Reports Server Logs Sales Transactions Customer Accounts Twitter Sentiment Data Nightly Transfer from Data Sources
  • 28. Demonstration Virtual Data Marts Page 28 © Hortonworks Inc. 2011 – 2014. All Rights Reserved with Hadoop Data Lake Kenny Peeples
  • 29. Use Case 3 - Overview Objexcxtivxe : –Purpose oriented data views for functional teams over a rich variety of semi-structured and structured data Problem: –Data Lakes have large volumes of consolidated clickstream data, product and customer data that need to be constrained for multi-departmental use. Solution: –Leverage HDP to mashup Clickstream analysis data with product and customer data on HDP to answer - Leverage Jboss Data Virt to provide Virtual data marts for each of Marketing and Product teams to ….. Page 29 © Hortonworks Inc. 2011 – 2014. All Rights RHesOerRveTdO NWORKS CONFIDENTIAL & PROPRIETARY INFORMATION
  • 30. Use Case 3 - Architecture APPLICATIONS Business Analy/cs Page 30 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Custom Applica/ons Packaged Applica/ons DATA SYSTEM SOURCES Emerging Sources (Sensor, Sen/ment, Geo, Unstructured) Exis/ng Sources (CRM, ERP, Clickstream, Logs) HDP 2.1 Governance & Integration Security Operations Data Access VIRTUAL DATA MART Data Management
  • 31. Use Case 3 - Resources • GUIDE How to guide: https://github.com/DataVirtualizationByExample/HortonworksUseCase3 Tutorial: Available soon • VIDEOS: http://vimeo.com/user16928011/hwxuc3configuration http://vimeo.com/user16928011/hwxuc3run http://vimeo.com/user16928011/hwxuc3overview • SOURCE: https://github.com/DataVirtualizationByExample/HortonworksUseCase3 Page 31 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 32. Benefits of JBoss Data Virtualization with Hortonworks HDP 2.1 • Creates virtual databases for controlling access to data in a data lake while giving lines of business the autonomy they seek • Combines new data in Hadoop with data in traditional data sources without moving or copying data • Gives access to a variety of BI and analytics tools • Provides caching for faster access to data • Provides consistent security policy across multiple data sources Page 32 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 33. Thank you! Page 33 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Hortonworks and Red Hat JBoss Data Virtualization
  • 34. Next Steps... More about Red Hat & Hortonworks http://hortonworks.com/partner/redhat Download the Hortonworks Sandbox Learn Hadoop Build Your Analytic App Try Hadoop 2 Contact us: events@hortonworks.com Page 34 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
  • 35. Don’t Forget to Register for our Next Webinar! Page 35 © Hortonworks Inc. 2011 – 2014. All Rights Reserved September 17th, 10 AM PST Red Hat JBoss Data Virtualization and Hortonworks Data Platform http://info.hortonworks.com/RedHatSeries_Hortonworks.html