Más contenido relacionado
La actualidad más candente (20)
Similar a Tapping into the Big Data Reservoir (CON7934) (20)
Más de Jeffrey T. Pollock (13)
Tapping into the Big Data Reservoir (CON7934)
- 1. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle Data Integration:CON7934
Tapping into the Big Data Reservoir with All Data
Jeff Pollock
Vice President, Oracle Data Integration
1 Oracle OpenWorld 2014
- 2. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Safe Harbor Statement
The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.
Oracle OpenWorld 2014 2
- 3. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Today’s Agenda
3
Oracle Data Integration Solutions
Big Data Reservoir
•Next generation data platform architecture on Hadoop
Oracle Data Integration for Big Data Reservoir
•Take complete advantage of the modern Big Data platform and leave legacy ETL tools behind
Proven Results with Big Data
•Beyond theory, early adopters getting benefits NOW!
1
2
3
4
Oracle OpenWorld 2014
- 4. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle Data Integration Solutions and Proven Benefits
Oracle OpenWorld 2014 4
Improve Agility
•Deploy Projects Faster
•Reliable Real-Time Reduce Risk
•Popular, Proven Tools
•Open, Not Proprietary Reduce Costs
•Better Productivity
•Eliminate ETL Servers
Analytic Data Integration
•Big Data Integration & Governance
•Data Warehouse Integration
•Business Intelligence Applications
Enterprise Data Integration and Governance
•Enterprise Data Quality and Profiling
•Comprehensive, Heterogeneous Data Integration
•Business Glossary and Metadata Management
Business Continuity
•Active-Active for Maximum Availability
•Zero Downtime Migrations
•Data Consolidation / Application Modernization
24 x 7 x 365
- 5. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Comprehensive Data Integration & Governance Capabilities
Oracle OpenWorld 2014 5
Real-Time Data Movement
–Low impact capture, stage in Hadoop
–Continuous data availability
Data Transformation
–Bulk data movement
–Pushdown data processing
Data Federation
–Virtualized Data Services
Data Quality & Verification
–Fix quality at the source
–Verify data consistency
Metadata Management
–Lineage and Impact Analysis
–Business Glossary Semantics
Data GovernanceFoundationOracle Data Integrator(Transformation) Enterprise Data Quality(Profile, Cleanse, Match and De-duplicate)
FastLoadOracle GoldenGate(Movement) Enterprise Metadata Management & Business Glossary(Business Glossary, Data Lineage, Impact Analysis and Data Provenance) Data Service Integrator(Federation) GoldenGateVeridata(Online Data Verification)
ELT Processingon Hadoop or SQL
Continuous Availability
- 6. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Data GovernanceFoundation
Differentiated Technical Approach
Oracle OpenWorld 2014 6
Dynamic Data Movement
–Real-time CDC is by default, not ETL
–Least invasive on sources
–Proven best performance
–Integrated Oracle capture/apply
No ETL Engines
–Take the processing to the data; don’t move the data to the process
–Leverage your data engines for the workloads (Hadoop or SQL)
Most Heterogeneous
–Leverage open source Hadoop, not proprietary distributions
–Hadoop is the Hub, not ETL tools
–Open metadata standardsOracle Data Integrator(Transformation) Enterprise Data Quality(Profile, Cleanse, Match and De-duplicate)
FastLoadOracle GoldenGate(Movement) Enterprise Metadata Management & Business Glossary(Business Glossary, Data Lineage, Impact Analysis and Data Provenance) Data Service Integrator(Federation) GoldenGateVeridata(Online Data Verification)
ELT Processingon Hadoop or SQL
Continuous Availability
- 7. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Today’s Agenda
7
1
2
3
4
Oracle OpenWorld 2014
Oracle Data Integration Solutions
Big Data Reservoir
•Next generation data platform architecture on Hadoop
Oracle Data Integration for Big Data Reservoir
•Take complete advantage of the modern Big Data platform and leave legacy ETL tools behind
Proven Results with Big Data
•Beyond theory, early adopters getting benefits NOW!
- 8. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Why the word “Reservoir?”
8
https://blogs.oracle.com/bigdata/entry/big_data_and_analytic_top
Oracle OpenWorld 2014
- 9. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
True Hadoop Opportunity: Big Data Reservoir
9
Deep DataStorage
Data Preparation
Data Discovery
Data staged / merged in
Hadoop to provide single place to explore/discover data
External data staging and long running batch jobs run in Hadoop to make the most of the DB
Store more raw detail data for
less cost, while keeping
aggregates in the DB
DW
Support for Exploratory Analytics without time consuming data modeling
Lower cost data staging and data preparation
Lower cost storage for questionable business data
Data Staging & Preparation
New Data Discovery
Detailed, Deep Data
Oracle OpenWorld 2014
- 10. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 10
Reports & Dashboards
Query Planning
Data Integration
Data Modelling
Database Mgmt.
Data Visualization
Query Construction
Data Enrichment
Data Preparation
Data Exploration
Data Acquisition
Operational Responsibilities
Data Science & Discovery
Operational Data Flow and Staffing Models
Oracle OpenWorld 2014
Data Scientists
DBAs, Developers, Data Stewards, Analysts
- 11. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Logical Architecture –Seamless Data Integration is Crucial
11
Virtualisation & Query Federation
Enterprise Performance Management
Pre-built & Ad-hoc BI Assets
InformationServices
Data Ingestion
Information Interpretation
Access & Performance Layer
Foundation Data Layer
Raw Data Reservoir
Data Science
Data Engines & Poly-structured sources
Content
Docs
Web & Social Media
SMS
Structured
DataSources
•Operational Data
•COTS Data
•Streaming & BAM
Immutable raw data reservoir
Raw data at rest is not interpreted
Immutable modelled data. Business Process Neutral form. Abstracted from business process changes
Past, current and future interpretation of enterprise data. Structured to support agile access & navigation
Discovery Lab Sandboxes
Rapid Development Sandboxes
Project based data stores to support specific discovery objectives
Project based data stored to facilitate rapid content / presentation delivery
Data Sources
Master & ReferenceData Sources
DataIntegration & Governance
DataIntegration & Governance
DI&G
DI&G
DI&G
DI&G
Oracle OpenWorld 2014
- 12. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Concrete Business Value with Big Data Reservoir
Oracle OpenWorld 2014 12
Lower TCO for the Data Warehouse
LoBFaster Access to Analytic Data
New Types of Analytics for All Data
•Control the costs of the Data Warehouse
•Massive value multipliers for Teradata and Netezzacustomers
•Put an end to the annual upgrade cycle
•Give analytics to the business earlier in the data lifecycle
•Avoid up front modelling overhead for Discovery
•Empower IT to focus on highest value analytics
•Run BI queries faster
•Support Exploratory Analytics directly from Hadoop
•Run Streaming Analytics from OEP, Storm, Flume etc.
•Drive new business solutions (telematics data, machine data, log data, unstructured data) COSTSPEEDVALUE
- 13. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 13
Top US AutomakerOracle Data Integration for RealtimeData Delivery to Hadoop Reservoir
Petabyte Scale
Oracle OpenWorld 2014
- 14. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Today’s Agenda
14
1
2
3
4
Oracle OpenWorld 2014
Oracle Data Integration Solutions
Big Data Reservoir
•Next generation data platform architecture on Hadoop
Oracle Data Integration for Big Data Reservoir
•Take complete advantage of the modern Big Data platform and leave legacy ETL tools behind
Proven Results with Big Data
•Beyond theory, early adopters getting benefits NOW!
- 15. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle Data Integration –Powerful Big Data Solutions
15
Commodity Data Reservoir
Leverage Oracle Data Integration with a wide array of databases or data warehouse appliances
Support Hadoop distributions on commodity hardware
Oracle Engineered Systems
Deeply integrated with Oracle Big Data Appliance and Exadata
Take advantage of Infinibandperformance, Oracle Big Data SQL, Columnar Compression, and all integrated Loader technologies
Streaming Big Data
Integrate realtimetransactional databases with streaming analytics
Filter, join and transform data while it is in motion, make business decisions while data is in memory
Oracle OpenWorld 2014
- 16. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Heterogeneous Reservoir with Oracle Data Integration
16
Flume
Hive on MR, Tez, Spark
Logs
OLTP DB
SQOOP
OGG
Pig on MR, Tez, SparkODI
SQOOP
Any DWOGG
Spark
OozieOEDQOEMM
Data Validation & Cleansing
Metadata Mgmt
& Lineage
API/File
Hive/HCat, HDFS,HBase
Hive/HCat, HDFS,HBase
NoSQL
Flume
Oracle OpenWorld 2014
- 17. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle OpenWorld 2014 17
European Energy Co. Oracle Data Integration for Data Staging and Transformingin Hortonworks
Real-Time to Hive
- 18. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Load to Oracle OLH/OSCH
Red Stack Reservoir with Oracle Data Integration
18
TransformHiveODI
Hive/HDFS
FederateHive/HDFS to Oracle
Big Data SQL
Oracle DB OLTP
Load from OracleCopyToBDA
Hive/HDFS
Federate Oracle to HiveQuery Provider for Hadoop
OGGOGG
Hive/HDFS
Oracle OpenWorld 2014
- 19. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Engineered System for Big Data from Oracle
19
DISK
PCI FLASH
DRAM
Warm Data
Hottest Data
Active Data
•Engineered data platform
•ODI Data Transformation at the speed of DRAM or the scale of Hadoop
•Utilize each data tier for specialized algorithms & compression
•Speedof DRAM
•I/Osof Flash
•Costof Disk
•Scaleof Hadoop
Hadoop DISKS
Deep DataOracle Data IntegratorOracle GoldenGate
Fully exploit Big Data SQL, In-Memory and No-SQL Advancements from Oracle
Oracle OpenWorld 2014
- 20. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 20
Top European BankOracle Data Integration MapReduceData Transformations in Big Data Appliance
Massively Parallel
Oracle OpenWorld 2014
- 21. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Streaming Reservoir with NoSQL and DIS
21
Transform(Hive, Pig/Oozie, Spark) ODI
FederateHive/HDFSBig Data SQL
OracleNoSQL
Hive/HDFS
OGGOGG
Hive/HDFS
Any DB
Sensors & Events
Hive/HDFSOEP
Load to Oracle OLH/OSCH
Oracle OpenWorld 2014
- 22. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 22
US Digital TV ProviderOracle Data Integration with Hadoop & Kafka
100m Tx/Hr
Oracle OpenWorld 2014
- 23. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Today’s Agenda
23
1
2
3
4
Oracle OpenWorld 2014
Oracle Data Integration Solutions
Big Data Reservoir
•Next generation data platform architecture on Hadoop
Oracle Data Integration for Big Data Reservoir
•Take complete advantage of the modern Big Data platform and leave legacy ETL tools behind
Proven Results with Big Data
•Beyond theory, early adopters getting benefits NOW!
- 24. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle OpenWorld 2014 24
The 90’s Are Calling: Don’t Custom Code Data Integration!!
Hey big data coders!
Yes, all you out there writing your data load programs in Scala, PigLatin, HiveQLor Java MR….
Custom coded data loading is BAD, stay away!
Been there, done that with C++, Pipes and Pro*C
Debugging kills, live data is always bad, downtime is a major bummer, projects can’t scale to large teams…
When you are past Discovery and in to Operations, use enterprise tools for their reliability and reach into existing IT systems.
- 25. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle Does Big Data Integration Better
25
Dynamic Data Movement
–CDC is by default, not an add-on
–Least invasive on sources
–Proven best performance
–Native Oracle capture/apply
NoETLEngine
–Take the processing to the data; don’t move the data to the process
–Leverage your data engines for the workloads (Hadoop or SQL)
Most Heterogeneous
–Leverage open source Hadoop, not proprietary distributions
–Hadoop is the Hub, not ETL tools
–Open metadata standards
vs.
Batch Data Movement
–Typical ETL vendors all default to batch data movement in their reference architectures
–Some can “talk the talk” but their CDC tech can’t touch Oracle GoldenGatescale/performance
ETL Engine Must Scale Alongside Hadoop
–Carefully watch how ETL engines scale out; parallelism runs via the Engine –more H/W to buy
–Map out the physical deployment architecture, compare to GG&ODI, the benefits will be clear
Proprietary Vendor Lock-in
–One popular ETL vendor puts their engines at the center of the architecture, not Hadoop
–The mainframe of ETL vendors has proprietary features that mainly run in their own distro
–“Fake free” ETL vendors sell proprietary add-ons
vs.
vs.
Oracle OpenWorld 2014
- 26. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle Does Big Data Better: Dynamic Data Movement
Oracle OpenWorld 2014 26
HDFS (Files)
HBase(NoSQL)
Hive / Hive Streaming (SQL)
Flume & Storm (Streaming)
Kafka (MPP Pub/Sub)
Spark Streaming (Machine Learning)
Capture Database Transactions and Deliver to Big Data in Real-Time
Capture
Trail
Route
Deliver
Pump GoldenGate
- 27. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle Does Big Data Better: Invented Pushdown Processing
27
ORCL Investments in ELT/Pushdown Tech
ScriptedSQL
StoredProcs
WarehouseBuilder
DataIntegrator(Heterogeneous)
ODI forColumnarDBs
ODI forIn-MemoryDBs
ODI forEngineeredSystems
ODI forHadoopNoSQL
ODI forHadoopPig & Oozie
ODI forSpark
ODI for …
1990’s
Eon of Scripts and PL-SQL
Era of Native SQL
Big Data Revolution
Oracle’s tool maturity and operational know-how for E-LT is unmatched
10x bigger footprint with E-LT than next closest competitor using “pushdown”
Simple and easy way to blend Hadoop and SQL E-LT execution from one tool
ODI forHadoopHive
Oracle OpenWorld 2014
- 28. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle Does Big Data Better: NoETLApproach
28
OneLogical Design:
ManyEngine Alternatives:
Data Engines:
Examples:
Engine I/O:
Best Use:
SQL/ OLTP Database
•Oracle DBMS
•Any OLTP DBMS
•DW Appliances
SSD / Diskbased
High volumes of transformations on relational data
MapReduce
•Hive / MR2
•Pig / Oozie/ MR2
SSD / Disk based
Huge batch-like transformations on any data types
In Memory(SQL / BigData)
•Oracle InMemory
•Hive / Tez/ YARN
•Spark / YARN
•ClouderaImpala
D/RAM;with various built in spill to disk approaches
Highlyinteractive data transformation patterns
StreamingBig Data
•Storm/ YARN
•Oracle Event Processor (OEP)
D/RAM; “always on” data pipeline
Verylow latencytransformations
Modern design studio for simple map development
Team-based GUI Tooling for work on Enterprise projects
Integrated lifecycle and metadata management
Automated support for Changed Data Capture
SEPARATE ETL ENGINE NOT REQUIRED!
Oracle OpenWorld 2014
Data Integrator
- 29. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle Does Big Data Better: Most Open & Heterogeneous
Oracle OpenWorld 2014 29
Hadoop HBase
Hadoop Hive/Flume
HP Enscribe
HP NonStop
HP Neoview
Hypersonic SQL
IBM DB2 iSeries
IBM DB2 UDB
IBM DB2 z Series
IBM Informix
IBM Netezza
JMS / MQ
Microsoft Access
Microsoft SQLServer
MySQL
Pivotal Greenplum
PostgreSQL
Salesforce.com
SAP BW / BI
SAP ERP / ECC
SAS
SQL/MP
SQL/MX
Sybase ASE
Sybase IQ
Teradata
Adaptive
Altova
Apache Hcatalog
Apache Hive/HQL
Borland
CA ERwin
ClouderaImpala
COBOL Copybook
DataStax
Embarcadero
EMC ProActivity
GentleWare
Google BigQuery
Grandite
HadaptHive
HortonworksHive
IBM Cognos
IBM DB2
IBM DataStage
IBM Discovery
IBM Federation Server
IBM Lotus Notes
IBM Netezza
IBM Rational Rose
IBM Rational Architect
InformaticaMetadata Mgr.
InformaticaPowerCenter
CoSORT
ISO SQL Standard (DDL)
MapRHadoop Hive
MicroFocus
Microsoft Access
Microsoft Office Excel
Microsoft Visio
Microsoft SQL Server
Microsoft SSIS
Microsoft Visual Studio
Microstrategy
Magic Draw
OMG CWM Standard
OMG UML Standard
Oracle BI Answers
Oracle BI Enterprise Edition
Oracle BI Server
Oracle DAC
Oracle Data Integrator
Oracle Data Modeler
Oracle Database
Oracle Designer
Oracle Hyperion Applications
Oracle Hyperion Essbase
Oracle Warehouse Builder
Pivotal Greenplum
PostgreSQL
QlikView
SAP BO Crystal Reports
SAP BO Designer
SAP BO Desktop Intelligence
SAP BO Repository
SAP BO Data Integrator
SAP BO Data Steward
SAP Master Data Management
SAP Sybase PowerDesigner
SAP Sybase ASE Database
SAS Data Integration Studio
SAS BI Server
SAS Information Map
SAS Metadata Management
SAS OLAP Server
Select
SparxArchitect
Syncsort
Tableau
Talend
Teradata
Tigris
Visible
W3C DTD & XSD Schema
Operational Integration (Movement / Transformation)
Metadata Harvesting (Glossary, Lineage & Impact Analysis)
Oracle Database
Oracle Exadata
Oracle Big Data Appliance
Oracle TimesTen
Oracle OLAP
Oracle Business Intelligence
Oracle BI Applications
Oracle E-Business Suite
Oracle JD Edwards Enterprise One
Oracle JD EdwardsWorld
Oracle Fusion Applications
Oracle Governance Risk and Compliance
Oracle Fusion AIA
OracleRetail Applications
Oracle Agile BI/ DW
OracleAgile PLM for Process
OracleiFlexFlexCUBE
Oracle iFlexMantas
Oracle HyperionApplications
Oracle PeopleSoft
Oracle Siebel CRM / OnDemand
Oracle Communications
Oracle WebLogic Server
Oracle Coherence Data Grid
Oracle SOA Suite
Oracle Enterprise Service Bus
+ open APIs and standards based meta-model
- 30. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle Does Big Data Better: Clear Business Benefits
30
Proven Technology
Better Architecture
Best for Oracle
•Unlike custom coding, a tools based approach is proven to result in lower cost long term operations
•Oracle GoldenGateis industry standard for Data Replication
•Oracle invented E-LT Pushdown processing and is 10x more widely deployed than competitors
•Oracle GoldenGateprovides the most scalable, native integration for database replication
•Oracle Data Integrator provides ultimate scalability and choice for Hadoop data transformations
•Consistent agent-based architecture avoids having multiple, incompatible engines (eg; old style ETL tools)
•Exadata–OGG and ODI are deeply integrated and are the only Replication and ETL processes certified to run on the appliance
•Big Data Appliance –deeply integrated technology part of core reference architecture
•Big Data Connectors –ODI included with core connector technologies for Hadoop
RISK
SCALE
COMPLETE
Heterogeneous Access
Oracle OpenWorld 2014
- 31. Copyright © 2014,Oracle and/or its affiliates. All rights reserved. |
Join the Community
#OOW14 #ODI12c #GoldenGate12c #EDQ12c
Oracle Data Integration blogblogs.oracle.com/dataintegration
Connect with Oracle on Social Media
OR connect via the web
Oracle Data Integration Home Pageoracle.com/goto/dataintegration
- 32. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 2014
2014 Oracle Excellence Award Ceremony for Fusion Middleware Innovation
ORACLE FUSION MIDDLEWARE: CELEBRATE THIS YEAR'S MOST INNOVATIVE CUSTOMER SOLUTIONS
Tuesday, September 30, 2014 5:00-5:45pm YBCA Theater (next to MosconeNorth)
Session ID: CON7029
- 33. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle Fusion Middleware
The Cloud Platform for Digital Business
Cloud
On-Premise
DIGITAL ENGAGEMENT APPLICATION & DATA INTEGRATION
IDENTITYMANAGEMENT
SYSTEMS
MANAGEMENT
APPLICATION INFRASTRUCTURE & TOOLS
BUSINESS PROCESS MANAGEMENT
BUSINESS ANALYTICS
CONTENT & COLLABORATION
Web
Mobile
Social
Internet of Things
- 34. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Questions and Answers
Oracle OpenWorld 2014 34
- 36. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 36 Oracle OpenWorld 2014