SlideShare una empresa de Scribd logo
1 de 36
Descargar para leer sin conexión
© 2019 Snowflake Computing Inc. All Rights Reserved
CONTINUOUS DATA
REPLICATION INTO
CLOUD STORAGE WITH
ORACLE GOLDENGATE
Michael Rainey | ITOUG Tech Days
February 2019
© 2019 Snowflake Computing Inc. All Rights Reserved
ABOUT ME
Michael Rainey
Senior Solutions Architect at Snowflake Computing
Oracle ACE Director
Twitter: @mRainey
Email: michael.rainey@snowflake.com
© 2019 Snowflake Computing Inc. All Rights Reserved
3 YEARS IN STEALTH + 3 YEARS GA
900+ employees
Over 2000 customers
today
Over $850M in venture
funding from leading
investors
First customers
2014, general
availability 2015
Founded 2012 by
industry veterans
with over 120
database patents
Queries processed in
Snowflake per day:
100 million
Largest single
table:
68 trillion rows
Largest number of
tables single DB:
200,000
Single customer
most data:
> 40PB
Single customer
most users:
> 10,000
Fun facts:
© 2019 Snowflake Computing Inc. All Rights Reserved
https://pixabay.com/en/fountain-water-flow-wet-3412242
© 2019 Snowflake Computing Inc. All Rights Reserved
ORACLE GOLDENGATE
© 2019 Snowflake Computing Inc. All Rights Reserved
In the beginning…
● Development of the “handler” was a manual effort using GoldenGate for Java adapter
● Very few targets available (HBase, Flume, Hive, etc)
Over the years…
● More targets, more handlers, more automated!
ORACLE GOLDENGATE FOR BIG DATA
© 2019 Snowflake Computing Inc. All Rights Reserved
ORACLE GOLDENGATE FOR BIG DATA
https://blogs.oracle.com/dataintegration/goldengate-for-big-data-123211-release-update
© 2019 Snowflake Computing Inc. All Rights Reserved
WRITE FILES WITH GOLDENGATE FOR BIG DATA
https://pixabay.com/en/alphabets-ancient-author-data-2365812/
© 2019 Snowflake Computing Inc. All Rights Reserved
File Writer Handler
● Data is formatted and staged locally
● Maintain state - previously, all was lost
● Templated strings (substitution variables) can be used throughout properties file for naming
Event Handlers
● Transforms files written by File Writer Handler to target format (Parquet, ORC, etc)
● Connects to 3rd party application APIs
● Loads files to 3rd party applications (HDFS, S3, etc)
Pluggable Formatters
● Provide format options and metadata options
GOLDENGATE FOR BIG DATA HANDLERS
© 2019 Snowflake Computing Inc. All Rights Reserved
Extracting data from the source remains the same, no change
Install and setup GoldenGate for Big Data to handle replicat functionality
Create replicat parameter file and properties file
GOLDENGATE FOR BIG DATA SETUP
© 2019 Snowflake Computing Inc. All Rights Reserved
Contains parameters required for...
● ...creating the file and where it will be created
● ...file format and metadata options
● ...connection to final target and additional parameters required
Create with name that matches replicat parameter file name
● Example: payroll.prm / payroll.properties
GOLDENGATE FOR BIG DATA PROPERTIES
© 2019 Snowflake Computing Inc. All Rights Reserved
REPLICAT TO TARGET PROCESS
© 2019 Snowflake Computing Inc. All Rights Reserved
FILE WRITER HANDLER STATE
© 2019 Snowflake Computing Inc. All Rights Reserved
GOLDENGATE FOR BIG DATA PROPERTIES
© 2019 Snowflake Computing Inc. All Rights Reserved
GOLDENGATE FOR BIG DATA PROPERTIES
Pluggable
Formatter
© 2019 Snowflake Computing Inc. All Rights Reserved
GOLDENGATE FOR BIG DATA PROPERTIES
© 2019 Snowflake Computing Inc. All Rights Reserved
GOLDENGATE FOR BIG DATA PROPERTIES
© 2019 Snowflake Computing Inc. All Rights Reserved
Cloud storage security (S3)
● User/role must be able to list buckets and create buckets, along with ability to write files
Properties file syntax
● Not many examples besides in the documentation...which can incorrect as well!
Add all required properties
● For example, goldengate.userexit.writers=javawriter is required, but not in the docs
LESSONS LEARNED
© 2019 Snowflake Computing Inc. All Rights Reserved
USING THE DATA IN CLOUD STORAGE
https://pixabay.com/en/hacking-cyber-blackandwhite-crime-2903156/
© 2019 Snowflake Computing Inc. All Rights Reserved
A fully managed extract, transform, and load (ETL) service
Data Catalog
● Central repository to store structural and operational metadata for all
your data assets
Crawlers
● Connects to data stores, determines the data structures, and writes tables into the Data Catalog
Jobs
● Business logic that performs the extract, transform, and load (ETL) work
● Developed using PySpark or Scala as scripts, generated by AWS Glue
● Built-in transforms used to process data
AWS GLUE
© 2019 Snowflake Computing Inc. All Rights Reserved
AWS GLUE JOB
© 2019 Snowflake Computing Inc. All Rights Reserved
Serverless query engine for reporting and analytics
Uses Presto DB as the underlying query engine
Supports CSV, JSON, Gzip files and columnar formats like Apache Parquet
● Use Athena’s own catalog or the centralized AWS Glue catalog
Serverless query engine
● Performance scales “automatically” based on query profiling
● Pay as you query - based on data scanned
● SELECT only - no DML
● Follow best practices for query optimization
Integrates with BI tools like Tableau, Looker, AWS QuickSight, etc. for advanced reports and visualizations
AMAZON ATHENA
© 2019 Snowflake Computing Inc. All Rights Reserved
AMAZON ATHENA
© 2019 Snowflake Computing Inc. All Rights Reserved
BI service used to create data visualizations and interactive dashboards
for insightful analysis
Connects to AWS sources (Athena, S3, etc), SQL databases, and SaaS
applications (Salesforce, etc)
Uses ‘pay-per-session’ pricing for cost-effective user access
Create and publish interactive dashboards, share reports via email, or embed
analytics into customer applications
Powered by super-fast, parallel, in-memory calculation engine (SPICE)
AWS QUICKSIGHT
© 2019 Snowflake Computing Inc. All Rights Reserved
© 2019 Snowflake Computing Inc. All Rights Reserved
Storage separated from compute
● Centralized, scale-out storage that expands
and contracts automatically
Resize compute instantly
● Scale up/down or turn off when not in use
Multiple clusters access data without
contention
● ETL, reporting, data science, and
applications all running at the same time
without performance impact.
Centralized management
● Separate metadata from storage and
compute
Full transactional consistency (ACID)
SNOWFLAKE DATA
WAREHOUSE
Management Optimization Security Availability Transactions Metadata
© 2019 Snowflake Computing Inc. All Rights Reserved
AUTO-INGEST WITH SNOWPIPE
Snowflake
Database
External
S3
Snowpipe Service
Server-less
Loader
S3 notification
File data
Data
replication/
streaming
application
© 2019 Snowflake Computing Inc. All Rights Reserved
CONTINUOUS REPLICATION TO SNOWFLAKE
Snowflake
Database
External
S3
Snowpipe Service
Server-less
Loader
S3 notification
File data
© 2019 Snowflake Computing Inc. All Rights Reserved
MERGE REPLICATED DATA
© 2019 Snowflake Computing Inc. All Rights Reserved
MERGE REPLICATED DATA
© 2019 Snowflake Computing Inc. All Rights Reserved
MERGE REPLICATED DATA
© 2019 Snowflake Computing Inc. All Rights Reserved
MERGE REPLICATED DATA
...
...
© 2019 Snowflake Computing Inc. All Rights Reserved
https://pixabay.com/en/lightbulbs-bulbs-light-idea-energy-1875257/
© 2019 Snowflake Computing Inc. All Rights Reserved
GoldenGate for Big Data docs:
https://docs.oracle.com/goldengate/bd123210/gg-bd/index.html
Oracle Data Integration blog:
https://blogs.oracle.com/dataintegration/data-integration
Continuous Data Replication into Snowflake with Oracle Goldengate blog post:
https://www.snowflake.com/blog/continuous-data-replication-into-snowflake-with-oracle-goldengate/
MORE INFORMATION
THANK YOU
© 2019 Snowflake Computing Inc. All Rights Reserved
© 2019 Snowflake Computing Inc. All Rights Reserved
DISCOVER THE PERFORMANCE, CONCURRENCY,
AND SIMPLICITY OF SNOWFLAKE
As easy as 1-2-3!
01
Visit Snowflake.com
02
Click “Try for Free”
03
Sign up & register
Snowflake is the only data warehouse built for the cloud.
You can automatically scale compute up, out, or
down—independent of storage. Plus, you have the power
of a complete SQL database, with zero management, that
can grow with you to support all of your data and all of
your users. With Snowflake On Demand™, pay only for
what you use.
Sign up and receive
$400 worth of free
usage for 30 days!

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Zero to Snowflake Presentation
Zero to Snowflake Presentation Zero to Snowflake Presentation
Zero to Snowflake Presentation
 
Oracle Database Migration to Oracle Cloud Infrastructure
Oracle Database Migration to Oracle Cloud InfrastructureOracle Database Migration to Oracle Cloud Infrastructure
Oracle Database Migration to Oracle Cloud Infrastructure
 
Building robust CDC pipeline with Apache Hudi and Debezium
Building robust CDC pipeline with Apache Hudi and DebeziumBuilding robust CDC pipeline with Apache Hudi and Debezium
Building robust CDC pipeline with Apache Hudi and Debezium
 
What's new in Oracle Trace File Analyzer version 12.2.1.1.0
What's new in Oracle Trace File Analyzer version 12.2.1.1.0What's new in Oracle Trace File Analyzer version 12.2.1.1.0
What's new in Oracle Trace File Analyzer version 12.2.1.1.0
 
Oracle Active Data Guard: Best Practices and New Features Deep Dive
Oracle Active Data Guard: Best Practices and New Features Deep Dive Oracle Active Data Guard: Best Practices and New Features Deep Dive
Oracle Active Data Guard: Best Practices and New Features Deep Dive
 
Snowflake Data Loading.pptx
Snowflake Data Loading.pptxSnowflake Data Loading.pptx
Snowflake Data Loading.pptx
 
Vue d'ensemble Dremio
Vue d'ensemble DremioVue d'ensemble Dremio
Vue d'ensemble Dremio
 
Webinar Data Mesh - Part 3
Webinar Data Mesh - Part 3Webinar Data Mesh - Part 3
Webinar Data Mesh - Part 3
 
Oracle GoldenGate 21c New Features and Best Practices
Oracle GoldenGate 21c New Features and Best PracticesOracle GoldenGate 21c New Features and Best Practices
Oracle GoldenGate 21c New Features and Best Practices
 
Oracle Database performance tuning using oratop
Oracle Database performance tuning using oratopOracle Database performance tuning using oratop
Oracle Database performance tuning using oratop
 
Enterprise manager 13c
Enterprise manager 13cEnterprise manager 13c
Enterprise manager 13c
 
The Top 5 Reasons to Deploy Your Applications on Oracle RAC
The Top 5 Reasons to Deploy Your Applications on Oracle RACThe Top 5 Reasons to Deploy Your Applications on Oracle RAC
The Top 5 Reasons to Deploy Your Applications on Oracle RAC
 
Snowflake essentials
Snowflake essentialsSnowflake essentials
Snowflake essentials
 
Oracle Transparent Data Encryption (TDE) 12c
Oracle Transparent Data Encryption (TDE) 12cOracle Transparent Data Encryption (TDE) 12c
Oracle Transparent Data Encryption (TDE) 12c
 
Migration to Oracle Multitenant
Migration to Oracle MultitenantMigration to Oracle Multitenant
Migration to Oracle Multitenant
 
Introducing Change Data Capture with Debezium
Introducing Change Data Capture with DebeziumIntroducing Change Data Capture with Debezium
Introducing Change Data Capture with Debezium
 
Master the Multi-Clustered Data Warehouse - Snowflake
Master the Multi-Clustered Data Warehouse - SnowflakeMaster the Multi-Clustered Data Warehouse - Snowflake
Master the Multi-Clustered Data Warehouse - Snowflake
 
Oracle Active Data Guard 12c: Far Sync Instance, Real-Time Cascade and Other ...
Oracle Active Data Guard 12c: Far Sync Instance, Real-Time Cascade and Other ...Oracle Active Data Guard 12c: Far Sync Instance, Real-Time Cascade and Other ...
Oracle Active Data Guard 12c: Far Sync Instance, Real-Time Cascade and Other ...
 
Oracle data guard for beginners
Oracle data guard for beginnersOracle data guard for beginners
Oracle data guard for beginners
 
Apache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversApache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the Covers
 

Similar a Continuous Data Replication into Cloud Storage with Oracle GoldenGate

Similar a Continuous Data Replication into Cloud Storage with Oracle GoldenGate (20)

Data Warehouse - Incremental Migration to the Cloud
Data Warehouse - Incremental Migration to the CloudData Warehouse - Incremental Migration to the Cloud
Data Warehouse - Incremental Migration to the Cloud
 
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
Snowflake: Your Data. No Limits (Session sponsored by Snowflake) - AWS Summit...
 
Demystifying Data Warehousing as a Service (GLOC 2019)
Demystifying Data Warehousing as a Service (GLOC 2019)Demystifying Data Warehousing as a Service (GLOC 2019)
Demystifying Data Warehousing as a Service (GLOC 2019)
 
How to Take Advantage of an Enterprise Data Warehouse in the Cloud
How to Take Advantage of an Enterprise Data Warehouse in the CloudHow to Take Advantage of an Enterprise Data Warehouse in the Cloud
How to Take Advantage of an Enterprise Data Warehouse in the Cloud
 
Snowflake for Data Engineering
Snowflake for Data EngineeringSnowflake for Data Engineering
Snowflake for Data Engineering
 
Snowflake’s Cloud Data Platform and Modern Analytics
Snowflake’s Cloud Data Platform and Modern AnalyticsSnowflake’s Cloud Data Platform and Modern Analytics
Snowflake’s Cloud Data Platform and Modern Analytics
 
Accelerate Productivity by Computing at the Edge - AWS Online Tech Talks
Accelerate Productivity by Computing at the Edge - AWS Online Tech TalksAccelerate Productivity by Computing at the Edge - AWS Online Tech Talks
Accelerate Productivity by Computing at the Edge - AWS Online Tech Talks
 
Make your data fly - Building data platform in AWS
Make your data fly - Building data platform in AWSMake your data fly - Building data platform in AWS
Make your data fly - Building data platform in AWS
 
Get Savvy with Snowflake
Get Savvy with SnowflakeGet Savvy with Snowflake
Get Savvy with Snowflake
 
Actionable Insights with AI - Snowflake for Data Science
Actionable Insights with AI - Snowflake for Data ScienceActionable Insights with AI - Snowflake for Data Science
Actionable Insights with AI - Snowflake for Data Science
 
Graph Gurus Episode 25: Unleash the Business Value of Your Data Lake with Gra...
Graph Gurus Episode 25: Unleash the Business Value of Your Data Lake with Gra...Graph Gurus Episode 25: Unleash the Business Value of Your Data Lake with Gra...
Graph Gurus Episode 25: Unleash the Business Value of Your Data Lake with Gra...
 
Does it only have to be ML + AI?
Does it only have to be ML + AI?Does it only have to be ML + AI?
Does it only have to be ML + AI?
 
Creare e gestire Data Lake e Data Warehouses
Creare e gestire Data Lake e Data WarehousesCreare e gestire Data Lake e Data Warehouses
Creare e gestire Data Lake e Data Warehouses
 
Delivering Data Democratization in the Cloud with Snowflake
Delivering Data Democratization in the Cloud with SnowflakeDelivering Data Democratization in the Cloud with Snowflake
Delivering Data Democratization in the Cloud with Snowflake
 
Sydney: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cloud
Sydney: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cloud Sydney: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cloud
Sydney: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cloud
 
Build multi region data warehouse on AWS - AWSVNUG
Build multi region data warehouse on AWS - AWSVNUGBuild multi region data warehouse on AWS - AWSVNUG
Build multi region data warehouse on AWS - AWSVNUG
 
Immersion Day - Democratize o acesso ao dado
Immersion Day - Democratize o acesso ao dadoImmersion Day - Democratize o acesso ao dado
Immersion Day - Democratize o acesso ao dado
 
Modern Data Platforms - Thinking Data Flywheel on the Cloud
Modern Data Platforms - Thinking Data Flywheel on the CloudModern Data Platforms - Thinking Data Flywheel on the Cloud
Modern Data Platforms - Thinking Data Flywheel on the Cloud
 
AWS Data Transfer Services: Deep Dive - SRV302 - Chicago AWS Summit
AWS Data Transfer Services: Deep Dive - SRV302 - Chicago AWS SummitAWS Data Transfer Services: Deep Dive - SRV302 - Chicago AWS Summit
AWS Data Transfer Services: Deep Dive - SRV302 - Chicago AWS Summit
 
Enabling the Active Data Warehouse with Apache Kudu
Enabling the Active Data Warehouse with Apache KuduEnabling the Active Data Warehouse with Apache Kudu
Enabling the Active Data Warehouse with Apache Kudu
 

Más de Michael Rainey

Más de Michael Rainey (20)

SQL on Hadoop for the Oracle Professional
SQL on Hadoop for the Oracle ProfessionalSQL on Hadoop for the Oracle Professional
SQL on Hadoop for the Oracle Professional
 
Going Serverless - an Introduction to AWS Glue
Going Serverless - an Introduction to AWS GlueGoing Serverless - an Introduction to AWS Glue
Going Serverless - an Introduction to AWS Glue
 
Offload, Transform, and Present - the New World of Data Integration
Offload, Transform, and Present - the New World of Data IntegrationOffload, Transform, and Present - the New World of Data Integration
Offload, Transform, and Present - the New World of Data Integration
 
Oracle Data Integrator 12c - Getting Started
Oracle Data Integrator 12c - Getting StartedOracle Data Integrator 12c - Getting Started
Oracle Data Integrator 12c - Getting Started
 
Streaming with Oracle Data Integration
Streaming with Oracle Data IntegrationStreaming with Oracle Data Integration
Streaming with Oracle Data Integration
 
Oracle data integrator 12c - getting started
Oracle data integrator 12c - getting startedOracle data integrator 12c - getting started
Oracle data integrator 12c - getting started
 
Oracle GoldenGate and Apache Kafka: A Deep Dive Into Real-Time Data Streaming
Oracle GoldenGate and Apache Kafka: A Deep Dive Into Real-Time Data StreamingOracle GoldenGate and Apache Kafka: A Deep Dive Into Real-Time Data Streaming
Oracle GoldenGate and Apache Kafka: A Deep Dive Into Real-Time Data Streaming
 
A Walk Through the Kimball ETL Subsystems with Oracle Data Integration
A Walk Through the Kimball ETL Subsystems with Oracle Data IntegrationA Walk Through the Kimball ETL Subsystems with Oracle Data Integration
A Walk Through the Kimball ETL Subsystems with Oracle Data Integration
 
Oracle GoldenGate and Apache Kafka A Deep Dive Into Real-Time Data Streaming
Oracle GoldenGate and Apache Kafka A Deep Dive Into Real-Time Data StreamingOracle GoldenGate and Apache Kafka A Deep Dive Into Real-Time Data Streaming
Oracle GoldenGate and Apache Kafka A Deep Dive Into Real-Time Data Streaming
 
Oracle GoldenGate and Apache Kafka A Deep Dive Into Real-Time Data Streaming
Oracle GoldenGate and Apache Kafka A Deep Dive Into Real-Time Data StreamingOracle GoldenGate and Apache Kafka A Deep Dive Into Real-Time Data Streaming
Oracle GoldenGate and Apache Kafka A Deep Dive Into Real-Time Data Streaming
 
A Walk Through the Kimball ETL Subsystems with Oracle Data Integration - Coll...
A Walk Through the Kimball ETL Subsystems with Oracle Data Integration - Coll...A Walk Through the Kimball ETL Subsystems with Oracle Data Integration - Coll...
A Walk Through the Kimball ETL Subsystems with Oracle Data Integration - Coll...
 
Practical Tips for Oracle Business Intelligence Applications 11g Implementations
Practical Tips for Oracle Business Intelligence Applications 11g ImplementationsPractical Tips for Oracle Business Intelligence Applications 11g Implementations
Practical Tips for Oracle Business Intelligence Applications 11g Implementations
 
GoldenGate and Oracle Data Integrator - A Perfect Match- Upgrade to 12c
GoldenGate and Oracle Data Integrator - A Perfect Match- Upgrade to 12cGoldenGate and Oracle Data Integrator - A Perfect Match- Upgrade to 12c
GoldenGate and Oracle Data Integrator - A Perfect Match- Upgrade to 12c
 
Real-Time Data Replication to Hadoop using GoldenGate 12c Adaptors
Real-Time Data Replication to Hadoop using GoldenGate 12c AdaptorsReal-Time Data Replication to Hadoop using GoldenGate 12c Adaptors
Real-Time Data Replication to Hadoop using GoldenGate 12c Adaptors
 
Tame Big Data with Oracle Data Integration
Tame Big Data with Oracle Data IntegrationTame Big Data with Oracle Data Integration
Tame Big Data with Oracle Data Integration
 
Real-time Data Warehouse Upgrade – Success Stories
Real-time Data Warehouse Upgrade – Success StoriesReal-time Data Warehouse Upgrade – Success Stories
Real-time Data Warehouse Upgrade – Success Stories
 
A Picture Can Replace A Thousand Words
A Picture Can Replace A Thousand WordsA Picture Can Replace A Thousand Words
A Picture Can Replace A Thousand Words
 
GoldenGate and Oracle Data Integrator - A Perfect Match...
GoldenGate and Oracle Data Integrator - A Perfect Match...GoldenGate and Oracle Data Integrator - A Perfect Match...
GoldenGate and Oracle Data Integrator - A Perfect Match...
 
GoldenGate and ODI - A Perfect Match for Real-Time Data Warehousing
GoldenGate and ODI - A Perfect Match for Real-Time Data WarehousingGoldenGate and ODI - A Perfect Match for Real-Time Data Warehousing
GoldenGate and ODI - A Perfect Match for Real-Time Data Warehousing
 
Data warehouse migration to oracle data integrator 11g
Data warehouse migration to oracle data integrator 11gData warehouse migration to oracle data integrator 11g
Data warehouse migration to oracle data integrator 11g
 

Último

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Último (20)

Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 

Continuous Data Replication into Cloud Storage with Oracle GoldenGate

  • 1. © 2019 Snowflake Computing Inc. All Rights Reserved CONTINUOUS DATA REPLICATION INTO CLOUD STORAGE WITH ORACLE GOLDENGATE Michael Rainey | ITOUG Tech Days February 2019
  • 2. © 2019 Snowflake Computing Inc. All Rights Reserved ABOUT ME Michael Rainey Senior Solutions Architect at Snowflake Computing Oracle ACE Director Twitter: @mRainey Email: michael.rainey@snowflake.com
  • 3. © 2019 Snowflake Computing Inc. All Rights Reserved 3 YEARS IN STEALTH + 3 YEARS GA 900+ employees Over 2000 customers today Over $850M in venture funding from leading investors First customers 2014, general availability 2015 Founded 2012 by industry veterans with over 120 database patents Queries processed in Snowflake per day: 100 million Largest single table: 68 trillion rows Largest number of tables single DB: 200,000 Single customer most data: > 40PB Single customer most users: > 10,000 Fun facts:
  • 4. © 2019 Snowflake Computing Inc. All Rights Reserved https://pixabay.com/en/fountain-water-flow-wet-3412242
  • 5. © 2019 Snowflake Computing Inc. All Rights Reserved ORACLE GOLDENGATE
  • 6. © 2019 Snowflake Computing Inc. All Rights Reserved In the beginning… ● Development of the “handler” was a manual effort using GoldenGate for Java adapter ● Very few targets available (HBase, Flume, Hive, etc) Over the years… ● More targets, more handlers, more automated! ORACLE GOLDENGATE FOR BIG DATA
  • 7. © 2019 Snowflake Computing Inc. All Rights Reserved ORACLE GOLDENGATE FOR BIG DATA https://blogs.oracle.com/dataintegration/goldengate-for-big-data-123211-release-update
  • 8. © 2019 Snowflake Computing Inc. All Rights Reserved WRITE FILES WITH GOLDENGATE FOR BIG DATA https://pixabay.com/en/alphabets-ancient-author-data-2365812/
  • 9. © 2019 Snowflake Computing Inc. All Rights Reserved File Writer Handler ● Data is formatted and staged locally ● Maintain state - previously, all was lost ● Templated strings (substitution variables) can be used throughout properties file for naming Event Handlers ● Transforms files written by File Writer Handler to target format (Parquet, ORC, etc) ● Connects to 3rd party application APIs ● Loads files to 3rd party applications (HDFS, S3, etc) Pluggable Formatters ● Provide format options and metadata options GOLDENGATE FOR BIG DATA HANDLERS
  • 10. © 2019 Snowflake Computing Inc. All Rights Reserved Extracting data from the source remains the same, no change Install and setup GoldenGate for Big Data to handle replicat functionality Create replicat parameter file and properties file GOLDENGATE FOR BIG DATA SETUP
  • 11. © 2019 Snowflake Computing Inc. All Rights Reserved Contains parameters required for... ● ...creating the file and where it will be created ● ...file format and metadata options ● ...connection to final target and additional parameters required Create with name that matches replicat parameter file name ● Example: payroll.prm / payroll.properties GOLDENGATE FOR BIG DATA PROPERTIES
  • 12. © 2019 Snowflake Computing Inc. All Rights Reserved REPLICAT TO TARGET PROCESS
  • 13. © 2019 Snowflake Computing Inc. All Rights Reserved FILE WRITER HANDLER STATE
  • 14. © 2019 Snowflake Computing Inc. All Rights Reserved GOLDENGATE FOR BIG DATA PROPERTIES
  • 15. © 2019 Snowflake Computing Inc. All Rights Reserved GOLDENGATE FOR BIG DATA PROPERTIES Pluggable Formatter
  • 16. © 2019 Snowflake Computing Inc. All Rights Reserved GOLDENGATE FOR BIG DATA PROPERTIES
  • 17. © 2019 Snowflake Computing Inc. All Rights Reserved GOLDENGATE FOR BIG DATA PROPERTIES
  • 18. © 2019 Snowflake Computing Inc. All Rights Reserved Cloud storage security (S3) ● User/role must be able to list buckets and create buckets, along with ability to write files Properties file syntax ● Not many examples besides in the documentation...which can incorrect as well! Add all required properties ● For example, goldengate.userexit.writers=javawriter is required, but not in the docs LESSONS LEARNED
  • 19. © 2019 Snowflake Computing Inc. All Rights Reserved USING THE DATA IN CLOUD STORAGE https://pixabay.com/en/hacking-cyber-blackandwhite-crime-2903156/
  • 20. © 2019 Snowflake Computing Inc. All Rights Reserved A fully managed extract, transform, and load (ETL) service Data Catalog ● Central repository to store structural and operational metadata for all your data assets Crawlers ● Connects to data stores, determines the data structures, and writes tables into the Data Catalog Jobs ● Business logic that performs the extract, transform, and load (ETL) work ● Developed using PySpark or Scala as scripts, generated by AWS Glue ● Built-in transforms used to process data AWS GLUE
  • 21. © 2019 Snowflake Computing Inc. All Rights Reserved AWS GLUE JOB
  • 22. © 2019 Snowflake Computing Inc. All Rights Reserved Serverless query engine for reporting and analytics Uses Presto DB as the underlying query engine Supports CSV, JSON, Gzip files and columnar formats like Apache Parquet ● Use Athena’s own catalog or the centralized AWS Glue catalog Serverless query engine ● Performance scales “automatically” based on query profiling ● Pay as you query - based on data scanned ● SELECT only - no DML ● Follow best practices for query optimization Integrates with BI tools like Tableau, Looker, AWS QuickSight, etc. for advanced reports and visualizations AMAZON ATHENA
  • 23. © 2019 Snowflake Computing Inc. All Rights Reserved AMAZON ATHENA
  • 24. © 2019 Snowflake Computing Inc. All Rights Reserved BI service used to create data visualizations and interactive dashboards for insightful analysis Connects to AWS sources (Athena, S3, etc), SQL databases, and SaaS applications (Salesforce, etc) Uses ‘pay-per-session’ pricing for cost-effective user access Create and publish interactive dashboards, share reports via email, or embed analytics into customer applications Powered by super-fast, parallel, in-memory calculation engine (SPICE) AWS QUICKSIGHT
  • 25. © 2019 Snowflake Computing Inc. All Rights Reserved
  • 26. © 2019 Snowflake Computing Inc. All Rights Reserved Storage separated from compute ● Centralized, scale-out storage that expands and contracts automatically Resize compute instantly ● Scale up/down or turn off when not in use Multiple clusters access data without contention ● ETL, reporting, data science, and applications all running at the same time without performance impact. Centralized management ● Separate metadata from storage and compute Full transactional consistency (ACID) SNOWFLAKE DATA WAREHOUSE Management Optimization Security Availability Transactions Metadata
  • 27. © 2019 Snowflake Computing Inc. All Rights Reserved AUTO-INGEST WITH SNOWPIPE Snowflake Database External S3 Snowpipe Service Server-less Loader S3 notification File data Data replication/ streaming application
  • 28. © 2019 Snowflake Computing Inc. All Rights Reserved CONTINUOUS REPLICATION TO SNOWFLAKE Snowflake Database External S3 Snowpipe Service Server-less Loader S3 notification File data
  • 29. © 2019 Snowflake Computing Inc. All Rights Reserved MERGE REPLICATED DATA
  • 30. © 2019 Snowflake Computing Inc. All Rights Reserved MERGE REPLICATED DATA
  • 31. © 2019 Snowflake Computing Inc. All Rights Reserved MERGE REPLICATED DATA
  • 32. © 2019 Snowflake Computing Inc. All Rights Reserved MERGE REPLICATED DATA ... ...
  • 33. © 2019 Snowflake Computing Inc. All Rights Reserved https://pixabay.com/en/lightbulbs-bulbs-light-idea-energy-1875257/
  • 34. © 2019 Snowflake Computing Inc. All Rights Reserved GoldenGate for Big Data docs: https://docs.oracle.com/goldengate/bd123210/gg-bd/index.html Oracle Data Integration blog: https://blogs.oracle.com/dataintegration/data-integration Continuous Data Replication into Snowflake with Oracle Goldengate blog post: https://www.snowflake.com/blog/continuous-data-replication-into-snowflake-with-oracle-goldengate/ MORE INFORMATION
  • 35. THANK YOU © 2019 Snowflake Computing Inc. All Rights Reserved
  • 36. © 2019 Snowflake Computing Inc. All Rights Reserved DISCOVER THE PERFORMANCE, CONCURRENCY, AND SIMPLICITY OF SNOWFLAKE As easy as 1-2-3! 01 Visit Snowflake.com 02 Click “Try for Free” 03 Sign up & register Snowflake is the only data warehouse built for the cloud. You can automatically scale compute up, out, or down—independent of storage. Plus, you have the power of a complete SQL database, with zero management, that can grow with you to support all of your data and all of your users. With Snowflake On Demand™, pay only for what you use. Sign up and receive $400 worth of free usage for 30 days!