SlideShare una empresa de Scribd logo
1 de 34
Descargar para leer sin conexión
Which Change Data Capture
Strategy Is Right for You?
Presented by
Paige Roberts
Sr. Product Marketing Manager
Data Integration, Data Quality
1
Choosing a Change Data Capture Strategy
1 What is Change Data Capture?
2 Why Do Change Data Capture?
3 Strategies for Change Data Capture
4 Examples of Change Data Capture
5 Q and A
CDC is the process that ensures that changes made over time in
one dataset are automatically transferred to another dataset.
Change Data Capture or CDC is most often used with databases that hold important
transactional data to make sure that organizations are working with up-to-date information
across the enterprise.
Source - often used to record transactions or other business occurrences as they happen.
Target - often used to create a report or do analysis to determine a course of action.
Sometimes, data is replicated bi-directionally so that a source is also a target and vice versa.
Which Change Data Capture Strategy Is Right for You?3
What is Change Data Capture?
Replication Options
One Way
Two Way
Cascade
Bi-Directional
Distribute
Consolidate
Choose a
topology or
combine them to
meet your data
sharing needs
5
Integrated Architecture Use Case
ERP SYSTEM
Customer Orders
Payment Details
Product Catalogue
Price List
eCOMMERCE &
WEB PORTALS
TEST & AUDIT
ENVIRONMENT
DATA EXCHANGE
WITH OUTSIDE VENDOR
(FLAT FILE)
DR /
BACKUP
6
Customer Example Architecture
EDGE
NODE
CLUSTER DATA NODES
DATABASE
SOURCES
MAINFRAME
SOURCES
VSAM
Db2
CAPTURE
AGENT
Reasons for CDC
1. Businesses have Multiple Databases
Multiple databases are the norm
• Merger or acquisition
• Choice of multiple apps or databases for best of breed solutions
• Combination of legacy and new databases
• Multi-organization supply chain
IT infrastructures are heterogeneous
• Database platforms
• Operating systems
• Hardware
8
Drivers Behind Change Data Capture
83%
10%
8% Does your organization rely on multiple databases?
Yes No I don't know.
73%of those with multiple databases
share data among them
Does your organization share data between multiple databases?
Source: Vision Solutions ‘ 2017 State of Resilience Report
2. Enabling Analytics, Reporting and BI
• Protecting performance of production
database by offloading data to a reporting
system for queries, reports, business
intelligence or analytics
• Consolidating data into centralized
databases, data marts or data warehouses
for decision making or business processing
Which Change Data Capture Strategy Is Right for You?9
Drivers Behind Change Data Capture
3. Enabling Machine Learning, Advanced Analytics and AI
• Growing data volumes lead to new architectures for
data consolidation – data lakes and enterprise data hubs
based on Hadoop or Spark.
• New types of data and larger amounts of data from
multiple sources combined together create an ideal
environment for training and employing machine
learning and artificial intelligence.
• Businesses across many industries seek competitive
edge from these new technologies in use cases from
fraud detection to targeted marketing.
• ML and AI systems have a constant, voracious need for
more data, and must constantly have the latest, most
current data available to provide the promised insights.
Which Change Data Capture Strategy Is Right for You?10
Drivers Behind Change Data Capture
4. Varied Business and IT Goals
• Offloading data for maintenance, backup, or testing
on a secondary system without production impact
• Maintaining synchronization between siloed
databases or branch offices
• Feeding segmented data to customer or partner
applications
• Migrating data to new databases
• Re-platforming databases to new database or
operating system platforms
11
Drivers Behind Change Data Capture
Source: Vision Solutions ‘ 2017 State of Resilience Report
For what business purpose does your organization share data
between databases?
Consolidating data from multiple sources into…
Reporting on data offloaded from the…
Synchronizing data between distributed…
Testing on offloaded data
Running business processes on offloaded data
I don’t know
0% 10% 20% 30% 40% 50% 60% 70%
Why do you need to capture and move the changes in your data?
• Populating centralized databases, data marts, data warehouses, or data lakes
• Enabling machine learning, advanced analytics and AI on modern data architectures like Hadoop and Spark
• Enabling queries, reports, business intelligence or analytics without production impact
• Feeding real-time data to employee, customer or partner applications
• Keeping data from siloed databases in sync
• Reducing the impact of database maintenance, backup or testing
• Re-platforming to new database or operating systems
• Consolidating databases
12
Goals for Change Data Capture
Strategies for CDC
Which Change Data Capture Strategy Is Right for You?14
Timestamps or Version Numbers
Advantages
• Simple
• Nearly every database can query
with a where clause.
Disadvantages
• Must be built into database
• Bloats database size
• Query requires considerable compute resources in source database
• Not always reliable
Which Change Data Capture Strategy Is Right for You?15
Table Triggers
Advantages
• Very reliable and detailed
• Changes can be captured, almost as fast as they are
made – real-time CDC.
Disadvantages
• Significant drag on database resources, both
compute and storage.
• Requires that the database have the capability.
• Negative impact on performance of applications that
depend on the source database.
Which Change Data Capture Strategy Is Right for You?16
Snapshot or Table Comparison
Advantages
• Relatively easy to implement with
good ETL software.
• Requires no specialized knowledge
of the source database.
• Very dependable and accurate.
Disadvantages
• Requires repeatedly moving all data in monitored tables. May impact
target or staging system resources and network bandwidth.
• Moving lots of data can be slow, may not meet SLA’s.
• Joining, comparing, and finding changes may also take time. Even
slower.
• Not a complete record of intermediate changes between snapshot
captures.
Which Change Data Capture Strategy Is Right for You?17
Log Scraping
Advantages
• Very reliable and detailed.
• Virtually no impact on database or application
performance.
• Changes captured in real-time.
• No database bloat.
Disadvantages
• Every RDMS has a different log format, often not
documented.
• Log formats often change between RDBMS
versions.
• Log files are frequently archived by the database.
CDC software must read them before they’re
archived, or be able to go read the archived logs.
• Requires specialized CDC software. Cannot be
easily accomplished with ETL software.
• Can fail if connectivity is lost on source or target,
causing lost data, duplicated data, or need to
restart from initial data load.
CDC with Syncsort
19
Syncsort DMX & DMX-h:
Simple and Powerful Big Data Integration Software
Syncsort Data Integration and Data Quality for the Cloud
DMX
• GUI for developing MapReduce & Spark jobs
• Test & debug locally in Windows; deploy on Hadoop
• Use-case Accelerators to fast-track development
• Broad based connectivity with automated parallelism
• Simply the best mainframe access and integration with Hadoop
• Improved per node scalability and throughput
High Performance
ETL Software
• Template driven design for:
o High performance ETL
o SQL migration/DB offload
o Mainframe data movement
• Light weight footprint on commodity hardware
• High speed flat file processing
• Self tuning engine
High Performance
Hadoop ETL SoftwareDMX-h
DMX Change Data Capture
Keep data in sync in real-time
• Without overloading networks.
• Without affecting source database
performance.
• Without coding or tuning.
Reliable transfer of data you can trust even if connectivity fails on either side.
• Auto restart.
• No data loss.
Real-Time Replication
with Transformation
Conflict Resolution,
Collision Monitoring,
Tracking and Auditing
Files
RDBMS
Streams
Streams
RDBMS
Data
Lake
Mainframe
Cloud
OLAP
DMX Change Data Capture Sources and Targets
SOURCES
• IBM Db2/z
• IBM Db2/i
• IBM Db2/LUW
• VSAM
• Kafka
• Oracle
• Oracle RAC
Real Application
Clusters
• MS SQL Server
• IBM Informix
• Sybase
TARGETS
• Kafka
• Amazon Kinesis
• Teradata
• HDFS
• Hive
(HDFS, ORC, Avro, Parquet)
• Impala
(Parquet, Kudu)
• IBM Db2
• SQL Server
• MS Azure SQL
• PostgreSQL
• MySQL
• Oracle
• Oracle RAC
• Sybase
• And more …
Real-Time Replication
with Transformation
Conflict Resolution,
Collision Monitoring,
Tracking and Auditing
Files
RDBMS
Streams
Streams
RDBMS
Data Hub
Mainframe
Cloud
OLAP
22
Design Once, Deploy Anywhere
Syncsort Data Integration and Data Quality for the Cloud
Intelligent Execution - Insulate your organization from underlying complexities of Hadoop.
Get excellent performance every time
without tuning, load balancing, etc.
No re-design, re-compile, no re-work ever
• Future-proof job designs for emerging compute
frameworks, e.g. Spark 2.x
• Move from development to test to production
• Move from on-premise to Cloud
• Move from one Cloud to another
Use existing ETL and data quality skills
No parallel programming – Java, MapReduce, Spark …
No worries about:
• Mappers, Reducers
• Big side or small side of joins …
Design Once
in visual GUI
Deploy Anywhere!
On-Premise,
Cloud
Mapreduce, Spark,
Future Platforms
Windows, Unix,
Linux
Batch,
Streaming
Single Node,
Cluster
Which Change Data Capture Strategy Is Right for You?23
Snapshot CDC with DMX/DMX-h
• Captures database changes on a
scheduled basis
• High speed sort and join
• Transforms and enhances data
during replication
• Supplies end-to-end lineage of data
for compliance, auditing
• Any source, any target, not limited
to sources with logging
• Fast development in template-
based GUI
• Latency – Usually hourly to weekly
Integration in
the Cloud with
DMX ETL
“DMX allows Dickey’s to rapidly
collect, transform and load
thousands of very large files, with
diverse data types from multiple
servers across all of Dickey’s
locations, without performance
bottlenecks.”
Laura Rea, Dickey’s, CIO
24
Modernize antiquated, Excel-based
Point of Sales system analytics.
Must function with minimal on-site
infrastructure and support personnel.
• Standardize software across 500+ stores.
• 1000’sof large files
• Diverse data types – financial, operations,
inventory, purchasing
• DMX ETL
• AWS cloud-based architecture designed and
implemented by iOLAP.
• Rapid job development in visual interface – no
hand coding or scripts to maintain.
• Everyday operations data available to non-
technical business users.
AWS Cloud scales with project needs
– Dickeys pays for only what they use
Redshift updated every 15-20
minutes for quick, easy, current data-
driven business insights.
Better reporting and analytics =
more dollars saved and earned.
SOLUTION:
25
Log-Based Anything to Hadoop
• Real-time capture
• Minimizes bandwidth usage with LAN/WAN
friendly replication
• Parallel load on cluster
• Updates HDFS, Hive or Impala, backed by HDFS,
Parquet, ORC, or Kudu.
• Updates even versions of Hive that did not
support updating
• Latency – Minutes (less than 5)
Real-Time Replication
with Transformation
Conflict Resolution,
Collision Monitoring,
Tracking and Auditing
Data
Lake
Cloud
Files
RDBMS
Streams
Mainframe
Case Study:
Guardian Life Insurance
"We found DMX-h to be very usable and
easy to ramp up in terms of skills. Most
of all, Syncsort has been a very good
partner in terms of support and listening
to our needs.“
– Alex Rosenthal, Enterprise Data Office
CHALLENGE
• Enable visualization and BI on broad range of data sets.
• Reduce data preparation, transformation times
• Reduce time-to-market for analytics projects.
• Make data assets available to whole enterprise – including Mainframe.
SOLUTION
• Created Amazon-style data marketplace, supported by data lake,
Hadoop, NoSQL. New projects reuse and build upon existing
data assets. DMX-h adds new data to the Data Lake with
each new project.
• DMX DataFunnel quickly ingested hundreds of database
tables at push of a button
• DMX Change Data Capture pushes changes from DB2 to the
data lake in real-time. Current data up-to-the minute.
BENEFITS
• Centralized standardized reusable data assets –
searchable, accessible and managed.
• DMX-h and DataFunnel accelerated
data acquisition, reduced time to
market for analytics and reporting.
27
Anything to Stream, or Stream to Anything
• Real-time capture
• Minimizes bandwidth usage with LAN/WAN
friendly replication
• Parallel load on cluster
• Updates HDFS, Hive or Impala, backed by
HDFS, Parquet, ORC, or Kudu.
• Updates even versions of Hive that did not
support updating
• Latency – Real-time, actual SLA varies
depending on update speed of target,
stream settings, etc. Usually, seconds.
Real-Time Replication
with Transformation
Conflict Resolution,
Collision Monitoring,
Tracking and Auditing
Files
RDBMS
Streams
Streams
RDBMS
Data
Lake
Mainframe
Cloud
OLAP
Case Study:
Global Hotel Data Kept Current On the Cloud
Syncsort Data Integration and Data Quality for the Cloud28
C H A L L E N G E
• More timely collection & reporting on room availability, event bookings,
inventory and other hotel data from 4,000+ properties globally
S O LU T I O N
• Near real-time reporting - DMX-h consumes property updates from Kafka
every 10 seconds
• DMX-h processes data on HDP, loading to Teradata every 30 minutes
• Deployed on Google Cloud Platform
• Productivity: Leveraging ETL team for Hadoop
(Spark), visual understanding of data pipeline
• Insight: Up-to-date data = better business decisions
= happier customers
B E N E F I T S
• Time to Value: DMX-h ease of use drastically cut development time
• Agility: Global reports updated every 30 min – before 24 hours
29
Log-Based Database to Database
• Captures database changes as they happen
• Transforms and enhances data during replication
• Minimizes bandwidth usage with LAN/WAN
friendly replication
• Ensures data integrity with conflict resolution
and collision monitoring
• Enables tracking and auditing of transactions for
compliance
• Latency – sub-second
Real-Time Replication
with Transformation
Conflict Resolution,
Collision Monitoring,
Tracking and Auditing
RDBMS
RDBMS
OLAP
Centralized Reporting Use Case
Casino 1
IBM i Db2
Casino 2 Casino 3 Casino 4 Casino 5 Casino 6
Single Data Warehouse Database
Windows Cluster
MS SQL Server
Business intelligence
Real time CDC replication
with transformation
• Customer loyalty
• Amounts paid
• Amounts won
• Time at the table
• Time at the machine
IBM i Db2 IBM i Db2 IBM i Db2 IBM i Db2 IBM i Db2
Gradual Database Re-Platforming Use Case
IBM i
Db2
Old System
Windows
SQL Server
New System
America II Corp
Active-Active replication eliminated need
for hard cutover and enabled partners to
move back and forth between systems
True zero downtime for
migration to new systems
Transformation between
different OS and database
platforms with completely
different schemas 100’s of partners moved to
new server after training at
their own pace
Syncsort Addresses All Your
Data Sharing Needs
✓ Enables centralization or consolidation of data
✓ Facilitates machine learning, advanced analytics and AI
✓ Facilitates real-time queries, reporting and business intelligence
✓ Transforms data for smooth data flow between databases
✓ Keeps distributed applications and data in sync
✓ Feeds real-time data to mission critical applications
✓ Offloads data for maintenance, testing and backup
✓ Migrates legacy data to new platforms
✓ And more!
33
Which Change Data Capture Strategy is Right for You?

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Data Mesh
Data MeshData Mesh
Data Mesh
 
Architecting a datalake
Architecting a datalakeArchitecting a datalake
Architecting a datalake
 
Lessons in Data Modeling: Data Modeling & MDM
Lessons in Data Modeling: Data Modeling & MDMLessons in Data Modeling: Data Modeling & MDM
Lessons in Data Modeling: Data Modeling & MDM
 
Estimating the Total Costs of Your Cloud Analytics Platform 
Estimating the Total Costs of Your Cloud Analytics Platform Estimating the Total Costs of Your Cloud Analytics Platform 
Estimating the Total Costs of Your Cloud Analytics Platform 
 
Introducing Databricks Delta
Introducing Databricks DeltaIntroducing Databricks Delta
Introducing Databricks Delta
 
Module 2 - Datalake
Module 2 - DatalakeModule 2 - Datalake
Module 2 - Datalake
 
Data Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced AnalyticsData Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced Analytics
 
Cloud Migration: Moving Data and Infrastructure to the Cloud
Cloud Migration: Moving Data and Infrastructure to the CloudCloud Migration: Moving Data and Infrastructure to the Cloud
Cloud Migration: Moving Data and Infrastructure to the Cloud
 
Introduction to Amazon DynamoDB
Introduction to Amazon DynamoDBIntroduction to Amazon DynamoDB
Introduction to Amazon DynamoDB
 
Data Architecture - The Foundation for Enterprise Architecture and Governance
Data Architecture - The Foundation for Enterprise Architecture and GovernanceData Architecture - The Foundation for Enterprise Architecture and Governance
Data Architecture - The Foundation for Enterprise Architecture and Governance
 
Get Savvy with Snowflake
Get Savvy with SnowflakeGet Savvy with Snowflake
Get Savvy with Snowflake
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
 
Day 5 - Real-time Data Processing/Internet of Things (IoT) with Amazon Kinesis
Day 5 - Real-time Data Processing/Internet of Things (IoT) with Amazon KinesisDay 5 - Real-time Data Processing/Internet of Things (IoT) with Amazon Kinesis
Day 5 - Real-time Data Processing/Internet of Things (IoT) with Amazon Kinesis
 
Data mesh
Data meshData mesh
Data mesh
 
How One Company Offloaded Data Warehouse ETL To Hadoop and Saved $30 Million
How One Company Offloaded Data Warehouse ETL To Hadoop and Saved $30 MillionHow One Company Offloaded Data Warehouse ETL To Hadoop and Saved $30 Million
How One Company Offloaded Data Warehouse ETL To Hadoop and Saved $30 Million
 
Big Data and Fast Data - Lambda Architecture in Action
Big Data and Fast Data - Lambda Architecture in ActionBig Data and Fast Data - Lambda Architecture in Action
Big Data and Fast Data - Lambda Architecture in Action
 
Databricks: A Tool That Empowers You To Do More With Data
Databricks: A Tool That Empowers You To Do More With DataDatabricks: A Tool That Empowers You To Do More With Data
Databricks: A Tool That Empowers You To Do More With Data
 
Democratizing Data
Democratizing DataDemocratizing Data
Democratizing Data
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data Engineering
 
Stream Processing – Concepts and Frameworks
Stream Processing – Concepts and FrameworksStream Processing – Concepts and Frameworks
Stream Processing – Concepts and Frameworks
 

Similar a Which Change Data Capture Strategy is Right for You?

Data Warehouse Optimization
Data Warehouse OptimizationData Warehouse Optimization
Data Warehouse Optimization
Cloudera, Inc.
 
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Precisely
 
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRobertsWP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
Jane Roberts
 

Similar a Which Change Data Capture Strategy is Right for You? (20)

Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
 
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
 
Data Warehouse Optimization
Data Warehouse OptimizationData Warehouse Optimization
Data Warehouse Optimization
 
Hadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data WarehouseHadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data Warehouse
 
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
 
Skillwise Big Data part 2
Skillwise Big Data part 2Skillwise Big Data part 2
Skillwise Big Data part 2
 
Skilwise Big data
Skilwise Big dataSkilwise Big data
Skilwise Big data
 
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the Cloud
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the CloudBring Your SAP and Enterprise Data to Hadoop, Kafka, and the Cloud
Bring Your SAP and Enterprise Data to Hadoop, Kafka, and the Cloud
 
The Future of Data Warehousing: ETL Will Never be the Same
The Future of Data Warehousing: ETL Will Never be the SameThe Future of Data Warehousing: ETL Will Never be the Same
The Future of Data Warehousing: ETL Will Never be the Same
 
The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data Integration
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
 
Hadoop and SQL: Delivery Analytics Across the Organization
Hadoop and SQL:  Delivery Analytics Across the OrganizationHadoop and SQL:  Delivery Analytics Across the Organization
Hadoop and SQL: Delivery Analytics Across the Organization
 
Horses for Courses: Database Roundtable
Horses for Courses: Database RoundtableHorses for Courses: Database Roundtable
Horses for Courses: Database Roundtable
 
Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...
Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...
Data Con LA 2018 - Populating your Enterprise Data Hub for Next Gen Analytics...
 
Fueling AI & Machine Learning: Legacy Data as a Competitive Advantage
Fueling AI & Machine Learning: Legacy Data as a Competitive AdvantageFueling AI & Machine Learning: Legacy Data as a Competitive Advantage
Fueling AI & Machine Learning: Legacy Data as a Competitive Advantage
 
Traditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A ComparisonTraditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A Comparison
 
Iod session 3423 analytics patterns of expertise, the fast path to amazing ...
Iod session 3423   analytics patterns of expertise, the fast path to amazing ...Iod session 3423   analytics patterns of expertise, the fast path to amazing ...
Iod session 3423 analytics patterns of expertise, the fast path to amazing ...
 
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRobertsWP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
 
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid WarehouseUsing the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
 

Más de Precisely

How to Build Data Governance Programs That Last - A Business-First Approach.pdf
How to Build Data Governance Programs That Last - A Business-First Approach.pdfHow to Build Data Governance Programs That Last - A Business-First Approach.pdf
How to Build Data Governance Programs That Last - A Business-First Approach.pdf
Precisely
 
Zukuntssichere SAP Prozesse dank automatisierter Massendaten
Zukuntssichere SAP Prozesse dank automatisierter MassendatenZukuntssichere SAP Prozesse dank automatisierter Massendaten
Zukuntssichere SAP Prozesse dank automatisierter Massendaten
Precisely
 
Automate Studio Training: Materials Maintenance Tips for Efficiency and Ease ...
Automate Studio Training: Materials Maintenance Tips for Efficiency and Ease ...Automate Studio Training: Materials Maintenance Tips for Efficiency and Ease ...
Automate Studio Training: Materials Maintenance Tips for Efficiency and Ease ...
Precisely
 
Testjrjnejrvnorno4rno3nrfnfjnrfnournfou3nfou3f
Testjrjnejrvnorno4rno3nrfnfjnrfnournfou3nfou3fTestjrjnejrvnorno4rno3nrfnfjnrfnournfou3nfou3f
Testjrjnejrvnorno4rno3nrfnfjnrfnournfou3nfou3f
Precisely
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity Webinar
Precisely
 
Moving IBM i Applications to the Cloud with AWS and Precisely
Moving IBM i Applications to the Cloud with AWS and PreciselyMoving IBM i Applications to the Cloud with AWS and Precisely
Moving IBM i Applications to the Cloud with AWS and Precisely
Precisely
 
Automate Your Master Data Processes for Shared Service Center Excellence
Automate Your Master Data Processes for Shared Service Center ExcellenceAutomate Your Master Data Processes for Shared Service Center Excellence
Automate Your Master Data Processes for Shared Service Center Excellence
Precisely
 

Más de Precisely (20)

How to Build Data Governance Programs That Last - A Business-First Approach.pdf
How to Build Data Governance Programs That Last - A Business-First Approach.pdfHow to Build Data Governance Programs That Last - A Business-First Approach.pdf
How to Build Data Governance Programs That Last - A Business-First Approach.pdf
 
Zukuntssichere SAP Prozesse dank automatisierter Massendaten
Zukuntssichere SAP Prozesse dank automatisierter MassendatenZukuntssichere SAP Prozesse dank automatisierter Massendaten
Zukuntssichere SAP Prozesse dank automatisierter Massendaten
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
Crucial Considerations for AI-ready Data.pdf
Crucial Considerations for AI-ready Data.pdfCrucial Considerations for AI-ready Data.pdf
Crucial Considerations for AI-ready Data.pdf
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Justifying Capacity Managment Webinar 4/10
Justifying Capacity Managment Webinar 4/10Justifying Capacity Managment Webinar 4/10
Justifying Capacity Managment Webinar 4/10
 
Automate Studio Training: Materials Maintenance Tips for Efficiency and Ease ...
Automate Studio Training: Materials Maintenance Tips for Efficiency and Ease ...Automate Studio Training: Materials Maintenance Tips for Efficiency and Ease ...
Automate Studio Training: Materials Maintenance Tips for Efficiency and Ease ...
 
Leveraging Mainframe Data in Near Real Time to Unleash Innovation With Cloud:...
Leveraging Mainframe Data in Near Real Time to Unleash Innovation With Cloud:...Leveraging Mainframe Data in Near Real Time to Unleash Innovation With Cloud:...
Leveraging Mainframe Data in Near Real Time to Unleash Innovation With Cloud:...
 
Testjrjnejrvnorno4rno3nrfnfjnrfnournfou3nfou3f
Testjrjnejrvnorno4rno3nrfnfjnrfnournfou3nfou3fTestjrjnejrvnorno4rno3nrfnfjnrfnournfou3nfou3f
Testjrjnejrvnorno4rno3nrfnfjnrfnournfou3nfou3f
 
Data Innovation Summit: Data Integrity Trends
Data Innovation Summit: Data Integrity TrendsData Innovation Summit: Data Integrity Trends
Data Innovation Summit: Data Integrity Trends
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity Webinar
 
Optimisez la fonction financière en automatisant vos processus SAP
Optimisez la fonction financière en automatisant vos processus SAPOptimisez la fonction financière en automatisant vos processus SAP
Optimisez la fonction financière en automatisant vos processus SAP
 
SAPS/4HANA Migration - Transformation-Management + nachhaltige Investitionen
SAPS/4HANA Migration - Transformation-Management + nachhaltige InvestitionenSAPS/4HANA Migration - Transformation-Management + nachhaltige Investitionen
SAPS/4HANA Migration - Transformation-Management + nachhaltige Investitionen
 
Automatisierte SAP Prozesse mit Hilfe von APIs
Automatisierte SAP Prozesse mit Hilfe von APIsAutomatisierte SAP Prozesse mit Hilfe von APIs
Automatisierte SAP Prozesse mit Hilfe von APIs
 
Moving IBM i Applications to the Cloud with AWS and Precisely
Moving IBM i Applications to the Cloud with AWS and PreciselyMoving IBM i Applications to the Cloud with AWS and Precisely
Moving IBM i Applications to the Cloud with AWS and Precisely
 
Effective Security Monitoring for IBM i: What You Need to Know
Effective Security Monitoring for IBM i: What You Need to KnowEffective Security Monitoring for IBM i: What You Need to Know
Effective Security Monitoring for IBM i: What You Need to Know
 
Automate Your Master Data Processes for Shared Service Center Excellence
Automate Your Master Data Processes for Shared Service Center ExcellenceAutomate Your Master Data Processes for Shared Service Center Excellence
Automate Your Master Data Processes for Shared Service Center Excellence
 
5 Keys to Improved IT Operation Management
5 Keys to Improved IT Operation Management5 Keys to Improved IT Operation Management
5 Keys to Improved IT Operation Management
 
Unlock Efficiency With Your Address Data Today For a Smarter Tomorrow
Unlock Efficiency With Your Address Data Today For a Smarter TomorrowUnlock Efficiency With Your Address Data Today For a Smarter Tomorrow
Unlock Efficiency With Your Address Data Today For a Smarter Tomorrow
 
Navigating Cloud Trends in 2024 Webinar Deck
Navigating Cloud Trends in 2024 Webinar DeckNavigating Cloud Trends in 2024 Webinar Deck
Navigating Cloud Trends in 2024 Webinar Deck
 

Último

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Último (20)

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

Which Change Data Capture Strategy is Right for You?

  • 1. Which Change Data Capture Strategy Is Right for You? Presented by Paige Roberts Sr. Product Marketing Manager Data Integration, Data Quality 1
  • 2. Choosing a Change Data Capture Strategy 1 What is Change Data Capture? 2 Why Do Change Data Capture? 3 Strategies for Change Data Capture 4 Examples of Change Data Capture 5 Q and A
  • 3. CDC is the process that ensures that changes made over time in one dataset are automatically transferred to another dataset. Change Data Capture or CDC is most often used with databases that hold important transactional data to make sure that organizations are working with up-to-date information across the enterprise. Source - often used to record transactions or other business occurrences as they happen. Target - often used to create a report or do analysis to determine a course of action. Sometimes, data is replicated bi-directionally so that a source is also a target and vice versa. Which Change Data Capture Strategy Is Right for You?3 What is Change Data Capture?
  • 4. Replication Options One Way Two Way Cascade Bi-Directional Distribute Consolidate Choose a topology or combine them to meet your data sharing needs
  • 5. 5 Integrated Architecture Use Case ERP SYSTEM Customer Orders Payment Details Product Catalogue Price List eCOMMERCE & WEB PORTALS TEST & AUDIT ENVIRONMENT DATA EXCHANGE WITH OUTSIDE VENDOR (FLAT FILE) DR / BACKUP
  • 6. 6 Customer Example Architecture EDGE NODE CLUSTER DATA NODES DATABASE SOURCES MAINFRAME SOURCES VSAM Db2 CAPTURE AGENT
  • 8. 1. Businesses have Multiple Databases Multiple databases are the norm • Merger or acquisition • Choice of multiple apps or databases for best of breed solutions • Combination of legacy and new databases • Multi-organization supply chain IT infrastructures are heterogeneous • Database platforms • Operating systems • Hardware 8 Drivers Behind Change Data Capture 83% 10% 8% Does your organization rely on multiple databases? Yes No I don't know. 73%of those with multiple databases share data among them Does your organization share data between multiple databases? Source: Vision Solutions ‘ 2017 State of Resilience Report
  • 9. 2. Enabling Analytics, Reporting and BI • Protecting performance of production database by offloading data to a reporting system for queries, reports, business intelligence or analytics • Consolidating data into centralized databases, data marts or data warehouses for decision making or business processing Which Change Data Capture Strategy Is Right for You?9 Drivers Behind Change Data Capture
  • 10. 3. Enabling Machine Learning, Advanced Analytics and AI • Growing data volumes lead to new architectures for data consolidation – data lakes and enterprise data hubs based on Hadoop or Spark. • New types of data and larger amounts of data from multiple sources combined together create an ideal environment for training and employing machine learning and artificial intelligence. • Businesses across many industries seek competitive edge from these new technologies in use cases from fraud detection to targeted marketing. • ML and AI systems have a constant, voracious need for more data, and must constantly have the latest, most current data available to provide the promised insights. Which Change Data Capture Strategy Is Right for You?10 Drivers Behind Change Data Capture
  • 11. 4. Varied Business and IT Goals • Offloading data for maintenance, backup, or testing on a secondary system without production impact • Maintaining synchronization between siloed databases or branch offices • Feeding segmented data to customer or partner applications • Migrating data to new databases • Re-platforming databases to new database or operating system platforms 11 Drivers Behind Change Data Capture Source: Vision Solutions ‘ 2017 State of Resilience Report For what business purpose does your organization share data between databases? Consolidating data from multiple sources into… Reporting on data offloaded from the… Synchronizing data between distributed… Testing on offloaded data Running business processes on offloaded data I don’t know 0% 10% 20% 30% 40% 50% 60% 70%
  • 12. Why do you need to capture and move the changes in your data? • Populating centralized databases, data marts, data warehouses, or data lakes • Enabling machine learning, advanced analytics and AI on modern data architectures like Hadoop and Spark • Enabling queries, reports, business intelligence or analytics without production impact • Feeding real-time data to employee, customer or partner applications • Keeping data from siloed databases in sync • Reducing the impact of database maintenance, backup or testing • Re-platforming to new database or operating systems • Consolidating databases 12 Goals for Change Data Capture
  • 14. Which Change Data Capture Strategy Is Right for You?14 Timestamps or Version Numbers Advantages • Simple • Nearly every database can query with a where clause. Disadvantages • Must be built into database • Bloats database size • Query requires considerable compute resources in source database • Not always reliable
  • 15. Which Change Data Capture Strategy Is Right for You?15 Table Triggers Advantages • Very reliable and detailed • Changes can be captured, almost as fast as they are made – real-time CDC. Disadvantages • Significant drag on database resources, both compute and storage. • Requires that the database have the capability. • Negative impact on performance of applications that depend on the source database.
  • 16. Which Change Data Capture Strategy Is Right for You?16 Snapshot or Table Comparison Advantages • Relatively easy to implement with good ETL software. • Requires no specialized knowledge of the source database. • Very dependable and accurate. Disadvantages • Requires repeatedly moving all data in monitored tables. May impact target or staging system resources and network bandwidth. • Moving lots of data can be slow, may not meet SLA’s. • Joining, comparing, and finding changes may also take time. Even slower. • Not a complete record of intermediate changes between snapshot captures.
  • 17. Which Change Data Capture Strategy Is Right for You?17 Log Scraping Advantages • Very reliable and detailed. • Virtually no impact on database or application performance. • Changes captured in real-time. • No database bloat. Disadvantages • Every RDMS has a different log format, often not documented. • Log formats often change between RDBMS versions. • Log files are frequently archived by the database. CDC software must read them before they’re archived, or be able to go read the archived logs. • Requires specialized CDC software. Cannot be easily accomplished with ETL software. • Can fail if connectivity is lost on source or target, causing lost data, duplicated data, or need to restart from initial data load.
  • 19. 19 Syncsort DMX & DMX-h: Simple and Powerful Big Data Integration Software Syncsort Data Integration and Data Quality for the Cloud DMX • GUI for developing MapReduce & Spark jobs • Test & debug locally in Windows; deploy on Hadoop • Use-case Accelerators to fast-track development • Broad based connectivity with automated parallelism • Simply the best mainframe access and integration with Hadoop • Improved per node scalability and throughput High Performance ETL Software • Template driven design for: o High performance ETL o SQL migration/DB offload o Mainframe data movement • Light weight footprint on commodity hardware • High speed flat file processing • Self tuning engine High Performance Hadoop ETL SoftwareDMX-h
  • 20. DMX Change Data Capture Keep data in sync in real-time • Without overloading networks. • Without affecting source database performance. • Without coding or tuning. Reliable transfer of data you can trust even if connectivity fails on either side. • Auto restart. • No data loss. Real-Time Replication with Transformation Conflict Resolution, Collision Monitoring, Tracking and Auditing Files RDBMS Streams Streams RDBMS Data Lake Mainframe Cloud OLAP
  • 21. DMX Change Data Capture Sources and Targets SOURCES • IBM Db2/z • IBM Db2/i • IBM Db2/LUW • VSAM • Kafka • Oracle • Oracle RAC Real Application Clusters • MS SQL Server • IBM Informix • Sybase TARGETS • Kafka • Amazon Kinesis • Teradata • HDFS • Hive (HDFS, ORC, Avro, Parquet) • Impala (Parquet, Kudu) • IBM Db2 • SQL Server • MS Azure SQL • PostgreSQL • MySQL • Oracle • Oracle RAC • Sybase • And more … Real-Time Replication with Transformation Conflict Resolution, Collision Monitoring, Tracking and Auditing Files RDBMS Streams Streams RDBMS Data Hub Mainframe Cloud OLAP
  • 22. 22 Design Once, Deploy Anywhere Syncsort Data Integration and Data Quality for the Cloud Intelligent Execution - Insulate your organization from underlying complexities of Hadoop. Get excellent performance every time without tuning, load balancing, etc. No re-design, re-compile, no re-work ever • Future-proof job designs for emerging compute frameworks, e.g. Spark 2.x • Move from development to test to production • Move from on-premise to Cloud • Move from one Cloud to another Use existing ETL and data quality skills No parallel programming – Java, MapReduce, Spark … No worries about: • Mappers, Reducers • Big side or small side of joins … Design Once in visual GUI Deploy Anywhere! On-Premise, Cloud Mapreduce, Spark, Future Platforms Windows, Unix, Linux Batch, Streaming Single Node, Cluster
  • 23. Which Change Data Capture Strategy Is Right for You?23 Snapshot CDC with DMX/DMX-h • Captures database changes on a scheduled basis • High speed sort and join • Transforms and enhances data during replication • Supplies end-to-end lineage of data for compliance, auditing • Any source, any target, not limited to sources with logging • Fast development in template- based GUI • Latency – Usually hourly to weekly
  • 24. Integration in the Cloud with DMX ETL “DMX allows Dickey’s to rapidly collect, transform and load thousands of very large files, with diverse data types from multiple servers across all of Dickey’s locations, without performance bottlenecks.” Laura Rea, Dickey’s, CIO 24 Modernize antiquated, Excel-based Point of Sales system analytics. Must function with minimal on-site infrastructure and support personnel. • Standardize software across 500+ stores. • 1000’sof large files • Diverse data types – financial, operations, inventory, purchasing • DMX ETL • AWS cloud-based architecture designed and implemented by iOLAP. • Rapid job development in visual interface – no hand coding or scripts to maintain. • Everyday operations data available to non- technical business users. AWS Cloud scales with project needs – Dickeys pays for only what they use Redshift updated every 15-20 minutes for quick, easy, current data- driven business insights. Better reporting and analytics = more dollars saved and earned. SOLUTION:
  • 25. 25 Log-Based Anything to Hadoop • Real-time capture • Minimizes bandwidth usage with LAN/WAN friendly replication • Parallel load on cluster • Updates HDFS, Hive or Impala, backed by HDFS, Parquet, ORC, or Kudu. • Updates even versions of Hive that did not support updating • Latency – Minutes (less than 5) Real-Time Replication with Transformation Conflict Resolution, Collision Monitoring, Tracking and Auditing Data Lake Cloud Files RDBMS Streams Mainframe
  • 26. Case Study: Guardian Life Insurance "We found DMX-h to be very usable and easy to ramp up in terms of skills. Most of all, Syncsort has been a very good partner in terms of support and listening to our needs.“ – Alex Rosenthal, Enterprise Data Office CHALLENGE • Enable visualization and BI on broad range of data sets. • Reduce data preparation, transformation times • Reduce time-to-market for analytics projects. • Make data assets available to whole enterprise – including Mainframe. SOLUTION • Created Amazon-style data marketplace, supported by data lake, Hadoop, NoSQL. New projects reuse and build upon existing data assets. DMX-h adds new data to the Data Lake with each new project. • DMX DataFunnel quickly ingested hundreds of database tables at push of a button • DMX Change Data Capture pushes changes from DB2 to the data lake in real-time. Current data up-to-the minute. BENEFITS • Centralized standardized reusable data assets – searchable, accessible and managed. • DMX-h and DataFunnel accelerated data acquisition, reduced time to market for analytics and reporting.
  • 27. 27 Anything to Stream, or Stream to Anything • Real-time capture • Minimizes bandwidth usage with LAN/WAN friendly replication • Parallel load on cluster • Updates HDFS, Hive or Impala, backed by HDFS, Parquet, ORC, or Kudu. • Updates even versions of Hive that did not support updating • Latency – Real-time, actual SLA varies depending on update speed of target, stream settings, etc. Usually, seconds. Real-Time Replication with Transformation Conflict Resolution, Collision Monitoring, Tracking and Auditing Files RDBMS Streams Streams RDBMS Data Lake Mainframe Cloud OLAP
  • 28. Case Study: Global Hotel Data Kept Current On the Cloud Syncsort Data Integration and Data Quality for the Cloud28 C H A L L E N G E • More timely collection & reporting on room availability, event bookings, inventory and other hotel data from 4,000+ properties globally S O LU T I O N • Near real-time reporting - DMX-h consumes property updates from Kafka every 10 seconds • DMX-h processes data on HDP, loading to Teradata every 30 minutes • Deployed on Google Cloud Platform • Productivity: Leveraging ETL team for Hadoop (Spark), visual understanding of data pipeline • Insight: Up-to-date data = better business decisions = happier customers B E N E F I T S • Time to Value: DMX-h ease of use drastically cut development time • Agility: Global reports updated every 30 min – before 24 hours
  • 29. 29 Log-Based Database to Database • Captures database changes as they happen • Transforms and enhances data during replication • Minimizes bandwidth usage with LAN/WAN friendly replication • Ensures data integrity with conflict resolution and collision monitoring • Enables tracking and auditing of transactions for compliance • Latency – sub-second Real-Time Replication with Transformation Conflict Resolution, Collision Monitoring, Tracking and Auditing RDBMS RDBMS OLAP
  • 30. Centralized Reporting Use Case Casino 1 IBM i Db2 Casino 2 Casino 3 Casino 4 Casino 5 Casino 6 Single Data Warehouse Database Windows Cluster MS SQL Server Business intelligence Real time CDC replication with transformation • Customer loyalty • Amounts paid • Amounts won • Time at the table • Time at the machine IBM i Db2 IBM i Db2 IBM i Db2 IBM i Db2 IBM i Db2
  • 31. Gradual Database Re-Platforming Use Case IBM i Db2 Old System Windows SQL Server New System America II Corp Active-Active replication eliminated need for hard cutover and enabled partners to move back and forth between systems True zero downtime for migration to new systems Transformation between different OS and database platforms with completely different schemas 100’s of partners moved to new server after training at their own pace
  • 32. Syncsort Addresses All Your Data Sharing Needs ✓ Enables centralization or consolidation of data ✓ Facilitates machine learning, advanced analytics and AI ✓ Facilitates real-time queries, reporting and business intelligence ✓ Transforms data for smooth data flow between databases ✓ Keeps distributed applications and data in sync ✓ Feeds real-time data to mission critical applications ✓ Offloads data for maintenance, testing and backup ✓ Migrates legacy data to new platforms ✓ And more!
  • 33. 33