SlideShare una empresa de Scribd logo
1 de 16
WHAT YOU NEED TO KNOW,
BEFORE MIGRATING DATA
PLATFORM TO GCP
by
SERHII KHOLODNIUK
Serhii Kholodniuk
Senior Big Data
Engineer
Sigma Software
Ukraine
Kyiv office
My interest and goals:
• interested in designing and developing data platforms for the needs of
business intelligence and machine learning.
• constantly looking for opportunities to simplify and optimize solutions, their
implementation and maintenance.
• client value oriented.
Mastering GCP:
• currently building data platform in GCP
• migrating data pipelines in to GCP infrastructure
• optimizing data warehouse structure
AGENDA
—
3
Why GCP becomes popular . . . . . . . . . . . . . . . . . . . . . . . . . . . 04
Migration phases . . . . . . . . . . . . . . . . . . . . . . . . . . . 09
Pipelines migration . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Schema and data migration . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Data storages . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
WHY GCP BECOMES POPULAR?
—
4
Cloud Infrastructure
Network
Cloud sustainability
Data cloud
Security out of box (encrypt data at rest and in transit)
Powerful BigQuery features with ergonomic design
Provide cloud infrastructure for all data needs
Customized solutions for different industries
Provide best practices industry solutions
Artificial intelligence solutions
Prebuilt ML model APIs
Custom Model Building with SQL in BigQuery ML
Custom Model Building with Cloud AutoML
CLOUD INFRASTRUCTURE
—
5
Network
29 regions
88 availability zones
146 edge locations
Cloud sustainability
100% renewable energy for all cloud regions
81% waste diverted from landfills
2x more efficient thana typical enterprise
data center
DATA CLOUD
—
6
Security out of box
(encrypt data at rest and in transit)
Provide cloud infrastructure for all data needs
Powerful BigQuery features with
ergonomic design
CUSTOMIZED SOLUTIONS FOR DIFFERENT INDUSTRIES
—
7
Provide best practices industry solutions
Industry solutions
Retail
Consumer packaged goods
Manufacturing
Automotive
Supply chain and logistics
Energy
Healthcare and life sciences
Media and entertainment
Gaming
Telecommunications
Financial services
Financial services
Capital markets
Government and public sector
Government
State and local government
Federal government
Education
AI SOLUTIONS
—
8
MIGRATION PHASES
—
9
1. Pre-migration phase
• complete inventory of workloads and stuff to be
migrated
• calculate Total Cost of Ownership and future
business value
• build a use case backlog
• select use cases for iteration
2. Migration phase
• schema migration
• pipelines migration
• data migration
3. Post-migration phase
• cost and performance optimization
• schema denormalization for BigQuery
• removing nested and repeated schema fields
• clustering and partitioning
• slots reservation for BigQuery
ITERATIVE APPROACH IN AGILE WAY
—
10
Prioritize use case backlog Select use cases for iteration Execution Release
1. Setup and data governance
2. Migrate schema and data
3. Translate queries
4. Migrate services and apps
5. Migrate data pipelines
6. Optimise perfomance
7. Verify and validate
Next iteration
PIPELINES MIGRATION
—
11
Cloud Composer
Cloud Dataflow
Cloud Dataproc
Cloud Compute Engine
WHAT TO CHOOSE?
DATAFLOW vs DATAPROC
—
12
Cloud Dataproc Cloud Dataflow
Recommended for: New data processing pipelines, unified
batch and streaming Existing
Hadoop/Spark applications, machine
learning/data science ecosystem, large-
batch jobs, preemptible VMs
New data processing pipelines, unified
batch and streaming
Fully-managed: No Yes
Managed by: DevOps Serverless
Auto-scaling: Yes, based on cluster utilization (reactive) Yes, transform-by-transform (adaptive)
Expertise: Hadoop, Hive, Pig, Apache Big Data
ecosystem, Spark, Flink, Presto, Druid
Apache Beam
DATAFLOW vs SPARK SERVERLESS
—
13
Spark Serverless Cloud Dataflow
Recommended for: New data processing pipelines, unified
batch existing Spark applications (from
Spark 3.2), machine learning/data science
ecosystem, large-batch jobs
New data processing pipelines, unified
batch and streaming
Fully-managed: Yes Yes
Managed by: Serverless Serverless
Auto-scaling: Yes, transform-by-transform (adaptive) Yes, transform-by-transform (adaptive)
Expertise: Pyspark, Spark SQL, Spark R, Spark
Java/Scala
Apache Beam
SCHEMA AND DATA MIGRATION
—
14
Database Migration Service – helps migrating MySQL and PostgresSQL to CloudSQL
BigQuery Data Transfer Service
Google recommends loading large data volumes by using Cloud Storage Transfer Service, and preferable are
Avro, Parquet or ORC format rather than CSV or JSON
For migration stratagies for Oracle workloads: rehost (by Bare Metal Solution), replatform, rewrite
Hbase to Bigtable migration path: HDFS -> Cloud Storage -> Storage Transfer Service -> Bigtable
DATA STORES FOR DIFFERENT USE CASES
—
15
Data
Unstructured Structured
Cloud Storage
Transactional
workloads
Data analytics
workloads
Millisecond
latency
Latency in
seconds
Cloud Bigtable
BigQuery
Firestore
NoSQL
SQL
One database
enough
Horisontal
scalability
Cloud SQL
Cloud Spanner
THANK YOU!

Más contenido relacionado

La actualidad más candente

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDatabricks
 
Zero to Snowflake Presentation
Zero to Snowflake Presentation Zero to Snowflake Presentation
Zero to Snowflake Presentation Brett VanderPlaats
 
Migrating Data and Databases to Azure
Migrating Data and Databases to AzureMigrating Data and Databases to Azure
Migrating Data and Databases to AzureKaren Lopez
 
The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022Kai Wähner
 
Accelerate and modernize your data pipelines
Accelerate and modernize your data pipelinesAccelerate and modernize your data pipelines
Accelerate and modernize your data pipelinesPaul Van Siclen
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 
An Overview of Best Practices for Large Scale Migrations - AWS Transformation...
An Overview of Best Practices for Large Scale Migrations - AWS Transformation...An Overview of Best Practices for Large Scale Migrations - AWS Transformation...
An Overview of Best Practices for Large Scale Migrations - AWS Transformation...Amazon Web Services
 
모든 데이터를 위한 단 하나의 저장소, Amazon S3 기반 데이터 레이크::정세웅::AWS Summit Seoul 2018
모든 데이터를 위한 단 하나의 저장소, Amazon S3 기반 데이터 레이크::정세웅::AWS Summit Seoul 2018모든 데이터를 위한 단 하나의 저장소, Amazon S3 기반 데이터 레이크::정세웅::AWS Summit Seoul 2018
모든 데이터를 위한 단 하나의 저장소, Amazon S3 기반 데이터 레이크::정세웅::AWS Summit Seoul 2018Amazon Web Services Korea
 
Kafka 101 and Developer Best Practices
Kafka 101 and Developer Best PracticesKafka 101 and Developer Best Practices
Kafka 101 and Developer Best Practicesconfluent
 
Microsoft Cloud Adoption Framework for Azure: Thru Partner Governance Workshop
Microsoft Cloud Adoption Framework for Azure: Thru Partner Governance WorkshopMicrosoft Cloud Adoption Framework for Azure: Thru Partner Governance Workshop
Microsoft Cloud Adoption Framework for Azure: Thru Partner Governance WorkshopNicholas Vossburg
 
Introduction to Batch Processing on AWS
Introduction to Batch Processing on AWSIntroduction to Batch Processing on AWS
Introduction to Batch Processing on AWSAmazon Web Services
 
Mass Migration Strategy - A Key Step in the Enterprise Transformation - AWS C...
Mass Migration Strategy - A Key Step in the Enterprise Transformation - AWS C...Mass Migration Strategy - A Key Step in the Enterprise Transformation - AWS C...
Mass Migration Strategy - A Key Step in the Enterprise Transformation - AWS C...AWS Germany
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureDatabricks
 
Disaster Recovery using AWS -Architecture blueprints
Disaster Recovery using AWS -Architecture blueprintsDisaster Recovery using AWS -Architecture blueprints
Disaster Recovery using AWS -Architecture blueprintsHarish Ganesan
 
Microsoft Azure Technical Overview
Microsoft Azure Technical OverviewMicrosoft Azure Technical Overview
Microsoft Azure Technical Overviewgjuljo
 
Introduction to AWS Lambda and Serverless Applications
Introduction to AWS Lambda and Serverless ApplicationsIntroduction to AWS Lambda and Serverless Applications
Introduction to AWS Lambda and Serverless ApplicationsAmazon Web Services
 
Migrate an Existing Application to Microsoft Azure
Migrate an Existing Application to Microsoft AzureMigrate an Existing Application to Microsoft Azure
Migrate an Existing Application to Microsoft AzureChris Dufour
 

La actualidad más candente (20)

Azure 101
Azure 101Azure 101
Azure 101
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Unified Stream and Batch Processing with Apache Flink
Unified Stream and Batch Processing with Apache FlinkUnified Stream and Batch Processing with Apache Flink
Unified Stream and Batch Processing with Apache Flink
 
Zero to Snowflake Presentation
Zero to Snowflake Presentation Zero to Snowflake Presentation
Zero to Snowflake Presentation
 
Migrating Data and Databases to Azure
Migrating Data and Databases to AzureMigrating Data and Databases to Azure
Migrating Data and Databases to Azure
 
The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022
 
Accelerate and modernize your data pipelines
Accelerate and modernize your data pipelinesAccelerate and modernize your data pipelines
Accelerate and modernize your data pipelines
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
infrastructure as code
infrastructure as codeinfrastructure as code
infrastructure as code
 
An Overview of Best Practices for Large Scale Migrations - AWS Transformation...
An Overview of Best Practices for Large Scale Migrations - AWS Transformation...An Overview of Best Practices for Large Scale Migrations - AWS Transformation...
An Overview of Best Practices for Large Scale Migrations - AWS Transformation...
 
모든 데이터를 위한 단 하나의 저장소, Amazon S3 기반 데이터 레이크::정세웅::AWS Summit Seoul 2018
모든 데이터를 위한 단 하나의 저장소, Amazon S3 기반 데이터 레이크::정세웅::AWS Summit Seoul 2018모든 데이터를 위한 단 하나의 저장소, Amazon S3 기반 데이터 레이크::정세웅::AWS Summit Seoul 2018
모든 데이터를 위한 단 하나의 저장소, Amazon S3 기반 데이터 레이크::정세웅::AWS Summit Seoul 2018
 
Kafka 101 and Developer Best Practices
Kafka 101 and Developer Best PracticesKafka 101 and Developer Best Practices
Kafka 101 and Developer Best Practices
 
Microsoft Cloud Adoption Framework for Azure: Thru Partner Governance Workshop
Microsoft Cloud Adoption Framework for Azure: Thru Partner Governance WorkshopMicrosoft Cloud Adoption Framework for Azure: Thru Partner Governance Workshop
Microsoft Cloud Adoption Framework for Azure: Thru Partner Governance Workshop
 
Introduction to Batch Processing on AWS
Introduction to Batch Processing on AWSIntroduction to Batch Processing on AWS
Introduction to Batch Processing on AWS
 
Mass Migration Strategy - A Key Step in the Enterprise Transformation - AWS C...
Mass Migration Strategy - A Key Step in the Enterprise Transformation - AWS C...Mass Migration Strategy - A Key Step in the Enterprise Transformation - AWS C...
Mass Migration Strategy - A Key Step in the Enterprise Transformation - AWS C...
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
 
Disaster Recovery using AWS -Architecture blueprints
Disaster Recovery using AWS -Architecture blueprintsDisaster Recovery using AWS -Architecture blueprints
Disaster Recovery using AWS -Architecture blueprints
 
Microsoft Azure Technical Overview
Microsoft Azure Technical OverviewMicrosoft Azure Technical Overview
Microsoft Azure Technical Overview
 
Introduction to AWS Lambda and Serverless Applications
Introduction to AWS Lambda and Serverless ApplicationsIntroduction to AWS Lambda and Serverless Applications
Introduction to AWS Lambda and Serverless Applications
 
Migrate an Existing Application to Microsoft Azure
Migrate an Existing Application to Microsoft AzureMigrate an Existing Application to Microsoft Azure
Migrate an Existing Application to Microsoft Azure
 

Similar a Everything You Need to Know Before Migrating Your Data Platform to GCP

Data Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Data Engineer's Lunch #60: Series - Developing Enterprise ConsciousnessData Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Data Engineer's Lunch #60: Series - Developing Enterprise ConsciousnessAnant Corporation
 
DEVNET-1166 Open SDN Controller APIs
DEVNET-1166	Open SDN Controller APIsDEVNET-1166	Open SDN Controller APIs
DEVNET-1166 Open SDN Controller APIsCisco DevNet
 
Cloud-Native Patterns for Data-Intensive Applications
Cloud-Native Patterns for Data-Intensive ApplicationsCloud-Native Patterns for Data-Intensive Applications
Cloud-Native Patterns for Data-Intensive ApplicationsVMware Tanzu
 
NVIDIA Rapids presentation
NVIDIA Rapids presentationNVIDIA Rapids presentation
NVIDIA Rapids presentationtestSri1
 
Slides: Success Stories for Data-to-Cloud
Slides: Success Stories for Data-to-CloudSlides: Success Stories for Data-to-Cloud
Slides: Success Stories for Data-to-CloudDATAVERSITY
 
IMS01 IMS Keynote
IMS01   IMS KeynoteIMS01   IMS Keynote
IMS01 IMS KeynoteRobert Hain
 
th1330-1410effectenbeurszaal4-3v2-140424180955-phpapp01 (1).pdf
th1330-1410effectenbeurszaal4-3v2-140424180955-phpapp01 (1).pdfth1330-1410effectenbeurszaal4-3v2-140424180955-phpapp01 (1).pdf
th1330-1410effectenbeurszaal4-3v2-140424180955-phpapp01 (1).pdfTarekHassan840678
 
Monitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service ProvidersMonitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service ProvidersDataWorks Summit
 
Qlik and Confluent Success Stories with Kafka - How Generali and Skechers Kee...
Qlik and Confluent Success Stories with Kafka - How Generali and Skechers Kee...Qlik and Confluent Success Stories with Kafka - How Generali and Skechers Kee...
Qlik and Confluent Success Stories with Kafka - How Generali and Skechers Kee...HostedbyConfluent
 
Monetizing Big Data at Telecom Service Providers
Monetizing Big Data at Telecom Service ProvidersMonetizing Big Data at Telecom Service Providers
Monetizing Big Data at Telecom Service ProvidersDataWorks Summit
 
Apache Kafka® + Machine Learning for Supply Chain 
Apache Kafka® + Machine Learning for Supply Chain Apache Kafka® + Machine Learning for Supply Chain 
Apache Kafka® + Machine Learning for Supply Chain confluent
 
IIoT with Kafka and Machine Learning for Supply Chain Optimization In Real Ti...
IIoT with Kafka and Machine Learning for Supply Chain Optimization In Real Ti...IIoT with Kafka and Machine Learning for Supply Chain Optimization In Real Ti...
IIoT with Kafka and Machine Learning for Supply Chain Optimization In Real Ti...Kai Wähner
 
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...HostedbyConfluent
 
Solving enterprise challenges through scale out storage & big compute final
Solving enterprise challenges through scale out storage & big compute finalSolving enterprise challenges through scale out storage & big compute final
Solving enterprise challenges through scale out storage & big compute finalAvere Systems
 
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...HostedbyConfluent
 
Leveraging Mainframe Data for Modern Analytics
Leveraging Mainframe Data for Modern AnalyticsLeveraging Mainframe Data for Modern Analytics
Leveraging Mainframe Data for Modern Analyticsconfluent
 
Key Database Criteria for Cloud Applications
Key Database Criteria for Cloud ApplicationsKey Database Criteria for Cloud Applications
Key Database Criteria for Cloud ApplicationsNuoDB
 
Processing Real-Time Data at Scale: A streaming platform as a central nervous...
Processing Real-Time Data at Scale: A streaming platform as a central nervous...Processing Real-Time Data at Scale: A streaming platform as a central nervous...
Processing Real-Time Data at Scale: A streaming platform as a central nervous...confluent
 

Similar a Everything You Need to Know Before Migrating Your Data Platform to GCP (20)

Data Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Data Engineer's Lunch #60: Series - Developing Enterprise ConsciousnessData Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Data Engineer's Lunch #60: Series - Developing Enterprise Consciousness
 
DEVNET-1166 Open SDN Controller APIs
DEVNET-1166	Open SDN Controller APIsDEVNET-1166	Open SDN Controller APIs
DEVNET-1166 Open SDN Controller APIs
 
Cloud-Native Patterns for Data-Intensive Applications
Cloud-Native Patterns for Data-Intensive ApplicationsCloud-Native Patterns for Data-Intensive Applications
Cloud-Native Patterns for Data-Intensive Applications
 
Rapids: Data Science on GPUs
Rapids: Data Science on GPUsRapids: Data Science on GPUs
Rapids: Data Science on GPUs
 
NVIDIA Rapids presentation
NVIDIA Rapids presentationNVIDIA Rapids presentation
NVIDIA Rapids presentation
 
Slides: Success Stories for Data-to-Cloud
Slides: Success Stories for Data-to-CloudSlides: Success Stories for Data-to-Cloud
Slides: Success Stories for Data-to-Cloud
 
IMS01 IMS Keynote
IMS01   IMS KeynoteIMS01   IMS Keynote
IMS01 IMS Keynote
 
th1330-1410effectenbeurszaal4-3v2-140424180955-phpapp01 (1).pdf
th1330-1410effectenbeurszaal4-3v2-140424180955-phpapp01 (1).pdfth1330-1410effectenbeurszaal4-3v2-140424180955-phpapp01 (1).pdf
th1330-1410effectenbeurszaal4-3v2-140424180955-phpapp01 (1).pdf
 
Monitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service ProvidersMonitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service Providers
 
Qlik and Confluent Success Stories with Kafka - How Generali and Skechers Kee...
Qlik and Confluent Success Stories with Kafka - How Generali and Skechers Kee...Qlik and Confluent Success Stories with Kafka - How Generali and Skechers Kee...
Qlik and Confluent Success Stories with Kafka - How Generali and Skechers Kee...
 
Monetizing Big Data at Telecom Service Providers
Monetizing Big Data at Telecom Service ProvidersMonetizing Big Data at Telecom Service Providers
Monetizing Big Data at Telecom Service Providers
 
Apache Kafka® + Machine Learning for Supply Chain 
Apache Kafka® + Machine Learning for Supply Chain Apache Kafka® + Machine Learning for Supply Chain 
Apache Kafka® + Machine Learning for Supply Chain 
 
IIoT with Kafka and Machine Learning for Supply Chain Optimization In Real Ti...
IIoT with Kafka and Machine Learning for Supply Chain Optimization In Real Ti...IIoT with Kafka and Machine Learning for Supply Chain Optimization In Real Ti...
IIoT with Kafka and Machine Learning for Supply Chain Optimization In Real Ti...
 
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
 
Solving enterprise challenges through scale out storage & big compute final
Solving enterprise challenges through scale out storage & big compute finalSolving enterprise challenges through scale out storage & big compute final
Solving enterprise challenges through scale out storage & big compute final
 
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
 
Leveraging Mainframe Data for Modern Analytics
Leveraging Mainframe Data for Modern AnalyticsLeveraging Mainframe Data for Modern Analytics
Leveraging Mainframe Data for Modern Analytics
 
Key Database Criteria for Cloud Applications
Key Database Criteria for Cloud ApplicationsKey Database Criteria for Cloud Applications
Key Database Criteria for Cloud Applications
 
Processing Real-Time Data at Scale: A streaming platform as a central nervous...
Processing Real-Time Data at Scale: A streaming platform as a central nervous...Processing Real-Time Data at Scale: A streaming platform as a central nervous...
Processing Real-Time Data at Scale: A streaming platform as a central nervous...
 
About CDAP
About CDAPAbout CDAP
About CDAP
 

Más de Lviv Startup Club

Nikita Zahurdaiev: Developing PMO Services and Functions (UA)
Nikita Zahurdaiev: Developing PMO Services and Functions (UA)Nikita Zahurdaiev: Developing PMO Services and Functions (UA)
Nikita Zahurdaiev: Developing PMO Services and Functions (UA)Lviv Startup Club
 
Oleksandr Krakovetskyi: What's wrong with Generative AI? (UA)
Oleksandr Krakovetskyi: What's wrong with Generative AI? (UA)Oleksandr Krakovetskyi: What's wrong with Generative AI? (UA)
Oleksandr Krakovetskyi: What's wrong with Generative AI? (UA)Lviv Startup Club
 
Oleksandr Krakovetskyi: What's wrong with Generative AI? (UA)
Oleksandr Krakovetskyi: What's wrong with Generative AI? (UA)Oleksandr Krakovetskyi: What's wrong with Generative AI? (UA)
Oleksandr Krakovetskyi: What's wrong with Generative AI? (UA)Lviv Startup Club
 
Stanislav Podyachev: AI Agents as Role-Playing Business Modeling Tools (UA)
Stanislav Podyachev: AI Agents as Role-Playing Business Modeling Tools (UA)Stanislav Podyachev: AI Agents as Role-Playing Business Modeling Tools (UA)
Stanislav Podyachev: AI Agents as Role-Playing Business Modeling Tools (UA)Lviv Startup Club
 
Kyryl Truskovskyi: Training and Serving Open-Sourced Foundational Models (UA)
Kyryl Truskovskyi: Training and Serving Open-Sourced Foundational Models (UA)Kyryl Truskovskyi: Training and Serving Open-Sourced Foundational Models (UA)
Kyryl Truskovskyi: Training and Serving Open-Sourced Foundational Models (UA)Lviv Startup Club
 
Andrii Rodionov: What can go wrong in a distributed system – experience from ...
Andrii Rodionov: What can go wrong in a distributed system – experience from ...Andrii Rodionov: What can go wrong in a distributed system – experience from ...
Andrii Rodionov: What can go wrong in a distributed system – experience from ...Lviv Startup Club
 
Dmytro Tkachenko: Можливості АІ відео для бізнесу (UA)
Dmytro Tkachenko: Можливості АІ відео для бізнесу (UA)Dmytro Tkachenko: Можливості АІ відео для бізнесу (UA)
Dmytro Tkachenko: Можливості АІ відео для бізнесу (UA)Lviv Startup Club
 
Roman Kyslyi: Використання та побудова LLM агентів (UA)
Roman Kyslyi: Використання та побудова LLM агентів (UA)Roman Kyslyi: Використання та побудова LLM агентів (UA)
Roman Kyslyi: Використання та побудова LLM агентів (UA)Lviv Startup Club
 
Veronika Snizhko: Штучний інтелект як каталізатор інноваційної культури в ком...
Veronika Snizhko: Штучний інтелект як каталізатор інноваційної культури в ком...Veronika Snizhko: Штучний інтелект як каталізатор інноваційної культури в ком...
Veronika Snizhko: Штучний інтелект як каталізатор інноваційної культури в ком...Lviv Startup Club
 
Volodymyr Zhukov: Ключові труднощі в реальних імплементаціях AI. Досвід з пра...
Volodymyr Zhukov: Ключові труднощі в реальних імплементаціях AI. Досвід з пра...Volodymyr Zhukov: Ключові труднощі в реальних імплементаціях AI. Досвід з пра...
Volodymyr Zhukov: Ключові труднощі в реальних імплементаціях AI. Досвід з пра...Lviv Startup Club
 
Volodymyr Zhukov: Куди рухається ринок AI у 2024 році. Інсайти від Stanford H...
Volodymyr Zhukov: Куди рухається ринок AI у 2024 році. Інсайти від Stanford H...Volodymyr Zhukov: Куди рухається ринок AI у 2024 році. Інсайти від Stanford H...
Volodymyr Zhukov: Куди рухається ринок AI у 2024 році. Інсайти від Stanford H...Lviv Startup Club
 
Andrii Boichuk: The RAG is dead, long live the RAG або як сучасні LLM змінюют...
Andrii Boichuk: The RAG is dead, long live the RAG або як сучасні LLM змінюют...Andrii Boichuk: The RAG is dead, long live the RAG або як сучасні LLM змінюют...
Andrii Boichuk: The RAG is dead, long live the RAG або як сучасні LLM змінюют...Lviv Startup Club
 
Vladyslav Fliahin: Applications of Gen AI in CV (UA)
Vladyslav Fliahin: Applications of Gen AI in CV (UA)Vladyslav Fliahin: Applications of Gen AI in CV (UA)
Vladyslav Fliahin: Applications of Gen AI in CV (UA)Lviv Startup Club
 
Artem Ternov: Побудова платформи під DataEngineering та DataScience в ентерпр...
Artem Ternov: Побудова платформи під DataEngineering та DataScience в ентерпр...Artem Ternov: Побудова платформи під DataEngineering та DataScience в ентерпр...
Artem Ternov: Побудова платформи під DataEngineering та DataScience в ентерпр...Lviv Startup Club
 
Michael Vidyakin: Defining PMO Structure and Governance (UA)
Michael Vidyakin: Defining PMO Structure and Governance (UA)Michael Vidyakin: Defining PMO Structure and Governance (UA)
Michael Vidyakin: Defining PMO Structure and Governance (UA)Lviv Startup Club
 
Michael Vidyakin: Assessing Organizational Readiness (UA)
Michael Vidyakin: Assessing Organizational Readiness (UA)Michael Vidyakin: Assessing Organizational Readiness (UA)
Michael Vidyakin: Assessing Organizational Readiness (UA)Lviv Startup Club
 
Michael Vidyakin: Introduction to PMO (UA)
Michael Vidyakin: Introduction to PMO (UA)Michael Vidyakin: Introduction to PMO (UA)
Michael Vidyakin: Introduction to PMO (UA)Lviv Startup Club
 
Anna Kompanets: PMO Maturity and Continuous Improvement (UA)
Anna Kompanets: PMO Maturity and Continuous Improvement (UA)Anna Kompanets: PMO Maturity and Continuous Improvement (UA)
Anna Kompanets: PMO Maturity and Continuous Improvement (UA)Lviv Startup Club
 
Natalia Folgina: General state of IT talent market (UA)
Natalia Folgina: General state of IT talent market (UA)Natalia Folgina: General state of IT talent market (UA)
Natalia Folgina: General state of IT talent market (UA)Lviv Startup Club
 
Andrii Burlutskyi: Емпатія та AI: секрет сучасного demand generation (UA)
Andrii Burlutskyi: Емпатія та AI: секрет сучасного demand generation (UA)Andrii Burlutskyi: Емпатія та AI: секрет сучасного demand generation (UA)
Andrii Burlutskyi: Емпатія та AI: секрет сучасного demand generation (UA)Lviv Startup Club
 

Más de Lviv Startup Club (20)

Nikita Zahurdaiev: Developing PMO Services and Functions (UA)
Nikita Zahurdaiev: Developing PMO Services and Functions (UA)Nikita Zahurdaiev: Developing PMO Services and Functions (UA)
Nikita Zahurdaiev: Developing PMO Services and Functions (UA)
 
Oleksandr Krakovetskyi: What's wrong with Generative AI? (UA)
Oleksandr Krakovetskyi: What's wrong with Generative AI? (UA)Oleksandr Krakovetskyi: What's wrong with Generative AI? (UA)
Oleksandr Krakovetskyi: What's wrong with Generative AI? (UA)
 
Oleksandr Krakovetskyi: What's wrong with Generative AI? (UA)
Oleksandr Krakovetskyi: What's wrong with Generative AI? (UA)Oleksandr Krakovetskyi: What's wrong with Generative AI? (UA)
Oleksandr Krakovetskyi: What's wrong with Generative AI? (UA)
 
Stanislav Podyachev: AI Agents as Role-Playing Business Modeling Tools (UA)
Stanislav Podyachev: AI Agents as Role-Playing Business Modeling Tools (UA)Stanislav Podyachev: AI Agents as Role-Playing Business Modeling Tools (UA)
Stanislav Podyachev: AI Agents as Role-Playing Business Modeling Tools (UA)
 
Kyryl Truskovskyi: Training and Serving Open-Sourced Foundational Models (UA)
Kyryl Truskovskyi: Training and Serving Open-Sourced Foundational Models (UA)Kyryl Truskovskyi: Training and Serving Open-Sourced Foundational Models (UA)
Kyryl Truskovskyi: Training and Serving Open-Sourced Foundational Models (UA)
 
Andrii Rodionov: What can go wrong in a distributed system – experience from ...
Andrii Rodionov: What can go wrong in a distributed system – experience from ...Andrii Rodionov: What can go wrong in a distributed system – experience from ...
Andrii Rodionov: What can go wrong in a distributed system – experience from ...
 
Dmytro Tkachenko: Можливості АІ відео для бізнесу (UA)
Dmytro Tkachenko: Можливості АІ відео для бізнесу (UA)Dmytro Tkachenko: Можливості АІ відео для бізнесу (UA)
Dmytro Tkachenko: Можливості АІ відео для бізнесу (UA)
 
Roman Kyslyi: Використання та побудова LLM агентів (UA)
Roman Kyslyi: Використання та побудова LLM агентів (UA)Roman Kyslyi: Використання та побудова LLM агентів (UA)
Roman Kyslyi: Використання та побудова LLM агентів (UA)
 
Veronika Snizhko: Штучний інтелект як каталізатор інноваційної культури в ком...
Veronika Snizhko: Штучний інтелект як каталізатор інноваційної культури в ком...Veronika Snizhko: Штучний інтелект як каталізатор інноваційної культури в ком...
Veronika Snizhko: Штучний інтелект як каталізатор інноваційної культури в ком...
 
Volodymyr Zhukov: Ключові труднощі в реальних імплементаціях AI. Досвід з пра...
Volodymyr Zhukov: Ключові труднощі в реальних імплементаціях AI. Досвід з пра...Volodymyr Zhukov: Ключові труднощі в реальних імплементаціях AI. Досвід з пра...
Volodymyr Zhukov: Ключові труднощі в реальних імплементаціях AI. Досвід з пра...
 
Volodymyr Zhukov: Куди рухається ринок AI у 2024 році. Інсайти від Stanford H...
Volodymyr Zhukov: Куди рухається ринок AI у 2024 році. Інсайти від Stanford H...Volodymyr Zhukov: Куди рухається ринок AI у 2024 році. Інсайти від Stanford H...
Volodymyr Zhukov: Куди рухається ринок AI у 2024 році. Інсайти від Stanford H...
 
Andrii Boichuk: The RAG is dead, long live the RAG або як сучасні LLM змінюют...
Andrii Boichuk: The RAG is dead, long live the RAG або як сучасні LLM змінюют...Andrii Boichuk: The RAG is dead, long live the RAG або як сучасні LLM змінюют...
Andrii Boichuk: The RAG is dead, long live the RAG або як сучасні LLM змінюют...
 
Vladyslav Fliahin: Applications of Gen AI in CV (UA)
Vladyslav Fliahin: Applications of Gen AI in CV (UA)Vladyslav Fliahin: Applications of Gen AI in CV (UA)
Vladyslav Fliahin: Applications of Gen AI in CV (UA)
 
Artem Ternov: Побудова платформи під DataEngineering та DataScience в ентерпр...
Artem Ternov: Побудова платформи під DataEngineering та DataScience в ентерпр...Artem Ternov: Побудова платформи під DataEngineering та DataScience в ентерпр...
Artem Ternov: Побудова платформи під DataEngineering та DataScience в ентерпр...
 
Michael Vidyakin: Defining PMO Structure and Governance (UA)
Michael Vidyakin: Defining PMO Structure and Governance (UA)Michael Vidyakin: Defining PMO Structure and Governance (UA)
Michael Vidyakin: Defining PMO Structure and Governance (UA)
 
Michael Vidyakin: Assessing Organizational Readiness (UA)
Michael Vidyakin: Assessing Organizational Readiness (UA)Michael Vidyakin: Assessing Organizational Readiness (UA)
Michael Vidyakin: Assessing Organizational Readiness (UA)
 
Michael Vidyakin: Introduction to PMO (UA)
Michael Vidyakin: Introduction to PMO (UA)Michael Vidyakin: Introduction to PMO (UA)
Michael Vidyakin: Introduction to PMO (UA)
 
Anna Kompanets: PMO Maturity and Continuous Improvement (UA)
Anna Kompanets: PMO Maturity and Continuous Improvement (UA)Anna Kompanets: PMO Maturity and Continuous Improvement (UA)
Anna Kompanets: PMO Maturity and Continuous Improvement (UA)
 
Natalia Folgina: General state of IT talent market (UA)
Natalia Folgina: General state of IT talent market (UA)Natalia Folgina: General state of IT talent market (UA)
Natalia Folgina: General state of IT talent market (UA)
 
Andrii Burlutskyi: Емпатія та AI: секрет сучасного demand generation (UA)
Andrii Burlutskyi: Емпатія та AI: секрет сучасного demand generation (UA)Andrii Burlutskyi: Емпатія та AI: секрет сучасного demand generation (UA)
Andrii Burlutskyi: Емпатія та AI: секрет сучасного demand generation (UA)
 

Último

Appkodes Tinder Clone Script with Customisable Solutions.pptx
Appkodes Tinder Clone Script with Customisable Solutions.pptxAppkodes Tinder Clone Script with Customisable Solutions.pptx
Appkodes Tinder Clone Script with Customisable Solutions.pptxappkodes
 
EUDR Info Meeting Ethiopian coffee exporters
EUDR Info Meeting Ethiopian coffee exportersEUDR Info Meeting Ethiopian coffee exporters
EUDR Info Meeting Ethiopian coffee exportersPeter Horsten
 
digital marketing , introduction of digital marketing
digital marketing , introduction of digital marketingdigital marketing , introduction of digital marketing
digital marketing , introduction of digital marketingrajputmeenakshi733
 
Excvation Safety for safety officers reference
Excvation Safety for safety officers referenceExcvation Safety for safety officers reference
Excvation Safety for safety officers referencessuser2c065e
 
Psychic Reading | Spiritual Guidance – Astro Ganesh Ji
Psychic Reading | Spiritual Guidance – Astro Ganesh JiPsychic Reading | Spiritual Guidance – Astro Ganesh Ji
Psychic Reading | Spiritual Guidance – Astro Ganesh Jiastral oracle
 
Intermediate Accounting, Volume 2, 13th Canadian Edition by Donald E. Kieso t...
Intermediate Accounting, Volume 2, 13th Canadian Edition by Donald E. Kieso t...Intermediate Accounting, Volume 2, 13th Canadian Edition by Donald E. Kieso t...
Intermediate Accounting, Volume 2, 13th Canadian Edition by Donald E. Kieso t...ssuserf63bd7
 
NAB Show Exhibitor List 2024 - Exhibitors Data
NAB Show Exhibitor List 2024 - Exhibitors DataNAB Show Exhibitor List 2024 - Exhibitors Data
NAB Show Exhibitor List 2024 - Exhibitors DataExhibitors Data
 
trending-flavors-and-ingredients-in-salty-snacks-us-2024_Redacted-V2.pdf
trending-flavors-and-ingredients-in-salty-snacks-us-2024_Redacted-V2.pdftrending-flavors-and-ingredients-in-salty-snacks-us-2024_Redacted-V2.pdf
trending-flavors-and-ingredients-in-salty-snacks-us-2024_Redacted-V2.pdfMintel Group
 
Driving Business Impact for PMs with Jon Harmer
Driving Business Impact for PMs with Jon HarmerDriving Business Impact for PMs with Jon Harmer
Driving Business Impact for PMs with Jon HarmerAggregage
 
Welding Electrode Making Machine By Deccan Dynamics
Welding Electrode Making Machine By Deccan DynamicsWelding Electrode Making Machine By Deccan Dynamics
Welding Electrode Making Machine By Deccan DynamicsIndiaMART InterMESH Limited
 
Healthcare Feb. & Mar. Healthcare Newsletter
Healthcare Feb. & Mar. Healthcare NewsletterHealthcare Feb. & Mar. Healthcare Newsletter
Healthcare Feb. & Mar. Healthcare NewsletterJamesConcepcion7
 
TriStar Gold Corporate Presentation - April 2024
TriStar Gold Corporate Presentation - April 2024TriStar Gold Corporate Presentation - April 2024
TriStar Gold Corporate Presentation - April 2024Adnet Communications
 
The-Ethical-issues-ghhhhhhhhjof-Byjus.pptx
The-Ethical-issues-ghhhhhhhhjof-Byjus.pptxThe-Ethical-issues-ghhhhhhhhjof-Byjus.pptx
The-Ethical-issues-ghhhhhhhhjof-Byjus.pptxmbikashkanyari
 
WSMM Media and Entertainment Feb_March_Final.pdf
WSMM Media and Entertainment Feb_March_Final.pdfWSMM Media and Entertainment Feb_March_Final.pdf
WSMM Media and Entertainment Feb_March_Final.pdfJamesConcepcion7
 
Cyber Security Training in Office Environment
Cyber Security Training in Office EnvironmentCyber Security Training in Office Environment
Cyber Security Training in Office Environmentelijahj01012
 
Jewish Resources in the Family Resource Centre
Jewish Resources in the Family Resource CentreJewish Resources in the Family Resource Centre
Jewish Resources in the Family Resource CentreNZSG
 
14680-51-4.pdf Good quality CAS Good quality CAS
14680-51-4.pdf  Good  quality CAS Good  quality CAS14680-51-4.pdf  Good  quality CAS Good  quality CAS
14680-51-4.pdf Good quality CAS Good quality CAScathy664059
 
Interoperability and ecosystems: Assembling the industrial metaverse
Interoperability and ecosystems:  Assembling the industrial metaverseInteroperability and ecosystems:  Assembling the industrial metaverse
Interoperability and ecosystems: Assembling the industrial metaverseSiemens
 

Último (20)

Appkodes Tinder Clone Script with Customisable Solutions.pptx
Appkodes Tinder Clone Script with Customisable Solutions.pptxAppkodes Tinder Clone Script with Customisable Solutions.pptx
Appkodes Tinder Clone Script with Customisable Solutions.pptx
 
EUDR Info Meeting Ethiopian coffee exporters
EUDR Info Meeting Ethiopian coffee exportersEUDR Info Meeting Ethiopian coffee exporters
EUDR Info Meeting Ethiopian coffee exporters
 
WAM Corporate Presentation April 12 2024.pdf
WAM Corporate Presentation April 12 2024.pdfWAM Corporate Presentation April 12 2024.pdf
WAM Corporate Presentation April 12 2024.pdf
 
digital marketing , introduction of digital marketing
digital marketing , introduction of digital marketingdigital marketing , introduction of digital marketing
digital marketing , introduction of digital marketing
 
Excvation Safety for safety officers reference
Excvation Safety for safety officers referenceExcvation Safety for safety officers reference
Excvation Safety for safety officers reference
 
Psychic Reading | Spiritual Guidance – Astro Ganesh Ji
Psychic Reading | Spiritual Guidance – Astro Ganesh JiPsychic Reading | Spiritual Guidance – Astro Ganesh Ji
Psychic Reading | Spiritual Guidance – Astro Ganesh Ji
 
Intermediate Accounting, Volume 2, 13th Canadian Edition by Donald E. Kieso t...
Intermediate Accounting, Volume 2, 13th Canadian Edition by Donald E. Kieso t...Intermediate Accounting, Volume 2, 13th Canadian Edition by Donald E. Kieso t...
Intermediate Accounting, Volume 2, 13th Canadian Edition by Donald E. Kieso t...
 
NAB Show Exhibitor List 2024 - Exhibitors Data
NAB Show Exhibitor List 2024 - Exhibitors DataNAB Show Exhibitor List 2024 - Exhibitors Data
NAB Show Exhibitor List 2024 - Exhibitors Data
 
trending-flavors-and-ingredients-in-salty-snacks-us-2024_Redacted-V2.pdf
trending-flavors-and-ingredients-in-salty-snacks-us-2024_Redacted-V2.pdftrending-flavors-and-ingredients-in-salty-snacks-us-2024_Redacted-V2.pdf
trending-flavors-and-ingredients-in-salty-snacks-us-2024_Redacted-V2.pdf
 
Driving Business Impact for PMs with Jon Harmer
Driving Business Impact for PMs with Jon HarmerDriving Business Impact for PMs with Jon Harmer
Driving Business Impact for PMs with Jon Harmer
 
The Bizz Quiz-E-Summit-E-Cell-IITPatna.pptx
The Bizz Quiz-E-Summit-E-Cell-IITPatna.pptxThe Bizz Quiz-E-Summit-E-Cell-IITPatna.pptx
The Bizz Quiz-E-Summit-E-Cell-IITPatna.pptx
 
Welding Electrode Making Machine By Deccan Dynamics
Welding Electrode Making Machine By Deccan DynamicsWelding Electrode Making Machine By Deccan Dynamics
Welding Electrode Making Machine By Deccan Dynamics
 
Healthcare Feb. & Mar. Healthcare Newsletter
Healthcare Feb. & Mar. Healthcare NewsletterHealthcare Feb. & Mar. Healthcare Newsletter
Healthcare Feb. & Mar. Healthcare Newsletter
 
TriStar Gold Corporate Presentation - April 2024
TriStar Gold Corporate Presentation - April 2024TriStar Gold Corporate Presentation - April 2024
TriStar Gold Corporate Presentation - April 2024
 
The-Ethical-issues-ghhhhhhhhjof-Byjus.pptx
The-Ethical-issues-ghhhhhhhhjof-Byjus.pptxThe-Ethical-issues-ghhhhhhhhjof-Byjus.pptx
The-Ethical-issues-ghhhhhhhhjof-Byjus.pptx
 
WSMM Media and Entertainment Feb_March_Final.pdf
WSMM Media and Entertainment Feb_March_Final.pdfWSMM Media and Entertainment Feb_March_Final.pdf
WSMM Media and Entertainment Feb_March_Final.pdf
 
Cyber Security Training in Office Environment
Cyber Security Training in Office EnvironmentCyber Security Training in Office Environment
Cyber Security Training in Office Environment
 
Jewish Resources in the Family Resource Centre
Jewish Resources in the Family Resource CentreJewish Resources in the Family Resource Centre
Jewish Resources in the Family Resource Centre
 
14680-51-4.pdf Good quality CAS Good quality CAS
14680-51-4.pdf  Good  quality CAS Good  quality CAS14680-51-4.pdf  Good  quality CAS Good  quality CAS
14680-51-4.pdf Good quality CAS Good quality CAS
 
Interoperability and ecosystems: Assembling the industrial metaverse
Interoperability and ecosystems:  Assembling the industrial metaverseInteroperability and ecosystems:  Assembling the industrial metaverse
Interoperability and ecosystems: Assembling the industrial metaverse
 

Everything You Need to Know Before Migrating Your Data Platform to GCP

  • 1. WHAT YOU NEED TO KNOW, BEFORE MIGRATING DATA PLATFORM TO GCP by SERHII KHOLODNIUK
  • 2. Serhii Kholodniuk Senior Big Data Engineer Sigma Software Ukraine Kyiv office My interest and goals: • interested in designing and developing data platforms for the needs of business intelligence and machine learning. • constantly looking for opportunities to simplify and optimize solutions, their implementation and maintenance. • client value oriented. Mastering GCP: • currently building data platform in GCP • migrating data pipelines in to GCP infrastructure • optimizing data warehouse structure
  • 3. AGENDA — 3 Why GCP becomes popular . . . . . . . . . . . . . . . . . . . . . . . . . . . 04 Migration phases . . . . . . . . . . . . . . . . . . . . . . . . . . . 09 Pipelines migration . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Schema and data migration . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 Data storages . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
  • 4. WHY GCP BECOMES POPULAR? — 4 Cloud Infrastructure Network Cloud sustainability Data cloud Security out of box (encrypt data at rest and in transit) Powerful BigQuery features with ergonomic design Provide cloud infrastructure for all data needs Customized solutions for different industries Provide best practices industry solutions Artificial intelligence solutions Prebuilt ML model APIs Custom Model Building with SQL in BigQuery ML Custom Model Building with Cloud AutoML
  • 5. CLOUD INFRASTRUCTURE — 5 Network 29 regions 88 availability zones 146 edge locations Cloud sustainability 100% renewable energy for all cloud regions 81% waste diverted from landfills 2x more efficient thana typical enterprise data center
  • 6. DATA CLOUD — 6 Security out of box (encrypt data at rest and in transit) Provide cloud infrastructure for all data needs Powerful BigQuery features with ergonomic design
  • 7. CUSTOMIZED SOLUTIONS FOR DIFFERENT INDUSTRIES — 7 Provide best practices industry solutions Industry solutions Retail Consumer packaged goods Manufacturing Automotive Supply chain and logistics Energy Healthcare and life sciences Media and entertainment Gaming Telecommunications Financial services Financial services Capital markets Government and public sector Government State and local government Federal government Education
  • 9. MIGRATION PHASES — 9 1. Pre-migration phase • complete inventory of workloads and stuff to be migrated • calculate Total Cost of Ownership and future business value • build a use case backlog • select use cases for iteration 2. Migration phase • schema migration • pipelines migration • data migration 3. Post-migration phase • cost and performance optimization • schema denormalization for BigQuery • removing nested and repeated schema fields • clustering and partitioning • slots reservation for BigQuery
  • 10. ITERATIVE APPROACH IN AGILE WAY — 10 Prioritize use case backlog Select use cases for iteration Execution Release 1. Setup and data governance 2. Migrate schema and data 3. Translate queries 4. Migrate services and apps 5. Migrate data pipelines 6. Optimise perfomance 7. Verify and validate Next iteration
  • 11. PIPELINES MIGRATION — 11 Cloud Composer Cloud Dataflow Cloud Dataproc Cloud Compute Engine WHAT TO CHOOSE?
  • 12. DATAFLOW vs DATAPROC — 12 Cloud Dataproc Cloud Dataflow Recommended for: New data processing pipelines, unified batch and streaming Existing Hadoop/Spark applications, machine learning/data science ecosystem, large- batch jobs, preemptible VMs New data processing pipelines, unified batch and streaming Fully-managed: No Yes Managed by: DevOps Serverless Auto-scaling: Yes, based on cluster utilization (reactive) Yes, transform-by-transform (adaptive) Expertise: Hadoop, Hive, Pig, Apache Big Data ecosystem, Spark, Flink, Presto, Druid Apache Beam
  • 13. DATAFLOW vs SPARK SERVERLESS — 13 Spark Serverless Cloud Dataflow Recommended for: New data processing pipelines, unified batch existing Spark applications (from Spark 3.2), machine learning/data science ecosystem, large-batch jobs New data processing pipelines, unified batch and streaming Fully-managed: Yes Yes Managed by: Serverless Serverless Auto-scaling: Yes, transform-by-transform (adaptive) Yes, transform-by-transform (adaptive) Expertise: Pyspark, Spark SQL, Spark R, Spark Java/Scala Apache Beam
  • 14. SCHEMA AND DATA MIGRATION — 14 Database Migration Service – helps migrating MySQL and PostgresSQL to CloudSQL BigQuery Data Transfer Service Google recommends loading large data volumes by using Cloud Storage Transfer Service, and preferable are Avro, Parquet or ORC format rather than CSV or JSON For migration stratagies for Oracle workloads: rehost (by Bare Metal Solution), replatform, rewrite Hbase to Bigtable migration path: HDFS -> Cloud Storage -> Storage Transfer Service -> Bigtable
  • 15. DATA STORES FOR DIFFERENT USE CASES — 15 Data Unstructured Structured Cloud Storage Transactional workloads Data analytics workloads Millisecond latency Latency in seconds Cloud Bigtable BigQuery Firestore NoSQL SQL One database enough Horisontal scalability Cloud SQL Cloud Spanner