SlideShare una empresa de Scribd logo
1 de 31
Descargar para leer sin conexión
Serverless Big Data Architecture
on
Google Cloud Platform
at
Presented by Kriangkrai Chaonithi @spicydog
On 25/11/2018, At Barcamp Bangkhen 9
Hello! My name is Gap
Education
● BS Applied Computer Science (KMUTT)
● MS Applied Computer Engineering (KMUTT)
Work Experience
● Former Android, iOS & PHP Developer at Longdo.COM
● Former R&D Manager at Insightera
● CTO & co-founder at Credit OK
Fields of Interests
● Software Engineering
● Computer Security
● Servers & Cloud & Distributed Computing
● Machine Learning & NLP
https://spicydog.me
Agenda
● Server & application deployment history
● Introduction to Google Cloud Platform products
○ Computing
○ Storage & databases
○ Data analytics
● Big data architecture at Credit OK
○ About Credit OK
○ Why we use serverless
○ Our requirements
○ Our solutions
○ The summary
Server & Application
Deployment History
Bare Metal Server
● Pre-cloud era (probably..)
● Install OS and dependencies on a machine
● One machine - one server
● Expose the network to the internet
● Colocation/on-premise
● SSH/FTP/Git to the server
Virtualization
● One machine - many servers
● One machine multiple customers
● VPS / Cloud
● SSH/FTP/Git to the server
IaaS
Containers & Micro Services
● Docker / Kubernetes
● Auto deployment
● Auto scale (automatic spawn new nodes)
● Pay base on number of nodes
● Infrastructure as code! (new concept!)
PaaS
Why Containers?
Why Container Orchestration?
https://blog.risingstack.com/what-is-kubernetes-how-to-get-started/
Serverless
● Write code and deploy!
● Auto deploy
● Auto scale
● Pay per request
● No infrastructure!!
SaaS
It’s time to talk about..
Some Famous Features on GCP
GCP Computing
Virtual Machine
Containers
Severless
Let’s Review Types of Databases
SQL NoSQL
CAP Theorem
GCP Storages & Databases
Non-serverless
Serverless
GCP Data Analytics
Pipeline Analytics Visualization
Credit Scoring Platform on Big Data Analytics
creditok.co
Why use serverless on big data?
● Scalable & super high performance
● No more server maintenance :)
● Easier to optimize
● Only pay per use
Requirements
● Have a HUGE data warehouse for batch processing
● Our customer have on-premise data on >400 sites
● Data ingestor app is needed to install to every site
● Data ingestor app must be able to run on
● Data ingestor app must be super robust and easy to install
● Must work automatically everyday, task scheduler
When >400 sites upload large files
to your server at the same time..
This is unintentional DDoS!
So we mainly use cloud function
● Auto scale
● But only accept <10 MB body size
and also use
Compute/App Engine
for >10MB files
Raw Data
Source
Raw Data
Source
Data Flow Architecture
Serverless
Big Data Architecture
In Summary
● Focus on design & coding
● Few people to achieve huge task
● No cost on idle server, pay as you use
(GCS storage ~$0.02 per GB)
● Processing cost is surprisingly low when optimized
(Beware of BigQuery cost!)
Beware of ZONE_RESOURCE_POOL_EXHAUSTED
● Serverless doesn’t mean no server, you just do not need to spawn servers/workers
● Worker pools have limit, do not run your app at the peak time (but when!!)
● Hopefully Google will solve the problem soon :)
We Are Hiring!
● PHP Laravel/Lumen Developer
● Data Engineer
● Credit Risk Analyst
hr@creditok.co
https://jobs.blognone.com/company/creditok
Qu s o & An er
Time is short, let’s utilize the networks.
Feel free to connect with me via spicydog.me

Más contenido relacionado

La actualidad más candente

Data Analytics Strategy
Data Analytics StrategyData Analytics Strategy
Data Analytics Strategy
eHealthCareers
 

La actualidad más candente (20)

Advanced PII / PI data discovery and data protection
Advanced PII / PI data discovery and data protectionAdvanced PII / PI data discovery and data protection
Advanced PII / PI data discovery and data protection
 
Introduction to AI Governance
Introduction to AI GovernanceIntroduction to AI Governance
Introduction to AI Governance
 
Introduction to GDPR
Introduction to GDPRIntroduction to GDPR
Introduction to GDPR
 
Tesla - Redstar Project
Tesla - Redstar ProjectTesla - Redstar Project
Tesla - Redstar Project
 
Impact of ict on privacy and personal data
Impact of ict on privacy and personal dataImpact of ict on privacy and personal data
Impact of ict on privacy and personal data
 
Delivering Data Democratization in the Cloud with Snowflake
Delivering Data Democratization in the Cloud with SnowflakeDelivering Data Democratization in the Cloud with Snowflake
Delivering Data Democratization in the Cloud with Snowflake
 
Data Protection and Privacy
Data Protection and PrivacyData Protection and Privacy
Data Protection and Privacy
 
Artificial Intelligence Introduction & Business usecases
Artificial Intelligence Introduction & Business usecasesArtificial Intelligence Introduction & Business usecases
Artificial Intelligence Introduction & Business usecases
 
Data strategy demistifying data
Data strategy demistifying dataData strategy demistifying data
Data strategy demistifying data
 
The ABCs of Treating Data as Product
The ABCs of Treating Data as ProductThe ABCs of Treating Data as Product
The ABCs of Treating Data as Product
 
IICS_Capabilities.pptx
IICS_Capabilities.pptxIICS_Capabilities.pptx
IICS_Capabilities.pptx
 
Slides: Taking an Active Approach to Data Governance
Slides: Taking an Active Approach to Data GovernanceSlides: Taking an Active Approach to Data Governance
Slides: Taking an Active Approach to Data Governance
 
Ai in government
Ai in government Ai in government
Ai in government
 
Data protection ppt
Data protection pptData protection ppt
Data protection ppt
 
AI in Retail
AI in RetailAI in Retail
AI in Retail
 
Data Culture Keynote and Exec Track Birm Dec 8th
Data Culture Keynote and Exec Track Birm Dec 8thData Culture Keynote and Exec Track Birm Dec 8th
Data Culture Keynote and Exec Track Birm Dec 8th
 
GDPR and Security.pdf
GDPR and Security.pdfGDPR and Security.pdf
GDPR and Security.pdf
 
DATA & ANALYTICS
DATA & ANALYTICSDATA & ANALYTICS
DATA & ANALYTICS
 
Big data and AI in digital marketing
Big data and AI in digital marketingBig data and AI in digital marketing
Big data and AI in digital marketing
 
Data Analytics Strategy
Data Analytics StrategyData Analytics Strategy
Data Analytics Strategy
 

Similar a Serverless Big Data Architecture on Google Cloud Platform at Credit OK

kranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadkranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High load
Krivoy Rog IT Community
 
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
Omid Vahdaty
 
Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3  Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3
Omid Vahdaty
 

Similar a Serverless Big Data Architecture on Google Cloud Platform at Credit OK (20)

Introduction to Data Engineer and Data Pipeline at Credit OK
Introduction to Data Engineer and Data Pipeline at Credit OKIntroduction to Data Engineer and Data Pipeline at Credit OK
Introduction to Data Engineer and Data Pipeline at Credit OK
 
What cloud changes the developer
What cloud changes the developerWhat cloud changes the developer
What cloud changes the developer
 
Not Your Father’s Web App: The Cloud-Native Architecture of images.nasa.gov
Not Your Father’s Web App: The Cloud-Native Architecture of images.nasa.govNot Your Father’s Web App: The Cloud-Native Architecture of images.nasa.gov
Not Your Father’s Web App: The Cloud-Native Architecture of images.nasa.gov
 
AWS Techniques and lessons writing a minimal cost gitlab runner
AWS Techniques and lessons writing a minimal cost gitlab runnerAWS Techniques and lessons writing a minimal cost gitlab runner
AWS Techniques and lessons writing a minimal cost gitlab runner
 
Make your data fly - Building data platform in AWS
Make your data fly - Building data platform in AWSMake your data fly - Building data platform in AWS
Make your data fly - Building data platform in AWS
 
Designing for operability and managability
Designing for operability and managabilityDesigning for operability and managability
Designing for operability and managability
 
Case study migration from cm13 to cm14 - Oracle Primavera P6 Collaborate 14
Case study migration from cm13 to cm14 - Oracle Primavera P6 Collaborate 14Case study migration from cm13 to cm14 - Oracle Primavera P6 Collaborate 14
Case study migration from cm13 to cm14 - Oracle Primavera P6 Collaborate 14
 
kranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadkranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High load
 
Web Performance Optimization
Web Performance OptimizationWeb Performance Optimization
Web Performance Optimization
 
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
 
SaaS startups - Software Engineering Challenges
SaaS startups - Software Engineering ChallengesSaaS startups - Software Engineering Challenges
SaaS startups - Software Engineering Challenges
 
[Cloud OnAir] Talks by DevRel Vol.4 データ管理とデータ ベース 2020年8月27日 放送
[Cloud OnAir] Talks by DevRel Vol.4 データ管理とデータ ベース 2020年8月27日 放送[Cloud OnAir] Talks by DevRel Vol.4 データ管理とデータ ベース 2020年8月27日 放送
[Cloud OnAir] Talks by DevRel Vol.4 データ管理とデータ ベース 2020年8月27日 放送
 
Workflow Engines + Luigi
Workflow Engines + LuigiWorkflow Engines + Luigi
Workflow Engines + Luigi
 
[AWS DC Meetup] Not Your Father’s WebApp: The Cloud-Native Architecture of im...
[AWS DC Meetup] Not Your Father’s WebApp: The Cloud-Native Architecture of im...[AWS DC Meetup] Not Your Father’s WebApp: The Cloud-Native Architecture of im...
[AWS DC Meetup] Not Your Father’s WebApp: The Cloud-Native Architecture of im...
 
Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3  Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3
 
Introduction to Modern DevOps Technologies
Introduction to  Modern DevOps TechnologiesIntroduction to  Modern DevOps Technologies
Introduction to Modern DevOps Technologies
 
Write less (code) and build more with serverless
Write less (code) and build more with serverlessWrite less (code) and build more with serverless
Write less (code) and build more with serverless
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
Solving enterprise challenges through scale out storage &amp; big compute final
Solving enterprise challenges through scale out storage &amp; big compute finalSolving enterprise challenges through scale out storage &amp; big compute final
Solving enterprise challenges through scale out storage &amp; big compute final
 
Presto: Query Anything - Data Engineer’s perspective
Presto: Query Anything - Data Engineer’s perspectivePresto: Query Anything - Data Engineer’s perspective
Presto: Query Anything - Data Engineer’s perspective
 

Último

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Último (20)

presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 

Serverless Big Data Architecture on Google Cloud Platform at Credit OK

  • 1. Serverless Big Data Architecture on Google Cloud Platform at Presented by Kriangkrai Chaonithi @spicydog On 25/11/2018, At Barcamp Bangkhen 9
  • 2. Hello! My name is Gap Education ● BS Applied Computer Science (KMUTT) ● MS Applied Computer Engineering (KMUTT) Work Experience ● Former Android, iOS & PHP Developer at Longdo.COM ● Former R&D Manager at Insightera ● CTO & co-founder at Credit OK Fields of Interests ● Software Engineering ● Computer Security ● Servers & Cloud & Distributed Computing ● Machine Learning & NLP https://spicydog.me
  • 3. Agenda ● Server & application deployment history ● Introduction to Google Cloud Platform products ○ Computing ○ Storage & databases ○ Data analytics ● Big data architecture at Credit OK ○ About Credit OK ○ Why we use serverless ○ Our requirements ○ Our solutions ○ The summary
  • 5. Bare Metal Server ● Pre-cloud era (probably..) ● Install OS and dependencies on a machine ● One machine - one server ● Expose the network to the internet ● Colocation/on-premise ● SSH/FTP/Git to the server
  • 6. Virtualization ● One machine - many servers ● One machine multiple customers ● VPS / Cloud ● SSH/FTP/Git to the server IaaS
  • 7. Containers & Micro Services ● Docker / Kubernetes ● Auto deployment ● Auto scale (automatic spawn new nodes) ● Pay base on number of nodes ● Infrastructure as code! (new concept!) PaaS
  • 10. Serverless ● Write code and deploy! ● Auto deploy ● Auto scale ● Pay per request ● No infrastructure!! SaaS
  • 11. It’s time to talk about..
  • 12.
  • 15.
  • 16. Let’s Review Types of Databases SQL NoSQL
  • 18. GCP Storages & Databases Non-serverless Serverless
  • 19. GCP Data Analytics Pipeline Analytics Visualization
  • 20.
  • 21. Credit Scoring Platform on Big Data Analytics creditok.co
  • 22. Why use serverless on big data? ● Scalable & super high performance ● No more server maintenance :) ● Easier to optimize ● Only pay per use
  • 23. Requirements ● Have a HUGE data warehouse for batch processing ● Our customer have on-premise data on >400 sites ● Data ingestor app is needed to install to every site ● Data ingestor app must be able to run on ● Data ingestor app must be super robust and easy to install ● Must work automatically everyday, task scheduler
  • 24. When >400 sites upload large files to your server at the same time.. This is unintentional DDoS!
  • 25. So we mainly use cloud function ● Auto scale ● But only accept <10 MB body size and also use Compute/App Engine for >10MB files
  • 27. Serverless Big Data Architecture In Summary ● Focus on design & coding ● Few people to achieve huge task ● No cost on idle server, pay as you use (GCS storage ~$0.02 per GB) ● Processing cost is surprisingly low when optimized (Beware of BigQuery cost!)
  • 28. Beware of ZONE_RESOURCE_POOL_EXHAUSTED ● Serverless doesn’t mean no server, you just do not need to spawn servers/workers ● Worker pools have limit, do not run your app at the peak time (but when!!) ● Hopefully Google will solve the problem soon :)
  • 29. We Are Hiring! ● PHP Laravel/Lumen Developer ● Data Engineer ● Credit Risk Analyst hr@creditok.co https://jobs.blognone.com/company/creditok
  • 30. Qu s o & An er
  • 31. Time is short, let’s utilize the networks. Feel free to connect with me via spicydog.me