SlideShare una empresa de Scribd logo
1 de 59
Ta Virot Chiraphadhanakul, PhD (@tvirot)
GDE in Machine Learning | Managing Director @ Skooldio
Data Science on Google Cloud Platform
Google Developers
Launchpad Build for Cloud Meetup, Bangkok
90% of the data in the world
today has been created in
the last two years alone.
— IBM
Turning data into …
ǡ Metrics
Insights
Data Products
@tvirot
Data Science Process

Collect Manipulate Analyze Model
|
Communicate
@tvirot
A lot of tools needed
Pig Airflow
@tvirot
Administrative and operational issues
• Deployment and configuration
• Managing scale and optimizing utilization
• Reliability
• Resource provisioning
• etc.
@tvirot
How about a serverless big
data stack that scales
automatically?
Serverless Data Processing
• Focus on insights, not administration
• Practically infinite scale, exactly when you need it
• Pay only for what you use
• Freedom to experiment, fail quickly, and iterate. Successful experiments are
ready to go live right away
Storage & Databases Big Data Machine Learning
Data Science on Google Cloud Platform

Collect Manipulate Analyze Model
|
Communicate
Storage & Databases
Cloud Storage
A scalable object storage service
suitable for all kinds of
unstructured data
Cloud SQL
A fully-managed database service
that makes it easy to set up,
maintain, manage, and administer
your relational MySQL and
PostgreSQL databases in the cloud
Cloud Datastore
A highly-scalable NoSQL database
for your applications. Cloud
Datastore automatically handles
sharding and replication.
Cloud BigTable
A massively scalable NoSQL
database suitable for low-latency
and high-throughput workloads. It
supports the open-source, industry-
standard HBase API
Fully-managed real-time messaging service
that allows you to send and receive
messages between independent applications
Connect Anything to Everything

Use Cloud Pub/Sub to publish and
subscribe to data from multiple sources,
reducing dependencies between
components of distributed applications
Highly Scalable

Any customer can send up to 10,000
messages per second, by default



Guaranteed Delivery

Designed to provide “at least once” delivery
Cloud Pub/Sub

Collect Manipulate Analyze Model
|
Communicate
Fully-managed data processing service,
supporting both stream and batch execution
of pipelines
Fully Managed

Dynamically provision resources to
minimize latency while maintaining high
utilization efficiency
Unified Programming Model

Express computational requirements
regardless of data source
Cloud Dataflow
https://cloud.google.com/dataflow/examples/wordcount-example
Pipeline p = Pipeline.create(options);
Pipeline p = Pipeline.create(options);
p.apply(TextIO.Read.from(“gs://dataflow-samples/shakespeare/kinglear.txt”))
Create a Pipeline
https://cloud.google.com/dataflow/examples/wordcount-example
Pipeline p = Pipeline.create(options);
p.apply(TextIO.Read.from(“gs://dataflow-samples/shakespeare/kinglear.txt”))
.apply(ParDo.named("ExtractWords").of(new DoFn<String, String>() {
     @Override
     public void processElement(ProcessContext c) {
       for (String word : c.element().split("[^a-zA-Z']+")) {
         if (!word.isEmpty()) {
           c.output(word);
         }
       }
     }
}))
Read lines
Create a Pipeline
https://cloud.google.com/dataflow/examples/wordcount-example
Pipeline p = Pipeline.create(options);
p.apply(TextIO.Read.from(“gs://dataflow-samples/shakespeare/kinglear.txt”))
.apply(ParDo.named("ExtractWords").of(new DoFn<String, String>() {
     @Override
     public void processElement(ProcessContext c) {
       for (String word : c.element().split("[^a-zA-Z']+")) {
         if (!word.isEmpty()) {
           c.output(word);
         }
       }
     }
}))
  .apply(Count.<String>perElement())
 
Read lines
Create a Pipeline
Tokenize lines into words
https://cloud.google.com/dataflow/examples/wordcount-example
Pipeline p = Pipeline.create(options);
p.apply(TextIO.Read.from(“gs://dataflow-samples/shakespeare/kinglear.txt”))
.apply(ParDo.named("ExtractWords").of(new DoFn<String, String>() {
     @Override
     public void processElement(ProcessContext c) {
       for (String word : c.element().split("[^a-zA-Z']+")) {
         if (!word.isEmpty()) {
           c.output(word);
         }
       }
     }
}))
  .apply(Count.<String>perElement())
  .apply(MapElements.via(new SimpleFunction<KV<String, Long>, String>() {
     @Override
     public String apply(KV<String, Long> element) {
       return element.getKey() + ": " + element.getValue();
     }
  }))
Read lines
Create a Pipeline
Tokenize lines into words
Count words
https://cloud.google.com/dataflow/examples/wordcount-example
Pipeline p = Pipeline.create(options);
p.apply(TextIO.Read.from(“gs://dataflow-samples/shakespeare/kinglear.txt”))
.apply(ParDo.named("ExtractWords").of(new DoFn<String, String>() {
     @Override
     public void processElement(ProcessContext c) {
       for (String word : c.element().split("[^a-zA-Z']+")) {
         if (!word.isEmpty()) {
           c.output(word);
         }
       }
     }
}))
  .apply(Count.<String>perElement())
  .apply(MapElements.via(new SimpleFunction<KV<String, Long>, String>() {
     @Override
     public String apply(KV<String, Long> element) {
       return element.getKey() + ": " + element.getValue();
     }
  }))
.apply(TextIO.Write.to("gs://my-bucket/counts.txt"));
Format strings
Read lines
Create a Pipeline
Tokenize lines into words
Count words
https://cloud.google.com/dataflow/examples/wordcount-example
Pipeline p = Pipeline.create(options);
p.apply(TextIO.Read.from(“gs://dataflow-samples/shakespeare/kinglear.txt”))
.apply(ParDo.named("ExtractWords").of(new DoFn<String, String>() {
     @Override
     public void processElement(ProcessContext c) {
       for (String word : c.element().split("[^a-zA-Z']+")) {
         if (!word.isEmpty()) {
           c.output(word);
         }
       }
     }
}))
  .apply(Count.<String>perElement())
  .apply(MapElements.via(new SimpleFunction<KV<String, Long>, String>() {
     @Override
     public String apply(KV<String, Long> element) {
       return element.getKey() + ": " + element.getValue();
     }
  }))
.apply(TextIO.Write.to("gs://my-bucket/counts.txt"));
Format strings
Read lines
Create a Pipeline
Tokenize lines into words
Write to file
Count words
https://cloud.google.com/dataflow/examples/wordcount-example
https://cloud.google.com/dataflow/pipelines/design-principles
Different Pipeline Shapes
MULTIPLE TRANSFORMS

+ MERGING
JOINING MULTIPLE
SOURCES
Events, metrics
Stream
Batch
Raw logs,
databases, etc.
Cloud Pub/Sub
Cloud Storage
Cloud Dataflow
Managed Spark and Hadoop service which is
fast, easy to use, and low cost
Fast & Scalable Data Processing

Create a cluster in minutes and resize them
at any time
Affordable Pricing

Based on actual use, measured by the
minute
Open Source Ecosystem

Move existing projects or ETL pipelines
without redevelopment
Cloud Dataproc
Cloud Dataproc
@tvirot
Events, metrics
Stream
Batch
Raw logs,
databases, etc.
Cloud Pub/Sub
Cloud Storage
Cloud Dataflow
Batch Cloud Dataproc
An intelligent data service for visually
exploring, cleaning, and preparing data
Visually explore data
Intelligent data manipulation
Serverless and works at any scale
Cloud Dataprep
https://cloud.google.com/dataprep/
Google's fully managed, petabyte scale, low
cost enterprise data warehouse for analytics
Fully Managed

No infrastructure to manage, and you don't
need a database administrator
Speed & Scale

Scans TB in seconds and PB in minutes
Convenience of SQL
Makes it more accessible
Security & Reliability

Automatically encrypts and replicates data
BigQuery
Google's fully managed, petabyte scale, low
cost enterprise data warehouse for analytics
Flexible Data Ingestion

Load your data from Google Cloud Storage
or Google Cloud Datastore, or stream it
Fully Integrated

With other Google Cloud products and
third-party applications
BigQuery
https://cloud.google.com/blog/big-data/2016/01/anatomy-of-a-bigquery-query
How fast is BigQuery really?
BigQuery
Events, metrics
Stream
Batch
Raw logs,
databases, etc.
Cloud Pub/Sub
Cloud Storage
Cloud Dataflow
Batch Cloud Dataproc

Collect Manipulate Analyze Model
|
Communicate
An easy to use interactive tool for data
exploration, analysis, visualization and
machine learning
Integrated & Open Source

Built on Jupyter (formerly IPython).
Enables analysis of your data on BigQuery,
ML Engine, Compute Engine, and Cloud
Storage
Cloud Datalab
Photo: https://github.com/googledatalab/datalab
Turns your data into informative dashboards
and reports that are easy to read, easy to
share, and fully customizable
Put all your data to work

Easily access all the data sources you
need to understand your business and
make better decisions
Build engaging visualizations

Create beautiful charts and graphs that
bring your data to life
Leverage teamwork that works

Share and collaborate in real time. Work
together quickly, from anywhere.
Cloud Data Studio
https://datastudio.google.com
@tvirot
https://datastudio.google.com
@tvirot

Collect Manipulate Analyze Model
|
Communicate
Artificial Intelligence
is the new electricity.
— Andrew Ng
AlphaGo
The first computer program to
beat a professional human Go
player
Photo: Nature
Waymo
The Google self-driving car
project became Waymo with a
mission to make it easy and
safe for people and things to
move around
Photo: Waymo
Machine Learning engine and APIs
Custom ML modelsPre-trained ML models
Machine Learning
Engine
TensorFlowVision API
Translation
API
Natural Language
API
Speech API Jobs API
Google Cloud
Vision API
Understand the content of
images
• Label Detection
• Optical Character Recognition
• Explicit Content Detection
• etc.
+
https://m.me/youpin.city | https://youpin.city/app
@tvirot
Google Cloud
Speech API
Convert audio to text by
applying powerful neural
network models in an easy to
use API
@tvirot
A managed service that enables you to easily
build machine learning models, that work on
any type of data, of any size
Scalable Service

Managed distributed training infrastructure
that supports CPUs and GPUs
HyperTune

Automatically tuning your hyper
parameters with HyperTune
Deep Learning Capabilities

Supports any TensorFlow models
Cloud ML Engine
BigQuery
Events, metrics
Stream
Batch
Raw logs,
databases, etc.
Cloud Pub/Sub
Cloud Storage
Cloud Dataflow
Batch Cloud Dataproc
Cloud Machine
Learning Engine
Large-Scale Deep Learning for Intelligent Computer Systems, Jeff Dean, WSDM 2016
http://playground.tensorflow.org/
http://playground.tensorflow.org/
http://playground.tensorflow.org/
TensorKart
Self-driving MarioKart with
TensorFlow
http://kevinhughes.ca/blog/tensor-kart
Cucumber Sorter
"Farmers want to focus and
spend their time on growing
delicious vegetables.”
— Makoto Koike
Photos: Google Cloud Platform / Kaz Sato
https://codelabs.developers.google.com/codelabs/cloud-tensorflow-mnist
https://codelabs.developers.google.com/codelabs/cloud-tensorflow-mnist
Serverless

Less ops and administration
No waiting

Queries that used to take hours or days
now take minutes or seconds
Machine Intelligence

Gives everyone access to the deep learning
systems
Thank you!
Ta Virot Chiraphadhanakul, PhD (@tvirot)
GDE in Machine Learning | Managing Director @ Skooldio

Más contenido relacionado

La actualidad más candente

Evaluation of TPC-H on Spark and Spark SQL in ALOJA
Evaluation of TPC-H on Spark and Spark SQL in ALOJAEvaluation of TPC-H on Spark and Spark SQL in ALOJA
Evaluation of TPC-H on Spark and Spark SQL in ALOJA
DataWorks Summit
 

La actualidad más candente (20)

Discover BigQuery ML, build your own CREATE MODEL statement
Discover BigQuery ML, build your own CREATE MODEL statementDiscover BigQuery ML, build your own CREATE MODEL statement
Discover BigQuery ML, build your own CREATE MODEL statement
 
Your Raw Data is Ready - Introduction to Analytics Engineering | SMX Advanced...
Your Raw Data is Ready - Introduction to Analytics Engineering | SMX Advanced...Your Raw Data is Ready - Introduction to Analytics Engineering | SMX Advanced...
Your Raw Data is Ready - Introduction to Analytics Engineering | SMX Advanced...
 
Unified MLOps: Feature Stores & Model Deployment
Unified MLOps: Feature Stores & Model DeploymentUnified MLOps: Feature Stores & Model Deployment
Unified MLOps: Feature Stores & Model Deployment
 
Introduction to Google Cloud Platform
Introduction to Google Cloud PlatformIntroduction to Google Cloud Platform
Introduction to Google Cloud Platform
 
Serverless with Google Cloud Functions
Serverless with Google Cloud FunctionsServerless with Google Cloud Functions
Serverless with Google Cloud Functions
 
Introduction to Google Cloud Platform for Big Data - Trusted Conf
Introduction to Google Cloud Platform for Big Data - Trusted ConfIntroduction to Google Cloud Platform for Big Data - Trusted Conf
Introduction to Google Cloud Platform for Big Data - Trusted Conf
 
Amazon SageMaker
Amazon SageMakerAmazon SageMaker
Amazon SageMaker
 
BigQuery ML - Machine learning at scale using SQL
BigQuery ML - Machine learning at scale using SQLBigQuery ML - Machine learning at scale using SQL
BigQuery ML - Machine learning at scale using SQL
 
Understanding cloud with Google Cloud Platform
Understanding cloud with Google Cloud PlatformUnderstanding cloud with Google Cloud Platform
Understanding cloud with Google Cloud Platform
 
Machine Learning on AWS
Machine Learning on AWSMachine Learning on AWS
Machine Learning on AWS
 
Google cloud platform
Google cloud platformGoogle cloud platform
Google cloud platform
 
Google Cloud Platform Training | Introduction To GCP | Google Cloud Platform ...
Google Cloud Platform Training | Introduction To GCP | Google Cloud Platform ...Google Cloud Platform Training | Introduction To GCP | Google Cloud Platform ...
Google Cloud Platform Training | Introduction To GCP | Google Cloud Platform ...
 
Introduction to Google Cloud Platform (GCP) | Google Cloud Tutorial for Begin...
Introduction to Google Cloud Platform (GCP) | Google Cloud Tutorial for Begin...Introduction to Google Cloud Platform (GCP) | Google Cloud Tutorial for Begin...
Introduction to Google Cloud Platform (GCP) | Google Cloud Tutorial for Begin...
 
Introduction to GCP BigQuery and DataPrep
Introduction to GCP BigQuery and DataPrepIntroduction to GCP BigQuery and DataPrep
Introduction to GCP BigQuery and DataPrep
 
Introduction to Google Cloud Platform
Introduction to Google Cloud PlatformIntroduction to Google Cloud Platform
Introduction to Google Cloud Platform
 
Google Cloud and Data Pipeline Patterns
Google Cloud and Data Pipeline PatternsGoogle Cloud and Data Pipeline Patterns
Google Cloud and Data Pipeline Patterns
 
Deploy and Serve Model from Azure Databricks onto Azure Machine Learning
Deploy and Serve Model from Azure Databricks onto Azure Machine LearningDeploy and Serve Model from Azure Databricks onto Azure Machine Learning
Deploy and Serve Model from Azure Databricks onto Azure Machine Learning
 
bigquery.pptx
bigquery.pptxbigquery.pptx
bigquery.pptx
 
Introduction to AWS Glue
Introduction to AWS Glue Introduction to AWS Glue
Introduction to AWS Glue
 
Evaluation of TPC-H on Spark and Spark SQL in ALOJA
Evaluation of TPC-H on Spark and Spark SQL in ALOJAEvaluation of TPC-H on Spark and Spark SQL in ALOJA
Evaluation of TPC-H on Spark and Spark SQL in ALOJA
 

Similar a Data Science on Google Cloud Platform

Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream Processing
Guido Schmutz
 

Similar a Data Science on Google Cloud Platform (20)

Building Modern Data Pipelines for Time Series Data on GCP with InfluxData by...
Building Modern Data Pipelines for Time Series Data on GCP with InfluxData by...Building Modern Data Pipelines for Time Series Data on GCP with InfluxData by...
Building Modern Data Pipelines for Time Series Data on GCP with InfluxData by...
 
Data Ingestion in Big Data and IoT platforms
Data Ingestion in Big Data and IoT platformsData Ingestion in Big Data and IoT platforms
Data Ingestion in Big Data and IoT platforms
 
Integrating Google Cloud Dataproc with Alluxio for faster performance in the ...
Integrating Google Cloud Dataproc with Alluxio for faster performance in the ...Integrating Google Cloud Dataproc with Alluxio for faster performance in the ...
Integrating Google Cloud Dataproc with Alluxio for faster performance in the ...
 
Google Cloud Platform for Data Science teams
Google Cloud Platform for Data Science teamsGoogle Cloud Platform for Data Science teams
Google Cloud Platform for Data Science teams
 
Architecting Solutions Leveraging The Cloud
Architecting Solutions Leveraging The CloudArchitecting Solutions Leveraging The Cloud
Architecting Solutions Leveraging The Cloud
 
Supercharge your data analytics with BigQuery
Supercharge your data analytics with BigQuerySupercharge your data analytics with BigQuery
Supercharge your data analytics with BigQuery
 
Google cloud Dataflow & Apache Flink
Google cloud Dataflow & Apache FlinkGoogle cloud Dataflow & Apache Flink
Google cloud Dataflow & Apache Flink
 
Streaming Visualization
Streaming VisualizationStreaming Visualization
Streaming Visualization
 
Apache Eagle: Secure Hadoop in Real Time
Apache Eagle: Secure Hadoop in Real TimeApache Eagle: Secure Hadoop in Real Time
Apache Eagle: Secure Hadoop in Real Time
 
Apache Eagle at Hadoop Summit 2016 San Jose
Apache Eagle at Hadoop Summit 2016 San JoseApache Eagle at Hadoop Summit 2016 San Jose
Apache Eagle at Hadoop Summit 2016 San Jose
 
DataFinder: A Python Application for Scientific Data Management
DataFinder: A Python Application for Scientific Data ManagementDataFinder: A Python Application for Scientific Data Management
DataFinder: A Python Application for Scientific Data Management
 
Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream Processing
 
IoT NY - Google Cloud Services for IoT
IoT NY - Google Cloud Services for IoTIoT NY - Google Cloud Services for IoT
IoT NY - Google Cloud Services for IoT
 
Organizing the Data Chaos of Scientists
Organizing the Data Chaos of ScientistsOrganizing the Data Chaos of Scientists
Organizing the Data Chaos of Scientists
 
Ultra Fast Deep Learning in Hybrid Cloud using Intel Analytics Zoo & Alluxio
Ultra Fast Deep Learning in Hybrid Cloud using Intel Analytics Zoo & AlluxioUltra Fast Deep Learning in Hybrid Cloud using Intel Analytics Zoo & Alluxio
Ultra Fast Deep Learning in Hybrid Cloud using Intel Analytics Zoo & Alluxio
 
Apache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing dataApache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing data
 
Fundamental question and answer in cloud computing quiz by animesh chaturvedi
Fundamental question and answer in cloud computing quiz by animesh chaturvediFundamental question and answer in cloud computing quiz by animesh chaturvedi
Fundamental question and answer in cloud computing quiz by animesh chaturvedi
 
Giga Spaces Data Grid / Data Caching Overview
Giga Spaces Data Grid / Data Caching OverviewGiga Spaces Data Grid / Data Caching Overview
Giga Spaces Data Grid / Data Caching Overview
 
Google Cloud lightning talk @MHacks
Google Cloud lightning talk @MHacksGoogle Cloud lightning talk @MHacks
Google Cloud lightning talk @MHacks
 
Enterprise guide to building a Data Mesh
Enterprise guide to building a Data MeshEnterprise guide to building a Data Mesh
Enterprise guide to building a Data Mesh
 

Último

Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
only4webmaster01
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
MarinCaroMartnezBerg
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
amitlee9823
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 

Último (20)

Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 

Data Science on Google Cloud Platform

  • 1. Ta Virot Chiraphadhanakul, PhD (@tvirot) GDE in Machine Learning | Managing Director @ Skooldio Data Science on Google Cloud Platform Google Developers Launchpad Build for Cloud Meetup, Bangkok
  • 2. 90% of the data in the world today has been created in the last two years alone. — IBM
  • 3. Turning data into … ǡ Metrics Insights Data Products @tvirot
  • 4. Data Science Process  Collect Manipulate Analyze Model | Communicate @tvirot
  • 5. A lot of tools needed Pig Airflow @tvirot
  • 6. Administrative and operational issues • Deployment and configuration • Managing scale and optimizing utilization • Reliability • Resource provisioning • etc.
  • 8. How about a serverless big data stack that scales automatically?
  • 9. Serverless Data Processing • Focus on insights, not administration • Practically infinite scale, exactly when you need it • Pay only for what you use • Freedom to experiment, fail quickly, and iterate. Successful experiments are ready to go live right away
  • 10. Storage & Databases Big Data Machine Learning Data Science on Google Cloud Platform
  • 11.  Collect Manipulate Analyze Model | Communicate
  • 12. Storage & Databases Cloud Storage A scalable object storage service suitable for all kinds of unstructured data Cloud SQL A fully-managed database service that makes it easy to set up, maintain, manage, and administer your relational MySQL and PostgreSQL databases in the cloud Cloud Datastore A highly-scalable NoSQL database for your applications. Cloud Datastore automatically handles sharding and replication. Cloud BigTable A massively scalable NoSQL database suitable for low-latency and high-throughput workloads. It supports the open-source, industry- standard HBase API
  • 13. Fully-managed real-time messaging service that allows you to send and receive messages between independent applications Connect Anything to Everything
 Use Cloud Pub/Sub to publish and subscribe to data from multiple sources, reducing dependencies between components of distributed applications Highly Scalable
 Any customer can send up to 10,000 messages per second, by default
 
 Guaranteed Delivery
 Designed to provide “at least once” delivery Cloud Pub/Sub
  • 14.  Collect Manipulate Analyze Model | Communicate
  • 15. Fully-managed data processing service, supporting both stream and batch execution of pipelines Fully Managed
 Dynamically provision resources to minimize latency while maintaining high utilization efficiency Unified Programming Model
 Express computational requirements regardless of data source Cloud Dataflow
  • 17. Pipeline p = Pipeline.create(options); p.apply(TextIO.Read.from(“gs://dataflow-samples/shakespeare/kinglear.txt”)) Create a Pipeline https://cloud.google.com/dataflow/examples/wordcount-example
  • 18. Pipeline p = Pipeline.create(options); p.apply(TextIO.Read.from(“gs://dataflow-samples/shakespeare/kinglear.txt”)) .apply(ParDo.named("ExtractWords").of(new DoFn<String, String>() {      @Override      public void processElement(ProcessContext c) {        for (String word : c.element().split("[^a-zA-Z']+")) {          if (!word.isEmpty()) {            c.output(word);          }        }      } })) Read lines Create a Pipeline https://cloud.google.com/dataflow/examples/wordcount-example
  • 19. Pipeline p = Pipeline.create(options); p.apply(TextIO.Read.from(“gs://dataflow-samples/shakespeare/kinglear.txt”)) .apply(ParDo.named("ExtractWords").of(new DoFn<String, String>() {      @Override      public void processElement(ProcessContext c) {        for (String word : c.element().split("[^a-zA-Z']+")) {          if (!word.isEmpty()) {            c.output(word);          }        }      } }))   .apply(Count.<String>perElement())   Read lines Create a Pipeline Tokenize lines into words https://cloud.google.com/dataflow/examples/wordcount-example
  • 20. Pipeline p = Pipeline.create(options); p.apply(TextIO.Read.from(“gs://dataflow-samples/shakespeare/kinglear.txt”)) .apply(ParDo.named("ExtractWords").of(new DoFn<String, String>() {      @Override      public void processElement(ProcessContext c) {        for (String word : c.element().split("[^a-zA-Z']+")) {          if (!word.isEmpty()) {            c.output(word);          }        }      } }))   .apply(Count.<String>perElement())   .apply(MapElements.via(new SimpleFunction<KV<String, Long>, String>() {      @Override      public String apply(KV<String, Long> element) {        return element.getKey() + ": " + element.getValue();      }   })) Read lines Create a Pipeline Tokenize lines into words Count words https://cloud.google.com/dataflow/examples/wordcount-example
  • 21. Pipeline p = Pipeline.create(options); p.apply(TextIO.Read.from(“gs://dataflow-samples/shakespeare/kinglear.txt”)) .apply(ParDo.named("ExtractWords").of(new DoFn<String, String>() {      @Override      public void processElement(ProcessContext c) {        for (String word : c.element().split("[^a-zA-Z']+")) {          if (!word.isEmpty()) {            c.output(word);          }        }      } }))   .apply(Count.<String>perElement())   .apply(MapElements.via(new SimpleFunction<KV<String, Long>, String>() {      @Override      public String apply(KV<String, Long> element) {        return element.getKey() + ": " + element.getValue();      }   })) .apply(TextIO.Write.to("gs://my-bucket/counts.txt")); Format strings Read lines Create a Pipeline Tokenize lines into words Count words https://cloud.google.com/dataflow/examples/wordcount-example
  • 22. Pipeline p = Pipeline.create(options); p.apply(TextIO.Read.from(“gs://dataflow-samples/shakespeare/kinglear.txt”)) .apply(ParDo.named("ExtractWords").of(new DoFn<String, String>() {      @Override      public void processElement(ProcessContext c) {        for (String word : c.element().split("[^a-zA-Z']+")) {          if (!word.isEmpty()) {            c.output(word);          }        }      } }))   .apply(Count.<String>perElement())   .apply(MapElements.via(new SimpleFunction<KV<String, Long>, String>() {      @Override      public String apply(KV<String, Long> element) {        return element.getKey() + ": " + element.getValue();      }   })) .apply(TextIO.Write.to("gs://my-bucket/counts.txt")); Format strings Read lines Create a Pipeline Tokenize lines into words Write to file Count words https://cloud.google.com/dataflow/examples/wordcount-example
  • 24. Events, metrics Stream Batch Raw logs, databases, etc. Cloud Pub/Sub Cloud Storage Cloud Dataflow
  • 25. Managed Spark and Hadoop service which is fast, easy to use, and low cost Fast & Scalable Data Processing
 Create a cluster in minutes and resize them at any time Affordable Pricing
 Based on actual use, measured by the minute Open Source Ecosystem
 Move existing projects or ETL pipelines without redevelopment Cloud Dataproc
  • 27. Events, metrics Stream Batch Raw logs, databases, etc. Cloud Pub/Sub Cloud Storage Cloud Dataflow Batch Cloud Dataproc
  • 28. An intelligent data service for visually exploring, cleaning, and preparing data Visually explore data Intelligent data manipulation Serverless and works at any scale Cloud Dataprep
  • 30. Google's fully managed, petabyte scale, low cost enterprise data warehouse for analytics Fully Managed
 No infrastructure to manage, and you don't need a database administrator Speed & Scale
 Scans TB in seconds and PB in minutes Convenience of SQL Makes it more accessible Security & Reliability
 Automatically encrypts and replicates data BigQuery
  • 31. Google's fully managed, petabyte scale, low cost enterprise data warehouse for analytics Flexible Data Ingestion
 Load your data from Google Cloud Storage or Google Cloud Datastore, or stream it Fully Integrated
 With other Google Cloud products and third-party applications BigQuery
  • 33. BigQuery Events, metrics Stream Batch Raw logs, databases, etc. Cloud Pub/Sub Cloud Storage Cloud Dataflow Batch Cloud Dataproc
  • 34.  Collect Manipulate Analyze Model | Communicate
  • 35. An easy to use interactive tool for data exploration, analysis, visualization and machine learning Integrated & Open Source
 Built on Jupyter (formerly IPython). Enables analysis of your data on BigQuery, ML Engine, Compute Engine, and Cloud Storage Cloud Datalab
  • 37. Turns your data into informative dashboards and reports that are easy to read, easy to share, and fully customizable Put all your data to work
 Easily access all the data sources you need to understand your business and make better decisions Build engaging visualizations
 Create beautiful charts and graphs that bring your data to life Leverage teamwork that works
 Share and collaborate in real time. Work together quickly, from anywhere. Cloud Data Studio
  • 40.  Collect Manipulate Analyze Model | Communicate
  • 41. Artificial Intelligence is the new electricity. — Andrew Ng
  • 42. AlphaGo The first computer program to beat a professional human Go player Photo: Nature
  • 43. Waymo The Google self-driving car project became Waymo with a mission to make it easy and safe for people and things to move around Photo: Waymo
  • 44. Machine Learning engine and APIs Custom ML modelsPre-trained ML models Machine Learning Engine TensorFlowVision API Translation API Natural Language API Speech API Jobs API
  • 45. Google Cloud Vision API Understand the content of images • Label Detection • Optical Character Recognition • Explicit Content Detection • etc. + https://m.me/youpin.city | https://youpin.city/app @tvirot
  • 46. Google Cloud Speech API Convert audio to text by applying powerful neural network models in an easy to use API @tvirot
  • 47. A managed service that enables you to easily build machine learning models, that work on any type of data, of any size Scalable Service
 Managed distributed training infrastructure that supports CPUs and GPUs HyperTune
 Automatically tuning your hyper parameters with HyperTune Deep Learning Capabilities
 Supports any TensorFlow models Cloud ML Engine
  • 48. BigQuery Events, metrics Stream Batch Raw logs, databases, etc. Cloud Pub/Sub Cloud Storage Cloud Dataflow Batch Cloud Dataproc Cloud Machine Learning Engine
  • 49. Large-Scale Deep Learning for Intelligent Computer Systems, Jeff Dean, WSDM 2016
  • 54. Cucumber Sorter "Farmers want to focus and spend their time on growing delicious vegetables.” — Makoto Koike Photos: Google Cloud Platform / Kaz Sato
  • 57. Serverless
 Less ops and administration No waiting
 Queries that used to take hours or days now take minutes or seconds Machine Intelligence
 Gives everyone access to the deep learning systems
  • 58.
  • 59. Thank you! Ta Virot Chiraphadhanakul, PhD (@tvirot) GDE in Machine Learning | Managing Director @ Skooldio