SlideShare una empresa de Scribd logo
1 de 20
1Data & Analytics
3rd Generation Data Platform:
From Buckets of Bits to Understanding
2Data & Analytics
Data is Everything.
How well you use your data can determine the degree of your success.
3
Street nameStreet number
Street View
Sign
Business facade
Sign
Business name
Traffic light
Traffic signStreet number
4Data & Analytics
3rd Gen Data Platforms Challenges
Data access to a variety
of data sources.
Develop and build
analytic models.
Data preparation,
exploration and visualization.
Deploy models and integrate
them into business processes
and applications.
High performance and scalability
for both development
and deployment.
Perform platform, project
and model management.
55
– Doug Cutting, Hadoop Co-Creator
“Google is living a few years
in the future and sending the
rest of us messages”
6Data & Analytics
Google Cloud Platform Vision
Single-node computing
“Some assembly required”
True, on-demand cloud
An actual, global
elastic cloud
3rd Wave
Invest your energy
in great apps
Colocation
Your kit, someone
else’s building.
Yours to manage.
1st Wave
Today's Cloud:
Virtualized
Data Centers
Standard virtual kit,
for rent. Still yours
to manage.
2nd Wave
7Data & Analytics
Bridging the Waves
AnalyzeStoreCapture
BigQuery
(large scale SQL)
Cloud Machine
Learning
Cloud Pub/Sub
Logs, App Engine
BigQuery streaming
Process
Cloud Dataflow
(stream and batch)
Cloud
Storage
(objects)
Cloud SQL
(SQL)
Cloud
Datastore
(NoSQL)
BigQuery
(structured)
Cloud Dataproc (Hadoop & Ecosystem)
Cloud
Bigtable
(NoSQL
HBase)
Cassandra hBase MongoDBRabbit MQ Kafka
Wave 2
Wave 3
Visualise
Cloud DataLab
(iPython/Jupyter)
Data Studio 360
Tableau Qlik
8Data & Analytics
Exploration &
Collaboration
Databases Storage
Data
Preparation
& Processing
Analytics
Advanced
Analytics &
Intelligence
Google Cloud Data Platform
Mobile apps
Sensors and
devices
Web apps
Relational
Key-value
Document
SQL
Wide
Column
Object
Stream
processing
Batch
processing
Data
preparation
Federated
query
Data catalog
Data
exploration
Data
visualization
Developers
Data scientists
Business
analysts
Development
environment
for Machine
Learning
Pre-Trained
Machine
Learning
models
Data
Ingestion
Messaging
Logs
9Data & Analytics
Data
Preparation &
Processing
Cloud Dataflow
Cloud Dataproc
Exploration &
Collaboration
Google BigQuery
Cloud Datalab
Google
Analytics 360
Cloud Dataproc
Google Cloud Data Platform
Mobile apps
Sensors and
devices
Web apps
Developers
Data scientists
Business
analysts
Data Ingestion
Cloud Pub/Sub
App Engine
Databases/
Storage
Cloud SQL
Cloud Bigtable
Cloud Datastore
Cloud Storage
Analytics
Google BigQuery
Google
Analytics 360
Cloud Dataproc
Google Drive
Advanced
Analytics &
Intelligence
Cloud Machine
Learning
Translate API
Vision API
Speech API
10Data & Analytics
Managed Data Services - Focus on Insight vs Infrastructure
PB+ Scale, No-Ops, Batch & Streaming of Data
Insights/
Programming
Resource
Provisioning
Performance
Tuning
Monitoring
Reliability
Deployment &
Configuration
Handling
Growing Scale
Utilization
Improvements
Insights/
Programming
proprietary & confidential | not for distribution
"We are very excited about the productivity
benefits offered by Cloud Dataflow and Cloud
Pub/Sub. It took half a day to rewrite
something that had previously taken over six
months to build using Spark"
Paul Clarke, Director of Technology, Ocado
http://googlecloudplatform.blogspot.co.uk/2015/08/Announcing-General-Availability-of-Google-Cloud-Dataflow-and-Cloud-Pub-Sub.html
Hadoop + Local SSD
5X the IOPS at 0.5 the cost of AWS local SSD
Up to 1.5TB per instance
680,000 read IOPS and < 1ms latency1
2
3
13Data & Analytics
– Mattias P Johansson, Software Engineer, Spotify
“With Google Cloud Platform, we benefitted by having a
virtual supercomputer on demand, without having to deal
with all the usual space, power, cooling and networking
issues.
Just a few years ago, we would have needed to use the
largest supercomputers on the planet to do what we’re
now able to do with Google”
– Mark Johnson, CEO, Descartes Labs
“Right at the start of the partnership we were able
to reduce time to insight from 96 hours to 30
minutes by using BigQuery.”
– Gary Sanders, Head of Digital Analytics, Lloyds Banking Group
“Everyone involved unanimously picked GCP. It came
down to this: we believe the core technology is better.”
– Peter Bakkam, Platform Lead, Quizlet
Do you feel this way about your Data Warehouse?
14Data & Analytics
Data Warehouses/Lakes Machine Intelligence
Data Warehouse is the foundation of something bigger
Predictive
+
Prescriptive
analytics
=
Advanced
analytics
Cloud
On
Premises
Machine
Learning
APIs
Train
your own
Models
15Data & Analytics
Automatically
categorize, and
automatically
extract value
Evaluate the model by
applying it against
additional manually
categorized data, correct
and tune
Machine intelligence is already making a huge difference
and there are many, many more opportunities
Capture lots of examples
of correct evaluations for
that categorization, and
use them to train an ML
model
Identify categorizations
that provide value,
categories you’re
already evaluating for
by hand today
1 2 3 4
16| THE LEADERS CIRCLE
Rapidly accelerating use of deep learning at Google
AlphaGo
Android
Apps
Gmail
Maps
Photos
Robotics
Speech
Search
Translation
YouTube
and many others ...
Used across areas:
2012 2013 2014 2015
1500
1000
500
0
Number of directories containing model description files
171717
BETAGA BETA
Cloud
Translate API
Cloud
Vision API
Cloud
Speech API
Natural
Language API
GA
Ready to use Machine Learning models
18Data & Analytics
Machine learning will drive every
successful huge IPO win in the next 5 years.“
”Eric Schmidt
Executive Chairman, Alphabet Inc
19Data & Analytics
Now
Stores and Analyzes
Next
Understands
20Data & Analytics 20
Thanks!

Más contenido relacionado

Más de Dataconomy Media

Más de Dataconomy Media (20)

Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...
Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...
Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...
 
Data Natives Vienna v 7.0 | "Building Kubernetes Operators with KUDO for Dat...
Data Natives Vienna v 7.0  | "Building Kubernetes Operators with KUDO for Dat...Data Natives Vienna v 7.0  | "Building Kubernetes Operators with KUDO for Dat...
Data Natives Vienna v 7.0 | "Building Kubernetes Operators with KUDO for Dat...
 
Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...
Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...
Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...
 
Data Natives Cologne v 4.0 | "The Data Lorax: Planting the Seeds of Fairness...
Data Natives Cologne v 4.0  | "The Data Lorax: Planting the Seeds of Fairness...Data Natives Cologne v 4.0  | "The Data Lorax: Planting the Seeds of Fairness...
Data Natives Cologne v 4.0 | "The Data Lorax: Planting the Seeds of Fairness...
 
Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...
Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...
Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...
 
Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...
Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...
Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...
 
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
 
Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...
Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...
Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...
 
Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...
Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...
Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...
 
Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...
Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...
Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...
 
Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...
Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...
Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...
 
Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...
Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...
Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...
 
Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...
Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...
Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...
 
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...
 
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
 
Big Data Helsinki v 3 | "What you should know about PSD2 APIs?" - Joonas Tomperi
Big Data Helsinki v 3 | "What you should know about PSD2 APIs?" - Joonas TomperiBig Data Helsinki v 3 | "What you should know about PSD2 APIs?" - Joonas Tomperi
Big Data Helsinki v 3 | "What you should know about PSD2 APIs?" - Joonas Tomperi
 
Big Data Stockholm v 7 | "Federated Machine Learning for Collaborative and Se...
Big Data Stockholm v 7 | "Federated Machine Learning for Collaborative and Se...Big Data Stockholm v 7 | "Federated Machine Learning for Collaborative and Se...
Big Data Stockholm v 7 | "Federated Machine Learning for Collaborative and Se...
 
Big Data Oslo v 4 | "When a Perfect Algorithm Meets Real Data" - Alessandra ...
Big Data Oslo v 4 | "When a Perfect Algorithm Meets Real Data" -  Alessandra ...Big Data Oslo v 4 | "When a Perfect Algorithm Meets Real Data" -  Alessandra ...
Big Data Oslo v 4 | "When a Perfect Algorithm Meets Real Data" - Alessandra ...
 
Big Data Oslo v 4 Sci Code: "Current Industry Projects within AI and the Best...
Big Data Oslo v 4 Sci Code: "Current Industry Projects within AI and the Best...Big Data Oslo v 4 Sci Code: "Current Industry Projects within AI and the Best...
Big Data Oslo v 4 Sci Code: "Current Industry Projects within AI and the Best...
 
Big Data Warsaw v 4 I "Startups: Lifeguards of the Corporate Data Lake" - Fel...
Big Data Warsaw v 4 I "Startups: Lifeguards of the Corporate Data Lake" - Fel...Big Data Warsaw v 4 I "Startups: Lifeguards of the Corporate Data Lake" - Fel...
Big Data Warsaw v 4 I "Startups: Lifeguards of the Corporate Data Lake" - Fel...
 

Último

Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
ahmedjiabur940
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
gajnagarg
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
HyderabadDolls
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
gajnagarg
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
nirzagarg
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
chadhar227
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdf
SayantanBiswas37
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
gajnagarg
 
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
HyderabadDolls
 

Último (20)

Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
 
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdf
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
 
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 

"Big Data & Machine Learning Innovation with Google Cloud Platform", Shira Kimchi Google Cloud Platform, Business Manager of MEA

  • 1. 1Data & Analytics 3rd Generation Data Platform: From Buckets of Bits to Understanding
  • 2. 2Data & Analytics Data is Everything. How well you use your data can determine the degree of your success.
  • 3. 3 Street nameStreet number Street View Sign Business facade Sign Business name Traffic light Traffic signStreet number
  • 4. 4Data & Analytics 3rd Gen Data Platforms Challenges Data access to a variety of data sources. Develop and build analytic models. Data preparation, exploration and visualization. Deploy models and integrate them into business processes and applications. High performance and scalability for both development and deployment. Perform platform, project and model management.
  • 5. 55 – Doug Cutting, Hadoop Co-Creator “Google is living a few years in the future and sending the rest of us messages”
  • 6. 6Data & Analytics Google Cloud Platform Vision Single-node computing “Some assembly required” True, on-demand cloud An actual, global elastic cloud 3rd Wave Invest your energy in great apps Colocation Your kit, someone else’s building. Yours to manage. 1st Wave Today's Cloud: Virtualized Data Centers Standard virtual kit, for rent. Still yours to manage. 2nd Wave
  • 7. 7Data & Analytics Bridging the Waves AnalyzeStoreCapture BigQuery (large scale SQL) Cloud Machine Learning Cloud Pub/Sub Logs, App Engine BigQuery streaming Process Cloud Dataflow (stream and batch) Cloud Storage (objects) Cloud SQL (SQL) Cloud Datastore (NoSQL) BigQuery (structured) Cloud Dataproc (Hadoop & Ecosystem) Cloud Bigtable (NoSQL HBase) Cassandra hBase MongoDBRabbit MQ Kafka Wave 2 Wave 3 Visualise Cloud DataLab (iPython/Jupyter) Data Studio 360 Tableau Qlik
  • 8. 8Data & Analytics Exploration & Collaboration Databases Storage Data Preparation & Processing Analytics Advanced Analytics & Intelligence Google Cloud Data Platform Mobile apps Sensors and devices Web apps Relational Key-value Document SQL Wide Column Object Stream processing Batch processing Data preparation Federated query Data catalog Data exploration Data visualization Developers Data scientists Business analysts Development environment for Machine Learning Pre-Trained Machine Learning models Data Ingestion Messaging Logs
  • 9. 9Data & Analytics Data Preparation & Processing Cloud Dataflow Cloud Dataproc Exploration & Collaboration Google BigQuery Cloud Datalab Google Analytics 360 Cloud Dataproc Google Cloud Data Platform Mobile apps Sensors and devices Web apps Developers Data scientists Business analysts Data Ingestion Cloud Pub/Sub App Engine Databases/ Storage Cloud SQL Cloud Bigtable Cloud Datastore Cloud Storage Analytics Google BigQuery Google Analytics 360 Cloud Dataproc Google Drive Advanced Analytics & Intelligence Cloud Machine Learning Translate API Vision API Speech API
  • 10. 10Data & Analytics Managed Data Services - Focus on Insight vs Infrastructure PB+ Scale, No-Ops, Batch & Streaming of Data Insights/ Programming Resource Provisioning Performance Tuning Monitoring Reliability Deployment & Configuration Handling Growing Scale Utilization Improvements Insights/ Programming
  • 11. proprietary & confidential | not for distribution "We are very excited about the productivity benefits offered by Cloud Dataflow and Cloud Pub/Sub. It took half a day to rewrite something that had previously taken over six months to build using Spark" Paul Clarke, Director of Technology, Ocado http://googlecloudplatform.blogspot.co.uk/2015/08/Announcing-General-Availability-of-Google-Cloud-Dataflow-and-Cloud-Pub-Sub.html
  • 12. Hadoop + Local SSD 5X the IOPS at 0.5 the cost of AWS local SSD Up to 1.5TB per instance 680,000 read IOPS and < 1ms latency1 2 3
  • 13. 13Data & Analytics – Mattias P Johansson, Software Engineer, Spotify “With Google Cloud Platform, we benefitted by having a virtual supercomputer on demand, without having to deal with all the usual space, power, cooling and networking issues. Just a few years ago, we would have needed to use the largest supercomputers on the planet to do what we’re now able to do with Google” – Mark Johnson, CEO, Descartes Labs “Right at the start of the partnership we were able to reduce time to insight from 96 hours to 30 minutes by using BigQuery.” – Gary Sanders, Head of Digital Analytics, Lloyds Banking Group “Everyone involved unanimously picked GCP. It came down to this: we believe the core technology is better.” – Peter Bakkam, Platform Lead, Quizlet Do you feel this way about your Data Warehouse?
  • 14. 14Data & Analytics Data Warehouses/Lakes Machine Intelligence Data Warehouse is the foundation of something bigger Predictive + Prescriptive analytics = Advanced analytics Cloud On Premises Machine Learning APIs Train your own Models
  • 15. 15Data & Analytics Automatically categorize, and automatically extract value Evaluate the model by applying it against additional manually categorized data, correct and tune Machine intelligence is already making a huge difference and there are many, many more opportunities Capture lots of examples of correct evaluations for that categorization, and use them to train an ML model Identify categorizations that provide value, categories you’re already evaluating for by hand today 1 2 3 4
  • 16. 16| THE LEADERS CIRCLE Rapidly accelerating use of deep learning at Google AlphaGo Android Apps Gmail Maps Photos Robotics Speech Search Translation YouTube and many others ... Used across areas: 2012 2013 2014 2015 1500 1000 500 0 Number of directories containing model description files
  • 17. 171717 BETAGA BETA Cloud Translate API Cloud Vision API Cloud Speech API Natural Language API GA Ready to use Machine Learning models
  • 18. 18Data & Analytics Machine learning will drive every successful huge IPO win in the next 5 years.“ ”Eric Schmidt Executive Chairman, Alphabet Inc
  • 19. 19Data & Analytics Now Stores and Analyzes Next Understands
  • 20. 20Data & Analytics 20 Thanks!

Notas del editor

  1. ML + Google = :-) Mission is to organize the world’s information Information = data, data = oxygen Use of data can determine success
  2. a lot of info in pictures 23 billion words in Wikipedia 40 billion textual lines in StreetView Make the point that we didn’t realize how valuable the pictures were originally and later we revisited and extracted all this additional value. Willing to make a bet all the audience have similarly valuable data.
  3. Data access to a variety of data sources. The gartner advanced analytics customer reference survey indicates that while the majority of users are analyzing transactional data, new data sources — such as text, log and sensor data, and location data — are becoming increasingly common. Data preparation, exploration and visualization is a key area of functionality as analysis is performed by users who may lack familiarity with the data and have increasingly high expectations of tools for automating data discovery, visualization and preparation. The ability to develop and build analytic models, including clustering, classification and predictive models, forecasting models, simulation models and optimization models. Ability to deploy models and integrate them into business processes and applications. Deployment is a significant pain point for many organizations, so allowing easy adoption of models as part of a business process or application — rather than them just being exported as code or a database score — improves project success rates. Capabilities to perform platform, project and model management. The need to be able to validate the performance of models and track them once deployed is necessary; the ability to reuse models and audit their development and usage can be mandatory, rather than just desired, in certain more regulated industries and environments. High performance and scalability for both development and deployment. The ability to perform at high levels of speed and accuracy with large volumes data and streaming data is still critical for organizations, and with rising data volumes becomes even more of a differentiator.
  4. Speaker: Right now, Big Data = Big problems 1 - Removing the complexity of building and maintaining a Big Data system: Unlike with other Cloud services, Google provides the industry’s only NoOps Big Data platform. NoOps means that application developers will never have to speak with an operations professional again. NoOps will achieve this nirvana, by using cloud infrastructure-as-a-service to get the resources they need when they need them. 2 - Capture and store all data for all business functions: Developers can capture data using Pub/Sub or porting data from other Google services (i.e. Google Analytics). In addition, Google Cloud Storage offers developers durable and highly available object storage. Google created three simple storage product options to help developers improve the performance of their applications while keeping their costs low. These three product options use the same API, providing a simple and consistent method of access. 3 - Continuously accommodating greater data volumes and new data sources: Google understands that the amount of data companies have to store and analyze is growing exponentially. This is why we’re constantly innovating in order to offer cheaper and faster storage services (Nearline) but also making analysis tool such as BigQuery faster. 4 - Finding value in existing data very easily: Google BigQuery is designed to make it easy to analyze large amounts of data quickly. BigQuery enables analysts and developers to run fast SQL-like join and aggregate queries on datasets without the need for batch-based processing. 5 - Reducing the time from data collection to action: Google Cloud Platform for Big Data offers a proven and integrated end to end solution to make sense of large amounts of information in a very short amount of time. The end to end process of data management happens in the following stages: Capture data using Pub/Sub, porting data from other Google services i.e. Google Analytics Process data using DataFlow, 3rd party offerings i.e. Hadoop Store data using Google Cloud storage, Standard, DRA and Nearline or Bigtable and BigQuery Analyze data using BigQuery, 3rd party offerings i.e. Spark 6 - Removing the hurdles to innovate and iterate with Big Data: Google has led the industry with innovations in software infrastructure such as MapReduce, BigTable and Dremel. Today, Google is pushing the next generation of innovation with products such as Spanner and Flume. When you build on Cloud Platform, you get access to Google’s technology innovations faster.
  5. By 2020, predictive and prescriptive analytics will attract 40% of enterprises' net new investment in business intelligence and analytics. By 2018, more than half of large organizations globally will compete using advanced analytics and proprietary algorithms, causing the disruption of entire industries. advanced analytics is the analysis of all kinds of data using sophisticated quantitative methods (such as statistics, descriptive and predictive data mining, machine learning, simulation and optimization) to produce insights that traditional approaches to business intelligence (BI) — such as query and reporting — are unlikely to discover.
  6. Understanding that the space is categorizable, testable. Caetgorically. Do you have sources of data where you have correctly categorized already. Humans have interpreted data and put it into categories. Might Computationally/operationally expensive way to do this...how little data do I need. Test and tune. Process/app can be used to collect Automatic clustering/ Kmeans- Are there things grouped in there, categories I can’t see. Some groupings will seem irrelevant artificial-est Google Products: AlphaGo, Apps, Maps, Photos, Gmail, Speech, Android, YouTube, Translation, Robotics Research, Image Understanding, Natural Language Understanding, Drug Discovery Outside of Google, most popular use case are:
  7. To summarize, Cloud Machine Learning provides the latest innovations in vision and speech from Google Research and services like Photos, Google app, Translate, and Inbox. These ML driven capabilities are now simple APIs in Translate API, Vision API, and Speech API. Translate and Vision are fully launched. Cloud Speech and NL entered Beta. We’re very excited to bring this innovative technology to you guys
  8. Thanks so much!