3. 3
● Brazilian Data Analyst
● Databases Management Student
● Google fan
● Mom of 1 / Pet Mom of 8
● Plant Based Geek
● Crazy about Nature
4. 4
WHAT I'M GOING TO TALK ABOUT?
■ Big Data Beyond the Hype
[ What Is | The 5 Vs ]
■ What is the Google Cloud Platform?
[ What Is | The Ecosystem ]
■ GCP Products for Big Data
[ Example of Big Data Lifecycle | Ingesting | Storing | Processing | Analysing ]
■ GCP Big Data Solutions to IMWT's Portfolio
[ Challenges | Example | Steps to Success ]
6. 6
High-volume, high-velocity
and high-variety information
assets that demand cost-
effective, innovative forms of
information processing for
enhanced insight and
decision making.
WHAT IS BIG DATA?
Source: Gartner IT Glossary
7. 7
BIG
DATA
Source: Adapted from Michael Walker (2012)
THE 5 Vs
Terabytes to Exabytes
of existing data
to process
Milliseconds to Seconds
to process
VOLUME
Data at Rest
VALUE
Data Into Money
VERACITY
Data In Doubt
VARIETY
Data In Many Forms
VELOCITY
Data In Motion
Structured, unstructured,
text, multimedia...Uncertainty due to data
inconsistency, incompleteness,
Ambiguities, model approximations...
Business models can be
associated to the data
9. 9
A suite of cloud
computing services that
runs on the same
infrastructure that
Google uses internally
for its end-user products.
WHAT IS GOOGLE CLOUD PLATFORM?
Source: GCP Website (2018)
14. 14
INGESTION
Source: GCP Website(2018)
Serverless, fully managed, scalable and pay-
for-use platform for apps and beckends.
Save money while focus on code
rather than infrastructure
Integrated, open and global real-time event
stream ingestion, delivery and analysis
platform.
Fast reporting, targeting and
optimization in advertising and media
15. 15
PROCESSING
Source: GCP Website(2018)
Simple, automated
and reliable stream
and batch data
processing platform.
Fast, easy-to-use and
fully managed cloud
service for running
Apache Spark and
Hadoop cluster.
Minimize latency and
maximize utilization.
Low costs. Focus on the
data, not on the cluster.
16. 16
STORAGE
Source: GCP Website(2018)
In memory, relational,
non-relational, object
and warehouse cloud
storage solutions.
Secure, cost-effective and easily
access storage for every need.
17. 17
EXPLORATION
Source: GCP Website(2018)
Easy-to-use and interactive
tool for data exploration,
analysis, visualization and
machine learning.
Fast, scalable, cost-effective
and fully managed cloud
data warehouse for
analytics.
Set of integrated data-and-
marketing analysis products.
Free. May incur compute, storage
and other cloud services.
Serverless and built-in Machine
Learning.
18. 18
ANALYTICS
Source: GCP Website(2018)
Fast, large scale and easy-to-
use
AI products and services.
Easy-to-use deep learning
models to speech-to-text /
image-to-JSON conversion
and dynamic translation.
Pre trained models.
No advanced ML
skill required.
Better training performance
compared to other
deep learning systems.
22. 22
Source: Adapted from IBM (2014)
STEPS TO SUCCESS
Identify high-value opportunities
Establish the right architecture and funding model
Prove value to business through pilot programs
Scale by expanding to additional use cases
Transform to a data-driven culture