2. WELCOME TO BIG DATA
Introduction
2017 has been a great year for big data, the silent companion in our lives, it seems that we are losing the fear of something
starting with BIG already scares.
The big data is going to be used not only by the big players, to be present and to be a fundamental axis in the strategies of
any company. Starting, at last, to democratize the use of data as we say.
We have seen the birth of many companies that for sure are going to change the way we see and do things, giving us the
transparency and visibility the market needs.
GRPD is a hot topic nowadays, does not come to change things comes but to regulate them and that each of us know in
what position we are in each moment and take the necessary responsibilities so that the user is more informed and
therefore more protected.
3. R E P O R T
A U T U M N / W I N T E R 1 6
O N L I N E S T O R E . C O M
A U T U M N / W I N T E R 1 6
I N D E X
Introduction
What is big data?
The expanding digital universe
The future is now
Where do we get the data?
Method of collection
Data structure
Where is all that data stored?
Data Warehouse VS Data Lake
How does the data arrive to the DMP?
Data management platform
The 7 VS of Big Data
Lack of profesional data profiles
New Profesional Profiles
Data transparency
GDPR
4. What is
big data?
Big data is a term that describes the
large volume of data that inundates
a business.
But it’s not the amount of data
that’s important, It’s what
organizations do with the data that
matters. Big data can be analyzed
for insights that lead to better
decisions and strategic business
moves.
Source: Verve systems
Source: sas.com
5. THEEXPANDINGDIGITA
UNIVERSE
The digital universe has been rising exponentially since 2013. It is estimated that, by
2020 the size of digital and data universe, will reach a long size unimaginable a few
years ago.
Source: @Tiffani Bova
6. THE
FUTURE
IS NOW
Big Data in Europe
6 million people in Europe worked in data-related jobs in 2015 and 6.16 million in
2016. As far as medium-term developments are concerned, it is estimated
that under a high-growth scenario, the number of data workers in Europe will
increase up to 10.43 million, with a compound average growth rate of 14.1% by
2020.
7. THE
FUTURE
IS NOW
Source: IDC, Big Data Market Forecast,
Big Data in Spain
Big data market in Spain has grown exponentially over the past
few years, and 2019 is expected to reach $313.7 M
Big Data market forecast in Spain
8. Using big data, organizations can
generate actionable insights that
enable them to drive their business
forward. Rapid integration of the
ever-expanding pool of data sources
and types opening a whole new
world of possibilities.
Wheredoweget
thedata?
Source: nextgov.com
Source: columnfivemedia.com
9. #01 #02
S E C O N D P A R T Y
D A T A
F I R S T P A R T Y
D A T A
Browser and serverside cookies set and recorded on
visitors to web and app properties
Cross-device Ids
Device model, operating system, connection type,
mobile network
Personally Identifiable Information (PII) like name,
email, phone, postal address
Behavioural information such as who bought what
and how often
It is first party data made available for use by
another organisation, shared with transparency.
The organisation using the data knows where it
came from, how it was collected, and what it
signifies. The data sources are identical to those
given above.
Data ingestion from multiple on and off-line sources
Data storage
Data mapping
Customer profiling
Cross-device identity graphing
Activation in media buying platforms
Marketplaces for sharing and monetising data sets
Analytics
#03
T H I R D P A R T Y
D A T A
Method of collection
DATA COLLECTED DIRECTLY BY THE
ORGANIZATION
DATA SHARED BY A TRUSTED
SOURCE
AGGREGATED DATA FROM OTHER
SOURCES
Source: :xelsionmedia.com
10. B I J O U M E D I A M A R K E T I N G P R O P O S A L
D a t a s t r u c t u r e
Unstructured: Data that does not
reside in fixed locations generally
refers to free-form text, which is
ubiquitous.
Semi-structured : Between the two
forms where “tags” or “structure”
are associated or embedded within
unstructured data.
Structured: Data that resides in fixed
fields within a record or file.
Source: medium.com
THE
CHALLENGE IS
TO STRUCTURE
ALL THAT DATA
Source: :sherpasoftware.com
11. DATA WAREHOUSE (DWH)
Where is all that data stored?
A data warehouse is a large store of
data accumulated from a varied
range of sources within an
organization. It is used to guide
management decisions.
ETL is normally a continuous, ongoing
process with a well-defined workflow.
It extracts data from homogeneous
or heterogeneous data sources.
Then, data is cleansed, enriched,
transformed, and stored either back
in the lake or in a data warehouse.
Source: Xplenty
12. DATA LAKE
A data lake is a storage repository or
a storage bank that holds a huge
amount of raw data in its original
format until it’s needed.
ELT (Extract, Load, Transform) is a
variant of ETL wherein the extracted
data is first loaded into the target
system. Transformations are
performed after the data is loaded
into the data warehouse. ELT
typically works well when the target
system is powerful enough to handle
transformations.
Source: Xplenty
Where is all that data stored?
13. Data Warehouse Data LakeVS
DATA
PROCESSING
STORAGE
AGILITY
SECURITY
USERS
Structured, processed
Structured, semi-structured,
unstructured, raw
Schema-on-write Schema-on-write
Expensive for large data volumes
Designed for low-cost
storage
Less agile fixed configuration
Highly agile, configure &
reconfigure as needed
Mature Maturing
Business professionals Data scientists
Source: datamation.com
14. H o w d o e s t h e d a t a
a r r i v e s t o t h e D M P ?
DATA
LAKE
DATA
WAREHOUSE
Capture
Curate
Aggregate
Data
Build
Customer
profiles
Engage
customers
Activate
DMP
DSP
Source: prnewswire.com
16. THE 7 Vs OF BIG DATA
14%
14%
14%
14%
14%
14%
14%
Velocity
Variety
Veracity
Visualisation
Viability
Volume
Value
17. 1 . V O L U M E
Amount of data that is generated
in our environment
THE 7 Vs OF BIG DATA
Source: Medium
18. 2 . V E L O C I T Y
The speed in which data is
accessible
THE 7 Vs OF BIG DATA
19. 3 . V A R I E T Y
Forms, types and sources
from which data are
recorded
THE 7 Vs OF BIG DATA
Source:Medium
20. 4 . V E R A C I T Y
Is all about making sure
the data is accurate
THE 7 Vs OF BIG DATA
Source:Medium
21. 5 . V I A B I L I T Y
The capacity of companies
to generate an effective
use of the large volume of
data that handle.
THE 7 Vs OF BIG DATA
Source:Medium
22. 6 . V I S U A L I Z A T I O N
Importance of the visual
representation,
understandable of data in
a pictorial or graphical
format.
THE 7 Vs OF BIG DATA
Source: Mtkander
23. 7 . V A L U E
Be sure that your organization is getting
value from the data
THE 7 Vs OF BIG DATA
Getting Business Value from Big Data
#01 #02 #03 #04
Estimate Analyze Integrate Discover
Estimate expediture &
hardware investment
Analyze streaming
Big Data
Integrate Big Data
with older enterprise
sources
Discover new
business
opportunities
Big Data is the ability to achieve greater value
through insights from superior analytics
Source: Medium
25. New Professional Profiles
DATA SCIENTIST DATA ANALYSTDATA ARCHITECT
Cleans, massages and organizes
(big) data
Collects, processes and performs
statistical data analysis
Creates blueprints for data management
systems to integrate, centralyze, protect
and maintain data sources
DATA ENGINEER
Develop, constructs, tests and maintains
architectures (such databases and large-
scale processing systems)
STATISCIAN DATABASE ADMINISTRATOR
Collects, analyzes and interprets-qualitative
as well as quantitative data with statistical
theories and methods
Ensures that the database is available
to all relevant users, is performing
properly and is being kept safe
Source: Pinterest.com
26. DATA TRANSPARENCY AND VISIBILITY
Source: businessinsider.com
Transparencyisakeyelementwhenpurchasingdata.Advertisersneed
toknowthesourcewherethedataiscomingfromanddatasuppliers
shouldalwaysprovidethisinformation.
The85%oftheexpertsinmarketingandbusinessthinkthatitisnecessary
toincreasethevisibilityofthedatausedtodefineaudiences,tobeableto
takereallyadvantageofallthatdata.
27. Source: powerdata.es
DATA QUALITY
It refers to the quality of a set of information collected in a database, an information system or a data
warehouse with attributes like: accuracy, integrity, updating, coherence, relevance, accessibility and reliability
necessary to be useful to the processing, analysis, and any other purpose that a user you want to give.
28. The GDPR has been born of a need to
regulate the flow of data and protect
it, developing clear policies and
procedures to protect personal data,
and adopt appropriate technical and
organisational measures