SlideShare una empresa de Scribd logo
1 de 11
Descargar para leer sin conexión
DEBUNKING THE MYTHS 
Speaker 10 of 17 
Martin Willcox 
@Willcoxmnk 
What is Data Lake, Anyway? 
Followed by 
Anthony Miller
One of the Big Data labels that we risk over-loading to 
complete abstraction is the idea of a "Data Lake”… 
2 © 2014 Teradata 
“…store all data present 
and future and create a 
centralised data archive 
location.” 
“A large 
object-based 
repository that 
holds data in 
its native 
format” 
“Sometimes 
called the bit 
bucket or the 
landing zone” 
“All Water 
and Little 
Substance” 
“As more and more applications 
are created that derive value 
from… new types of data… the 
Data Lake forms”
“Data lakes can 
help resolve the 
nagging problem of 
accessibility and 
data integration” 
…and some of the discussions sound eerily familiar 
3 © 2014 Teradata 
Data accessibility 
and integration? 
Isn’t that what the 
Data Warehouse is 
for?
So is the Data Lake a new architectural construct? 
4 © 2014 Teradata 
Or are we just re-platforming Data Marts? 
Simple, single subject area Dimensional 
Data Marts – with all of the dimensions 
pre-joined to the fact table? One-per-workload 
/ application? 
Is this really the future of Enterprise 
Analytics? Or circa 1995 silo, 
departmental Decision Support Systems 
warmed-over?
Take the merits of the different technologies out of the 
equation – and this is what some of us are thinking… 
5 © 2014 Teradata
…but there are no free lunches in Information 
Management – merely more and different options 
Explicit, or implicit, there 
is always, always, always 
(at least one) schema 
6 © 2014 Teradata 
Agile application 
development, versus 
agile data acquisition 
None of the information 
management 
strategies / technologies 
are magic - “pay me 
now, or pay me later”
7 © 2014 Teradata 
Big Data Are Plural 
For the foreseeable future, we will need multiple Information 
Management strategies - and multiple Information 
Management technologies 
DATA WAREHOUSE 
DISCOVERY PLATFORM 
Integration 
becomes a 
critical concern 
DATA 
PLATFORM 
– Gartner – 
Logical Data Warehouse 
– Forrester – 
Enterprise Data Hub 
– Teradata – 
Unified Data Architecture
8 © 2014 Teradata 
A definition of the Data Lake (Data Reservoir) 
A centralised, consolidated, persistent store of raw, un-modelled and un-transformed data from 
multiple sources / silos (without an explicit, pre-defined schema, without externally defined metadata – 
and without guarantees about the quality, provenance and security of the data) 
Agile data acquisition – 
a haystack to go looking 
for needles… 
…with a natural storage 
model for complex, 
multi-structured data… 
…support for efficient 
non-relational 
computation… 
Now that is new, interesting and (potentially) very, very useful… 
…and provision for cost-effective 
storage of large 
and noisy data-sets.
9 © 2014 Teradata 
Data. Science
does nature tend to give us a single, beautiful lake? Or a messy patchwork of lakes, plural? 
10 © 2014 Teradata 
Left to its own devices, 
STOP PRESS: Laws of Physics* Unchanged! 
(* More specifically, the 2nd Law of Thermodynamics) 
None of the new information management strategies and technologies is by itself a cure 
for information entropy – data silos form naturally, just like lakes form naturally
11 © 2014 Teradata 
Summary and conclusions

Más contenido relacionado

La actualidad más candente

Info qiy foundation digital me - dappre-eng-aug17
Info qiy foundation   digital me - dappre-eng-aug17Info qiy foundation   digital me - dappre-eng-aug17
Info qiy foundation digital me - dappre-eng-aug17BigDataExpo
 
Modern Data Architecture
Modern Data ArchitectureModern Data Architecture
Modern Data ArchitectureEd Thewlis
 
Agile v Warehouse? Maurice Lynch CEO of Nathaen Technologies - Dublinked Data...
Agile v Warehouse? Maurice Lynch CEO of Nathaen Technologies - Dublinked Data...Agile v Warehouse? Maurice Lynch CEO of Nathaen Technologies - Dublinked Data...
Agile v Warehouse? Maurice Lynch CEO of Nathaen Technologies - Dublinked Data...Dublinked .
 
Education Seminar: Self-service BI, Logical Data Warehouse and Data Lakes
Education Seminar: Self-service BI, Logical Data Warehouse and Data LakesEducation Seminar: Self-service BI, Logical Data Warehouse and Data Lakes
Education Seminar: Self-service BI, Logical Data Warehouse and Data LakesDenodo
 
A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)Denodo
 
Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)
Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)
Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)Denodo
 
Dell hans timmerman v1.1
Dell hans timmerman v1.1Dell hans timmerman v1.1
Dell hans timmerman v1.1BigDataExpo
 
A "First Time Right" Start with Data Virtualization by Bart De Groeve, Practi...
A "First Time Right" Start with Data Virtualization by Bart De Groeve, Practi...A "First Time Right" Start with Data Virtualization by Bart De Groeve, Practi...
A "First Time Right" Start with Data Virtualization by Bart De Groeve, Practi...Patrick Van Renterghem
 
Logical Data Warehouse: The Foundation of Modern Data and Analytics (APAC)
Logical Data Warehouse: The Foundation of Modern Data and Analytics (APAC)Logical Data Warehouse: The Foundation of Modern Data and Analytics (APAC)
Logical Data Warehouse: The Foundation of Modern Data and Analytics (APAC)Denodo
 
Accelerate Cloud Modernization using Data Virtualization
Accelerate Cloud Modernization using Data VirtualizationAccelerate Cloud Modernization using Data Virtualization
Accelerate Cloud Modernization using Data VirtualizationDenodo
 
Data Virtualization for Compliance – Creating a Controlled Data Environment
Data Virtualization for Compliance – Creating a Controlled Data EnvironmentData Virtualization for Compliance – Creating a Controlled Data Environment
Data Virtualization for Compliance – Creating a Controlled Data EnvironmentDenodo
 
Advanced Data Analytics and Open Data - Dr Ingo Keck of CeADAR - Dublinked Da...
Advanced Data Analytics and Open Data - Dr Ingo Keck of CeADAR - Dublinked Da...Advanced Data Analytics and Open Data - Dr Ingo Keck of CeADAR - Dublinked Da...
Advanced Data Analytics and Open Data - Dr Ingo Keck of CeADAR - Dublinked Da...Dublinked .
 
Atlantis company overview
Atlantis company overviewAtlantis company overview
Atlantis company overviewAriel Schwieg
 
TechEvent 2019: Provisioning of Data Platforms - Why, how, what; Martin Wunde...
TechEvent 2019: Provisioning of Data Platforms - Why, how, what; Martin Wunde...TechEvent 2019: Provisioning of Data Platforms - Why, how, what; Martin Wunde...
TechEvent 2019: Provisioning of Data Platforms - Why, how, what; Martin Wunde...Trivadis
 
A Successful Data Strategy for Insurers in Volatile Times (EMEA)
A Successful Data Strategy for Insurers in Volatile Times (EMEA)A Successful Data Strategy for Insurers in Volatile Times (EMEA)
A Successful Data Strategy for Insurers in Volatile Times (EMEA)Denodo
 
Multi-Cloud-Datenintegration mit Datenvirtualisierung
Multi-Cloud-Datenintegration mit DatenvirtualisierungMulti-Cloud-Datenintegration mit Datenvirtualisierung
Multi-Cloud-Datenintegration mit DatenvirtualisierungDenodo
 
Empowering your Enterprise with a Self-Service Data Marketplace (EMEA)
Empowering your Enterprise with a Self-Service Data Marketplace (EMEA)Empowering your Enterprise with a Self-Service Data Marketplace (EMEA)
Empowering your Enterprise with a Self-Service Data Marketplace (EMEA)Denodo
 

La actualidad más candente (20)

Info qiy foundation digital me - dappre-eng-aug17
Info qiy foundation   digital me - dappre-eng-aug17Info qiy foundation   digital me - dappre-eng-aug17
Info qiy foundation digital me - dappre-eng-aug17
 
Modern Data Architecture
Modern Data ArchitectureModern Data Architecture
Modern Data Architecture
 
Agile v Warehouse? Maurice Lynch CEO of Nathaen Technologies - Dublinked Data...
Agile v Warehouse? Maurice Lynch CEO of Nathaen Technologies - Dublinked Data...Agile v Warehouse? Maurice Lynch CEO of Nathaen Technologies - Dublinked Data...
Agile v Warehouse? Maurice Lynch CEO of Nathaen Technologies - Dublinked Data...
 
Education Seminar: Self-service BI, Logical Data Warehouse and Data Lakes
Education Seminar: Self-service BI, Logical Data Warehouse and Data LakesEducation Seminar: Self-service BI, Logical Data Warehouse and Data Lakes
Education Seminar: Self-service BI, Logical Data Warehouse and Data Lakes
 
A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)
 
Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)
Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)
Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)
 
Dell hans timmerman v1.1
Dell hans timmerman v1.1Dell hans timmerman v1.1
Dell hans timmerman v1.1
 
A "First Time Right" Start with Data Virtualization by Bart De Groeve, Practi...
A "First Time Right" Start with Data Virtualization by Bart De Groeve, Practi...A "First Time Right" Start with Data Virtualization by Bart De Groeve, Practi...
A "First Time Right" Start with Data Virtualization by Bart De Groeve, Practi...
 
Logical Data Warehouse: The Foundation of Modern Data and Analytics (APAC)
Logical Data Warehouse: The Foundation of Modern Data and Analytics (APAC)Logical Data Warehouse: The Foundation of Modern Data and Analytics (APAC)
Logical Data Warehouse: The Foundation of Modern Data and Analytics (APAC)
 
Accelerate Cloud Modernization using Data Virtualization
Accelerate Cloud Modernization using Data VirtualizationAccelerate Cloud Modernization using Data Virtualization
Accelerate Cloud Modernization using Data Virtualization
 
Data Virtualization for Compliance – Creating a Controlled Data Environment
Data Virtualization for Compliance – Creating a Controlled Data EnvironmentData Virtualization for Compliance – Creating a Controlled Data Environment
Data Virtualization for Compliance – Creating a Controlled Data Environment
 
Vendor-Checklist
Vendor-ChecklistVendor-Checklist
Vendor-Checklist
 
Advanced Data Analytics and Open Data - Dr Ingo Keck of CeADAR - Dublinked Da...
Advanced Data Analytics and Open Data - Dr Ingo Keck of CeADAR - Dublinked Da...Advanced Data Analytics and Open Data - Dr Ingo Keck of CeADAR - Dublinked Da...
Advanced Data Analytics and Open Data - Dr Ingo Keck of CeADAR - Dublinked Da...
 
Atlantis company overview
Atlantis company overviewAtlantis company overview
Atlantis company overview
 
TechEvent 2019: Provisioning of Data Platforms - Why, how, what; Martin Wunde...
TechEvent 2019: Provisioning of Data Platforms - Why, how, what; Martin Wunde...TechEvent 2019: Provisioning of Data Platforms - Why, how, what; Martin Wunde...
TechEvent 2019: Provisioning of Data Platforms - Why, how, what; Martin Wunde...
 
A Successful Data Strategy for Insurers in Volatile Times (EMEA)
A Successful Data Strategy for Insurers in Volatile Times (EMEA)A Successful Data Strategy for Insurers in Volatile Times (EMEA)
A Successful Data Strategy for Insurers in Volatile Times (EMEA)
 
Multi-Cloud-Datenintegration mit Datenvirtualisierung
Multi-Cloud-Datenintegration mit DatenvirtualisierungMulti-Cloud-Datenintegration mit Datenvirtualisierung
Multi-Cloud-Datenintegration mit Datenvirtualisierung
 
Study: #Big Data in #Austria
Study: #Big Data in #AustriaStudy: #Big Data in #Austria
Study: #Big Data in #Austria
 
Data encryption-cloud
Data encryption-cloudData encryption-cloud
Data encryption-cloud
 
Empowering your Enterprise with a Self-Service Data Marketplace (EMEA)
Empowering your Enterprise with a Self-Service Data Marketplace (EMEA)Empowering your Enterprise with a Self-Service Data Marketplace (EMEA)
Empowering your Enterprise with a Self-Service Data Marketplace (EMEA)
 

Similar a Martin Willcox - What is a Data Lake, Anyway?

Data Virtualization – Gateway to a Digital Business - Barry Devlin
Data Virtualization – Gateway to a Digital Business - Barry DevlinData Virtualization – Gateway to a Digital Business - Barry Devlin
Data Virtualization – Gateway to a Digital Business - Barry DevlinDenodo
 
Gerenral insurance Accounts IT and Investment
Gerenral insurance Accounts IT and InvestmentGerenral insurance Accounts IT and Investment
Gerenral insurance Accounts IT and Investmentvijayk23x
 
Big data data lake and beyond
Big data data lake and beyond Big data data lake and beyond
Big data data lake and beyond Rajesh Kumar
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An IntroductionDenodo
 
Unlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data VirtualizationUnlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data VirtualizationDenodo
 
From Single Purpose to Multi Purpose Data Lakes - Broadening End Users
From Single Purpose to Multi Purpose Data Lakes - Broadening End UsersFrom Single Purpose to Multi Purpose Data Lakes - Broadening End Users
From Single Purpose to Multi Purpose Data Lakes - Broadening End UsersDenodo
 
Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)Denodo
 
Enterprise Data Lake
Enterprise Data LakeEnterprise Data Lake
Enterprise Data Lakesambiswal
 
Enterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable DigitalEnterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable Digitalsambiswal
 
An Introduction to Data Virtualization in 2018
An Introduction to Data Virtualization in 2018An Introduction to Data Virtualization in 2018
An Introduction to Data Virtualization in 2018Denodo
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An IntroductionDenodo
 
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)Denodo
 
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)Denodo
 
Myth Busters III: I’m Building a Data Lake, So I Don’t Need Data Virtualization
Myth Busters III: I’m Building a Data Lake, So I Don’t Need Data VirtualizationMyth Busters III: I’m Building a Data Lake, So I Don’t Need Data Virtualization
Myth Busters III: I’m Building a Data Lake, So I Don’t Need Data VirtualizationDenodo
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An IntroductionDenodo
 
Big Data and BI Tools - BI Reporting for Bay Area Startups User Group
Big Data and BI Tools - BI Reporting for Bay Area Startups User GroupBig Data and BI Tools - BI Reporting for Bay Area Startups User Group
Big Data and BI Tools - BI Reporting for Bay Area Startups User GroupScott Mitchell
 
The Marriage of the Data Lake and the Data Warehouse and Why You Need Both
The Marriage of the Data Lake and the Data Warehouse and Why You Need BothThe Marriage of the Data Lake and the Data Warehouse and Why You Need Both
The Marriage of the Data Lake and the Data Warehouse and Why You Need BothAdaryl "Bob" Wakefield, MBA
 

Similar a Martin Willcox - What is a Data Lake, Anyway? (20)

Data Virtualization – Gateway to a Digital Business - Barry Devlin
Data Virtualization – Gateway to a Digital Business - Barry DevlinData Virtualization – Gateway to a Digital Business - Barry Devlin
Data Virtualization – Gateway to a Digital Business - Barry Devlin
 
Data lakes
Data lakesData lakes
Data lakes
 
Data lake ppt
Data lake pptData lake ppt
Data lake ppt
 
Gerenral insurance Accounts IT and Investment
Gerenral insurance Accounts IT and InvestmentGerenral insurance Accounts IT and Investment
Gerenral insurance Accounts IT and Investment
 
Big data data lake and beyond
Big data data lake and beyond Big data data lake and beyond
Big data data lake and beyond
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An Introduction
 
Unlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data VirtualizationUnlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data Virtualization
 
From Single Purpose to Multi Purpose Data Lakes - Broadening End Users
From Single Purpose to Multi Purpose Data Lakes - Broadening End UsersFrom Single Purpose to Multi Purpose Data Lakes - Broadening End Users
From Single Purpose to Multi Purpose Data Lakes - Broadening End Users
 
Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)
 
Enterprise Data Lake
Enterprise Data LakeEnterprise Data Lake
Enterprise Data Lake
 
Enterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable DigitalEnterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable Digital
 
An Introduction to Data Virtualization in 2018
An Introduction to Data Virtualization in 2018An Introduction to Data Virtualization in 2018
An Introduction to Data Virtualization in 2018
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An Introduction
 
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
 
Teradata
TeradataTeradata
Teradata
 
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
 
Myth Busters III: I’m Building a Data Lake, So I Don’t Need Data Virtualization
Myth Busters III: I’m Building a Data Lake, So I Don’t Need Data VirtualizationMyth Busters III: I’m Building a Data Lake, So I Don’t Need Data Virtualization
Myth Busters III: I’m Building a Data Lake, So I Don’t Need Data Virtualization
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An Introduction
 
Big Data and BI Tools - BI Reporting for Bay Area Startups User Group
Big Data and BI Tools - BI Reporting for Bay Area Startups User GroupBig Data and BI Tools - BI Reporting for Bay Area Startups User Group
Big Data and BI Tools - BI Reporting for Bay Area Startups User Group
 
The Marriage of the Data Lake and the Data Warehouse and Why You Need Both
The Marriage of the Data Lake and the Data Warehouse and Why You Need BothThe Marriage of the Data Lake and the Data Warehouse and Why You Need Both
The Marriage of the Data Lake and the Data Warehouse and Why You Need Both
 

Más de Saratoga

Georgina Armstrong - Data Visualisations. Making Boring Data Exciting and Emp...
Georgina Armstrong - Data Visualisations. Making Boring Data Exciting and Emp...Georgina Armstrong - Data Visualisations. Making Boring Data Exciting and Emp...
Georgina Armstrong - Data Visualisations. Making Boring Data Exciting and Emp...Saratoga
 
David Shorten - Artificial intelligence
David Shorten - Artificial intelligenceDavid Shorten - Artificial intelligence
David Shorten - Artificial intelligenceSaratoga
 
Theo Priestley - Internet of Things - Forget the Numbers, Let's Talk Realities
Theo Priestley - Internet of Things - Forget the Numbers, Let's Talk RealitiesTheo Priestley - Internet of Things - Forget the Numbers, Let's Talk Realities
Theo Priestley - Internet of Things - Forget the Numbers, Let's Talk RealitiesSaratoga
 
Jasper Horrell - SKA and Big Data: Up in Space and on the Ground
Jasper Horrell - SKA and Big Data: Up in Space and on the GroundJasper Horrell - SKA and Big Data: Up in Space and on the Ground
Jasper Horrell - SKA and Big Data: Up in Space and on the GroundSaratoga
 
Barry Devlin - The Myth of Data-Driven Business
Barry Devlin - The Myth of Data-Driven BusinessBarry Devlin - The Myth of Data-Driven Business
Barry Devlin - The Myth of Data-Driven BusinessSaratoga
 
Jeff Fletcher - Building a Hadoop based infrastructure as a service product a...
Jeff Fletcher - Building a Hadoop based infrastructure as a service product a...Jeff Fletcher - Building a Hadoop based infrastructure as a service product a...
Jeff Fletcher - Building a Hadoop based infrastructure as a service product a...Saratoga
 
Anthony Miller - The second Half of the Chessboard: Thriving in a Time of Exp...
Anthony Miller - The second Half of the Chessboard: Thriving in a Time of Exp...Anthony Miller - The second Half of the Chessboard: Thriving in a Time of Exp...
Anthony Miller - The second Half of the Chessboard: Thriving in a Time of Exp...Saratoga
 
Marc Smith - Charting Collections of Connections in Social Media: Creating Ma...
Marc Smith - Charting Collections of Connections in Social Media: Creating Ma...Marc Smith - Charting Collections of Connections in Social Media: Creating Ma...
Marc Smith - Charting Collections of Connections in Social Media: Creating Ma...Saratoga
 
Tristan Bergh - Predictive Analytics in Action: Real Business Results in Sout...
Tristan Bergh - Predictive Analytics in Action: Real Business Results in Sout...Tristan Bergh - Predictive Analytics in Action: Real Business Results in Sout...
Tristan Bergh - Predictive Analytics in Action: Real Business Results in Sout...Saratoga
 
Gill Staniland - Interconnected BI - A systems thinking approach
Gill Staniland - Interconnected BI - A systems thinking approachGill Staniland - Interconnected BI - A systems thinking approach
Gill Staniland - Interconnected BI - A systems thinking approachSaratoga
 
Gary Hope - Machine Learning: It's Not as Hard as you Think
Gary Hope - Machine Learning: It's Not as Hard as you ThinkGary Hope - Machine Learning: It's Not as Hard as you Think
Gary Hope - Machine Learning: It's Not as Hard as you ThinkSaratoga
 
Jerry Chetty - Myth About Data Investigation
Jerry Chetty - Myth About Data InvestigationJerry Chetty - Myth About Data Investigation
Jerry Chetty - Myth About Data InvestigationSaratoga
 
Mike McDougall - Business Intelligence - Perdition or Paradise
Mike McDougall - Business Intelligence - Perdition or ParadiseMike McDougall - Business Intelligence - Perdition or Paradise
Mike McDougall - Business Intelligence - Perdition or ParadiseSaratoga
 
Mbwana Alliy - Big data from Silicon Valley to Africa
Mbwana Alliy - Big data from Silicon Valley to AfricaMbwana Alliy - Big data from Silicon Valley to Africa
Mbwana Alliy - Big data from Silicon Valley to AfricaSaratoga
 
The art of visualising requirements
The art of visualising requirementsThe art of visualising requirements
The art of visualising requirementsSaratoga
 
Getting investment ready tech4 africa (zach)
Getting investment ready   tech4 africa (zach)Getting investment ready   tech4 africa (zach)
Getting investment ready tech4 africa (zach)Saratoga
 

Más de Saratoga (16)

Georgina Armstrong - Data Visualisations. Making Boring Data Exciting and Emp...
Georgina Armstrong - Data Visualisations. Making Boring Data Exciting and Emp...Georgina Armstrong - Data Visualisations. Making Boring Data Exciting and Emp...
Georgina Armstrong - Data Visualisations. Making Boring Data Exciting and Emp...
 
David Shorten - Artificial intelligence
David Shorten - Artificial intelligenceDavid Shorten - Artificial intelligence
David Shorten - Artificial intelligence
 
Theo Priestley - Internet of Things - Forget the Numbers, Let's Talk Realities
Theo Priestley - Internet of Things - Forget the Numbers, Let's Talk RealitiesTheo Priestley - Internet of Things - Forget the Numbers, Let's Talk Realities
Theo Priestley - Internet of Things - Forget the Numbers, Let's Talk Realities
 
Jasper Horrell - SKA and Big Data: Up in Space and on the Ground
Jasper Horrell - SKA and Big Data: Up in Space and on the GroundJasper Horrell - SKA and Big Data: Up in Space and on the Ground
Jasper Horrell - SKA and Big Data: Up in Space and on the Ground
 
Barry Devlin - The Myth of Data-Driven Business
Barry Devlin - The Myth of Data-Driven BusinessBarry Devlin - The Myth of Data-Driven Business
Barry Devlin - The Myth of Data-Driven Business
 
Jeff Fletcher - Building a Hadoop based infrastructure as a service product a...
Jeff Fletcher - Building a Hadoop based infrastructure as a service product a...Jeff Fletcher - Building a Hadoop based infrastructure as a service product a...
Jeff Fletcher - Building a Hadoop based infrastructure as a service product a...
 
Anthony Miller - The second Half of the Chessboard: Thriving in a Time of Exp...
Anthony Miller - The second Half of the Chessboard: Thriving in a Time of Exp...Anthony Miller - The second Half of the Chessboard: Thriving in a Time of Exp...
Anthony Miller - The second Half of the Chessboard: Thriving in a Time of Exp...
 
Marc Smith - Charting Collections of Connections in Social Media: Creating Ma...
Marc Smith - Charting Collections of Connections in Social Media: Creating Ma...Marc Smith - Charting Collections of Connections in Social Media: Creating Ma...
Marc Smith - Charting Collections of Connections in Social Media: Creating Ma...
 
Tristan Bergh - Predictive Analytics in Action: Real Business Results in Sout...
Tristan Bergh - Predictive Analytics in Action: Real Business Results in Sout...Tristan Bergh - Predictive Analytics in Action: Real Business Results in Sout...
Tristan Bergh - Predictive Analytics in Action: Real Business Results in Sout...
 
Gill Staniland - Interconnected BI - A systems thinking approach
Gill Staniland - Interconnected BI - A systems thinking approachGill Staniland - Interconnected BI - A systems thinking approach
Gill Staniland - Interconnected BI - A systems thinking approach
 
Gary Hope - Machine Learning: It's Not as Hard as you Think
Gary Hope - Machine Learning: It's Not as Hard as you ThinkGary Hope - Machine Learning: It's Not as Hard as you Think
Gary Hope - Machine Learning: It's Not as Hard as you Think
 
Jerry Chetty - Myth About Data Investigation
Jerry Chetty - Myth About Data InvestigationJerry Chetty - Myth About Data Investigation
Jerry Chetty - Myth About Data Investigation
 
Mike McDougall - Business Intelligence - Perdition or Paradise
Mike McDougall - Business Intelligence - Perdition or ParadiseMike McDougall - Business Intelligence - Perdition or Paradise
Mike McDougall - Business Intelligence - Perdition or Paradise
 
Mbwana Alliy - Big data from Silicon Valley to Africa
Mbwana Alliy - Big data from Silicon Valley to AfricaMbwana Alliy - Big data from Silicon Valley to Africa
Mbwana Alliy - Big data from Silicon Valley to Africa
 
The art of visualising requirements
The art of visualising requirementsThe art of visualising requirements
The art of visualising requirements
 
Getting investment ready tech4 africa (zach)
Getting investment ready   tech4 africa (zach)Getting investment ready   tech4 africa (zach)
Getting investment ready tech4 africa (zach)
 

Último

Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNKTimothy Spann
 
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...HyderabadDolls
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...HyderabadDolls
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...gajnagarg
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...Health
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...gajnagarg
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...gajnagarg
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Klinik kandungan
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...SOFTTECHHUB
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubaikojalkojal131
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxronsairoathenadugay
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制vexqp
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...nirzagarg
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...Bertram Ludäscher
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabiaahmedjiabur940
 
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...kumargunjan9515
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...HyderabadDolls
 

Último (20)

Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
 
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
Sonagachi * best call girls in Kolkata | ₹,9500 Pay Cash 8005736733 Free Home...
 

Martin Willcox - What is a Data Lake, Anyway?

  • 1. DEBUNKING THE MYTHS Speaker 10 of 17 Martin Willcox @Willcoxmnk What is Data Lake, Anyway? Followed by Anthony Miller
  • 2. One of the Big Data labels that we risk over-loading to complete abstraction is the idea of a "Data Lake”… 2 © 2014 Teradata “…store all data present and future and create a centralised data archive location.” “A large object-based repository that holds data in its native format” “Sometimes called the bit bucket or the landing zone” “All Water and Little Substance” “As more and more applications are created that derive value from… new types of data… the Data Lake forms”
  • 3. “Data lakes can help resolve the nagging problem of accessibility and data integration” …and some of the discussions sound eerily familiar 3 © 2014 Teradata Data accessibility and integration? Isn’t that what the Data Warehouse is for?
  • 4. So is the Data Lake a new architectural construct? 4 © 2014 Teradata Or are we just re-platforming Data Marts? Simple, single subject area Dimensional Data Marts – with all of the dimensions pre-joined to the fact table? One-per-workload / application? Is this really the future of Enterprise Analytics? Or circa 1995 silo, departmental Decision Support Systems warmed-over?
  • 5. Take the merits of the different technologies out of the equation – and this is what some of us are thinking… 5 © 2014 Teradata
  • 6. …but there are no free lunches in Information Management – merely more and different options Explicit, or implicit, there is always, always, always (at least one) schema 6 © 2014 Teradata Agile application development, versus agile data acquisition None of the information management strategies / technologies are magic - “pay me now, or pay me later”
  • 7. 7 © 2014 Teradata Big Data Are Plural For the foreseeable future, we will need multiple Information Management strategies - and multiple Information Management technologies DATA WAREHOUSE DISCOVERY PLATFORM Integration becomes a critical concern DATA PLATFORM – Gartner – Logical Data Warehouse – Forrester – Enterprise Data Hub – Teradata – Unified Data Architecture
  • 8. 8 © 2014 Teradata A definition of the Data Lake (Data Reservoir) A centralised, consolidated, persistent store of raw, un-modelled and un-transformed data from multiple sources / silos (without an explicit, pre-defined schema, without externally defined metadata – and without guarantees about the quality, provenance and security of the data) Agile data acquisition – a haystack to go looking for needles… …with a natural storage model for complex, multi-structured data… …support for efficient non-relational computation… Now that is new, interesting and (potentially) very, very useful… …and provision for cost-effective storage of large and noisy data-sets.
  • 9. 9 © 2014 Teradata Data. Science
  • 10. does nature tend to give us a single, beautiful lake? Or a messy patchwork of lakes, plural? 10 © 2014 Teradata Left to its own devices, STOP PRESS: Laws of Physics* Unchanged! (* More specifically, the 2nd Law of Thermodynamics) None of the new information management strategies and technologies is by itself a cure for information entropy – data silos form naturally, just like lakes form naturally
  • 11. 11 © 2014 Teradata Summary and conclusions