SlideShare una empresa de Scribd logo
1 de 33
Descargar para leer sin conexión
When and How Data
Lakes Fit into a Modern
Data Architecture
Presented by: William McKnight
“#1 Global Influencer in Data Warehousing” Onalytica
President, McKnight Consulting Group
An Inc. 5000 Company in 2018 and 2017
@williammcknight
www.mcknightcg.com
(214) 514-1444
Second Thursday of Every Month, at 2:00 ET
#AdvAnalytics
From Data Lakes
to Data Experiences
Joel McKelvey, Director, Product Marketing
1
https://emtemp.gcom.cloud/ngw/globalassets/en/information-technology/documents/trends/gartner-2019-cio-agenda-key-takeaways.pdf
Digital-fueled Growth is the Top
Investment Priority For Technology Leaders.1
Rebalance your technology portfolio toward digital transformation
Percent of respondents
increasing investment
Percent of respondents
decreasing investment
Cyber/information security 40%1%
Cloud services or solutions (Saas, Paa5, etc.) 33%2%
Core system improvements/transformation 31%10%
How to implement product-centric delivery by percentage of respondents
DigitalTransformation
Business Intelligence or data analytics solution 45%1%
1
https://www.forrester.com/report/InsightsDriven+Businesses+Set+The+Pace+For+Global+Growth/-/E-RES130848
Insights-driven businesses harness
and implement digital insights
strategically and at scale to drive growth
and create differentiating experiences,
products, and services.1
7x Faster growth than global GDP
30% Growth or more using advanced analytics in a transformational way
2.3x More likely to succeed during disruption
Looker Data Platform
Governed metrics | Best-in-class APIs | In-database | Git version-control | Security | Cloud
Integrated Insights
Sales reps at
Slack have the
metrics their
customers care
about most within
a pre-populated
slide deck
Contextual | Passive | Where you work
Sales reps have
more context on
customer calls
with valuable
usage data
embedded within
Salesforce
Data-driven Workflows
Operational | Time-sensitive | Task-driven
Reduce churn
with automated
alerts and email
follow ups for
success managers
based on
customer health
Increase ROI on
digital ad spend
by optimizing bids
in real-time with
ML ‘bid-bot’,
trained with
governed data
Custom Applications
Job to be done | Larger purpose
Custom
application
ensures ads are
sold for the
optimal price,
regardless of time
slot or market
Top Broadcaster
Maintain optimal
inventory levels
and pricing with
merchandising
and supply chain
management
application
Top Retailer
Modern BI & Analytics
Analytical | Exploreable | Data-centric
Namely customers
access reports
and dashboards
to better
understand their
staffing needs
and trends
Holistic
understanding of
customers with a
360-degree view
across channels:
web, apps, print,
and more
Data Lake
1 in 2
customers integrate
insights/experiences
beyond Looker
2000+
Customers
5000+
Developers
900+
Employees
Santa Cruz
San Francisco New YorkChicago
Boulder Tokyo
Dublin London
Empower
people with
the smarter
use of data
Looker Recognized as Challenger in the Gartner 2020 Magic
Quadrant for Analytics and Business Intelligence Platforms
“The growing demand for tools that close
the gap between discovering insights and
taking action is creating a profound change
in the way we use data in the workplace. At
Looker our vision is to meet this demand by
enabling data experiences that go far
beyond traditional business intelligence.”
- Nick Caldwell,
Chief Product Officer at Looker
1 Gartner “Magic Quadrant for Analytics and Business Intelligence Platforms,” by James Richardson, Rita Sallam, Kurt Schlegel, Austin Kronz, and Julian Sun February 13, 2020
Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or
other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties,
expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.
William McKnight
President, McKnight Consulting Group
• Frequent keynote speaker and trainer internationally
• Consulted to Pfizer, Scotiabank, Fidelity, TD Ameritrade, Teva
Pharmaceuticals, Verizon, and many other Global 1000 companies
• Hundreds of articles, blogs, benchmarks and white papers in
publication
• Focused on delivering business value and solving business problems
utilizing proven, streamlined approaches to information management
• Former Database Engineer, Fortune 50 Information Technology
executive and Ernst&Young Entrepreneur of Year Finalist
• Owner/consultant: 2018 & 2017 Inc. 5000 Data Strategy and
Implementation consulting firm
• Brings 25+ years of information management and DBMS experience
McKnight Consulting Group Offerings
Strategy
Training
Strategy
§ Trusted Advisor
§ Action Plans
§ Roadmaps
§ Tool Selections
§ Program Management
Training
§ Classes
§ Workshops
Implementation
§ Data/Data Warehousing/Business
Intelligence/Analytics
§ Master Data Management
§ Governance/Quality
§ Big Data
Implementation
3
Analytic Data Stores
3 Major Decisions
• Decision #1: The Data Store Type
– The largest factor for distinguishing between databases and file-based scale-out system
utilization is the data profile. The latter is best for data that fits the loose label of 'unstructured'
(or semi-structured) data, while more traditional data -- and smaller volumes of all data -- still
belong in a relational database.
• Decision #2: Data Store Placement
– You must also decide where to place your data store -- on-premises or in the cloud (and which
cloud). In the past, the only clear choice for most organizations was on-premises data. However,
the costs of scale are gnawing away at the notion that this remains the best approach for a data
platform. For more on why databases are moving to the cloud, please read this article.
• Decision #3: The Workload Architecture
– Finally, you must keep in mind the distinction between operational or analytical workloads.
Short transactional requests and more complex (often longer) analytics requests demand
different architectures. Analytics databases, though quite diverse, are the preferred platforms
for the analytics workload.
5
Whither the idea of the Data Warehouse?
Intake
Export
Files
Txn
App
Data
Full
Delta
Stream
Structured
Big Data
TIER 1
Access1..n
Regional and
Departmental
Views
ADS
Applications
& Engines
Operational
Analytics &
Hot Views
Data Marts
Independent
Dependent
Relational
Data
TIER 3
Conformed
Dimensions
Distribution
Common Summary
and Derived Values
Master Data
Reference Data Hub
Transaction
Data Hub
TIER 2
6
Data Warehousing
• Data Warehouses (still) have a
lower total cost of ownership than
data marts
• A data warehouse is a SHARED
platform
– Build once, use many
– Access at Data Warehouse
– Access by creating a mart off the DW
• Still A LOT cheaper than building from
scratch
“… a subject-
oriented, integrated,
non-volatile, time-
variant collection of
data, organized to
support
management
needs.” — Bill Inmon
Reasons for Analytic Architecture Change
• Take Advantage Of…
– Cloud Databases
– Get into a Columnar Data Orientation
– Get into the Data Architecture you want
– Cloud Storage
• Projects Requiring Consolidated Data
8
The Key is Right-Fitting Platforms
• THE Data Warehouse
– Value-Added Components: Modeling for Access,
Data Quality, Tooling, Conformed Dimensions,
Data Governance, Etc.
• A Dependent Data Mart (Fed from the Data
Warehouse)
• A Data Lake
• A Big Data Cluster
• An Independent Data Mart
• An Operational Hub
• An Operational Data Lake
9
Data
Lake
Usage Understanding by the Builders
D
a
t
a
C
u
l
t
i
v
a
t
i
o
n
Data
Warehouse
Data
Mart
Sensible Divisions of Analytic Platforms
The Post-Operational Ecosystem
Data Lake
DW
DM
DM
11
Usage Understanding by the Builders
D
a
t
a
C
u
l
t
i
v
a
t
i
o
n
Data
Warehouse
/Lake
What If?
Data
Mart
Deploying the Data Lake
Data Lake
Data Scientist Workbench and Data
Warehouse Staging
OLTP
Systems
Data Lake
Data Scientists
ERP
CRM
Supply
Chain
MDM
…
Data
Warehouse
Data Mart
Stream or
Batch
Updates
DI
Real-Time,
Event-Driven
Apps
14
Data Lake Patterns
• Data Refinery
– Do Data Warehouse ETL in the Data Lake
• Archive Storage
• Data Science Lab
• [Data Lake as the Data Warehouse]
15
Files
RDBMS
Streaming
Data
Sources
Ingest
Governance
Process
Central Data Store
Kafka, Pulsar
Snowball
Kinesis
QuickSight
HadoopCloud Storage
EMR
Glue
Catalog & User Interface Access Management
DynamoDB ElasticSearch Web
Interface
API Gateway IAM & Cognito
Analyze
Python
R
Machine
Learning
Data Lake Example Components
16
Data Lake Setup
• Managed deployments in the Hadoop
family of products
• External tables in Hive metastore that point
at cloud storage (Amazon S3, Google
Cloud Storage, Azure Data Lake Storage
Gen 2)
– To run SQL against the data
– HiveQL and Spark SQL require entries in the
metastore
17
Object Storage Instances
• Object Storage instances/clusters have local
storage, i.e., on the physical drives mounted to
the instances themselves, that is HDFS and
Hive
• Object Storage technologies access their
cloud vendor’s respective cloud storage—viz.:
– Amazon EMR accesses S3
– Dataproc accesses Google Cloud Storage
– HDI accesses Azure Data Lake Storage Gen2
• Local storage is used by the Object Storage
platform for housekeeping
18
The Data Warehouse of the Future
• Pair a lake with an analytical engine that
charges only by what you use
• If you have a ton of data that can sit in cold
storage and only needs to be accessed or
analyzed occasionally, store it in Amazon
S3/Azure Blob Storage/Google Cloud Storage
– Use a database (on-premise or in the cloud) that
can create external tables that point at the storage
– Analysts can query directly against it, or draw down
a subset for some deeper/intensive analysis
– The GB/month storage fee plus data
transfer/egress fees will be much cheaper than
leaving it in a data warehouse
19
Notes on the Data Warehouse of the Future
• More Achievable separate compute and storage architecture
• Compute resources (Map/Reduce, Hive, Spark, etc.) can be
taken down, scaled up or out, or interchanged without data
movement
• Storage can be centralized, but compute can be distributed
• Major players have mechanism to ensure consistency to achieve
ACID-like compliance
• Remote data replication to ensure redundancy and recovery
• Most of the query execution is processing time, and not data
transport, so if cloud compute and storage are in the same
cloud vendor region, performance is hardly impacted
20
Sample Cluster Configuration
Google BigQuery
Cloud Provider Google Cloud
Platform Version 3.6
Hadoop Version 2.7.3
Hive Version 1.2.1
Spark Version 2.3.2
Instance Type n1-highmem-16
Head/Master Nodes 1
Worker Nodes 16 and 32
vCPUs (per node) 16
RAM (per node) 104 GB
Compute Cost
(per node per hour)
$0.947
Platform Premium (per node per hour) $0.160
21
Tips
• If possible, configure remote data to be stored in parquet format, as
opposed to comma-separated or other text format
• As new data sources are added to cloud storage, use a code
distribution system—like Github—to distribute new table definitions
to distributed teams
• Use data partitioning to improve performance—but don’t forget new
partitions have to be declared to the Hive metastore when they are
added to the data
• Co-locate compute and storage in the same region
• Use AES-256 encryption on cloud storage bucket to ensure encryption
at-rest
• Hold the remotely-stored data to the same governance and data
quality standards you would if it were on-premise—consider a data
catalog or other metadata technique to keep the data organized and
easy-to-find for new compute engines
• Drop commonly used data in the lake, like master data from MDM
22
The Data Science Lab Role of
the Data Lake
Artificial Intelligence and Machine Learning
• Looming on the horizon is an injection of
AI/ML into every piece of software
• Consider the domain of data integration
– Predicting with high accuracy the steps ahead
– Fixing its bugs
• Machine learning is being built into databases
so the data will be analyzed as it is loaded
– I.e., Python with TensorFlow and Scala on Spark.
• The split of the necessary AI/ML between the
"edge" of corporate users and the software
itself is still to be determined
24
Training Data for Machine Learning &
Artificial Intelligence
• You must have enough data to analyze to
build models
• Your data determines the depth of AI you
can achieve -- for example, statistical
modeling, machine learning, or deep
learning -- and its accuracy
25
AI Data
• Call center recordings and chat logs
• Streaming sensor data, historical maintenance records and
search logs
• Customer account data and purchase history
• Email response metrics
• Product catalogs and data sheets
• Public references
• YouTube video content audio tracks
• User website behaviors
• Sentiment analysis, user-generated content, social graph
data, and other external data sources
26
When and How Data
Lakes Fit into a Modern
Data Architecture
Presented by: William McKnight
“#1 Global Influencer in Data Warehousing” Onalytica
President, McKnight Consulting Group
An Inc. 5000 Company in 2018 and 2017
@williammcknight
www.mcknightcg.com
(214) 514-1444
Second Thursday of Every Month, at 2:00 ET
#AdvAnalytics

Más contenido relacionado

La actualidad más candente

ADV Slides: Building and Growing Organizational Analytics with Data Lakes
ADV Slides: Building and Growing Organizational Analytics with Data LakesADV Slides: Building and Growing Organizational Analytics with Data Lakes
ADV Slides: Building and Growing Organizational Analytics with Data LakesDATAVERSITY
 
Drive your business with predictive analytics
Drive your business with predictive analyticsDrive your business with predictive analytics
Drive your business with predictive analyticsThe Marketing Distillery
 
Slides: Data Governance Reality Check
Slides: Data Governance Reality CheckSlides: Data Governance Reality Check
Slides: Data Governance Reality CheckDATAVERSITY
 
How to get data lineage right
How to get data lineage rightHow to get data lineage right
How to get data lineage rightLeigh Hill
 
Data Architecture Best Practices for Today’s Rapidly Changing Data Landscape
Data Architecture Best Practices for Today’s Rapidly Changing Data LandscapeData Architecture Best Practices for Today’s Rapidly Changing Data Landscape
Data Architecture Best Practices for Today’s Rapidly Changing Data LandscapeDATAVERSITY
 
Slides: Applying Artificial Intelligence (AI) in All the Right Places in the ...
Slides: Applying Artificial Intelligence (AI) in All the Right Places in the ...Slides: Applying Artificial Intelligence (AI) in All the Right Places in the ...
Slides: Applying Artificial Intelligence (AI) in All the Right Places in the ...DATAVERSITY
 
5 Steps to Transform into a Data-Driven Organization - Ganes Kesari - Gramen...
 5 Steps to Transform into a Data-Driven Organization - Ganes Kesari - Gramen... 5 Steps to Transform into a Data-Driven Organization - Ganes Kesari - Gramen...
5 Steps to Transform into a Data-Driven Organization - Ganes Kesari - Gramen...Ganes Kesari
 
Data Architecture Strategies Webinar: Emerging Trends in Data Architecture – ...
Data Architecture Strategies Webinar: Emerging Trends in Data Architecture – ...Data Architecture Strategies Webinar: Emerging Trends in Data Architecture – ...
Data Architecture Strategies Webinar: Emerging Trends in Data Architecture – ...DATAVERSITY
 
Why Data Science Projects Fail
Why Data Science Projects FailWhy Data Science Projects Fail
Why Data Science Projects FailSense Corp
 
You Can’t Have Best in Class Governance Without Best in Class Data Lineage
You Can’t Have Best in Class Governance Without Best in Class Data LineageYou Can’t Have Best in Class Governance Without Best in Class Data Lineage
You Can’t Have Best in Class Governance Without Best in Class Data LineageDATAVERSITY
 
Cloud and Analytics -- 2020 sparksummit
Cloud and Analytics -- 2020 sparksummitCloud and Analytics -- 2020 sparksummit
Cloud and Analytics -- 2020 sparksummitMing Yuan
 
Data as a Profit Driver – Emerging Techniques to Monetize Data as a Strategic...
Data as a Profit Driver – Emerging Techniques to Monetize Data as a Strategic...Data as a Profit Driver – Emerging Techniques to Monetize Data as a Strategic...
Data as a Profit Driver – Emerging Techniques to Monetize Data as a Strategic...DATAVERSITY
 
Data strategy in a Big Data world
Data strategy in a Big Data worldData strategy in a Big Data world
Data strategy in a Big Data worldCraig Milroy
 
Using Machine Learning to Understand and Predict Marketing ROI
Using Machine Learning to Understand and Predict Marketing ROIUsing Machine Learning to Understand and Predict Marketing ROI
Using Machine Learning to Understand and Predict Marketing ROIDATAVERSITY
 
2011 digital trends webinar presentation
2011 digital trends webinar presentation2011 digital trends webinar presentation
2011 digital trends webinar presentationEconsultancy
 
DAMA International Symposium San Diego CA 03-17-2008
DAMA International Symposium San Diego CA 03-17-2008DAMA International Symposium San Diego CA 03-17-2008
DAMA International Symposium San Diego CA 03-17-2008Robert J. Abate, CBIP, CDMP
 
Data Insights and Analytics Webinar: CDO vs. CAO - What’s the Difference?
Data Insights and Analytics Webinar: CDO vs. CAO - What’s the Difference?Data Insights and Analytics Webinar: CDO vs. CAO - What’s the Difference?
Data Insights and Analytics Webinar: CDO vs. CAO - What’s the Difference?DATAVERSITY
 
NTXISSACSC3 - Why Enterprise Information Management is the Key to GRC by Mika...
NTXISSACSC3 - Why Enterprise Information Management is the Key to GRC by Mika...NTXISSACSC3 - Why Enterprise Information Management is the Key to GRC by Mika...
NTXISSACSC3 - Why Enterprise Information Management is the Key to GRC by Mika...North Texas Chapter of the ISSA
 
Best Practices in Metadata Management
Best Practices in Metadata ManagementBest Practices in Metadata Management
Best Practices in Metadata ManagementDATAVERSITY
 
Microsoft Crm Analytics
Microsoft Crm AnalyticsMicrosoft Crm Analytics
Microsoft Crm AnalyticsNic Smith
 

La actualidad más candente (20)

ADV Slides: Building and Growing Organizational Analytics with Data Lakes
ADV Slides: Building and Growing Organizational Analytics with Data LakesADV Slides: Building and Growing Organizational Analytics with Data Lakes
ADV Slides: Building and Growing Organizational Analytics with Data Lakes
 
Drive your business with predictive analytics
Drive your business with predictive analyticsDrive your business with predictive analytics
Drive your business with predictive analytics
 
Slides: Data Governance Reality Check
Slides: Data Governance Reality CheckSlides: Data Governance Reality Check
Slides: Data Governance Reality Check
 
How to get data lineage right
How to get data lineage rightHow to get data lineage right
How to get data lineage right
 
Data Architecture Best Practices for Today’s Rapidly Changing Data Landscape
Data Architecture Best Practices for Today’s Rapidly Changing Data LandscapeData Architecture Best Practices for Today’s Rapidly Changing Data Landscape
Data Architecture Best Practices for Today’s Rapidly Changing Data Landscape
 
Slides: Applying Artificial Intelligence (AI) in All the Right Places in the ...
Slides: Applying Artificial Intelligence (AI) in All the Right Places in the ...Slides: Applying Artificial Intelligence (AI) in All the Right Places in the ...
Slides: Applying Artificial Intelligence (AI) in All the Right Places in the ...
 
5 Steps to Transform into a Data-Driven Organization - Ganes Kesari - Gramen...
 5 Steps to Transform into a Data-Driven Organization - Ganes Kesari - Gramen... 5 Steps to Transform into a Data-Driven Organization - Ganes Kesari - Gramen...
5 Steps to Transform into a Data-Driven Organization - Ganes Kesari - Gramen...
 
Data Architecture Strategies Webinar: Emerging Trends in Data Architecture – ...
Data Architecture Strategies Webinar: Emerging Trends in Data Architecture – ...Data Architecture Strategies Webinar: Emerging Trends in Data Architecture – ...
Data Architecture Strategies Webinar: Emerging Trends in Data Architecture – ...
 
Why Data Science Projects Fail
Why Data Science Projects FailWhy Data Science Projects Fail
Why Data Science Projects Fail
 
You Can’t Have Best in Class Governance Without Best in Class Data Lineage
You Can’t Have Best in Class Governance Without Best in Class Data LineageYou Can’t Have Best in Class Governance Without Best in Class Data Lineage
You Can’t Have Best in Class Governance Without Best in Class Data Lineage
 
Cloud and Analytics -- 2020 sparksummit
Cloud and Analytics -- 2020 sparksummitCloud and Analytics -- 2020 sparksummit
Cloud and Analytics -- 2020 sparksummit
 
Data as a Profit Driver – Emerging Techniques to Monetize Data as a Strategic...
Data as a Profit Driver – Emerging Techniques to Monetize Data as a Strategic...Data as a Profit Driver – Emerging Techniques to Monetize Data as a Strategic...
Data as a Profit Driver – Emerging Techniques to Monetize Data as a Strategic...
 
Data strategy in a Big Data world
Data strategy in a Big Data worldData strategy in a Big Data world
Data strategy in a Big Data world
 
Using Machine Learning to Understand and Predict Marketing ROI
Using Machine Learning to Understand and Predict Marketing ROIUsing Machine Learning to Understand and Predict Marketing ROI
Using Machine Learning to Understand and Predict Marketing ROI
 
2011 digital trends webinar presentation
2011 digital trends webinar presentation2011 digital trends webinar presentation
2011 digital trends webinar presentation
 
DAMA International Symposium San Diego CA 03-17-2008
DAMA International Symposium San Diego CA 03-17-2008DAMA International Symposium San Diego CA 03-17-2008
DAMA International Symposium San Diego CA 03-17-2008
 
Data Insights and Analytics Webinar: CDO vs. CAO - What’s the Difference?
Data Insights and Analytics Webinar: CDO vs. CAO - What’s the Difference?Data Insights and Analytics Webinar: CDO vs. CAO - What’s the Difference?
Data Insights and Analytics Webinar: CDO vs. CAO - What’s the Difference?
 
NTXISSACSC3 - Why Enterprise Information Management is the Key to GRC by Mika...
NTXISSACSC3 - Why Enterprise Information Management is the Key to GRC by Mika...NTXISSACSC3 - Why Enterprise Information Management is the Key to GRC by Mika...
NTXISSACSC3 - Why Enterprise Information Management is the Key to GRC by Mika...
 
Best Practices in Metadata Management
Best Practices in Metadata ManagementBest Practices in Metadata Management
Best Practices in Metadata Management
 
Microsoft Crm Analytics
Microsoft Crm AnalyticsMicrosoft Crm Analytics
Microsoft Crm Analytics
 

Similar a When and How Data Lakes Fit into a Modern Data Architecture

ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
 
2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics2022 Trends in Enterprise Analytics
2022 Trends in Enterprise AnalyticsDATAVERSITY
 
Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)Nathan Bijnens
 
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...Denodo
 
ADV Slides: How to Improve Your Analytic Data Architecture Maturity
ADV Slides: How to Improve Your Analytic Data Architecture MaturityADV Slides: How to Improve Your Analytic Data Architecture Maturity
ADV Slides: How to Improve Your Analytic Data Architecture MaturityDATAVERSITY
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization Denodo
 
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)Denodo
 
Accelerate Self-Service Analytics with Data Virtualization and Visualization
Accelerate Self-Service Analytics with Data Virtualization and VisualizationAccelerate Self-Service Analytics with Data Virtualization and Visualization
Accelerate Self-Service Analytics with Data Virtualization and VisualizationDenodo
 
Bridging the Last Mile: Getting Data to the People Who Need It
Bridging the Last Mile: Getting Data to the People Who Need ItBridging the Last Mile: Getting Data to the People Who Need It
Bridging the Last Mile: Getting Data to the People Who Need ItDenodo
 
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?Denodo
 
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...DATAVERSITY
 
Introducing Trillium DQ for Big Data: Powerful Profiling and Data Quality for...
Introducing Trillium DQ for Big Data: Powerful Profiling and Data Quality for...Introducing Trillium DQ for Big Data: Powerful Profiling and Data Quality for...
Introducing Trillium DQ for Big Data: Powerful Profiling and Data Quality for...Precisely
 
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...Denodo
 
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsEnabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsStreamsets Inc.
 
LinkedInSaxoBankDataWorkbench
LinkedInSaxoBankDataWorkbenchLinkedInSaxoBankDataWorkbench
LinkedInSaxoBankDataWorkbenchSheetal Pratik
 
Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricNathan Bijnens
 
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...Denodo
 
A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)Denodo
 
Trends in Enterprise Advanced Analytics
Trends in Enterprise Advanced AnalyticsTrends in Enterprise Advanced Analytics
Trends in Enterprise Advanced AnalyticsDATAVERSITY
 
Unlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data VirtualizationUnlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data VirtualizationDenodo
 

Similar a When and How Data Lakes Fit into a Modern Data Architecture (20)

ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
 
2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics
 
Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)
 
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...
Data Fabric - Why Should Organizations Implement a Logical and Not a Physical...
 
ADV Slides: How to Improve Your Analytic Data Architecture Maturity
ADV Slides: How to Improve Your Analytic Data Architecture MaturityADV Slides: How to Improve Your Analytic Data Architecture Maturity
ADV Slides: How to Improve Your Analytic Data Architecture Maturity
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
 
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
 
Accelerate Self-Service Analytics with Data Virtualization and Visualization
Accelerate Self-Service Analytics with Data Virtualization and VisualizationAccelerate Self-Service Analytics with Data Virtualization and Visualization
Accelerate Self-Service Analytics with Data Virtualization and Visualization
 
Bridging the Last Mile: Getting Data to the People Who Need It
Bridging the Last Mile: Getting Data to the People Who Need ItBridging the Last Mile: Getting Data to the People Who Need It
Bridging the Last Mile: Getting Data to the People Who Need It
 
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
 
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
 
Introducing Trillium DQ for Big Data: Powerful Profiling and Data Quality for...
Introducing Trillium DQ for Big Data: Powerful Profiling and Data Quality for...Introducing Trillium DQ for Big Data: Powerful Profiling and Data Quality for...
Introducing Trillium DQ for Big Data: Powerful Profiling and Data Quality for...
 
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
 
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsEnabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
 
LinkedInSaxoBankDataWorkbench
LinkedInSaxoBankDataWorkbenchLinkedInSaxoBankDataWorkbench
LinkedInSaxoBankDataWorkbench
 
Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft Fabric
 
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
How Data Virtualization Puts Enterprise Machine Learning Programs into Produc...
 
A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)A Logical Architecture is Always a Flexible Architecture (ASEAN)
A Logical Architecture is Always a Flexible Architecture (ASEAN)
 
Trends in Enterprise Advanced Analytics
Trends in Enterprise Advanced AnalyticsTrends in Enterprise Advanced Analytics
Trends in Enterprise Advanced Analytics
 
Unlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data VirtualizationUnlock Your Data for ML & AI using Data Virtualization
Unlock Your Data for ML & AI using Data Virtualization
 

Más de DATAVERSITY

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...DATAVERSITY
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceDATAVERSITY
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data LiteracyDATAVERSITY
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsDATAVERSITY
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for YouDATAVERSITY
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?DATAVERSITY
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?DATAVERSITY
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling FundamentalsDATAVERSITY
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectDATAVERSITY
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at ScaleDATAVERSITY
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?DATAVERSITY
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...DATAVERSITY
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?DATAVERSITY
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsDATAVERSITY
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayDATAVERSITY
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise AnalyticsDATAVERSITY
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best PracticesDATAVERSITY
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?DATAVERSITY
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best PracticesDATAVERSITY
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageDATAVERSITY
 

Más de DATAVERSITY (20)

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and Governance
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data Literacy
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for You
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling Fundamentals
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic Project
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and Forwards
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement Today
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best Practices
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best Practices
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
 

Último

Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...amitlee9823
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...amitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...amitlee9823
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...amitlee9823
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...amitlee9823
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsJoseMangaJr1
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 

Último (20)

Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 

When and How Data Lakes Fit into a Modern Data Architecture

  • 1. When and How Data Lakes Fit into a Modern Data Architecture Presented by: William McKnight “#1 Global Influencer in Data Warehousing” Onalytica President, McKnight Consulting Group An Inc. 5000 Company in 2018 and 2017 @williammcknight www.mcknightcg.com (214) 514-1444 Second Thursday of Every Month, at 2:00 ET #AdvAnalytics
  • 2. From Data Lakes to Data Experiences Joel McKelvey, Director, Product Marketing
  • 3. 1 https://emtemp.gcom.cloud/ngw/globalassets/en/information-technology/documents/trends/gartner-2019-cio-agenda-key-takeaways.pdf Digital-fueled Growth is the Top Investment Priority For Technology Leaders.1 Rebalance your technology portfolio toward digital transformation Percent of respondents increasing investment Percent of respondents decreasing investment Cyber/information security 40%1% Cloud services or solutions (Saas, Paa5, etc.) 33%2% Core system improvements/transformation 31%10% How to implement product-centric delivery by percentage of respondents DigitalTransformation Business Intelligence or data analytics solution 45%1%
  • 4. 1 https://www.forrester.com/report/InsightsDriven+Businesses+Set+The+Pace+For+Global+Growth/-/E-RES130848 Insights-driven businesses harness and implement digital insights strategically and at scale to drive growth and create differentiating experiences, products, and services.1 7x Faster growth than global GDP 30% Growth or more using advanced analytics in a transformational way 2.3x More likely to succeed during disruption
  • 5. Looker Data Platform Governed metrics | Best-in-class APIs | In-database | Git version-control | Security | Cloud Integrated Insights Sales reps at Slack have the metrics their customers care about most within a pre-populated slide deck Contextual | Passive | Where you work Sales reps have more context on customer calls with valuable usage data embedded within Salesforce Data-driven Workflows Operational | Time-sensitive | Task-driven Reduce churn with automated alerts and email follow ups for success managers based on customer health Increase ROI on digital ad spend by optimizing bids in real-time with ML ‘bid-bot’, trained with governed data Custom Applications Job to be done | Larger purpose Custom application ensures ads are sold for the optimal price, regardless of time slot or market Top Broadcaster Maintain optimal inventory levels and pricing with merchandising and supply chain management application Top Retailer Modern BI & Analytics Analytical | Exploreable | Data-centric Namely customers access reports and dashboards to better understand their staffing needs and trends Holistic understanding of customers with a 360-degree view across channels: web, apps, print, and more Data Lake
  • 6. 1 in 2 customers integrate insights/experiences beyond Looker 2000+ Customers 5000+ Developers 900+ Employees Santa Cruz San Francisco New YorkChicago Boulder Tokyo Dublin London Empower people with the smarter use of data
  • 7. Looker Recognized as Challenger in the Gartner 2020 Magic Quadrant for Analytics and Business Intelligence Platforms “The growing demand for tools that close the gap between discovering insights and taking action is creating a profound change in the way we use data in the workplace. At Looker our vision is to meet this demand by enabling data experiences that go far beyond traditional business intelligence.” - Nick Caldwell, Chief Product Officer at Looker 1 Gartner “Magic Quadrant for Analytics and Business Intelligence Platforms,” by James Richardson, Rita Sallam, Kurt Schlegel, Austin Kronz, and Julian Sun February 13, 2020 Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.
  • 8. William McKnight President, McKnight Consulting Group • Frequent keynote speaker and trainer internationally • Consulted to Pfizer, Scotiabank, Fidelity, TD Ameritrade, Teva Pharmaceuticals, Verizon, and many other Global 1000 companies • Hundreds of articles, blogs, benchmarks and white papers in publication • Focused on delivering business value and solving business problems utilizing proven, streamlined approaches to information management • Former Database Engineer, Fortune 50 Information Technology executive and Ernst&Young Entrepreneur of Year Finalist • Owner/consultant: 2018 & 2017 Inc. 5000 Data Strategy and Implementation consulting firm • Brings 25+ years of information management and DBMS experience
  • 9. McKnight Consulting Group Offerings Strategy Training Strategy § Trusted Advisor § Action Plans § Roadmaps § Tool Selections § Program Management Training § Classes § Workshops Implementation § Data/Data Warehousing/Business Intelligence/Analytics § Master Data Management § Governance/Quality § Big Data Implementation 3
  • 11. 3 Major Decisions • Decision #1: The Data Store Type – The largest factor for distinguishing between databases and file-based scale-out system utilization is the data profile. The latter is best for data that fits the loose label of 'unstructured' (or semi-structured) data, while more traditional data -- and smaller volumes of all data -- still belong in a relational database. • Decision #2: Data Store Placement – You must also decide where to place your data store -- on-premises or in the cloud (and which cloud). In the past, the only clear choice for most organizations was on-premises data. However, the costs of scale are gnawing away at the notion that this remains the best approach for a data platform. For more on why databases are moving to the cloud, please read this article. • Decision #3: The Workload Architecture – Finally, you must keep in mind the distinction between operational or analytical workloads. Short transactional requests and more complex (often longer) analytics requests demand different architectures. Analytics databases, though quite diverse, are the preferred platforms for the analytics workload. 5
  • 12. Whither the idea of the Data Warehouse? Intake Export Files Txn App Data Full Delta Stream Structured Big Data TIER 1 Access1..n Regional and Departmental Views ADS Applications & Engines Operational Analytics & Hot Views Data Marts Independent Dependent Relational Data TIER 3 Conformed Dimensions Distribution Common Summary and Derived Values Master Data Reference Data Hub Transaction Data Hub TIER 2 6
  • 13. Data Warehousing • Data Warehouses (still) have a lower total cost of ownership than data marts • A data warehouse is a SHARED platform – Build once, use many – Access at Data Warehouse – Access by creating a mart off the DW • Still A LOT cheaper than building from scratch “… a subject- oriented, integrated, non-volatile, time- variant collection of data, organized to support management needs.” — Bill Inmon
  • 14. Reasons for Analytic Architecture Change • Take Advantage Of… – Cloud Databases – Get into a Columnar Data Orientation – Get into the Data Architecture you want – Cloud Storage • Projects Requiring Consolidated Data 8
  • 15. The Key is Right-Fitting Platforms • THE Data Warehouse – Value-Added Components: Modeling for Access, Data Quality, Tooling, Conformed Dimensions, Data Governance, Etc. • A Dependent Data Mart (Fed from the Data Warehouse) • A Data Lake • A Big Data Cluster • An Independent Data Mart • An Operational Hub • An Operational Data Lake 9
  • 16. Data Lake Usage Understanding by the Builders D a t a C u l t i v a t i o n Data Warehouse Data Mart Sensible Divisions of Analytic Platforms
  • 18. Usage Understanding by the Builders D a t a C u l t i v a t i o n Data Warehouse /Lake What If? Data Mart
  • 20. Data Lake Data Scientist Workbench and Data Warehouse Staging OLTP Systems Data Lake Data Scientists ERP CRM Supply Chain MDM … Data Warehouse Data Mart Stream or Batch Updates DI Real-Time, Event-Driven Apps 14
  • 21. Data Lake Patterns • Data Refinery – Do Data Warehouse ETL in the Data Lake • Archive Storage • Data Science Lab • [Data Lake as the Data Warehouse] 15
  • 22. Files RDBMS Streaming Data Sources Ingest Governance Process Central Data Store Kafka, Pulsar Snowball Kinesis QuickSight HadoopCloud Storage EMR Glue Catalog & User Interface Access Management DynamoDB ElasticSearch Web Interface API Gateway IAM & Cognito Analyze Python R Machine Learning Data Lake Example Components 16
  • 23. Data Lake Setup • Managed deployments in the Hadoop family of products • External tables in Hive metastore that point at cloud storage (Amazon S3, Google Cloud Storage, Azure Data Lake Storage Gen 2) – To run SQL against the data – HiveQL and Spark SQL require entries in the metastore 17
  • 24. Object Storage Instances • Object Storage instances/clusters have local storage, i.e., on the physical drives mounted to the instances themselves, that is HDFS and Hive • Object Storage technologies access their cloud vendor’s respective cloud storage—viz.: – Amazon EMR accesses S3 – Dataproc accesses Google Cloud Storage – HDI accesses Azure Data Lake Storage Gen2 • Local storage is used by the Object Storage platform for housekeeping 18
  • 25. The Data Warehouse of the Future • Pair a lake with an analytical engine that charges only by what you use • If you have a ton of data that can sit in cold storage and only needs to be accessed or analyzed occasionally, store it in Amazon S3/Azure Blob Storage/Google Cloud Storage – Use a database (on-premise or in the cloud) that can create external tables that point at the storage – Analysts can query directly against it, or draw down a subset for some deeper/intensive analysis – The GB/month storage fee plus data transfer/egress fees will be much cheaper than leaving it in a data warehouse 19
  • 26. Notes on the Data Warehouse of the Future • More Achievable separate compute and storage architecture • Compute resources (Map/Reduce, Hive, Spark, etc.) can be taken down, scaled up or out, or interchanged without data movement • Storage can be centralized, but compute can be distributed • Major players have mechanism to ensure consistency to achieve ACID-like compliance • Remote data replication to ensure redundancy and recovery • Most of the query execution is processing time, and not data transport, so if cloud compute and storage are in the same cloud vendor region, performance is hardly impacted 20
  • 27. Sample Cluster Configuration Google BigQuery Cloud Provider Google Cloud Platform Version 3.6 Hadoop Version 2.7.3 Hive Version 1.2.1 Spark Version 2.3.2 Instance Type n1-highmem-16 Head/Master Nodes 1 Worker Nodes 16 and 32 vCPUs (per node) 16 RAM (per node) 104 GB Compute Cost (per node per hour) $0.947 Platform Premium (per node per hour) $0.160 21
  • 28. Tips • If possible, configure remote data to be stored in parquet format, as opposed to comma-separated or other text format • As new data sources are added to cloud storage, use a code distribution system—like Github—to distribute new table definitions to distributed teams • Use data partitioning to improve performance—but don’t forget new partitions have to be declared to the Hive metastore when they are added to the data • Co-locate compute and storage in the same region • Use AES-256 encryption on cloud storage bucket to ensure encryption at-rest • Hold the remotely-stored data to the same governance and data quality standards you would if it were on-premise—consider a data catalog or other metadata technique to keep the data organized and easy-to-find for new compute engines • Drop commonly used data in the lake, like master data from MDM 22
  • 29. The Data Science Lab Role of the Data Lake
  • 30. Artificial Intelligence and Machine Learning • Looming on the horizon is an injection of AI/ML into every piece of software • Consider the domain of data integration – Predicting with high accuracy the steps ahead – Fixing its bugs • Machine learning is being built into databases so the data will be analyzed as it is loaded – I.e., Python with TensorFlow and Scala on Spark. • The split of the necessary AI/ML between the "edge" of corporate users and the software itself is still to be determined 24
  • 31. Training Data for Machine Learning & Artificial Intelligence • You must have enough data to analyze to build models • Your data determines the depth of AI you can achieve -- for example, statistical modeling, machine learning, or deep learning -- and its accuracy 25
  • 32. AI Data • Call center recordings and chat logs • Streaming sensor data, historical maintenance records and search logs • Customer account data and purchase history • Email response metrics • Product catalogs and data sheets • Public references • YouTube video content audio tracks • User website behaviors • Sentiment analysis, user-generated content, social graph data, and other external data sources 26
  • 33. When and How Data Lakes Fit into a Modern Data Architecture Presented by: William McKnight “#1 Global Influencer in Data Warehousing” Onalytica President, McKnight Consulting Group An Inc. 5000 Company in 2018 and 2017 @williammcknight www.mcknightcg.com (214) 514-1444 Second Thursday of Every Month, at 2:00 ET #AdvAnalytics