2. WE ARE MEETING EXACTLY 62 YEARS AFTER SPUTNIK 1 LAUNCH☺
PDM MEETUP 2
3. PRAGUE DATA MANAGEMENT MEETUP (PDM MEETUP)
– Open professional group
– Based on www.meetup.com
– Everyone is welcomed
– There are no bad topics, only bad speakers☺
– You can show anything to others
– Operational since September 2015
– Sponsored by ADASTRA
DATA MANAGEMENT
DATA ACQUISITION
DATA STORING
DATA INTEGRATION
DATA ANALYTICS
DATA USAGE
PDM MEETUP 3
4. MEETUP HISTORY
# Date Topics
1 10. 9. 2015 Data Management
2 14. 10. 2015 Data Lake
3 23. 11. 2015 Dark Data (without Dark Energy and Dark Force)
4 12. 1. 2016 Data Lake
5 7. 3. 2016 Sad Stories About DW/BI Modeling (sad only)
6 23. 3. 2016 Self-service BI Street Battle
7 27. 4. 2016 Let's explore the new Microsoft PowerBI!
8 22. 9. 2016 Data Management pro začátečníky
Data Management for Beginners
9 17. 10. 2016 Small Big Data
10 22. 11. 2016 Základy modelování DW/BI
DW/BI Modeling Basics
11 23.1.2017 Komponenty datových skladů
Data Warehouse Components
12 28.2.2017 Operational Data Store
13 28.3.2017 Metadata v DW/BI
DW/BI Metadata
# Date Topics
14 25.4.2017 Jak se stát DW/BI konzultantem
Be a DW/BI Consultant
15 16.5.2017 SQL
16 29.5.2017 From IoT to AI: Applications of time series data
17 26.9.2017 Aktuální trendy v data managementu
Actual trends in data management
18 24.10.2017 Datové platformy na technologiích Oracle
Data platforms based on Oracle
19 21.11.2017 Big Data rychle a zběsile / Big Data Fast and Furious
20 30.1.2018 Jak se staví velké datové sklady
How to build huge data warehouse
21 27.2.2018 Základy modelování DW/BI #2
DW/BI Modeling Basics
22 27.3.2018 Big Data: How to deal with sensorics (floating) data easily
23 17.4.2018 DW/BIaaS
24 22.5.2018 Be a Consultant / Jak se stát konzultantem
25 19.6.2018 Building AI-Powered Retail Store
26 17.9.2018 Information Management 101
27 23.10.2018 Blockchain
28 29.1.2019 DW & BI trendy v roce 2019 / DW & BI Trends in 2019
29 26.3.2019 Data Warehouse Automation
30 10.4.2019 Next Gen Data Integration Patterns With Jeff Pollock
What Next? Meet Pepper? Big Data? Cloud DW? DW Basics?
6. BRIEF DATA MANAGEMENT HISTORY
Modern Age
Cloud
Automation
Logical Data Warehouse
Extended Data Warehouse
Data Lake
Polyglot Architecture
Kappa / Lambda
Databus
Data Pipeline
Real-time Data Integration
Big Data ETL
Open Source Analytics
Big Data Analytics
Self-service BI & ETL
Data Science
Machine Learning & AI
Hadoop without Hadoop
Stream Analytics
All data Analytics
Data Management Platform
Autonomous Technologies
Decoupled Compute & Storage
Serverless
Prehistory
Controlled Chaos
Best Practice Awaking
Manual Scripting
Primeval Relational Analytics
1985 - 1995
Antiquity
Titans: Kimball vs. Inmon
Maturing Best practices
Enterprise Data Warehouse
ETL
OLAP
Reference Data Management
Classic Relational Analytics
1995 – 2005 2005 - 2015
Middle Age
Traditional Data Warehouse
Hub-and-Spoke Architecture
Data Governance
Master Data Management
Metadata-Driven Development
ELT
Data Vault
Data Mining
DW Appliance
Columnar DB
In-memory DB
Hadoop Stack Dawn
Unstructured Data Analytics
2015 - 2025
Future?
2025 - ∞
7. INFINITE DATA MANAGEMENT LOOP IS STILL SAME
Collect
Integrate
Enrich
Store
Analyze
Discover
Use
Curate
8. CLASSICAL DATA WAREHOUSE
– Key data platform for decades but no more
– Data system used for reporting and data analysis, and
is considered a core component of business
intelligence. DWs are central repositories of integrated
data from one or more disparate sources.
– A large amount of information from a company stored
on a computer and used for making business
decisions
– Old but very mature concept with some potential
– Core Features
– Database (usually RDBMS)
– Subject Orientation
– Data Integration
– History
– Structure Stability
– Batch processing & significant data latencies
– DW, DWH, MIS, ADS, ADW, EDW, DP
– NEXT GEN DATA INTEGRATION NEEDED
Data Warehouse
Data
Source
Data
Acquistion
Data
Integration
Data
Staging Data Repository
Reporting &
Other Data
Usage
Analytics
Data
Source
9. BUSINES PRIORITIES VS. CLASSICAL DATA WAREHOUES
Grow revenue & profit
Improve CX
Improve products and services
360 degree view
Digital transformation
Accelerate responses to business and
market changes
Real-time data-driven decisions
Faster predictive insights
Smarter intelligent business
Structured static data only
Melting with data growth
Business demand exceeds IT capacities & IT budgets
Data siloed cross multiple platforms
Growing operational overhead
Missing real-time insights
Unscalable
Limited advanced analytics
Really expensive TCO
Outdated governance and security
10. Data
Staging
Area
Ralph Kimball
Data Warehouse Bus (DW)
Bottom-Up
Conformed Data Marts
(Kimball’s Data Warehouse)
Conformed
Dimensions
Business Transformation
CLASSICAL DATA WAREHOUSE ARCHITECTURES (HUB-AND-SPOKE)
Data
Sources
Data
Marts
RDBMS
RDBMS
Reporting
Data Apps
Bill Inmon
Enterprise Data Warehouse (EDW)
Top-Down
Dan Linstedt
Data Vault (DV)
Top-Down
Technical
Transformation
Technical
Transformation
Technical
Transformation
Business
Transformation
Business
Transformation
Data
Sources
Data
Sources
Data
Marts
Data
Staging
Area
Data
Staging
Area
Data
Warehouse
Data
Vault
Business
Vault
Business
Transformation
RDBMS
RDBMS
Reporting
Data Apps
RDBMS
RDBMS
Reporting
Data Apps
Data
Marts
11. Complexity
DATA ARCHITECTURE EVOLUTION
Hub-and-Spoke
Data Warehouse
Data
Integration
Data
AcquistionData Sources Data Warehouse
RDBMS
RDBMS
Reporting
Data Apps
Data
Marts
Analytics
Query
Engine
Polyglot
Data Federation
Data Virtualization
Logical Data Warehouse
Data Warehouse RDBMS
Reporting
RDBMS
Data Apps
Data
Marts Analytics
Data
Integration
Data
Acquistion
Data Sources
Kappa
Databus
Data Sources
Data Ingest
Messaging
CDC
Bulk Copy
Files
Data Extractor
Serving
Layer
Data Lake
REST
SQL
Pub/Sub
Data Integration
Data Warehouse RDBMS
Reporting
RDBMS
Data Apps
Data Marts
Analytics
Speed Layer
Pipeline Manager
Lambda
Speed Layer
Pipeline Manager
Batch Layer
Object Storage
Data Sources
Data Ingest
Messaging
CDC
Bulk Copy
Files
Data Extractor
Serving
Layer
Data Lake
REST
SQL
Pub/Sub
Data Integration
Data Warehouse RDBMS
Reporting
RDBMS
Data Apps
Data Marts
Analytics
12. DATA INTEGRATION PATTERNS
Mediator
Load
Extract
Extract
Load Transform
Transform Load
Extract Transform
Source Target
TEL
ELT
ETL
API Call API LogicData API
CDC Change Capture LoadExtract ReplicationTransport
Pub/Sub SubscriptionPublisher Broker
ETLT Extract
Load Transform
LoadTransform
Data Pipeline Data Pipeline
13. CLOUD & CLOUD COMPUTING LONG HISTORY
Before 4 billion years
The first cloud on
Earth
1994
Andy Hertzfeld used the
term for a computing
platform
1977
The cloud icon was used for a
large computer network
2006
Amazon Web Services
2008
Google Cloud
Platform
2008
Alibaba Cloud
2010
Microsoft
Azure
2012
Oracle Cloud
2014
IBM
Bluemix
15. CLOUD WARS
Source: https://cloudwars.co
Top 10 Cloud Vendors Hit $37 Billion in Q2 As Microsoft, Amazon Drive 50%
Why Microsoft’s Beating Amazon: Cloud Deals with SAP, Oracle and ServiceNow
Microsoft and Oracle to interconnect Microsoft Azure and Oracle Cloud
18. MODERN DATA MANAGEMENT SUMMARY
Data
Ingest
{}
Data
Integration
Data
Management
Architecture
Data
Model
Database
Data
Repository
Deployment
Data
Usage
E-R ModelHub & Spoke
Kappa / Databus
Graph Data Model
Key Management
Data Discovery
Data Science
On-premise
Cloud
Hybrid Cloud
Multi-Cloud
Data Warehouse
Data Mart
Sandbox
Business Intelligence
Reporting
Machine Learning
Data Lake
RDBMS
In-memory
Document Store
Multidimensional DB
Graph DBMS
Columnar DBMS
Object Store
NoSQL
Multidimensional
Model
Data Archive
Time Variance
Data Latency
Audit
Date Tiering
Data Retention
Data SecurityAutomation
Orchestration
Aggregation
Reconciliation
ETL
Cleansing
Standardization
Data Loading
Data Replication
Change Data Capture
Manual Inputs
Stream Processing
Legacy
Lambda
Operational
Data Store
Snowflake Schema
Big Data Fabric
Star Schema Metadata
Reference
Data Management
Data Catalog
Data Governance
Data Quering
Data Literacy
Master
Data Management
TCO Management
Governance
Polyglot
Key-Value
Column Family
Data API
File RepositoryDistributed
File System
Data Pipelines
Master Data
Repository