Maxim Tereschenko (BigData Lead, Provectus) with the talk "Analytical Systems Evolution - From Excel to Big Data Platforms and Data Lakes".
Description: For the last ten years, analytical systems have changed dramatically. From Excel and Data Warehouses, we came to Big Data platforms and Data Lakes. It is no longer fantasy to communicate with the analytical system by voice or to wander in 3D glasses among the visualizations of the data. In scope of the speech, I want to follow this evolution, identify its main trends and fantasize about the future.
2. About Me
2005
2010
20152009
2008
Product
Owner
From BI Developer To
Delivery Manager
BI
Developer
BI Business
Analyst
BI
Consultant
Consulting ProductOutsourcingEnterprise Consulting
2017
Practice Lead
Business
Development
3. ● Data Analysis Stages
● Relational Datawarehouse
● Extended Relational Datawarehouse
● Big Data Challenges
● Modern Analytic Landscape
● Big Data Platform
● Data Lakes
● Future Trends Predictions
● BigData & Data Analytics Practice
Agenda
10. Ext DWH Use Cases
Agility
Scalability
Cost
Performance
Consistency
Velocity
Security
Corporate Reporting
Pixel Perfect Reporting
Ad-hoc analysis
Real-Time Analytics
Advanced Analytics
All Data Analysis
Self-service BI
Use Cases
Data Types(s)
11. Big Data Challenges
> 1 billions of users
> 3 billions of photos
daily (12 000 per sec)
> 5 billions of comments
daily (58 000 per sec)
12. Typical Big Data Challenges
UNSTRUCTURED
STRUCTURED
HIGH
MEDIUM
LOW
Archives Docs Business
Apps
Media Social
Networks
Public
Web
Data
Storages
Machine
Log Data
Sensor
Data
Velocity Variety VolumeComplexity
Architecture Concerns:
• Scalability
• Performance
• Extensibility
• Data Quality
Data Sources:
• Fault-Tolerance and Availability
• Security
• Cost
• Skills Availability
14. Big Data Questions
Data
Discovery
Dashboards
and Business
Reporting
Real Time
Intelligence
Business Users
Intelligent AgentsConsumers
How to implement
Recommendations or Anomaly
Detection achieving Low
latency?
Data Scientists/
Analysts
How to enable Data
Science/
Advanced Analytics
team for predictive
and advanced
analytics?
How to provide
Real-time Dashboards
or Self-Service BI with
high Data quality and
good Performance
over terabytes and
petabytes?
Operations
15. Modern Analytic Landscape
A modern integrated approach for solving Big Data/Business Analytics needs across
multiple verticals and domains
All Data
Real-time Data Processing
Data Acquisition and Storing
DataIntegration
Enterprise
Data Warehousing
Data Management
(Governance, Security, Quality, MDM)
Analytics
Reporting
and Analysis
Predictive
Modeling
Data Mining
Data Lake
(Landing, Exploration
and Archiving)
UX and
Visualization
Applications
Application
data
Media data:
images,
video, etc
Social data
Enterprise
content data
Machine,
sensor, log
data
Docs and
archives
data
Customer
Analytics
Marketing
Analytics
Web/Mobile/
Social
Analytics
IT
Operational
Analytics
Fraud and
Risk
Analytics
Complex Event
Processing
Real-time Query
and Search
20. Big Data Platform
Real-Time Analytics
Self-Service BI
Streaming
Pixel Perfect Reporting
Advanced Analytics
All Data Analysis
Corporate Reporting
Use CasesAgility
Scalability
Cost
Performance
Consistency
Velocity
Security
Data Types(s)
21. Data Lakes
This is not something what I thought…
when I wanted to spend a couple of days at the lake
22. Data Lake. What’s the difference?
All Data, All Data Types
Easy To Change
Fast Insights
24. Data Lakes Architecture (Example)
https://www.searchtechnologies.com/blog/search-data-lake-with-big-data
25. Data Lakes
Self-Service BI
Advanced Analytics
Predictive Analytics
All Data Analysis
Text Mining
Pixel Perfect Reporting
Corporate Reporting
Use CasesAgility
Scalability
Cost
Performance
Consistency
Velocity
Security
Data Types(s)
26. Future Predictions by Gartner
● Next-Generation Data Discovery
● Smart Data Discovery Capabilities
● Natural-Language Generation and Artificial Intelligence
● 50% of analytic queries will be generated using search,
natural-language processing or voice, or will be
autogenerated
● Organizations that offer users access to a curated catalog of
internal and external data will realize twice the business
value from analytics investments than those that do not
https://www.gartner.com/doc/reprints?id=1-3TYE0CD&ct=170221&st=sb≈