Business intelligence (BI) and data analytics are increasing in popularity as more organizations are looking to become more data-driven. Many tools have powerful visualization techniques that can create dynamic displays of critical information. To ensure that the data displayed on these visualizations is accurate and timely, a strong Data Architecture is needed. Join this webinar to understand how to create a robust Data Architecture for BI and data analytics that takes both business and technology needs into consideration.
Student Profile Sample report on improving academic performance by uniting gr...
Business Intelligence & Data Analytics– An Architected Approach
1. Copyright Global Data Strategy, Ltd. 2022
Business Intelligence & Data Analytics:
An Architected Approach
Donna Burbank
Global Data Strategy, Ltd.
June 23rd, 2022
3. KATANA GRAPH |
TM
KATANA GRAPH |
TM
Confidential 2
High Performance Scale-out Graph Processing & Analytics
Founded in March 2020, offices in Austin, Bay Area,
NYC, Denver
Co-founders: Keshav Pingali and Chris Rossbach
Investors: Intel Capital, Dell Venture Capital, Redline Ventures,
Walden International
Katana team: Leaders in graph algorithms, programming
languages, runtimes, virtualization and storage.
Commercial engagements with several Fortune 100 companies
Website: www.katanagraph.com
Company Overview
4. KATANA GRAPH |
TM
Leadership Team
Confidential 3
Gurbinder Gill
PhD UT Austin
VMWare, Facebook,
MSR , IBM Research
Roshan Dathathri
PhD UT Austin
NI, MSR, HP Labs
Emmett Witchel
Prof UT Austin
InCert, Veritas,
Symantec
Bo Wu
Prof Colorado
School of Mines
Graph mining expert
Donald Nguyen
PhD UT Austin
Google, Synthace,
Determined AI
Tyler Hunt
PhD UT Austin
MSR, Visa Research,
Bell Labs
Jon Currey
University of Cambridge
Distributed Systems,
Machine Learning
MSR, Apple (iTune), Oracle
Yige Hu
PhD UT Austin
File System,
Fault Tolerance
Amy Chang
Board Advisor
BOD P&G, Cisco, Disney
UCSF Hospital Exec Committee
Deans Advisory Council
Stanford University
Ying Ding
Data Science Advisor
Professor UT Austin
Medical/ Pharma Knowledge Graph,
Machine Learning
Co-founder Data2Discovery
Keshav Pingali
CEO, Co-founder
Prof UT Austin
Fellow ACM, IEEE, AAAS
Chris Rossbach
CTO, Co-founder
Prof UT Austin
MSR, Vmware, Canesta
Farshid Sabet
CBO, Co-founder
Intel, Modvidius,
Aptina, SanDisk
5. KATANA GRAPH |
TM
KATANA GRAPH |
Graph Technology
Application Areas
04
Platforms
Finance
Healthcare
Retail
Energy Industrial
Telecom
Genomics Anti Money
Laundering
Drug
Discovery
Identity
Graph
Precision
Medicine
Electronic
Circuit Design
Tools
Knowledge
Graph
Predictive
Monitoring
Intrusion
detection
Supply Chain
Optimization
Fraud
Detection
Real Time
Analytics
Customer
360
Recommendation
Social
Networks
6. KATANA GRAPH |
TM
KATANA GRAPH |
TM
Why Katana Graph
Confidential 5
Architected to handle massive graphs
• Tested with largest publicly available
web-crawl: WDC12 (3.5B vertices, 128B edges)
Unmatched performance
• 10x - 100x times faster vs competing solutions
Massive scalability
• Proven on Open Cloud HPC Clusters
(AWS , Azure, Google Cloud)
• Scales up to 256 machines on Stampede Xeon
(Skylake) Cluster
Native AI/ML with Graphs
• Health and Life Sciences (HLS), Financial, Identity
Management, Intrusion detection, EDA (Electronic
Design Automation), HPC (High Performance
Computing) application: 3D mesh generation
8. Global Data Strategy, Ltd. 2022 www.globaldatastrategy.com
Donna Burbank
2
Donna is a recognised industry expert in
information management with over 20 years
of experience in data strategy, information
management, data modeling, metadata
management, and enterprise architecture.
Her background is multi-faceted across
consulting, product development, product
management, brand strategy, marketing, and
business leadership.
She is currently the Managing Director at
Global Data Strategy, Ltd., an international
information management consulting company
that specializes in the alignment of business
drivers with data-centric technology.
In past roles, she has served in key brand
strategy and product management roles at CA
Technologies and Embarcadero Technologies
for several of the leading data management
products in the market.
As an active contributor to the data
management community, she is a long time
DAMA International member, Past President
and Advisor to the DAMA Rocky Mountain
chapter, and was awarded the Excellence in
Data Management Award from DAMA
International.
She has worked with dozens of Fortune 500
companies worldwide in the Americas,
Europe, Asia, and Africa and speaks regularly
at industry conferences. She has co-authored
several books and is a regular contributor to
industry publications. She can be reached at
donna.burbank@globaldatastrategy.com
Donna is based in Boulder, Colorado, USA.
Follow on Twitter @donnaburbank
@GlobalDataStrat
Twitter Event hashtag: #DAStrategies
9. Global Data Strategy, Ltd. 2022 www.globaldatastrategy.com
DATAVERSITY Data Architecture Strategies
• January Emerging Trends in Data Architecture – What’s the Next Big Thing?
• February Building a Data Strategy - Practical Steps for Aligning with Business Goals
• March Master Data Management – Aligning Data, Process, and Governance
• April Data Governance & Data Architecture: Alignment & Synergies
• May Improving Data Literacy Around Data Architecture
• June Business Intelligence & Data Analytics: An Architected Approach
• July Best Practices in Metadata Management
• August Data Quality Best Practices
• September Business-centric Data Modeling
• October Graph Databases: Benefits & Risks
• December Enterprise Architecture vs. Data Architecture
3
This Year’s Lineup
10. Global Data Strategy, Ltd. 2022 www.globaldatastrategy.com
What We’ll Cover Today
• Business intelligence (BI) and data analytics are
increasing in popularity as more organizations are
looking to become more data-driven.
• Many tools have powerful visualization
techniques that can create dynamic displays of
critical information.
• To ensure that the data displayed on these
visualizations is accurate and timely, a strong
Data Architecture is needed.
• This webinar will discuss how to create a robust
Data Architecture for BI and data analytics that
takes both business and technology needs into
consideration.
4
11. Global Data Strategy, Ltd. 2022 www.globaldatastrategy.com
Data-Driven Business
70% of organizations feel that their
organization sees data as a strategic asset*.
70% of indicated that reporting and
analytics were key drivers for data
management.**
>50% identified improved collaboration
through using a defined data architecture. **
5
* based on research from a 2019 DATAVERSITY survey on “Trends in Data Management” by Donna Burbank and Michelle Knight
** based on research from a 2021 DATAVERSITY survey on “Trends in Data Management” by Donna Burbank and Michelle Knight
12. Global Data Strategy, Ltd. 2022 www.globaldatastrategy.com
Main Business Goals & Drivers for Data Management
6
0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% 70.00% 80.00%
Gaining Competitive Advantage
Improving Outcomes (e.g. health, education, etc.)
Improving Product Quality
Increasing Revenue and Growth
Improving Customer Satisfaction
Complying with Regulations
Saving Cost and Increasing Efficiency
Reducing Risk
Supporting Digital Transformation
Gaining Insights through Reporting and Analytics
Main Business Goals & Drivers for Data Management
(select all that apply)
13. Global Data Strategy, Ltd. 2022 www.globaldatastrategy.com
Supporting Reporting & Analytics
7
ACME Inc. Sales Dashboard
❑ Product: Widget 1
❑ Region: NA
201
8
2019 2020 2021 2022
Successful reporting & analytics includes:
• Data-driven culture
• Do we use dashboards in our sales meetings?
• Or go by “gut feel”?
• How can we integrate analytics into our sales cycle
(e.g. predictive next best offer)
• Data Governance
• How do we define “Total Revenue”?
• What countries are included in South America?
• Data Quality
• Are these revenue numbers accurate?
• What’s the source of the product data?
• Data Architecture
• How are we storing the data to accurately &
efficiently to slice and dice for these reports?
Super Widget
Pack
Widget 1
Widget 2
What about
the data?
14. Global Data Strategy, Ltd. 2022 www.globaldatastrategy.com
What is the Correct Architecture to Power Reporting & Analytics?
… There is a Cacophony of Options …
8
Data
Warehouse
Data Lake
Data Lake
House
Data Marketplace
Metadata
Catalog
Relational, Nonrelational, Star Schema, SQL, NoSQL, Graph, Document Store, Real-
time Streaming, Time series….
15. Global Data Strategy, Ltd. 2022 www.globaldatastrategy.com
What are Current Organizational Priorities
9
* based on research from a 2021
DATAVERSITY survey on “Trends in Data
Management” by Donna Burbank and
Michelle Knight
16. Global Data Strategy, Ltd. 2022 www.globaldatastrategy.com
Using a Data Lake in Conjunction with a Data Warehouse
10
17. Global Data Strategy, Ltd. 2022 www.globaldatastrategy.com
Integrating Multiple Paradigms
• The Data Lake has a different architecture & purpose than traditional data sources such as data
warehouses.
• But the two environments can co-exist to share relevant information.
11
Data Analysis & Discovery – Data Lake Enterprise Systems of Record
Data Governance & Collaboration
Master &
Reference Data
Data Warehouse
Data Marts
Operational Data
Security & Privacy
Sandbox
Lightly Modeled
Data
Data
Exploration
Reporting & Analytics
Advanced
Analytics
Self-Service BI
Standard BI
Reports
18. Global Data Strategy, Ltd. 2022 www.globaldatastrategy.com
Level 1
“Top-Down” alignment with
business priorities
Level 5
“Bottom-Up” management &
inventory of data sources
Level 2
Managing the people, process,
policies & culture around data
Level 4
Coordinating & integrating
disparate data sources
Level 3
Leveraging data for strategic
advantage
A Holistic Approach is Needed
19. Global Data Strategy, Ltd. 2022 www.globaldatastrategy.com
13
The Design Aspect of
Data Architecture for BI & Analytics
20. Global Data Strategy, Ltd. 2022 www.globaldatastrategy.com
A little data modeling up-front
… prevents headaches down the road
From Data Modeling for the Business by Hoberman, Burbank, Bradley, Technics Publications, 2009
• It’s often tempting to skip data
modeling documentation because it’s
“faster”
• But…long-term, it’s ultimately longer as
errors and inconsistencies need to be
fixed as a result.
“If you don’t have time to do it right, do
you have time to do it again?”
21. Global Data Strategy, Ltd. 2022 www.globaldatastrategy.com
Levels of Data Models
15
Conceptual
Logical
Physical
Purpose
Communication & Definition of
Business Concepts & Rules
Clarification & Detail
of Business Rules &
Data Structures
Technical
Implementation on
a Physical Database
Audience
Business Stakeholders
Data Architects
Data Architects
Business Analysts
DBAs
Developers
Business Concepts
Data Entities
Physical Tables
Business Stakeholders
Data Architects
Enterprise
Subject Areas
Organization & Scoping of main
business domain areas
22. Global Data Strategy, Ltd. 2022 www.globaldatastrategy.com
Different Physical Models for Different Use Cases
16
Relational – Normal Form
• Reduce redundancy for
operational data
• Increase data quality
• Ensure consistency (ACID
transactions)
Dimensional– Star Schema
• Ease of reporting for summarized
and historical data
• Ability to easily “slice and dice” for
self-service reporting
• Performance and flexibility
NoSQL
No modeling technique is inherently “better” than another. Data use cases & purpose drives what “good” looks like.
…Rant over…
• Speed of retrieval, low
latency
• High data volumes
• Flexibility for change
…And More!
• There are numerous
ways to model and store
data.
• Hierarchical/XML
• Graph
• COBOL Copybook!
• S3 “buckets”
• Data Vault
• Etc…
23. Global Data Strategy, Ltd. 2022 www.globaldatastrategy.com
Is the Star Schema Dead?
17
24. Global Data Strategy, Ltd. 2022 www.globaldatastrategy.com
The Star Schema
Dimension
Dimension
Dimension
Dimension
Dimension
Fact
(Measure)
Facts/Measures: Contain the actual values to be reported on.
What are we measuring? e.g. Activities (sales transaction,
patient visit, etc.)
• Few attributes (just numbers with links to the dimensions)
• Many values (e.g. all sales transactions)
Dimensions: Contain the details that describe the central fact.
i.e. The things we want to report by. e.g. Date, Region, Quarter
• Many attributes (Individual name, DOB, gender, etc.)
• Few values
Note: Your Master Data domains often feed these dimensions.
Sales
By Month
By Customer
By Region By Sales Rep
By Product
The Star Schema is still a user-friendly and performant way to “slice and dice” data for reporting.
25. Global Data Strategy, Ltd. 2022 www.globaldatastrategy.com
The Bus Matrix
A Bus Matrix is a simply way to keep track of what you want to report “on” (Facts) and what
you want to report “by” (Dimensions)
Location Sales Rep Product Customer
Total Sales Revenue X X X X
Wholesale Revenue X X
Number of Returned Items
Etc.
Report “by”
- Facts
Report “on” - Dimensions
26. Global Data Strategy, Ltd. 2022 www.globaldatastrategy.com
Design Patterns
There are a number of design patterns available to fit a variety of use cases
(again – there is no “one size fits all” )
Inmon vs. Kimball
The battle still rages...
Data Vault
Hubs, Links and Satellites
Flatten Everything
Popular with Data Science
Columnar
Columns vs. Rows
And More…
Choices abound…
Graph
Good for discovering connections
27. Global Data Strategy, Ltd. 2022 www.globaldatastrategy.com
In a Typical Organization,
there are many Use Cases for Data Models
21
Web
Application
Operational
System
NoSQL Key Value Pair
for web session info
Relational Database
for Operational Data.
The following is just a subset of options that exist….
Operational Usage Transfer /
Exchange
JSON
XML
… Etc.
Storage for Analytics /
Reporting
Relational for Consistency
& Standards
Reporting for Analytic
“Slicing & Dicing”
Data Vault for Flexible
Storage
Consumption for Analytics
& Reporting
Cubes
Cubes for Business
Intelligence Reporting
Flattened Tables
Flattened tables for
Analytics & Data Science
Master Data & Hierarchies
for Data Quality &
Consistency
Graph Database
Graph Database for
Connections & Patterns
28. Global Data Strategy, Ltd. 2022 www.globaldatastrategy.com
Summary
• Analytics and Reporting are key priorities for
today’s data-driven business.
• A strong data architecture is needed to support
successful analytics
• There are many choices in the marketplace, and
at the same time, core fundamentals still apply.
• Choose your architecture wisely, and have fun
and success with the numerous options available
in today’s market.
22
29. Global Data Strategy, Ltd. 2022 www.globaldatastrategy.com
DATAVERSITY Data Architecture Strategies
• January Emerging Trends in Data Architecture – What’s the Next Big Thing?
• February Building a Data Strategy - Practical Steps for Aligning with Business Goals
• March Master Data Management – Aligning Data, Process, and Governance
• April Data Governance & Data Architecture: Alignment & Synergies
• May Improving Data Literacy Around Data Architecture
• June Business Intelligence & Data Analytics: An Architected Approach
• July Best Practices in Metadata Management
• August Data Quality Best Practices
• September Business-centric Data Modeling
• October Graph Databases: Benefits & Risks
• December Enterprise Architecture vs. Data Architecture
23
This Year’s Lineup
30. Global Data Strategy, Ltd. 2022
Who We Are: Business-Focused Data Strategy
Maximize the Organizational Value of Your Data Investment
In today’s business environment, showing rapid time to value for
any technical investment is critical.
But technology and data can be complex. At Global Data Strategy,
we help demystify technical complexity to help you:
• Demonstrate the ROI and business value of data to your
management
• Build a data strategy at your pace to match your unique culture
and organizational style.
• Create an actionable roadmap for “quick wins”, which building
towards a long-term scalable architecture.
Global Data Strategy’s shares experience from some of the largest
international organizations scaled to the pace of your unique team.
www.globaldatastrategy.com
Global Data Strategy has worked with organizations globally in the
following industries:
Finance · Retail · Social Services · Health Care · Education · Manufacturing
· Government · Public Utilities · Construction · Media & Entertainment ·
Insurance …. and more
31. Global Data Strategy, Ltd. 2022 www.globaldatastrategy.com
Questions?
Thoughts? Ideas?
25