2. OBJECTIVE
This presentation will cover some
fundamental principles around designing
systems with the goal of enabling
businesses to make data driven decisions.
2
3. AGENDA
• Typical Users & Ecosystem of Data
• Pitfalls & Benefits of Data Design
• Data Governance
• The Future of Predictive Analytics
• Data Tools & Resources
3
4. TYPICAL USAGE IN AN ORGANIZATION
Type of User Usage Explained
Day to day users of a system Need daily hands-on interaction with the data, such as a transactional
system for managing a particular line of business or work
Manager or team leads Need to oversee the work, and data reports will help here if designed
right
Leaders or company executives Need to see key performance indicator (KPI) metrics in order to run a
business and make decisions for business success
4
5. ECOSYSTEM OF SOURCE DATA – STANDARD OUTPUTS
5
Source
Data
Day to Day
User Queues
Manager &
Team Lead
Dashboards
Leadership &
Executive
Dashboards/
KPIs
Data Science
& Artificial
Intelligence
Data Trends
Standard &
Ad hoc Built
in Reports
Files
Extracted &
System
Integrations
6. CLASSIC PITFALLS
• Creates a lot of rework and data entry
• Band-Aid solutions develop, such as creating one-off
Excel spreadsheets, for gaps missed
• Often creates a lot of manual work for status reporting,
when data is not captured correctly at the source
• Users do not update data correctly, causing hours of
downstream cleanup work
• Systems die or are terminated, for lack of user adoption
• Other?
6
What happens when a system and its data is designed poorly?
7. BENEFITS OF DOING IT RIGHT
• It allows for scalability and growth
• Sets foundation for automation to occur
• Reduces the administrative burden
• Enables businesses to focus on what they are charted to do
• Other?
7
Data is a company’s primary asset and should be recognized for its value.
8. DESIGN FOR HIGH QUALITY
8
Requirements, requirements, requirements… really understand what the
user needs to do their work. Ask why and how, do job shadows, prototype out
solutions, etc.
Keep it simple… tables should use normalized data design with referential
integrity, to reduce redundancy
Keep it clear… Avoid acronyms and if used make sure they are defined, and
use common naming conventions, to avoid confusion or misuse
Quality controls… Put in field level edits from the start, such as; defined
field types (i.e. text, date/time, etc.) and define drop downs/lookup lists, to
have a solid foundation
11. THE FUTURE OF PREDICTIVE ANALYTICS
• Once data is captured in scale, there is
now the concept of artificial
intelligence (AI), data science and
machine learning that takes things one
step further
• This allows for analyzing data in mass
to do predictive analytics for the future
anticipated trends based on past data
11
Supervised learning allows you to collect data or produce a data output from the
previous experience.
Unsupervised learning is a machine learning technique, where you do not need to
supervise the model. Instead, you need to allow the model to work on its own to
information.
12. REAL WORLD EXAMPLES OF AI
12
Social Media
• Was one of pioneers due to open-source code and no regulations on the data
• Examples – Analyzing text, pictures, avoiding propaganda, deciding content flow, advertising, data gathering
https://klintmarketing.com/ai-social-media/
Healthcare
• Now emerging for trending and improving overall health outcomes based on prior patient data
• Example 1 – Hospitals predicting length of stay in beds for patients, formerly done by tribal knowledge
• Example 2 – Pharmacy trends and determining high usage patients and prescribing certain medications i.e.
Advocate Aurora Health Opioid study https://customers.microsoft.com/en-us/story/811799-AAH-Kensci-Azure
Technical Support
• In terms monitoring and chat-bots for user support
• Example – Data center technical issues and failures
https://www.forbes.com/sites/cognitiveworld/2019/05/31/exploring-the-impact-of-ai-in-the-data-
center/#23e2b88067c4
Other?
13. A FEW MACHINE LEARNING CHARTS
13
Thiel-Sen Slope:
Used for finding
outliers in the data
Cluster Chart:
Used for displaying
cohorts/groupings
in the data
Stacked Bar Chart:
Used for locating
solid bar trends in the
data
14. DATA TOOLS
• SQL Server, Oracle, other Backend Databases
• Excel
• SharePoint Lists
• MS Access
• Power BI, Tableau, other Data Visualizations
• Azure Cloud, Data Warehouse and other Big Data Tools
• Other?
14
15. RESOURCES
15
Entity Relationship Diagram (ERD) /
Normalized Data Design
https://www.visual-paradigm.com/guide/data-
modeling/what-is-entity-relationship-diagram/
Data Governance Standards https://www.informatica.com/solutions/what-is-intelligent-
data-governance.html
Data Science Certification https://learn-xpro.mit.edu/data-science