This document discusses data lineage and strategies for improving data governance. It begins with an overview of a webcast on data lineage and introduces the speakers. It then discusses the importance of data governance and challenges organizations face with data complexity, volume, and regulatory compliance. Specific challenges to effective data lineage when transitioning to cloud, increasing data sources, and growing data are explored. The document presents a case study of a global bank that used data integration and quality tools to build an anti-money laundering process. It concludes with recommendations to assess current data understanding and lineage use within the next 90 days to strengthen governance.
2. Webcast Audio
• Today’s webcast audio is streamed through your computer
speakers.
• If you need technical assistance with the web interface or audio,
please reach out to us using the Q&A box.
Questions Welcome
• Submit your questions at any time during the presentation using the
Q&A box.
• We will answer them during our Q&A session following the
presentation.
Recording and slides
• This webcast is being recorded. You will receive an email following
the webcast with a link to download both the recording and the
slides.
Housekeeping
Andy Reid
Director, Product Marketing
Arianna Valentini
Product Marketing Manager
3. What You Will Learn Today
• Review of the ingredients of successful Big Data
• What is the cost of lost data governance
• Overcoming data lineage challenges
• How one company is using DI + DQ for lineage that
fuels their anti-money laundering requirements
• What you can do in the next 90 days to take action on
data lineage
• Wrap up with Q&A
3
4. 4
Ingredients of Successful Big Data
1. Clear Business Case 2. Extract Data 3. Understand Data 4. Trace Lineage
Data Governance
5. 64%of IT executives have
trouble finding and cleaning
the right data for strategic
data projects
Sierra Venture, 2020
90%of executives are concerned
about the how misused data
can impact corporate
reputation
• PWC, 22nd Annual Global CEO Survey, 2019
Only 2%of firms consider
themselves fully CCPA
compliant today
International Association of Privacy Professionals,
October 2019
The Cost of Lost
Governance
GDPR Fines 2019: 27
$ 462,635,765https://alpin.io/blog/gdpr-fines-list/
December 15, 2019
The importance of data quality
and integration in the enterprise:
• Compliance
• Decision making
• Customer centricity
• Brand reputation
• Risk Mitigation
5
6. Goals and Challenges of
Data Governance
GOALS
• Regulatory compliance
• Understand data context,
meaning
• Accuracy, completeness,
consistency, relevancy,
timeliness, validity of data
CHALLENGES
• Multi-platform, data
volume and complexity
• Diversity and consistency of
sources
• Compliance demands:
broader, deeper & evolving
6
7. Regulation Pressures Continue to Grow
Broader and deeper compliance & regulationVolume and complexity of data is growing
May 2018 Jan 2020
7
9. 9
Why is Data Lineage
Important for Data
Governance?
• See linkages to external data sources and
targets
• Gain insight into the flow of data across
the enterprise
• Trace usage and assess the impact of
changes across the data lifecycle
• Diagnose problems faster
10. Transitioning to new
cloud deployments
Increasing data lineage
complexity
Rising data volumes,
sources, and variety
Growing regulatory
requirements
Challenges to
effective
Data Lineage
10
11. Growing Regulations
• Track data from access to integration to ensure sensitive
data is being used in a compliant way
• Regardless of the data source, mainframe, IBM i or cloud,
establish a process for lineage analysis
• See the flow of any piece of data through a job
• Consider how next-gen projects such as Machine
Learning might effect your data lineage processes
• Do you have what is needed for audits?
11
Data needs to meet quality levels but also be traced to original source
12. Rising Data Volumes, Sources, and Variety
• Consider how you will address data lineage for a growing
expanse of data
• Does the integration solutions you use today, create data
lineage challenges for source data?
• Ex. Mainframe data to a cloud data warehouse
• Establish data lineage processes that can cover requirements
for both batch and real-time data delivery
• Cannot forget data quality!
12
Regardless of complexities, continuous trusted data delivery is a must
13. Increasing Data Lineage Complexity
• Consider if you auditability and transparency in your current
data lineage processes
• Need full insight into the flow of data across the enterprise
• Is there a clear link to external data sources and targets?
• As data moves through its life cycle can you clearly trace usage
and assess?
13
As your environment complexity grows, you must have a data
lineage map to follow data throughout the enterprise
15. 15
The Reality is…
Cloud is Here
46% of IT professionals have said that
cloud or hybrid-cloud computing
was part of their 2019 initiatives
Data Trends for 2019, Syncsort 2019
84% of organizations have a multi-
cloud strategy
State of the Cloud 2019, Flexera
16. Transitioning to New Cloud Deployments
• When moving from source to cloud target, you need to pass
source-to-cluster data lineage information on
• Understand how a hybrid, multi or full cloud deployment can
effect your data governance scalability
• Ask: How will this effect my current data lineage process?
• Consider which elements of your current DI/DQ strategy
need to adapt
16
Cloud deployments need to satisfy governance and compliance needs
17. Global Bank
Building an AML process with DI + DQ
Goal
Meet AML transaction monitoring
and Financial Conduct Authority
(FCA) compliance
Challenges
• Data volume too large,
diversely scattered to analyze
• Disparate data sources –
Mainframe, RDBMS, Cloud,
etc.
• Maximize the value/ROI of the
data lake
17
Requirements
• Consolidated and clean data
• End-to-end data lineage
• Secure integrations
• Unmodified mainframe data
for archive/backup
18. Global Bank
Results: Data Integration Driving Improved CX
Solution
• Connect CDC
• Connect for Big Data
• Trillium for Big Data
Benefits Achieved
• High performance AML
results
• Faster time to value
• Data lake is trusted source
• Data feeding critical
machine learning-based
fraud detection
What’s Next
• Expanding to additional
Customer Engagement
solutions and applications
18
19. Looking at the Next 90 Days…
• Determine if you have an understanding of your
organizational data
• Consider how you use data lineage to support
governance today
• How will you use business lineage AND technical
lineage to ensure governance?
19