CIT modernized its data architecture in response to intense regulatory scrutiny. In this presentation, they present how data virtualization is being used to drive standardization, enable cross-company data integration, and serve as a common provisioning point from which to access all authoritative sources of data.
This presentation is part of the Fast Data Strategy Conference, and you can watch the video here goo.gl/CCqUeT.
ICT role in 21st century education and its challenges
Data Virtualization for Compliance – Creating a Controlled Data Environment
1. Data Virtualization for
Compliance – Creating a
Controlled Data Environment
Stan Sobol
Head of Data Architecture and Data Services
CIT Group, Inc.
2. Abstract
Data Virtualization for Compliance – Creating a
Controlled Data Environment
Enterprises face a variety of data management
challenges that influence their ability to leverage
accurate, meaningful information, quickly and
efficiently. Data virtualization is an enabling
technology which can address many of these
challenges.
This session will explore how data virtualization is
being used to dramatically reduce data proliferation
and ensure that all consumers are working from a
single source of the truth. It will also look at how
data virtualization can drive standardization,
measure & improve data quality, abstract data
consumers from data providers, expose data
lineage, enable cross-company data integration,
and serve a common provisioning point from which
to access all authoritative sources of data.
3. • Whether within IT or the business, employees
find ways to access the data they need to do
their job.
• Often times, data is copied, processed offline
(eg Excel & Access) , and somehow fed back
into the sausage grinder of data movement that
exists in many enterprises.
• Data is often enriched and adjusted along the
way, potentially resulting in inconsistent
information across internal organizations,
sometimes requiring additional reconciliations
and duplicated efforts.
• Self-serve data can be powerful, but requires
the guide rails of standards, access control,
certified provisioning points, and strong data
governance.
• Old habits die hard. Culture is a difficult thing
to change.
The Problem: Data Everywhere
4. • Financial Services companies are experiencing
unprecedented regulatory scrutiny with an
increased focus on data management
practices.
• Banks need to evolve their critical data flows to
provide increased frequency, granularity, and
auditable aggregation of data used to manage
risk.
• Systemically Important Financial Institutions
(SIFIs) or “too big to fail” banks must operate
with strict controls around their data and
increasingly mature their data management
practices to meet evolving industry standards.
• Regulatory authorities continue to move the
goal post with publications like BCBS 239,
which is positioned as a guideline, but
expected to become a requirement.
• Other industries like Pharma, Insurance and
Energy face similar challenges with their own
data management practices and challenging
regulatory requirements.
Regulatory Backdrop
5. • Common provisioning point from which to access all
authoritative sources of data.
• Beyond data integration capabilities, the DSL provides usage
metering, monitoring of in-flight data movement, and
orchestration of data APIs.
• The DSL is not a data repository, it is a framework to leverage
data that is persisted, mastered and managed elsewhere.
• Created with a collection of technologies, from traditional ETL
and sftp, to more modern RESTful interfaces supported by
data virtualization and API gateway technology.
• Provides metadata and lineage around data flows that
leverage the DSL.
• Data virtualization can reduce unnecessary copies, the root of
data proliferation.
• Consumers need to trust that historical data is durable and
consistent
• “Publish ready” APIs don’t just serve up data, they apply data
quality monitoring rules and trigger data stewardship activities.
• Data virtualization is a foundational technology within the DSL.
The Solution: The Data Services Layer (or “DSL”)
6. Data Architecture – Key Principles
• Realize value from data
• Access all data through common provisioning
point
• Avoid point-to-point integration
• Build once, use many times
• Minimize data replication and
proliferation
• Eliminate data redundancy, unnecessary
copies
• Eliminate redundant data reconciliation efforts
• Enables effective Data Governance
• Enforce policies, standards and procedures
• Define & publish authoritative sources of data
• Efficient data lineage and metadata
management
• Monitoring of data quality before consumption
• Pragmatic data integration strategy
• Faster time-to-market delivery
• Incremental information delivery
Data Provisioning Layer
Party
Master
Finance PlatformAuthoritative Data
Lease
Loan
Bank
Mortgage
Data Quality Monitoring
Data Access (Integration) Layer
Downstream Systems
Risk
Finance
Fit for purpose
Data Marts
Reporting Layer
AR Systems Risk Systems HR Systems
SystemofRecordDataDelivery
CertifiedGolden
Source
Data Virtualization in the Target State Architecture
7. • Build the team and the infrastructure capacity to provide an enterprise service.
• Establish policy requiring all strategic data flows to go through the DSL.
• Validate data lineage, ensure data consumption from authoritative sources.
• Disassemble the sausage grinder of data movement:
• Start to unwind legacy ETL and rewire strategic data flows through the DSL
• Aspire to have all data movement occur within a “single hop” of the DSL
• Explore metadata discovery tools to understand non-DSL data movement
• Smart automation (not everything) – quality rules, remediation workflow, etc.
• Study metering data, understand how data is consumed to help optimize services.
• Establish standards around how data is exposed:
• Everyone consumes data via a shared canonical model.
• Expose data as services at the finest granularity that makes sense.
• Rationalize data service APIs, ensure consistency & referential integrity across
business segments.
• Establish foundational data management platform with evolutionary path towards a
micro-services architecture.
• Start small, evolve with demand and growth.
We have the technology … Now what?
8. • Data governance is critical to the success of any data virtualization effort
• Consumers need to trust that data is owned, managed, and certified
• Establish a data governance framework that ensures accountability, empowers owners
of data, and fosters a culture of good data hygiene:
• Firm-wide policy establishing the data governance framework & governing bodies
• Data management committee aligned to senior-most governing body of firm
• Accountable executives on point for data quality by segment
• Data standards against which to measure data quality
• Data stewards empowered to own & remediate critical data elements
• Insightful, actionable metrics / dashboards targeted at executives, stewards, data
consumers
• Data quality has many dimensions, prioritize the ones that matter most
• eg completeness, validity, accuracy, timeliness, granularity, etc.
• Use a canonical model with shared terms defined in a firm-wide business glossary
• The Chief Data Officer owns the policy, provides stewardship of the data governance
framework, and serves as an evangelist for good data management practices
• Data virtualization can be an enabling technology for smart data governance
Data Governance is Critical