Migrating Analytics to the Cloud at Fannie Mae

©2018 Impetus Technologies, Inc. All rights reserved.
You are prohibited from making a copy or modification of, or from redistributing,
rebroadcasting, or re-encoding of this content without the prior written consent of
Impetus Technologies.
This presentation may include images from other products and services. These
images are used for illustrative purposes only. Unless explicitly stated there is no
implied endorsement or sponsorship of these products by Impetus Technologies. All
copyrights and trademarks are property of their respective owners.

Migrating Analytics to the Cloud at Fannie Mae
Kevin Bates
VP, Enterprise Data Strategy Execution
Fannie Mae

Migrating Analytics to the Cloud
at Fannie Mae
Analytics Overview
Why migrate analytics to the Cloud?
Our Approach
Challenges and Lessons Learned

Analytics at Fannie Mae
Market Risk
Financials
New Ideas
Pricing
Measuring Progress
Analytics drive the business at Fannie Mae….

Why migrate analytics to the Cloud?

Opportunities to empower new (and old) capabilities!

Analytics and reporting tools will continue to
propagate
AI Libraries
Data Federation &
Virtualization
Data Integration & ETL
Analytics LOB
Applications
Self-serve Analytics &
Visualization
Data Catalogs &
Metadata
Data Science Platforms
Self-serve Data Prep
Data & Compute
Platforms
Traditional BI
Rather than fight the
changes and limit
choice we need a
platform that enables
choice and manages
the complexity

Opportunities to drive efficiency and sharing…
Active Analytic Catalog
3rd Party
Cloud
On-Prem
1. Connect to data
tables
2. Join, massage,
aggregate, or shape
the data
3. Create calculations,
derivations,
expressions,
aggregations
Data Science Tools
1. Connect to data
tables
2. Join, massage,
aggregate, or shape
the data
derivations,
expressions,
aggregations
BI Tools
1. Connect to data
tables
2. Join, massage,
aggregate, or shape
the data
derivations,
expressions,
aggregations
Line of Bus. Tools
∞
2. Join, massage,
aggregate, or shape
the data
derivations,
expressions,
aggregations
4. Use tool-specific
functions: send campaign,
view model, etc.
view model, etc.
view model, etc.
view model, etc.
New/Custom Application
1x
1x
1x
∞
Re-Use
Analytic Reuse
1. Connect to data
tables

Fannie Mae’s experience with Data Lakes
2014
Open
source
Hadoop
2015
Analytics
Cluster using
proprietary
Hadoop
distribution
2016
Data Lake using
proprietary
Hadoop
distribution
2017
Data Lake using
cloud native
technologies
2018
Driving Data
Lake adoption
Fannie Mae has been in
forefront in adopting to
cloud industry advancements

Approach #1: Take a Governance View
Enterprise Data Lake
BI Reports & Dashboards
Ad-hoc and what if Queries
Data as Service
Data Science Results
?
Business Transaction Data
3rd Party Data
Reference Data
Deal and Delivery
Documents
Structured/SemiStructuredUnstructured
Data life-cycle
Metadata
Data
Security
Data
Lineage
App
User
Enter-
prise
Data
Zones
Data
Usage
Data
Standards
Access Control
Platform
Utilization
Focus areas to automate or enable tools to manage data lake.
Data Certification
Compliance Requirements
Preparation &
Transformatio
n
What goes in?
Ingested
What’s done with it?
Processed
What goes out?
Consumed

Approach #2: Think about Personas
User Zone
Enterprise Data Lake (EDL)Data Scientist /
Analyst
Data
Discovery
Data reads or copy
from other zones
into User Zone
Data contained
(No outward movement)
Developer
Data ingestion from
external source
User data/results
(Local Governance)
Data
Discovery
(EDL)
Data ingestion from
external source.
(Provide catalog)
Provide NPI
Classification of
external data
Process and Insight Layer
Governance
(Extended Metadata)
Data Reads and movement
between zones
(Controls and Metadata)
Schema Design and
Data Catalog
External movement/
Disclosures
(Controls and Catalog)
No NPI
EDL RBAC
No NPI
External Data
Data
External
Data
* Not all personas shown
Enterprise Zone App Zone
Data Layers
InsightLanding Prep.
Data Layers
InsightLanding Prep.

Approach #3: How can we bring two worlds together?
Traditional BI
AI Libraries
Data Catalogs & Metadata
Data & Compute Platforms
Data Federation & Virtualization
Data Integration & ETL
Analytics LOB
Applications
Self-Serve Analytics &
Visualization
Data Science Platforms
Self-Serve Data Prep
Data collaboration platform (centralized service catalog, federated delivery, lineage maintained)
SDLC-drivenTechPlatformsBusinessAnalytics

Approach #4: It’s new and evolving, so leverage
partners who can think end-to-end
Worked with Impetus to establish new patterns for
analytics data provisioning
Use case involved retirement project and cloud transition
Implementation required full production context (real
production, real users)
Solution included:
• One time historical data migration (prem to cloud)
• Migration of existing base tables and snapshots
• New build for cloud-hosted dimensions, snapshots
• New build for ongoing data flows (end-to-end)
Establish data extraction and ingestion framework
Job orchestration
Data transformation and change capture
Establish audit framework (operations, controls)
Capture reusable utilities and build the library
Monitor and report performance for each step
2
1
3
4
5
6

Challenges
Cloud-adoption and Data Lake development can
require manual processes, hand-coding, and reliance
on command-line tools
Keeping track of your data, its lineage, and making it
easy to find
Coupling of ingestion and processing drives
architecture decisions
Operationalizing processes for production and to
maintain SLAs
Ensuring data is in canonical forms with a shared
schema usable by others
Coding or filing tickets to perform new ingestion and
processing tasks
Multiple architectures and technologies used by
different teams on different clusters
Guaranteeing compliance in a system that is
designed for schema-on-read and raw data
Sharing infrastructure in a multi-tenant “self
service” environment
Business awareness buy-in

What we have learned
Review your development practices holistically
• You need new patterns for data movement
• Don’t lift and shift!
Think Governance First!
• Incorporation of new processes into data
governance strategy
• Focus on sustainable practices that fully
envision how the end-to-end together
Engage strategic partners where it makes sense
Keep engaging your business partners to ensure
alignment
As the center of gravity of data moves toward
the cloud, hybrid strategies will become
increasingly important
This is a migration that, for seasoned
companies, will take time
Don’t migrate to the Cloud for tech reasons—
engage your business!

Thank you.
www.fanniemae.com
www.impetus.com

Migrating Analytics to the Cloud at Fannie Mae

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Migrating Analytics to the Cloud at Fannie Mae

Similar to Migrating Analytics to the Cloud at Fannie Mae (20)

More from DataWorks Summit

More from DataWorks Summit (20)

Recently uploaded

Recently uploaded (20)

Migrating Analytics to the Cloud at Fannie Mae