The traditional approach to insurance pricing involves fitting a generalized linear model (GLM) to data collected on historical claims payments and premiums received. The explosive growth in data availability and increasing competitiveness in the marketplace are challenging actuaries to find new insights in their data and make predictions with more granularity, improved speed and efficiency, and with tighter integration among business units to support strategic decisions.
In this session we will share our experience implementing deep hierarchical neural networks using TensorFlow and PySpark on Databricks. We will discuss the benefits of the ML Runtime, our experience using the goofys mount, our process for hyperparameter tuning, specific considerations for the large dataset size and extreme volatility present in insurance data, among other topics.
Authors: Bryn Clark, Krish Rajaram
2. Krish Rajaram & Bryn Clarke, Nationwide Insurance
Deploying Enterprise
Scale Deep Learning in
Actuarial Modeling
#UnifiedAnalytics #SparkAISummit
3. Agenda
• About Nationwide
• About Enterprise Data Office
• Nationwide’s journey with Databricks
• Use case deep dive
3#UnifiedAnalytics #SparkAISummit
4. in 457 retirement plans, based
on number of plans
PLANSPONSOR, 2017 Recordkeeping Survey
Total small business insurer
Conning, 2014; Conning Strategic Study: The Small Business Sector for
Property-Casualty Insurance: Market Shift Coming
Writer of farms and ranches
A.M. Best, 2016 DWP
#1
8th largest auto insurer
A.M. Best, 2016 DWP
2nd largest
Domestic specialty (Excess & Surplus)
commercial lines insurer
A.M. Best, 2016 DWP
Nationwide is committing more than
$100 million
Of venture capital to invent and reinvent
customer-centric solutions.
#9 provider
of defined
contribution
retirement plans
PLANSPONSOR, 2017 Recordkeeping Survey 7th largest
homeowners insurer
A.M. Best, 2016 DWP
#1 pet
insurer
North American Pet Health
Insurance Assn., 2016
7th largest
writer of variable
annuities
Morningstar, YE 2016, Based
on total flows
#1 writer
of corporate life
IBIS Associates, Inc.,
February 2018
7th largest
commercial
lines insurer
A.M. Best, 2016 DWP
8th largest
life insurer
LIMRA, YE 2016.
Based on total premiums
Nationwide Ranks with the Best
5. FORTUNE 100 Best Companies to Work For
Black Enterprise 50 Best Companies for Diversity
2018 Catalyst Award honoree
Human Rights Campaign Best Place to Work for
LGBTQ Equality
$49 billion in total sales/direct written premium
$26.9 billion in net operating revenue
$1.2 billion in net operating income
$225.5 billion in total assets
A+A.M. Best
received 10/17/2002
affirmed 10/2/2017
A+Standard & Poor’s
received 12/22/2008
affirmed 5/24/2017
A1Moody’s
received 3/10/2009
affirmed 11/7/2017
Fortune 100 Company
6. FINANCIAL SERVICES COMMERCIAL LINES PERSONAL LINES
Individual Life
Annuities
Retirement Plans
Corporate Life
Mutual Funds
Banking
Standard Commercial
Farm and Ranch
Commercial Agribusiness
Excess and
Surplus/Specialty
Standard Auto
Homeowners and Renters
Pet
Sport Vehicles
Personal Liability
Lines of Business
7. Manages relationships with
our IT and business
partners
Oversees and optimizes data
integrity, availability, usability,
and trustworthiness
Owns the One Nationwide
data strategy, enabling the
Enterprise’s ability to leverage
data as a competitive asset
Manages Enterprise data
allowing for insights into
business activities, enabling
achievement of business
goals
Deploys data and analytics
tools and processes to solve
complex business problems.
Enterprise Data Office
Chief Data Officer
(CDO)
Data Advisory
Services
Purpose:
Give data a voice
Mission:
The EDO is dedicated to empowering the business of
Nationwide by delivering trusted solutions through
complete data & analytics services.
Data Governance
and Quality
Assurance
Data Architecture
and Strategy
Data Management
Data Analytics
and Decision
Sciences
9. Databricks deployment at Nationwide
Data Plane
Web Frontend w/ SSO
AWS Account
Control Plane
Support
(Access Genie)
On Prem Data sources
Hadoop,
SQL databases
Users (admin,
Data Scientist,
Engineers)
VGW
Business
Partner
Extranet
O/B Access
Download packages
3rd party datasets
Databricks
CLI
AWS Data sources
RedShift, RDS,
DynamoDB etc
MPLS
WAN
Data in S3 buckets
10. 10
Databricks adoption at Nationwide
Information worker Data Analyst Data Engineer Data Scientist
R & R Studio Python & Jupyter SAS/ SAS Grid
IBM SPSS H2O, DriverlessAI Tensorflow
Hadoop/Hive/Spark/Zeppelin SQL
Excel Access SAS
Tableau Paxata Python/R
SQL
51
2
Efficiency gain
De-risk
Revenue Generating
Use Cases
Not applicable
Databricks adoption Databricks adoption
Experiment Dev Prod
Databricks adoption
Experiment Dev Prod
Databricks adoption
Experiment Dev Prod
Well Known Variable
Data Sources Data Sources Data Sources Data Sources
Well Known Variable Well Known Variable Well Known Variable
Standard Emerging Specialized
Tools Tools Tools Tools
Standard Emerging Specialized Standard Emerging Specialized Standard Emerging Specialized
1-3 4-5 6+
Number of tools Number of tools Number of tools Number of tools
1-3 4-5 6+ 1-3 4-5 6+ 1-3 4-5 6+
11. 11
Utilizing methodologies to accelerate decision-making...
...by leveraging cutting edge data & technology
• Ensembled Machine
Learning
• Traditional Statistical
Learning
• Deep Learning
• Time Series Forecasting
Statistical
Modeling
AI &
Machine Learning
• Text & Speech Analytics
• GPU Acceleration
• Recommender
• Regression
Experimental
Design
Modeling &
Forecasting
• Bayesian Hierarchical
Modeling
• Segmentation Modeling
• Survivor Modeling
• Model-as-Service
Data Technology
• Nationwide internal
data
• Social
• Demographic
• Geographic
• Financial
• Macro-economic
• R
• Python
• H20
• Tableau
• Java
• SPSS
Modeler
• Tensorflow
What is the tangible benefit of our data
product solutions?
• Tailored support combing business
knowledge & statistical expertise
• Easy understanding of the data for instant &
actionable usability
• Automated & seamless access that
integrates with your processes
• Scalable utility solving advanced analytical
problems across domains
We deliver wisdom in data
by interacting with partners
to translate problems into
analytical solutions
Enterprise Analytics Office
12. Focus Use Case
• Predict insurance claims frequency and
severity (average cost of claims)
• Large dataset (100s of millions of records)
• Volatile data
– Insurance claims are infrequent
– Most often arise due to chance
12#UnifiedAnalytics #SparkAISummit
13. Traditional Approach
• Batch (1-5 years) of data aggregated across
linear predictors (state, vehicle model year,
driver age, etc.)
• Trained actuary fits a Generalized Linear Model
(GLM) to determine slope/intercept for each
linear predictor
• Result is a multiplicative “rating plan”
13#UnifiedAnalytics #SparkAISummit
14. Novel Approach
• Deep learning (hierarchical neural network)
• Adequately models non-linearity of latent
variables
• Multiple heads
– Frequency & Severity
– Coverage Type & Cause of Loss
• Compare to traditional GLM
14#UnifiedAnalytics #SparkAISummit
15. Performance Evaluation
• Custom loss functions
– Poisson, Gamma negative loglikelihood
• Custom metric functions
– Normalized Gini index / AUC
• Online monitoring using TensorBoard
15#UnifiedAnalytics #SparkAISummit
16. Model Search Space is Vast
• Size and number of layers
• Embedding dimensionality
• Activation functions (ReLU, tanh, linear)
• Regularization (L1/L2, dropout)
• Many others (autoencoder, combining levels of
prediction, skip connections, etc.)
16#UnifiedAnalytics #SparkAISummit
17. Why Spark?
• Many aspects of preprocessing are
embarrassingly parallel
– Conversion between data formats (SAS, CSV,
Parquet, TFRecords)
– Encoding of category labels
• Scoring is also embarrassingly parallel
• Primary limitation is hyperparameter/model
configuration search
17#UnifiedAnalytics #SparkAISummit
19. Benchmark Timings
* Utilizing Spark we are able to test many model configurations concurrently. In a local workstation
environment, each configuration needs to be tested consecutively.
19#UnifiedAnalytics #SparkAISummit
Local Workstation Spark
CSV Conversion 10-12 hrs < 5 mins
Random Shuffling ~ 8 hrs < 5 mins
Featurization ~ 5 hrs 20 mins
TFRecords Examples ~ 5 hrs < 5 mins
Model Training ~ 6 hrs ~ 3 hrs (single node*)
Model Scoring ~ 3 hrs < 5 mins
20. Lessons Learned
• Loading/exporting data
• Conversion of the model from Keras to TensorFlow
• Initializing TensorFlow models on individual nodes
• Using goofys mounts
• Syncing with DBFS to store model checkpoints
• Utilizing Databricks Jobs/notebook parameters
20#UnifiedAnalytics #SparkAISummit
21. Conclusion & Next Steps
• Utilizing Spark on Databricks with ML runtime
we reduced the modeling pipeline timings from ~
34 hours to less than 4hrs
• Further opportunity exists in utilizing Horovod for
multi-GPU training
– Reduce time needed to evaluate individual model
configurations
21#UnifiedAnalytics #SparkAISummit
22. General Observation
• Using Databricks ML Runtime, Notebooks, scalable
compute instances and scheduling features we were
able to rapidly prototype the methodology.
• Work in progress for Path to production and
integration with current Model deployment
framework.
• Challenging to predict DBU consumption by different
business units; hence difficult to forecast cost.
• No automatic integration with GitHub enterprise.
22#UnifiedAnalytics #SparkAISummit