SlideShare una empresa de Scribd logo
1 de 27
Descargar para leer sin conexión
what’s in your workflow?
reproducible business analysis at Capital One
Emily Riederer
Sr. Analyst, Capital One
@EmilyRiederer / emily.riederer@capitalone.com
the reproducibility “brand”
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
The replicability crisis empowered the open science movement and turned reproducibility
and workflows into hot topics
"The Myth of Self-Correcting Science"
The Atlantic
Sarah Estes
December 20, 2012
"How science goes wrong"
The Economist
Cover Story
October 21, 2013
"Psychology's Replication Crisis has a Silver Lining"
The Atlantic
Paul Bloom
Feb 18, 2016
"When the Revolution Came for Amy Cuddy"
New York Times Magazine
Susan Dominus
October 18, 2017
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
Many organizations emerged to develop and evangelize better practices for scientific
transparency and collaboration
• Offer two-day bootcamps on scientific computing
and reproducible research
• Target researchers in diverse academic disciplines
(e.g. ecology)
• Over 22,000 workshop attendees and 1,000
instructors since 2012
• “Good Enough Practices in Scientific Computing”
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
Industry embraced advancements in reproducibility for gains to quality and efficiency
• Developed internal R package for tooling and
community building among data scientists
• Promotes efficient work and standardized style
• Share analyses on Knowledge Repo (built on
GitHub)
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
Analyst
Scientist
Traditional business analysis’ focus on solving a specific case is a barrier to reproducible
thinking
Abstraction
Concrete Business Cases
Reproducibility Taxonomy
(Stodden, et al, 2013)
Open/Reproducible
Auditable
Confirmable
Replicable
Reviewable
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
The traditional lens of business analysis is reinforced by the standard tools of the trade
Abstraction
Concrete Business Cases
Spreadsheets expose
data, hide computation
Scripting logs computation,
abstracts data
Analyst
Scientist
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
The language business analysts use to articulate their problems obscures the link to
technology-driven solutions
Reproducibility
Version Control
Peer/Code Review
Re-Work
Tribal Knowledge
Team Work
Sanity Checking
the tidycf R package at Capital One
turning business analysis on its head by turning cashflows on their side
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
Decision
Making
Validation
&
Monitoring
Modeling
Scenario
Analysis
Cashflow analysis is integral to many interrelated pieces of business analytics
Documentation
& Governance
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
Decision
Making
Validation
&
Monitoring
Modeling
Scenario
Analysis
Documentation
& Governance
Database
System
BI Visualization
Tool
Legacy Statistical
Computing Platform
Legacy Statistical
Computing Platform
Legacy Statistical
Computing
Platform
FTP Client
FTP Client
Spreadsheet
Software
Spreadsheet
Software
Word Processor
Word Processor
Spreadsheet
Software
Presentation
Software
• Black box
• Limited capability
• Manual documentation
• Highly manual process
• System-specific
knowledge
• Slow iteration
Patchwork processes lead to inefficiency, poor documentation, and limited reproducibility
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
Decision
Making
Validation
&
Monitoring
Modeling
Scenario
Analysis
Building an end-to-end R package enabled an efficient and reproducible workflow
• Accessible code
• Extensible code
• Real-time
documentation
• Automated &
reproducible
• General versus system-
specific knowledge
• Rapid iteration
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
Nuanced business decisions are driven by a remarkably standard analytical “engine”
Business Processes
Workflow
(R for Data Science)
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
Cashflow statements are a typical representation of valuations models in the world of
financial analysis
Time Period 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
Total Revenue 4.7$ 21.1$ 47.1$ 34.3$ 1.6$ 13.2$ 7.9$ 57.1$ 4.1$ 14.7$ 4.7$ 21.1$ 47.1$ 34.3$ 1.6$ 13.2$ 7.9$ 57.1$
Interchange Revenue 2.0$ 19.5$ 45.8$ 31.7$ (0.7)$ 11.0$ 6.8$ 54.9$ 3.0$ 13.7$ 2.0$ 19.5$ 45.8$ 31.7$ (0.7)$ 11.0$ 6.8$ 54.9$
Spend 195.4$ 1,945.4$ 4,583.6$ 3,174.0$ (71.5)$ 1,096.2$ 678.6$ 5,486.5$ 304.0$ 1,366.2$ 195.4$ 1,945.4$ 4,583.6$ 3,174.0$ (71.5)$ 1,096.2$ 678.6$ 5,486.5$
Interchange Rate 1.0%
Interest Revenue 1.6$ 0.2$ 0.2$ 1.4$ 0.8$ 1.5$ 0.9$ 0.9$ 0.7$ 0.5$ 1.6$ 0.2$ 0.2$ 1.4$ 0.8$ 1.5$ 0.9$ 0.9$
Fee Revenue 0.9$ 0.4$ 0.9$ 0.0$ 0.7$ 0.1$ 0.2$ 0.6$ 0.3$ 0.5$ 0.9$ 0.4$ 0.9$ 0.0$ 0.7$ 0.1$ 0.2$ 0.6$
Other Revenues 0.3$ 1.0$ 0.1$ 1.1$ 0.8$ 0.6$ 0.1$ 0.7$ 0.1$ 0.1$ 0.3$ 1.0$ 0.1$ 1.1$ 0.8$ 0.6$ 0.1$ 0.7$
Total Expense 14.2$ 3.2$ 9.9$ 2.0$ 8.9$ 10.9$ 5.0$ 1.6$ 16.0$ 19.9$ 14.2$ 3.2$ 9.9$ 2.0$ 8.9$ 10.9$ 5.0$ 1.6$
Operating Expenses 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$
Marketing Expenses 5.0$ 2.0$ -$ 1.0$ 2.0$ 1.0$ 2.0$ 1.0$ 2.0$ 1.0$ 5.0$ 2.0$ 2.0$ 1.0$ 2.0$ 1.0$ 2.0$ 1.0$
Credit Losses -$ 0.2$ 0.2$ 0.2$ 0.2$ 0.2$ 0.2$ 0.2$ 0.2$ 0.2$ 1.0$ 0.2$ 0.2$ 0.2$ 0.2$ 0.2$ 0.2$ 0.2$
Recoveries & Coll -$ 0.1$ 0.1$ 0.1$ 0.1$ 0.1$ 0.1$ 0.1$ 0.1$ 0.1$ 2.0$ 0.1$ 0.1$ 0.1$ 0.1$ 0.1$ 0.1$ 0.1$
Cost of Funds 12.0$ 8.1$ 0.5$ 16.8$ 1.3$ 1.6$ 5.5$ 2.6$ 1.6$ 0.4$ 5.2$ 11.7$ 7.4$ 20.6$ 4.8$ 12.4$ 3.5$ 5.0$
Outstandings 447.3$ 346.8$ 19.0$ 353.9$ 297.6$ 42.9$ 276.3$ 358.8$ 50.7$ 101.7$ 433.2$ 479.1$ 229.6$ 430.4$ 272.0$ 415.2$ 123.1$ 154.0$
Loan Rate 2.7% 2.3% 2.6% 4.7% 0.4% 3.7% 2.0% 0.7% 3.1% 0.4% 1.2% 2.5% 3.2% 4.8% 1.8% 3.0% 2.9% 3.2%
Other Expenses (3.8)$ (8.2)$ 8.2$ (17.1)$ 4.3$ 7.0$ (3.7)$ (3.3)$ 11.1$ 17.3$ 0.0$ (11.9)$ (0.8)$ (20.9)$ 0.7$ (3.9)$ (1.8)$ (5.7)$
NIBT (9.5)$ 17.9$ 37.1$ 32.4$ (7.3)$ 2.3$ 2.9$ 55.5$ (11.9)$ (5.2)$ (9.5)$ 17.9$ 37.1$ 32.4$ (7.3)$ 2.3$ 2.9$ 55.5$
Tax (6.2)$ 11.6$ 24.1$ 21.1$ (4.7)$ 1.5$ 1.9$ 36.1$ (7.7)$ (3.4)$ (6.2)$ 11.6$ 24.1$ 21.1$ (4.7)$ 1.5$ 1.9$ 36.1$
Tax Rate 36.0%
NIAT (3.3)$ 6.3$ 13.0$ 11.3$ (2.6)$ 0.8$ 1.0$ 19.4$ (4.2)$ (1.8)$ (3.3)$ 6.3$ 13.0$ 11.3$ (2.6)$ 0.8$ 1.0$ 19.4$
Equity Flow (5.0)$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ (5.0)$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$
Cashflow (8.3)$ 7.3$ 14.0$ 12.3$ (1.6)$ 1.8$ 2.0$ 20.4$ (3.2)$ 4.0$ (8.3)$ 7.3$ 14.0$ 12.3$ (1.6)$ 1.8$ 2.0$ 20.4$
Discounted CF (8.3)$ 7.2$ 13.8$ 12.1$ (1.5)$ 1.7$ 1.9$ 19.5$ (3.0)$ 3.8$ (8.3)$ 7.2$ 13.8$ 12.1$ (1.5)$ 1.7$ 1.9$ 19.5$
Lifetime DCF 47.2$
TV 10.0$
PV 57.2$
Fake data is provided for illustrative purposes only and does not represent Capital One performance
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
However, cashflow statements are not optimized for either human or machine readability
Time
Mix of time series and pointwise fields; data and assumptions
Variable information contained in formatting – bold, italics, indentations
Mix of data and calculations, with every cell exposed to potential typos
Fake data is provided for illustrative purposes only and does not represent Capital One performance
Data defined by specific location on sheet so adding a
new line-item can perturb downstream calculations
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
Instead, we used tidy cashflows, applying the principles of tidy data analysis
• “Happy families are all alike; every unhappy family is unhappy in its own way “ – Leo Tolstoy, Anna Karenina
• “Like families, tidy datasets are all alike but every messy dataset is messy in its own way.” – Hadley Wickham, “Tidy Data”
Segment Time Tot_Rev Tot_Exp NIBT Tax NIAT Eq_Flow Cashflow
Super 1 4.69 14.20 -9.50 -6.18 -3.33 -5.00 -8.33
Super 2 21.08 3.19 17.90 11.63 6.26 1.00 7.26
Super 3 47.07 9.94 37.13 24.13 13.00 1.00 14.00
Super 4 34.35 1.96 32.39 21.05 11.34 1.00 12.34
Super 5 1.59 8.89 -7.30 -4.75 -2.56 1.00 -1.56
Prime 1 57.61 47.45 10.16 6.60 3.56 -5.00 -1.44
Prime 2 93.78 5.52 88.26 57.37 30.89 1.00 31.89
Prime 3 17.74 54.17 -36.43 -23.68 -12.75 1.00 -11.75
Prime 4 36.98 3.93 33.05 21.48 11.57 1.00 12.57
Prime 5 78.72 55.98 22.74 14.78 7.96 1.00 8.96
Time 1 2 3 4 5
Superprime
Total Revenue $4.69 $21.08 $47.07 $34.35 $1.59
Total Expense $14.20 $3.19 $9.94 $1.96 $8.89
NIBT -$9.50 $17.90 $37.13 $32.39 -$7.30
Tax -$6.18 $11.63 $24.13 $21.05 -$4.75
NIAT -$3.33 $6.26 $13.00 $11.34 -$2.56
Equity Flow -$5.00 $1.00 $1.00 $1.00 $1.00
Cashflow -$8.33 $7.26 $14.00 $12.34 -$1.56
Prime
Total Revenue $57.61 $93.78 $17.74 $36.98 $78.72
Total Expense $47.45 $5.52 $54.17 $3.93 $55.98
NIBT $10.16 $88.26 -$36.43 $33.05 $22.74
Tax $6.60 $57.37 -$23.68 $21.48 $14.78
NIAT $3.56 $30.89 -$12.75 $11.57 $7.96
Equity Flow -$5.00 $1.00 $1.00 $1.00 $1.00
Cashflow -$1.44 $31.89 -$11.75 $12.57 $8.96
Fake data is provided for illustrative purposes only and does not represent Capital One performance
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
Tidy cashflows streamline the workflow to facilitate advanced analytics like bootstrapping
error bars and indifference curves
Fake data is provided for illustrative purposes only and does not represent Capital One performance
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
Organically evolving the tidycf package while addressing business problems led to
efficient and empathetic development
Task:
Valuations
Process 1:
Data Validation
Process 2:
Data Exploration
Process n-1:
Model Validation
Process n+1 … z:
Analysis with Model
Framework 1:
Data Validation
Framework 2:
Data Exploration
Framework n-1:
Model Validation
Framework n+1 … z:
Analysis with Model
calc
functions
viz
functions
tbl
functions
… …
Process 3:
Model Building
Framework 3:
Model Building
Process n:
Model Intuition
Framework n:
Model Intuition
Business Problems R Markdown Templates R functions
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
RMarkdown templates enable package discoverability, immersion into the broader R
language, and contextual knowledge transfer while generating documentation
Code comments explain syntax and
suggest new functions to try
Text commentary facilitates knowledge
transfer of business context and intuition
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
Functions manipulate tidy data and provide output consistent with their taxonomy and
compatible with any tidyverse pipeline
calc
functions
viz
functions
tbl
functions
data frame
data frame
graphic
pivoted data
get
functions
DB Conns,
Internet Proxies,
etc.
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
These functions are intuitively related to help users quickly generate standard views
Fake data is provided for illustrative purposes only and does not represent Capital One performance
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
Integration with the tidyverse allows functions to provide both structure and flexibility
Fake data is provided for illustrative purposes only and does not represent Capital One performance
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
tidycf’s R Project template standardizes file structure for better project management
RStudio > File > New Project > New Directory
• /analysis/ : core scripts (.Rmd) and final outputs (.HTML)
• /data/ : raw data (or in our case, pulled from SQL)
• /doc/: text files providing context and documentation
• /ext/ : external files needed for project
• /output/ : intermediate/final data formats
• /src/ : other helper scripts (e.g. SQL, python)
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
RMarkdown templates document data transformations and demonstrate relative paths to
save intermediate artifacts in the appropriate directory
Data Validation
Exploratory Data Analysis
Model Validation
Model Analytics
(Multiple Modeling Steps)
…
Model Monitoring
raw
out1
out_t
out_t+1
model
model
out1
out2
out_t+1
model
R Markdown Templates Output DataInput Data
./data/
./analysis/
./output/
Directoryexternal
External Source
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
Industry, academia, and the open-source community served as inspirations for our three-
pronged training approach
Community BuildingBootcampsIndividual Resources
Set-Up Guides, Self-Directed
Learning, and Prerequisites
Three day bootcamp –
conceptual and hands-on
Interaction (forum) and
engagement (contribution)
opportunities
Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md
Our resulting package emphasizes reproducibility while immersing business analysts in R
odbc
DBI
dbplyr
rmarkdown
knitr
httr
devtools
plotly
tidyverse
what’s in your workflow?
reproducible business analysis at Capital One
Emily Riederer
Sr. Analyst, Capital One
@EmilyRiederer / emily.riederer@capitalone.com

Más contenido relacionado

La actualidad más candente

Inv 03.statistics review a_macias_in_class_fall2013
Inv 03.statistics review a_macias_in_class_fall2013Inv 03.statistics review a_macias_in_class_fall2013
Inv 03.statistics review a_macias_in_class_fall2013DFitzmorris
 
Goldmoney inc june ir 06.07.16 (1)
Goldmoney inc   june ir 06.07.16 (1)Goldmoney inc   june ir 06.07.16 (1)
Goldmoney inc june ir 06.07.16 (1)BitGold Inc
 
Epic research malaysia daily klse report for 31st december 2015
Epic research malaysia   daily klse report for 31st december 2015Epic research malaysia   daily klse report for 31st december 2015
Epic research malaysia daily klse report for 31st december 2015Epic Research Pte. Ltd.
 
EPIC RESEARCH SINGAPORE - Daily SGX Singapore report of 26 July 2016
EPIC RESEARCH SINGAPORE - Daily SGX Singapore report of 26 July 2016EPIC RESEARCH SINGAPORE - Daily SGX Singapore report of 26 July 2016
EPIC RESEARCH SINGAPORE - Daily SGX Singapore report of 26 July 2016Nicole Chan
 
EPIC RESEARCH SINGAPORE - Daily SGX Singapore report of 28 July 2016
EPIC RESEARCH SINGAPORE - Daily SGX Singapore report of 28 July 2016EPIC RESEARCH SINGAPORE - Daily SGX Singapore report of 28 July 2016
EPIC RESEARCH SINGAPORE - Daily SGX Singapore report of 28 July 2016Nicole Chan
 

La actualidad más candente (6)

Inv 03.statistics review a_macias_in_class_fall2013
Inv 03.statistics review a_macias_in_class_fall2013Inv 03.statistics review a_macias_in_class_fall2013
Inv 03.statistics review a_macias_in_class_fall2013
 
Goldmoney inc june ir 06.07.16 (1)
Goldmoney inc   june ir 06.07.16 (1)Goldmoney inc   june ir 06.07.16 (1)
Goldmoney inc june ir 06.07.16 (1)
 
Weekly Market Analysis
Weekly Market AnalysisWeekly Market Analysis
Weekly Market Analysis
 
Epic research malaysia daily klse report for 31st december 2015
Epic research malaysia   daily klse report for 31st december 2015Epic research malaysia   daily klse report for 31st december 2015
Epic research malaysia daily klse report for 31st december 2015
 
EPIC RESEARCH SINGAPORE - Daily SGX Singapore report of 26 July 2016
EPIC RESEARCH SINGAPORE - Daily SGX Singapore report of 26 July 2016EPIC RESEARCH SINGAPORE - Daily SGX Singapore report of 26 July 2016
EPIC RESEARCH SINGAPORE - Daily SGX Singapore report of 26 July 2016
 
EPIC RESEARCH SINGAPORE - Daily SGX Singapore report of 28 July 2016
EPIC RESEARCH SINGAPORE - Daily SGX Singapore report of 28 July 2016EPIC RESEARCH SINGAPORE - Daily SGX Singapore report of 28 July 2016
EPIC RESEARCH SINGAPORE - Daily SGX Singapore report of 28 July 2016
 

Similar a What's in your workflow? Bringing data science workflows to business analysis at Capital One

ACG_Cup_-_Valuation
ACG_Cup_-_ValuationACG_Cup_-_Valuation
ACG_Cup_-_ValuationIke Ekeh
 
Round 2Dec. 31, 2015 C57912AndrewsTracy CalhounVict.docx
Round 2Dec. 31, 2015 C57912AndrewsTracy CalhounVict.docxRound 2Dec. 31, 2015 C57912AndrewsTracy CalhounVict.docx
Round 2Dec. 31, 2015 C57912AndrewsTracy CalhounVict.docxjoellemurphey
 
Round 5Dec. 31, 2018 C58538AndrewsEugene EllisPhili.docx
Round 5Dec. 31, 2018 C58538AndrewsEugene EllisPhili.docxRound 5Dec. 31, 2018 C58538AndrewsEugene EllisPhili.docx
Round 5Dec. 31, 2018 C58538AndrewsEugene EllisPhili.docxjoellemurphey
 
WatchOver - Alzheimer's Caregiver Wearable Technology
WatchOver - Alzheimer's Caregiver Wearable TechnologyWatchOver - Alzheimer's Caregiver Wearable Technology
WatchOver - Alzheimer's Caregiver Wearable TechnologyJohn W. Quinn
 
Looptworks case analysis
Looptworks case analysisLooptworks case analysis
Looptworks case analysisFred Wu
 
Brunswick (BC) Pitch - Jonathan Chang - FINAL COPY
Brunswick (BC) Pitch - Jonathan Chang - FINAL COPYBrunswick (BC) Pitch - Jonathan Chang - FINAL COPY
Brunswick (BC) Pitch - Jonathan Chang - FINAL COPYJonathan Chang
 
Financial Forecasting
Financial ForecastingFinancial Forecasting
Financial ForecastingJamariHodges1
 
RR - Technicals 1.pdf
RR - Technicals 1.pdfRR - Technicals 1.pdf
RR - Technicals 1.pdfmerag76668
 
C3 comp redesign 6-3-13 (2)
C3 comp redesign  6-3-13 (2)C3 comp redesign  6-3-13 (2)
C3 comp redesign 6-3-13 (2)Mark Wolkove
 
Page 1 Front PagePage 2 Stocks & BondsPage 3 Financial Sum.docx
Page 1 Front PagePage 2 Stocks & BondsPage 3 Financial Sum.docxPage 1 Front PagePage 2 Stocks & BondsPage 3 Financial Sum.docx
Page 1 Front PagePage 2 Stocks & BondsPage 3 Financial Sum.docxgerardkortney
 
Page 1 Front PagePage 2 Stocks & BondsPage 3 Financial Sum.docx
Page 1 Front PagePage 2 Stocks & BondsPage 3 Financial Sum.docxPage 1 Front PagePage 2 Stocks & BondsPage 3 Financial Sum.docx
Page 1 Front PagePage 2 Stocks & BondsPage 3 Financial Sum.docxalfred4lewis58146
 
DiscussionThe utilization of functional communication training
DiscussionThe utilization of functional communication trainingDiscussionThe utilization of functional communication training
DiscussionThe utilization of functional communication trainingDustiBuckner14
 
Pat Von Tersch - Managing Margins in Market Downturns
Pat Von Tersch - Managing Margins in Market DownturnsPat Von Tersch - Managing Margins in Market Downturns
Pat Von Tersch - Managing Margins in Market DownturnsJohn Blue
 
21st Century Compensation Planning
21st Century Compensation Planning21st Century Compensation Planning
21st Century Compensation PlanningRelax, It's Handled
 
Round 2 - 2020Sim ID Z79546_8High Level OverviewTe.docx
Round 2 - 2020Sim ID Z79546_8High Level OverviewTe.docxRound 2 - 2020Sim ID Z79546_8High Level OverviewTe.docx
Round 2 - 2020Sim ID Z79546_8High Level OverviewTe.docxdaniely50
 
Excerpts taken from Lucas, S.E. (12th ed.) (2015). The Art .docx
Excerpts taken from  Lucas, S.E. (12th ed.) (2015).  The Art .docxExcerpts taken from  Lucas, S.E. (12th ed.) (2015).  The Art .docx
Excerpts taken from Lucas, S.E. (12th ed.) (2015). The Art .docxcravennichole326
 

Similar a What's in your workflow? Bringing data science workflows to business analysis at Capital One (20)

ACG Cup - Valuation
ACG Cup - ValuationACG Cup - Valuation
ACG Cup - Valuation
 
ACG_Cup_-_Valuation
ACG_Cup_-_ValuationACG_Cup_-_Valuation
ACG_Cup_-_Valuation
 
Round 2Dec. 31, 2015 C57912AndrewsTracy CalhounVict.docx
Round 2Dec. 31, 2015 C57912AndrewsTracy CalhounVict.docxRound 2Dec. 31, 2015 C57912AndrewsTracy CalhounVict.docx
Round 2Dec. 31, 2015 C57912AndrewsTracy CalhounVict.docx
 
Sample LBO Model Template – 2
Sample LBO Model Template – 2Sample LBO Model Template – 2
Sample LBO Model Template – 2
 
Round 5Dec. 31, 2018 C58538AndrewsEugene EllisPhili.docx
Round 5Dec. 31, 2018 C58538AndrewsEugene EllisPhili.docxRound 5Dec. 31, 2018 C58538AndrewsEugene EllisPhili.docx
Round 5Dec. 31, 2018 C58538AndrewsEugene EllisPhili.docx
 
WatchOver - Alzheimer's Caregiver Wearable Technology
WatchOver - Alzheimer's Caregiver Wearable TechnologyWatchOver - Alzheimer's Caregiver Wearable Technology
WatchOver - Alzheimer's Caregiver Wearable Technology
 
Looptworks case analysis
Looptworks case analysisLooptworks case analysis
Looptworks case analysis
 
Brunswick (BC) Pitch - Jonathan Chang - FINAL COPY
Brunswick (BC) Pitch - Jonathan Chang - FINAL COPYBrunswick (BC) Pitch - Jonathan Chang - FINAL COPY
Brunswick (BC) Pitch - Jonathan Chang - FINAL COPY
 
Financial Forecasting
Financial ForecastingFinancial Forecasting
Financial Forecasting
 
RR - Technicals 1.pdf
RR - Technicals 1.pdfRR - Technicals 1.pdf
RR - Technicals 1.pdf
 
C3 comp redesign 6-3-13 (2)
C3 comp redesign  6-3-13 (2)C3 comp redesign  6-3-13 (2)
C3 comp redesign 6-3-13 (2)
 
Page 1 Front PagePage 2 Stocks & BondsPage 3 Financial Sum.docx
Page 1 Front PagePage 2 Stocks & BondsPage 3 Financial Sum.docxPage 1 Front PagePage 2 Stocks & BondsPage 3 Financial Sum.docx
Page 1 Front PagePage 2 Stocks & BondsPage 3 Financial Sum.docx
 
Page 1 Front PagePage 2 Stocks & BondsPage 3 Financial Sum.docx
Page 1 Front PagePage 2 Stocks & BondsPage 3 Financial Sum.docxPage 1 Front PagePage 2 Stocks & BondsPage 3 Financial Sum.docx
Page 1 Front PagePage 2 Stocks & BondsPage 3 Financial Sum.docx
 
Sample LBO Model Template
Sample LBO Model TemplateSample LBO Model Template
Sample LBO Model Template
 
DiscussionThe utilization of functional communication training
DiscussionThe utilization of functional communication trainingDiscussionThe utilization of functional communication training
DiscussionThe utilization of functional communication training
 
Pat Von Tersch - Managing Margins in Market Downturns
Pat Von Tersch - Managing Margins in Market DownturnsPat Von Tersch - Managing Margins in Market Downturns
Pat Von Tersch - Managing Margins in Market Downturns
 
21st Century Compensation Planning
21st Century Compensation Planning21st Century Compensation Planning
21st Century Compensation Planning
 
Round 2 - 2020Sim ID Z79546_8High Level OverviewTe.docx
Round 2 - 2020Sim ID Z79546_8High Level OverviewTe.docxRound 2 - 2020Sim ID Z79546_8High Level OverviewTe.docx
Round 2 - 2020Sim ID Z79546_8High Level OverviewTe.docx
 
Excerpts taken from Lucas, S.E. (12th ed.) (2015). The Art .docx
Excerpts taken from  Lucas, S.E. (12th ed.) (2015).  The Art .docxExcerpts taken from  Lucas, S.E. (12th ed.) (2015).  The Art .docx
Excerpts taken from Lucas, S.E. (12th ed.) (2015). The Art .docx
 
SCM PROJECT
SCM PROJECTSCM PROJECT
SCM PROJECT
 

Más de Domino Data Lab

The Proliferation of New Database Technologies and Implications for Data Scie...
The Proliferation of New Database Technologies and Implications for Data Scie...The Proliferation of New Database Technologies and Implications for Data Scie...
The Proliferation of New Database Technologies and Implications for Data Scie...Domino Data Lab
 
Racial Bias in Policing: an analysis of Illinois traffic stops data
Racial Bias in Policing: an analysis of Illinois traffic stops dataRacial Bias in Policing: an analysis of Illinois traffic stops data
Racial Bias in Policing: an analysis of Illinois traffic stops dataDomino Data Lab
 
Data Quality Analytics: Understanding what is in your data, before using it
Data Quality Analytics: Understanding what is in your data, before using itData Quality Analytics: Understanding what is in your data, before using it
Data Quality Analytics: Understanding what is in your data, before using itDomino Data Lab
 
Supporting innovation in insurance with randomized experimentation
Supporting innovation in insurance with randomized experimentationSupporting innovation in insurance with randomized experimentation
Supporting innovation in insurance with randomized experimentationDomino Data Lab
 
Leveraging Data Science in the Automotive Industry
Leveraging Data Science in the Automotive IndustryLeveraging Data Science in the Automotive Industry
Leveraging Data Science in the Automotive IndustryDomino Data Lab
 
Summertime Analytics: Predicting E. coli and West Nile Virus
Summertime Analytics: Predicting E. coli and West Nile VirusSummertime Analytics: Predicting E. coli and West Nile Virus
Summertime Analytics: Predicting E. coli and West Nile VirusDomino Data Lab
 
Reproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with JupyterReproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with JupyterDomino Data Lab
 
GeoViz: A Canvas for Data Science
GeoViz: A Canvas for Data ScienceGeoViz: A Canvas for Data Science
GeoViz: A Canvas for Data ScienceDomino Data Lab
 
Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field Domino Data Lab
 
Doing your first Kaggle (Python for Big Data sets)
Doing your first Kaggle (Python for Big Data sets)Doing your first Kaggle (Python for Big Data sets)
Doing your first Kaggle (Python for Big Data sets)Domino Data Lab
 
Leveraged Analytics at Scale
Leveraged Analytics at ScaleLeveraged Analytics at Scale
Leveraged Analytics at ScaleDomino Data Lab
 
How I Learned to Stop Worrying and Love Linked Data
How I Learned to Stop Worrying and Love Linked DataHow I Learned to Stop Worrying and Love Linked Data
How I Learned to Stop Worrying and Love Linked DataDomino Data Lab
 
Software Engineering for Data Scientists
Software Engineering for Data ScientistsSoftware Engineering for Data Scientists
Software Engineering for Data ScientistsDomino Data Lab
 
Moving Data Science from an Event to A Program: Considerations in Creating Su...
Moving Data Science from an Event to A Program: Considerations in Creating Su...Moving Data Science from an Event to A Program: Considerations in Creating Su...
Moving Data Science from an Event to A Program: Considerations in Creating Su...Domino Data Lab
 
Building Data Analytics pipelines in the cloud using serverless technology
Building Data Analytics pipelines in the cloud using serverless technologyBuilding Data Analytics pipelines in the cloud using serverless technology
Building Data Analytics pipelines in the cloud using serverless technologyDomino Data Lab
 
Leveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science ToolsLeveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science ToolsDomino Data Lab
 
Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...Domino Data Lab
 
The Role and Importance of Curiosity in Data Science
The Role and Importance of Curiosity in Data ScienceThe Role and Importance of Curiosity in Data Science
The Role and Importance of Curiosity in Data ScienceDomino Data Lab
 
Fuzzy Matching to the Rescue
Fuzzy Matching to the RescueFuzzy Matching to the Rescue
Fuzzy Matching to the RescueDomino Data Lab
 

Más de Domino Data Lab (20)

The Proliferation of New Database Technologies and Implications for Data Scie...
The Proliferation of New Database Technologies and Implications for Data Scie...The Proliferation of New Database Technologies and Implications for Data Scie...
The Proliferation of New Database Technologies and Implications for Data Scie...
 
Racial Bias in Policing: an analysis of Illinois traffic stops data
Racial Bias in Policing: an analysis of Illinois traffic stops dataRacial Bias in Policing: an analysis of Illinois traffic stops data
Racial Bias in Policing: an analysis of Illinois traffic stops data
 
Data Quality Analytics: Understanding what is in your data, before using it
Data Quality Analytics: Understanding what is in your data, before using itData Quality Analytics: Understanding what is in your data, before using it
Data Quality Analytics: Understanding what is in your data, before using it
 
Supporting innovation in insurance with randomized experimentation
Supporting innovation in insurance with randomized experimentationSupporting innovation in insurance with randomized experimentation
Supporting innovation in insurance with randomized experimentation
 
Leveraging Data Science in the Automotive Industry
Leveraging Data Science in the Automotive IndustryLeveraging Data Science in the Automotive Industry
Leveraging Data Science in the Automotive Industry
 
Summertime Analytics: Predicting E. coli and West Nile Virus
Summertime Analytics: Predicting E. coli and West Nile VirusSummertime Analytics: Predicting E. coli and West Nile Virus
Summertime Analytics: Predicting E. coli and West Nile Virus
 
Reproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with JupyterReproducible Dashboards and other great things to do with Jupyter
Reproducible Dashboards and other great things to do with Jupyter
 
GeoViz: A Canvas for Data Science
GeoViz: A Canvas for Data ScienceGeoViz: A Canvas for Data Science
GeoViz: A Canvas for Data Science
 
Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field Managing Data Science | Lessons from the Field
Managing Data Science | Lessons from the Field
 
Doing your first Kaggle (Python for Big Data sets)
Doing your first Kaggle (Python for Big Data sets)Doing your first Kaggle (Python for Big Data sets)
Doing your first Kaggle (Python for Big Data sets)
 
Leveraged Analytics at Scale
Leveraged Analytics at ScaleLeveraged Analytics at Scale
Leveraged Analytics at Scale
 
How I Learned to Stop Worrying and Love Linked Data
How I Learned to Stop Worrying and Love Linked DataHow I Learned to Stop Worrying and Love Linked Data
How I Learned to Stop Worrying and Love Linked Data
 
Software Engineering for Data Scientists
Software Engineering for Data ScientistsSoftware Engineering for Data Scientists
Software Engineering for Data Scientists
 
Making Big Data Smart
Making Big Data SmartMaking Big Data Smart
Making Big Data Smart
 
Moving Data Science from an Event to A Program: Considerations in Creating Su...
Moving Data Science from an Event to A Program: Considerations in Creating Su...Moving Data Science from an Event to A Program: Considerations in Creating Su...
Moving Data Science from an Event to A Program: Considerations in Creating Su...
 
Building Data Analytics pipelines in the cloud using serverless technology
Building Data Analytics pipelines in the cloud using serverless technologyBuilding Data Analytics pipelines in the cloud using serverless technology
Building Data Analytics pipelines in the cloud using serverless technology
 
Leveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science ToolsLeveraging Open Source Automated Data Science Tools
Leveraging Open Source Automated Data Science Tools
 
Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...
 
The Role and Importance of Curiosity in Data Science
The Role and Importance of Curiosity in Data ScienceThe Role and Importance of Curiosity in Data Science
The Role and Importance of Curiosity in Data Science
 
Fuzzy Matching to the Rescue
Fuzzy Matching to the RescueFuzzy Matching to the Rescue
Fuzzy Matching to the Rescue
 

Último

Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degreeyuu sss
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
 
毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...
毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...
毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...ttt fff
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
Business Analytics using Microsoft Excel
Business Analytics using Microsoft ExcelBusiness Analytics using Microsoft Excel
Business Analytics using Microsoft Excelysmaelreyes
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...ssuserf63bd7
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxUnduhUnggah1
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxdolaknnilon
 
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一F La
 

Último (20)

Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 
毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...
毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...
毕业文凭制作#回国入职#diploma#degree美国加州州立大学北岭分校毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#de...
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
Business Analytics using Microsoft Excel
Business Analytics using Microsoft ExcelBusiness Analytics using Microsoft Excel
Business Analytics using Microsoft Excel
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docx
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptx
 
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
 

What's in your workflow? Bringing data science workflows to business analysis at Capital One

  • 1. what’s in your workflow? reproducible business analysis at Capital One Emily Riederer Sr. Analyst, Capital One @EmilyRiederer / emily.riederer@capitalone.com
  • 3. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md The replicability crisis empowered the open science movement and turned reproducibility and workflows into hot topics "The Myth of Self-Correcting Science" The Atlantic Sarah Estes December 20, 2012 "How science goes wrong" The Economist Cover Story October 21, 2013 "Psychology's Replication Crisis has a Silver Lining" The Atlantic Paul Bloom Feb 18, 2016 "When the Revolution Came for Amy Cuddy" New York Times Magazine Susan Dominus October 18, 2017
  • 4. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md Many organizations emerged to develop and evangelize better practices for scientific transparency and collaboration • Offer two-day bootcamps on scientific computing and reproducible research • Target researchers in diverse academic disciplines (e.g. ecology) • Over 22,000 workshop attendees and 1,000 instructors since 2012 • “Good Enough Practices in Scientific Computing”
  • 5. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md Industry embraced advancements in reproducibility for gains to quality and efficiency • Developed internal R package for tooling and community building among data scientists • Promotes efficient work and standardized style • Share analyses on Knowledge Repo (built on GitHub)
  • 6. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md Analyst Scientist Traditional business analysis’ focus on solving a specific case is a barrier to reproducible thinking Abstraction Concrete Business Cases Reproducibility Taxonomy (Stodden, et al, 2013) Open/Reproducible Auditable Confirmable Replicable Reviewable
  • 7. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md The traditional lens of business analysis is reinforced by the standard tools of the trade Abstraction Concrete Business Cases Spreadsheets expose data, hide computation Scripting logs computation, abstracts data Analyst Scientist
  • 8. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md The language business analysts use to articulate their problems obscures the link to technology-driven solutions Reproducibility Version Control Peer/Code Review Re-Work Tribal Knowledge Team Work Sanity Checking
  • 9. the tidycf R package at Capital One turning business analysis on its head by turning cashflows on their side
  • 10. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md Decision Making Validation & Monitoring Modeling Scenario Analysis Cashflow analysis is integral to many interrelated pieces of business analytics Documentation & Governance
  • 11. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md Decision Making Validation & Monitoring Modeling Scenario Analysis Documentation & Governance Database System BI Visualization Tool Legacy Statistical Computing Platform Legacy Statistical Computing Platform Legacy Statistical Computing Platform FTP Client FTP Client Spreadsheet Software Spreadsheet Software Word Processor Word Processor Spreadsheet Software Presentation Software • Black box • Limited capability • Manual documentation • Highly manual process • System-specific knowledge • Slow iteration Patchwork processes lead to inefficiency, poor documentation, and limited reproducibility
  • 12. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md Decision Making Validation & Monitoring Modeling Scenario Analysis Building an end-to-end R package enabled an efficient and reproducible workflow • Accessible code • Extensible code • Real-time documentation • Automated & reproducible • General versus system- specific knowledge • Rapid iteration
  • 13. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md Nuanced business decisions are driven by a remarkably standard analytical “engine” Business Processes Workflow (R for Data Science)
  • 14. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md Cashflow statements are a typical representation of valuations models in the world of financial analysis Time Period 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Total Revenue 4.7$ 21.1$ 47.1$ 34.3$ 1.6$ 13.2$ 7.9$ 57.1$ 4.1$ 14.7$ 4.7$ 21.1$ 47.1$ 34.3$ 1.6$ 13.2$ 7.9$ 57.1$ Interchange Revenue 2.0$ 19.5$ 45.8$ 31.7$ (0.7)$ 11.0$ 6.8$ 54.9$ 3.0$ 13.7$ 2.0$ 19.5$ 45.8$ 31.7$ (0.7)$ 11.0$ 6.8$ 54.9$ Spend 195.4$ 1,945.4$ 4,583.6$ 3,174.0$ (71.5)$ 1,096.2$ 678.6$ 5,486.5$ 304.0$ 1,366.2$ 195.4$ 1,945.4$ 4,583.6$ 3,174.0$ (71.5)$ 1,096.2$ 678.6$ 5,486.5$ Interchange Rate 1.0% Interest Revenue 1.6$ 0.2$ 0.2$ 1.4$ 0.8$ 1.5$ 0.9$ 0.9$ 0.7$ 0.5$ 1.6$ 0.2$ 0.2$ 1.4$ 0.8$ 1.5$ 0.9$ 0.9$ Fee Revenue 0.9$ 0.4$ 0.9$ 0.0$ 0.7$ 0.1$ 0.2$ 0.6$ 0.3$ 0.5$ 0.9$ 0.4$ 0.9$ 0.0$ 0.7$ 0.1$ 0.2$ 0.6$ Other Revenues 0.3$ 1.0$ 0.1$ 1.1$ 0.8$ 0.6$ 0.1$ 0.7$ 0.1$ 0.1$ 0.3$ 1.0$ 0.1$ 1.1$ 0.8$ 0.6$ 0.1$ 0.7$ Total Expense 14.2$ 3.2$ 9.9$ 2.0$ 8.9$ 10.9$ 5.0$ 1.6$ 16.0$ 19.9$ 14.2$ 3.2$ 9.9$ 2.0$ 8.9$ 10.9$ 5.0$ 1.6$ Operating Expenses 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ Marketing Expenses 5.0$ 2.0$ -$ 1.0$ 2.0$ 1.0$ 2.0$ 1.0$ 2.0$ 1.0$ 5.0$ 2.0$ 2.0$ 1.0$ 2.0$ 1.0$ 2.0$ 1.0$ Credit Losses -$ 0.2$ 0.2$ 0.2$ 0.2$ 0.2$ 0.2$ 0.2$ 0.2$ 0.2$ 1.0$ 0.2$ 0.2$ 0.2$ 0.2$ 0.2$ 0.2$ 0.2$ Recoveries & Coll -$ 0.1$ 0.1$ 0.1$ 0.1$ 0.1$ 0.1$ 0.1$ 0.1$ 0.1$ 2.0$ 0.1$ 0.1$ 0.1$ 0.1$ 0.1$ 0.1$ 0.1$ Cost of Funds 12.0$ 8.1$ 0.5$ 16.8$ 1.3$ 1.6$ 5.5$ 2.6$ 1.6$ 0.4$ 5.2$ 11.7$ 7.4$ 20.6$ 4.8$ 12.4$ 3.5$ 5.0$ Outstandings 447.3$ 346.8$ 19.0$ 353.9$ 297.6$ 42.9$ 276.3$ 358.8$ 50.7$ 101.7$ 433.2$ 479.1$ 229.6$ 430.4$ 272.0$ 415.2$ 123.1$ 154.0$ Loan Rate 2.7% 2.3% 2.6% 4.7% 0.4% 3.7% 2.0% 0.7% 3.1% 0.4% 1.2% 2.5% 3.2% 4.8% 1.8% 3.0% 2.9% 3.2% Other Expenses (3.8)$ (8.2)$ 8.2$ (17.1)$ 4.3$ 7.0$ (3.7)$ (3.3)$ 11.1$ 17.3$ 0.0$ (11.9)$ (0.8)$ (20.9)$ 0.7$ (3.9)$ (1.8)$ (5.7)$ NIBT (9.5)$ 17.9$ 37.1$ 32.4$ (7.3)$ 2.3$ 2.9$ 55.5$ (11.9)$ (5.2)$ (9.5)$ 17.9$ 37.1$ 32.4$ (7.3)$ 2.3$ 2.9$ 55.5$ Tax (6.2)$ 11.6$ 24.1$ 21.1$ (4.7)$ 1.5$ 1.9$ 36.1$ (7.7)$ (3.4)$ (6.2)$ 11.6$ 24.1$ 21.1$ (4.7)$ 1.5$ 1.9$ 36.1$ Tax Rate 36.0% NIAT (3.3)$ 6.3$ 13.0$ 11.3$ (2.6)$ 0.8$ 1.0$ 19.4$ (4.2)$ (1.8)$ (3.3)$ 6.3$ 13.0$ 11.3$ (2.6)$ 0.8$ 1.0$ 19.4$ Equity Flow (5.0)$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ (5.0)$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ 1.0$ Cashflow (8.3)$ 7.3$ 14.0$ 12.3$ (1.6)$ 1.8$ 2.0$ 20.4$ (3.2)$ 4.0$ (8.3)$ 7.3$ 14.0$ 12.3$ (1.6)$ 1.8$ 2.0$ 20.4$ Discounted CF (8.3)$ 7.2$ 13.8$ 12.1$ (1.5)$ 1.7$ 1.9$ 19.5$ (3.0)$ 3.8$ (8.3)$ 7.2$ 13.8$ 12.1$ (1.5)$ 1.7$ 1.9$ 19.5$ Lifetime DCF 47.2$ TV 10.0$ PV 57.2$ Fake data is provided for illustrative purposes only and does not represent Capital One performance
  • 15. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md However, cashflow statements are not optimized for either human or machine readability Time Mix of time series and pointwise fields; data and assumptions Variable information contained in formatting – bold, italics, indentations Mix of data and calculations, with every cell exposed to potential typos Fake data is provided for illustrative purposes only and does not represent Capital One performance Data defined by specific location on sheet so adding a new line-item can perturb downstream calculations
  • 16. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md Instead, we used tidy cashflows, applying the principles of tidy data analysis • “Happy families are all alike; every unhappy family is unhappy in its own way “ – Leo Tolstoy, Anna Karenina • “Like families, tidy datasets are all alike but every messy dataset is messy in its own way.” – Hadley Wickham, “Tidy Data” Segment Time Tot_Rev Tot_Exp NIBT Tax NIAT Eq_Flow Cashflow Super 1 4.69 14.20 -9.50 -6.18 -3.33 -5.00 -8.33 Super 2 21.08 3.19 17.90 11.63 6.26 1.00 7.26 Super 3 47.07 9.94 37.13 24.13 13.00 1.00 14.00 Super 4 34.35 1.96 32.39 21.05 11.34 1.00 12.34 Super 5 1.59 8.89 -7.30 -4.75 -2.56 1.00 -1.56 Prime 1 57.61 47.45 10.16 6.60 3.56 -5.00 -1.44 Prime 2 93.78 5.52 88.26 57.37 30.89 1.00 31.89 Prime 3 17.74 54.17 -36.43 -23.68 -12.75 1.00 -11.75 Prime 4 36.98 3.93 33.05 21.48 11.57 1.00 12.57 Prime 5 78.72 55.98 22.74 14.78 7.96 1.00 8.96 Time 1 2 3 4 5 Superprime Total Revenue $4.69 $21.08 $47.07 $34.35 $1.59 Total Expense $14.20 $3.19 $9.94 $1.96 $8.89 NIBT -$9.50 $17.90 $37.13 $32.39 -$7.30 Tax -$6.18 $11.63 $24.13 $21.05 -$4.75 NIAT -$3.33 $6.26 $13.00 $11.34 -$2.56 Equity Flow -$5.00 $1.00 $1.00 $1.00 $1.00 Cashflow -$8.33 $7.26 $14.00 $12.34 -$1.56 Prime Total Revenue $57.61 $93.78 $17.74 $36.98 $78.72 Total Expense $47.45 $5.52 $54.17 $3.93 $55.98 NIBT $10.16 $88.26 -$36.43 $33.05 $22.74 Tax $6.60 $57.37 -$23.68 $21.48 $14.78 NIAT $3.56 $30.89 -$12.75 $11.57 $7.96 Equity Flow -$5.00 $1.00 $1.00 $1.00 $1.00 Cashflow -$1.44 $31.89 -$11.75 $12.57 $8.96 Fake data is provided for illustrative purposes only and does not represent Capital One performance
  • 17. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md Tidy cashflows streamline the workflow to facilitate advanced analytics like bootstrapping error bars and indifference curves Fake data is provided for illustrative purposes only and does not represent Capital One performance
  • 18. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md Organically evolving the tidycf package while addressing business problems led to efficient and empathetic development Task: Valuations Process 1: Data Validation Process 2: Data Exploration Process n-1: Model Validation Process n+1 … z: Analysis with Model Framework 1: Data Validation Framework 2: Data Exploration Framework n-1: Model Validation Framework n+1 … z: Analysis with Model calc functions viz functions tbl functions … … Process 3: Model Building Framework 3: Model Building Process n: Model Intuition Framework n: Model Intuition Business Problems R Markdown Templates R functions
  • 19. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md RMarkdown templates enable package discoverability, immersion into the broader R language, and contextual knowledge transfer while generating documentation Code comments explain syntax and suggest new functions to try Text commentary facilitates knowledge transfer of business context and intuition
  • 20. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md Functions manipulate tidy data and provide output consistent with their taxonomy and compatible with any tidyverse pipeline calc functions viz functions tbl functions data frame data frame graphic pivoted data get functions DB Conns, Internet Proxies, etc.
  • 21. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md These functions are intuitively related to help users quickly generate standard views Fake data is provided for illustrative purposes only and does not represent Capital One performance
  • 22. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md Integration with the tidyverse allows functions to provide both structure and flexibility Fake data is provided for illustrative purposes only and does not represent Capital One performance
  • 23. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md tidycf’s R Project template standardizes file structure for better project management RStudio > File > New Project > New Directory • /analysis/ : core scripts (.Rmd) and final outputs (.HTML) • /data/ : raw data (or in our case, pulled from SQL) • /doc/: text files providing context and documentation • /ext/ : external files needed for project • /output/ : intermediate/final data formats • /src/ : other helper scripts (e.g. SQL, python)
  • 24. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md RMarkdown templates document data transformations and demonstrate relative paths to save intermediate artifacts in the appropriate directory Data Validation Exploratory Data Analysis Model Validation Model Analytics (Multiple Modeling Steps) … Model Monitoring raw out1 out_t out_t+1 model model out1 out2 out_t+1 model R Markdown Templates Output DataInput Data ./data/ ./analysis/ ./output/ Directoryexternal External Source
  • 25. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md Industry, academia, and the open-source community served as inspirations for our three- pronged training approach Community BuildingBootcampsIndividual Resources Set-Up Guides, Self-Directed Learning, and Prerequisites Three day bootcamp – conceptual and hands-on Interaction (forum) and engagement (contribution) opportunities
  • 26. Emily Riederer, Capital One (@EmilyRiederer) References on GitHub: emilyriederer/references/WIYW.md Our resulting package emphasizes reproducibility while immersing business analysts in R odbc DBI dbplyr rmarkdown knitr httr devtools plotly tidyverse
  • 27. what’s in your workflow? reproducible business analysis at Capital One Emily Riederer Sr. Analyst, Capital One @EmilyRiederer / emily.riederer@capitalone.com