SlideShare una empresa de Scribd logo
1 de 22
© 2015 ZestFinance, Inc.
Data Science ≠ Big Data
Jim McGuire
ZestFinance
1
© 2015 ZestFinance, Inc.
Disregard (Possibly) If You Work Here:
2
© 2015 ZestFinance, Inc.
Real “Big Data” Challenges
Company Amount of Data
Facebook 1.2B Users
Google 60T Pages
Amazon 200M products
XBox 50M Users
3
© 2015 ZestFinance, Inc.
ZestFinance and Big Data
 Big Data Lends New Zest to Banks’ Credit Judgments
 How Big Data is Revolutionizing the Credit Scoring Industry
 Beware Big Data’s Easy Answers
 How Big Data Might Mean Better Business for Big Banks
 ZestFinance… Uses Big Data to Weed Out Deadbeats
 For ZestFinance, Big Data Comes With Big Responsibility
 Big Data Uncovers Some Weird Correlations
 How Big Data Could Replace Your Credit Score
 Can ‘Big Data’ Lift People Out of Cycles of Debt?
4
© 2015 ZestFinance, Inc.
ZestFinance
• Started in 2009 by Douglas Merrill, ramped up in 2011
• Mission: to provide fair and transparent credit to everyone
• Initial focus on the underbanked in the US
• Extensive use of ML
• Provide models and services for:
• Underwriting
• Fraud Prevention
• Customer Service
• Collections
• Marketing
• Consult with external clients
• Traditional lenders
• Subprime lenders
• Media providers
• eCommerce
• Telemarketing
• Insurance
• International
5
© 2015 ZestFinance, Inc.
Extending Credit to the Underbanked
• Traditional credit scoring methodology in use for 40+ years
• FICO et al work great for those with ample credit history
• Lumped together at the bottom score bands:
• Little credit history
• Bankruptcies
• Poor credit history
• Use of alternative sources of credit
• Problems in modeling credit for <600 FICO
• Traditional credit bureau data is sketchy
• Alternative providers of “credit” data with limited coverage and quality
• Higher incidence of fraud
• Most borrowers already live financially “on the edge”
6
© 2015 ZestFinance, Inc.
ZestFinance & JD.com
• Credit in China:
• Historically a “savings” culture
• < 25% of population has access to credit
• Growing middle class
• Growing population of young workers in tech industry
• No credit bureau’s per se
• Credit available for corporations and the wealthy
• Companies don’t have data or tools to assess risk
• Tech companies are being given licenses to lend and
build credit scores
• Learnings from “deep subprime” in America
translate to new markets
• No extensive credit data
• Must combine noisy data from disparate sources
• Traditional credit scoring philosophies may not apply
7
© 2015 ZestFinance, Inc.
Why Data Science is More Than Just a
Lot of Data
• Interpretability / insights necessary for buy-in
• Past business decisions leave imprint on your data
• Expectations of model must be financially grounded
• Noisy data with lots of holes and biases
• Often trying to capture ephemeral “human behavior”
• Complicated structure / relations between fields
8
© 2015 ZestFinance, Inc.
EC2 + Favorite ML Software
9
© 2015 ZestFinance, Inc.
Shaped Datasets
• Past business
decisions shape
datasets used for
building models
• More extreme for
some verticals, like
lending
• Toy example to
demonstrate effect
10
© 2015 ZestFinance, Inc.
Shaped Datasets
• Launch new model,
change
• Collect more data
• Iterate on model build
11
© 2015 ZestFinance, Inc.
Shaped Datasets
• After several
iterations, end up in
this situation
• How to predict swap-
in population after
next model iteration?
12
© 2015 ZestFinance, Inc.
Blind Spots in Data Collection
Blind
Spot Training Set
Entire
Applicant
Population
• Model is not validated
in regions of feature
space that exist in
whole population
• Capping variables may
help, or may give you
false security
• Box cuts can
exacerbate issue
• Univariate analysis may
be misleading
13
© 2015 ZestFinance, Inc.
Blind Spots in Data Collection
VolumeinBlindSpotPerformanceinBlindSpot
14
© 2015 ZestFinance, Inc.
Univariate Signals and Shaping
• In business settings,
predictors are often
intuitive and interpretable
• After building a model,
this variable bubbles up
as significant
15
© 2015 ZestFinance, Inc.
Univariate Signals and Shaping
• In this case, univariate
behavior of the variable
has exactly the opposite
of expectations
# Delinquent Accounts
16
© 2015 ZestFinance, Inc.
Univariate Signals and Shaping#Trades>30DQEver
# Trades Ever Opened
Possible to have high
number of delinquent
accounts if only a small
fraction of total accounts
17
© 2015 ZestFinance, Inc.
• Categorical feature
showed good risk
splits
Historical Business Performance and
Unintentional Proxies
Category Metric
Red 8.3%
Green 7.9%
Blue 6.4%
But really it just
proxied for time
18
© 2015 ZestFinance, Inc.
Lots of Missing Data That Have Different
Meanings
• Context relevant
• Combination of ML
expertise and
business acumen to
handle appropriately
19
© 2015 ZestFinance, Inc.
Hire to Solve Your Business Problem
We Need Data Artists To Save Zombie
Borrowers:
The hottest job right now is ‘data scientist’,
but that label is wrong. It’s not just about
science, it’s also about art. The hottest job
should be data artist – those who
understand the quirks of data and think
through and enjoy the nature of what you’re
talking about… People think you get a
whole bunch of bits and drop them into a
stats box and get all the answers. Turns out
every answer you get is incorrect.
• “You can find a great developer and a
great researcher who has a background
in statistics, and maybe you can find a
great problem solver, but to find that in
the same person is hard.”
• Stan Humphries, Zillow
• “Practice. I strongly believe that being
a data scientist is an interesting merger
of science and craftsmanship. You need
to understand the theory but at the
same time you also need to exercise
your gut feeling.”
• Michael Berthold, KNIME
20
© 2015 ZestFinance, Inc.
Hiring for Diversity
• Naturally an interdisciplinary field that is rapidly evolving
• “Data Scientist” means a lot of different things
• “Full stack analyst” vs. “data engineer” vs. …
• Are coding skills important? Are big data skills important?
• Draw from range of backgrounds:
• Psychology
• Biostatistics
• Physics
• Applied Math
• Computer Science
• Statistics
• Engineering
• MBA
21
© 2015 ZestFinance, Inc.
Q & A
• Thanks!

Más contenido relacionado

La actualidad más candente

How Big Data identifies early indicators of Mental Stress
How Big Data identifies early indicators of Mental StressHow Big Data identifies early indicators of Mental Stress
How Big Data identifies early indicators of Mental StressCoert Du Plessis (杜康)
 
Seven Trends in Government Business Intelligence
Seven Trends in Government Business IntelligenceSeven Trends in Government Business Intelligence
Seven Trends in Government Business IntelligenceTableau Software
 
Big Data, Big Investment
Big Data, Big InvestmentBig Data, Big Investment
Big Data, Big InvestmentGGV Capital
 
Analytics 3.0 Measurable business impact from analytics & big data
Analytics 3.0 Measurable business impact from analytics & big dataAnalytics 3.0 Measurable business impact from analytics & big data
Analytics 3.0 Measurable business impact from analytics & big dataMicrosoft
 
Decision Intelligence: How AI and DI (and YOU) are Evolving to the Next Level
Decision Intelligence: How AI and DI (and YOU) are Evolving to the Next LevelDecision Intelligence: How AI and DI (and YOU) are Evolving to the Next Level
Decision Intelligence: How AI and DI (and YOU) are Evolving to the Next LevelLorien Pratt
 
The Five Data Questions
The Five Data QuestionsThe Five Data Questions
The Five Data Questionscrystalpullen
 
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)mark madsen
 
What is Big Data? - Business Plans
What is Big Data? - Business PlansWhat is Big Data? - Business Plans
What is Big Data? - Business PlansOur Business Ladder
 
IBM presentation at the Chief Analytics Officer Forum East Coast USA (#CAOForum)
IBM presentation at the Chief Analytics Officer Forum East Coast USA (#CAOForum)IBM presentation at the Chief Analytics Officer Forum East Coast USA (#CAOForum)
IBM presentation at the Chief Analytics Officer Forum East Coast USA (#CAOForum)Chief Analytics Officer Forum
 
Big Data – From Strategy to Production
Big Data – From Strategy to ProductionBig Data – From Strategy to Production
Big Data – From Strategy to ProductionSemantic Web Company
 
Strata Data Conference 2019 : Scaling Visualization for Big Data in the Cloud
Strata Data Conference 2019 : Scaling Visualization for Big Data in the CloudStrata Data Conference 2019 : Scaling Visualization for Big Data in the Cloud
Strata Data Conference 2019 : Scaling Visualization for Big Data in the CloudJaipaul Agonus
 
Digital Mines ver 2.0 | 7 lessons on automation i learnt leading digital in...
Digital Mines ver 2.0  |  7 lessons on automation i learnt leading digital in...Digital Mines ver 2.0  |  7 lessons on automation i learnt leading digital in...
Digital Mines ver 2.0 | 7 lessons on automation i learnt leading digital in...Coert Du Plessis (杜康)
 
Is big data handicapped by "design"? Seven design principles for communicatin...
Is big data handicapped by "design"? Seven design principles for communicatin...Is big data handicapped by "design"? Seven design principles for communicatin...
Is big data handicapped by "design"? Seven design principles for communicatin...Zach Gemignani
 
O'Reilly ebook: Machine Learning at Enterprise Scale | Qubole
O'Reilly ebook: Machine Learning at Enterprise Scale | QuboleO'Reilly ebook: Machine Learning at Enterprise Scale | Qubole
O'Reilly ebook: Machine Learning at Enterprise Scale | QuboleVasu S
 
Big Data Innovation
Big Data InnovationBig Data Innovation
Big Data Innovationpaul.hawking
 
Data science capabilities
Data science capabilitiesData science capabilities
Data science capabilitiesMathieu Boucher
 

La actualidad más candente (20)

How Big Data identifies early indicators of Mental Stress
How Big Data identifies early indicators of Mental StressHow Big Data identifies early indicators of Mental Stress
How Big Data identifies early indicators of Mental Stress
 
Moving Big Data to Big Value
Moving Big Data to Big ValueMoving Big Data to Big Value
Moving Big Data to Big Value
 
How does big data impact you
How does big data impact youHow does big data impact you
How does big data impact you
 
Seven Trends in Government Business Intelligence
Seven Trends in Government Business IntelligenceSeven Trends in Government Business Intelligence
Seven Trends in Government Business Intelligence
 
Big Data, Big Investment
Big Data, Big InvestmentBig Data, Big Investment
Big Data, Big Investment
 
Analytics 3.0 Measurable business impact from analytics & big data
Analytics 3.0 Measurable business impact from analytics & big dataAnalytics 3.0 Measurable business impact from analytics & big data
Analytics 3.0 Measurable business impact from analytics & big data
 
Decision Intelligence: How AI and DI (and YOU) are Evolving to the Next Level
Decision Intelligence: How AI and DI (and YOU) are Evolving to the Next LevelDecision Intelligence: How AI and DI (and YOU) are Evolving to the Next Level
Decision Intelligence: How AI and DI (and YOU) are Evolving to the Next Level
 
The Five Data Questions
The Five Data QuestionsThe Five Data Questions
The Five Data Questions
 
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
 
What is Big Data? - Business Plans
What is Big Data? - Business PlansWhat is Big Data? - Business Plans
What is Big Data? - Business Plans
 
IBM presentation at the Chief Analytics Officer Forum East Coast USA (#CAOForum)
IBM presentation at the Chief Analytics Officer Forum East Coast USA (#CAOForum)IBM presentation at the Chief Analytics Officer Forum East Coast USA (#CAOForum)
IBM presentation at the Chief Analytics Officer Forum East Coast USA (#CAOForum)
 
Big Data – From Strategy to Production
Big Data – From Strategy to ProductionBig Data – From Strategy to Production
Big Data – From Strategy to Production
 
Strata Data Conference 2019 : Scaling Visualization for Big Data in the Cloud
Strata Data Conference 2019 : Scaling Visualization for Big Data in the CloudStrata Data Conference 2019 : Scaling Visualization for Big Data in the Cloud
Strata Data Conference 2019 : Scaling Visualization for Big Data in the Cloud
 
Digital Mines ver 2.0 | 7 lessons on automation i learnt leading digital in...
Digital Mines ver 2.0  |  7 lessons on automation i learnt leading digital in...Digital Mines ver 2.0  |  7 lessons on automation i learnt leading digital in...
Digital Mines ver 2.0 | 7 lessons on automation i learnt leading digital in...
 
Business analytics
Business analyticsBusiness analytics
Business analytics
 
Is big data handicapped by "design"? Seven design principles for communicatin...
Is big data handicapped by "design"? Seven design principles for communicatin...Is big data handicapped by "design"? Seven design principles for communicatin...
Is big data handicapped by "design"? Seven design principles for communicatin...
 
O'Reilly ebook: Machine Learning at Enterprise Scale | Qubole
O'Reilly ebook: Machine Learning at Enterprise Scale | QuboleO'Reilly ebook: Machine Learning at Enterprise Scale | Qubole
O'Reilly ebook: Machine Learning at Enterprise Scale | Qubole
 
"Big Data Dreams"
"Big Data Dreams""Big Data Dreams"
"Big Data Dreams"
 
Big Data Innovation
Big Data InnovationBig Data Innovation
Big Data Innovation
 
Data science capabilities
Data science capabilitiesData science capabilities
Data science capabilities
 

Destacado

Tajolabigdatacamp2014 140618135810-phpapp01 hyunsik-choi
Tajolabigdatacamp2014 140618135810-phpapp01 hyunsik-choiTajolabigdatacamp2014 140618135810-phpapp01 hyunsik-choi
Tajolabigdatacamp2014 140618135810-phpapp01 hyunsik-choiData Con LA
 
Big Data Day LA 2015 - Tips for Building Self Service Data Science Platform b...
Big Data Day LA 2015 - Tips for Building Self Service Data Science Platform b...Big Data Day LA 2015 - Tips for Building Self Service Data Science Platform b...
Big Data Day LA 2015 - Tips for Building Self Service Data Science Platform b...Data Con LA
 
Big Data Day LA 2015 - Data mining, forecasting, and BI at the RRCC by Benjam...
Big Data Day LA 2015 - Data mining, forecasting, and BI at the RRCC by Benjam...Big Data Day LA 2015 - Data mining, forecasting, and BI at the RRCC by Benjam...
Big Data Day LA 2015 - Data mining, forecasting, and BI at the RRCC by Benjam...Data Con LA
 
Big Data Day LA 2015 - Machine Learning on Largish Data by Szilard Pafka of E...
Big Data Day LA 2015 - Machine Learning on Largish Data by Szilard Pafka of E...Big Data Day LA 2015 - Machine Learning on Largish Data by Szilard Pafka of E...
Big Data Day LA 2015 - Machine Learning on Largish Data by Szilard Pafka of E...Data Con LA
 
Big Data Day LA 2015 - What's new and next in Apache Tez by Bikas Saha of Hor...
Big Data Day LA 2015 - What's new and next in Apache Tez by Bikas Saha of Hor...Big Data Day LA 2015 - What's new and next in Apache Tez by Bikas Saha of Hor...
Big Data Day LA 2015 - What's new and next in Apache Tez by Bikas Saha of Hor...Data Con LA
 
Getting started with Spark & Cassandra by Jon Haddad of Datastax
Getting started with Spark & Cassandra by Jon Haddad of DatastaxGetting started with Spark & Cassandra by Jon Haddad of Datastax
Getting started with Spark & Cassandra by Jon Haddad of DatastaxData Con LA
 
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­tica
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­ticaA noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­tica
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­ticaData Con LA
 
Big Data Day LA 2015 - The AWS Big Data Platform by Michael Limcaco of Amazon
Big Data Day LA 2015 - The AWS Big Data Platform by Michael Limcaco of AmazonBig Data Day LA 2015 - The AWS Big Data Platform by Michael Limcaco of Amazon
Big Data Day LA 2015 - The AWS Big Data Platform by Michael Limcaco of AmazonData Con LA
 
Data science and good questions eric kostello
Data science and good questions eric kostelloData science and good questions eric kostello
Data science and good questions eric kostelloData Con LA
 
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder HortonworksThe Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder HortonworksData Con LA
 
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice MachineSpark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice MachineData Con LA
 
Big Data Day LA 2015 - Scalable and High-Performance Analytics with Distribut...
Big Data Day LA 2015 - Scalable and High-Performance Analytics with Distribut...Big Data Day LA 2015 - Scalable and High-Performance Analytics with Distribut...
Big Data Day LA 2015 - Scalable and High-Performance Analytics with Distribut...Data Con LA
 
Big Data Day LA 2016/ Data Science Track - Enabling Cross-Screen Advertising ...
Big Data Day LA 2016/ Data Science Track - Enabling Cross-Screen Advertising ...Big Data Day LA 2016/ Data Science Track - Enabling Cross-Screen Advertising ...
Big Data Day LA 2016/ Data Science Track - Enabling Cross-Screen Advertising ...Data Con LA
 
Big Data Day LA 2016/ NoSQL track - Architecting Real Life IoT Architecture, ...
Big Data Day LA 2016/ NoSQL track - Architecting Real Life IoT Architecture, ...Big Data Day LA 2016/ NoSQL track - Architecting Real Life IoT Architecture, ...
Big Data Day LA 2016/ NoSQL track - Architecting Real Life IoT Architecture, ...Data Con LA
 
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Deep Learning at Scale - A...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Deep Learning at Scale - A...Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Deep Learning at Scale - A...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Deep Learning at Scale - A...Data Con LA
 
Big Data Day LA 2016/ Use Case Driven track - BI is broken, Dave Fryer, Produ...
Big Data Day LA 2016/ Use Case Driven track - BI is broken, Dave Fryer, Produ...Big Data Day LA 2016/ Use Case Driven track - BI is broken, Dave Fryer, Produ...
Big Data Day LA 2016/ Use Case Driven track - BI is broken, Dave Fryer, Produ...Data Con LA
 
Defigo Security Solutions
Defigo Security Solutions Defigo Security Solutions
Defigo Security Solutions Bizofit
 
Cobra Guard Powerpoint
Cobra Guard PowerpointCobra Guard Powerpoint
Cobra Guard Powerpointltcinfo
 
FieldEZ_Corporate_Presentation_BFSI Ver1.0
FieldEZ_Corporate_Presentation_BFSI Ver1.0FieldEZ_Corporate_Presentation_BFSI Ver1.0
FieldEZ_Corporate_Presentation_BFSI Ver1.0Saroj Kumar Sharma
 

Destacado (20)

Tajolabigdatacamp2014 140618135810-phpapp01 hyunsik-choi
Tajolabigdatacamp2014 140618135810-phpapp01 hyunsik-choiTajolabigdatacamp2014 140618135810-phpapp01 hyunsik-choi
Tajolabigdatacamp2014 140618135810-phpapp01 hyunsik-choi
 
Big Data Day LA 2015 - Tips for Building Self Service Data Science Platform b...
Big Data Day LA 2015 - Tips for Building Self Service Data Science Platform b...Big Data Day LA 2015 - Tips for Building Self Service Data Science Platform b...
Big Data Day LA 2015 - Tips for Building Self Service Data Science Platform b...
 
Big Data Day LA 2015 - Data mining, forecasting, and BI at the RRCC by Benjam...
Big Data Day LA 2015 - Data mining, forecasting, and BI at the RRCC by Benjam...Big Data Day LA 2015 - Data mining, forecasting, and BI at the RRCC by Benjam...
Big Data Day LA 2015 - Data mining, forecasting, and BI at the RRCC by Benjam...
 
Big Data Day LA 2015 - Machine Learning on Largish Data by Szilard Pafka of E...
Big Data Day LA 2015 - Machine Learning on Largish Data by Szilard Pafka of E...Big Data Day LA 2015 - Machine Learning on Largish Data by Szilard Pafka of E...
Big Data Day LA 2015 - Machine Learning on Largish Data by Szilard Pafka of E...
 
Big Data Day LA 2015 - What's new and next in Apache Tez by Bikas Saha of Hor...
Big Data Day LA 2015 - What's new and next in Apache Tez by Bikas Saha of Hor...Big Data Day LA 2015 - What's new and next in Apache Tez by Bikas Saha of Hor...
Big Data Day LA 2015 - What's new and next in Apache Tez by Bikas Saha of Hor...
 
Getting started with Spark & Cassandra by Jon Haddad of Datastax
Getting started with Spark & Cassandra by Jon Haddad of DatastaxGetting started with Spark & Cassandra by Jon Haddad of Datastax
Getting started with Spark & Cassandra by Jon Haddad of Datastax
 
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­tica
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­ticaA noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­tica
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­tica
 
Big Data Day LA 2015 - The AWS Big Data Platform by Michael Limcaco of Amazon
Big Data Day LA 2015 - The AWS Big Data Platform by Michael Limcaco of AmazonBig Data Day LA 2015 - The AWS Big Data Platform by Michael Limcaco of Amazon
Big Data Day LA 2015 - The AWS Big Data Platform by Michael Limcaco of Amazon
 
Data science and good questions eric kostello
Data science and good questions eric kostelloData science and good questions eric kostello
Data science and good questions eric kostello
 
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder HortonworksThe Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
 
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice MachineSpark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
 
Big Data Day LA 2015 - Scalable and High-Performance Analytics with Distribut...
Big Data Day LA 2015 - Scalable and High-Performance Analytics with Distribut...Big Data Day LA 2015 - Scalable and High-Performance Analytics with Distribut...
Big Data Day LA 2015 - Scalable and High-Performance Analytics with Distribut...
 
Big Data Day LA 2016/ Data Science Track - Enabling Cross-Screen Advertising ...
Big Data Day LA 2016/ Data Science Track - Enabling Cross-Screen Advertising ...Big Data Day LA 2016/ Data Science Track - Enabling Cross-Screen Advertising ...
Big Data Day LA 2016/ Data Science Track - Enabling Cross-Screen Advertising ...
 
Big Data Day LA 2016/ NoSQL track - Architecting Real Life IoT Architecture, ...
Big Data Day LA 2016/ NoSQL track - Architecting Real Life IoT Architecture, ...Big Data Day LA 2016/ NoSQL track - Architecting Real Life IoT Architecture, ...
Big Data Day LA 2016/ NoSQL track - Architecting Real Life IoT Architecture, ...
 
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Deep Learning at Scale - A...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Deep Learning at Scale - A...Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Deep Learning at Scale - A...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Deep Learning at Scale - A...
 
Big Data Day LA 2016/ Use Case Driven track - BI is broken, Dave Fryer, Produ...
Big Data Day LA 2016/ Use Case Driven track - BI is broken, Dave Fryer, Produ...Big Data Day LA 2016/ Use Case Driven track - BI is broken, Dave Fryer, Produ...
Big Data Day LA 2016/ Use Case Driven track - BI is broken, Dave Fryer, Produ...
 
Defigo Security Solutions
Defigo Security Solutions Defigo Security Solutions
Defigo Security Solutions
 
Cobra Guard Powerpoint
Cobra Guard PowerpointCobra Guard Powerpoint
Cobra Guard Powerpoint
 
Robol
Robol Robol
Robol
 
FieldEZ_Corporate_Presentation_BFSI Ver1.0
FieldEZ_Corporate_Presentation_BFSI Ver1.0FieldEZ_Corporate_Presentation_BFSI Ver1.0
FieldEZ_Corporate_Presentation_BFSI Ver1.0
 

Similar a Big Data Day LA 2015 - Data Science ≠ Big Data by Jim McGuire of ZestFinance

How CIOs are grappling with big data analytics in Canada
How CIOs are grappling with big data analytics in CanadaHow CIOs are grappling with big data analytics in Canada
How CIOs are grappling with big data analytics in CanadaCanadianCIO (IT World Canada)
 
Five Trends in Analytics - How to Take Advantage Today - StampedeCon 2013
Five Trends in Analytics - How to Take Advantage Today - StampedeCon 2013Five Trends in Analytics - How to Take Advantage Today - StampedeCon 2013
Five Trends in Analytics - How to Take Advantage Today - StampedeCon 2013StampedeCon
 
Usama Fayyad talk in South Africa: From BigData to Data Science
Usama Fayyad talk in South Africa:  From BigData to Data ScienceUsama Fayyad talk in South Africa:  From BigData to Data Science
Usama Fayyad talk in South Africa: From BigData to Data ScienceUsama Fayyad
 
Oceans of big data: Take the plunge or wade in slowly?
Oceans of big data: Take the plunge or wade in slowly?Oceans of big data: Take the plunge or wade in slowly?
Oceans of big data: Take the plunge or wade in slowly?Deloitte Canada
 
Unlocking the Value of Big Data (Innovation Summit 2014)
Unlocking the Value of Big Data (Innovation Summit 2014)Unlocking the Value of Big Data (Innovation Summit 2014)
Unlocking the Value of Big Data (Innovation Summit 2014)Dun & Bradstreet
 
big data analytics pgpmx2015
big data analytics pgpmx2015big data analytics pgpmx2015
big data analytics pgpmx2015Sanmeet Dhokay
 
Big Data for the Next Big Idea in Financial Services (Whitepaper)
Big Data for the Next Big Idea in Financial Services (Whitepaper)Big Data for the Next Big Idea in Financial Services (Whitepaper)
Big Data for the Next Big Idea in Financial Services (Whitepaper)NAFCU Services Corporation
 
Little Steps to BIG Data
Little Steps to BIG DataLittle Steps to BIG Data
Little Steps to BIG DataAptera Inc
 
Big Data 101, What It Means for Business - BDI 12/4/13 The Future of Financia...
Big Data 101, What It Means for Business - BDI 12/4/13 The Future of Financia...Big Data 101, What It Means for Business - BDI 12/4/13 The Future of Financia...
Big Data 101, What It Means for Business - BDI 12/4/13 The Future of Financia...Business Development Institute
 
Startups Aiming to Disrupt Consumer Banking
Startups Aiming to Disrupt Consumer BankingStartups Aiming to Disrupt Consumer Banking
Startups Aiming to Disrupt Consumer BankingKevin Weeks
 
Bigdata for sme-industrial intelligence information-24july2017-final
Bigdata for sme-industrial intelligence information-24july2017-finalBigdata for sme-industrial intelligence information-24july2017-final
Bigdata for sme-industrial intelligence information-24july2017-finalstelligence
 
Creating Big Data Success with the Collaboration of Business and IT
Creating Big Data Success with the Collaboration of Business and ITCreating Big Data Success with the Collaboration of Business and IT
Creating Big Data Success with the Collaboration of Business and ITEdward Chenard
 
Big Data Jujitsu Walkthru Client x Client
Big Data Jujitsu Walkthru Client x ClientBig Data Jujitsu Walkthru Client x Client
Big Data Jujitsu Walkthru Client x ClientClient X Client
 
Big Data for Small Businesses
Big Data for Small BusinessesBig Data for Small Businesses
Big Data for Small BusinessesVivastream
 
Importance of Big data for your Business
Importance of Big data for your BusinessImportance of Big data for your Business
Importance of Big data for your Businessazuyo.com
 
Bda assignment can also be used for BDA notes and concept understanding.
Bda assignment can also be used for BDA notes and concept understanding.Bda assignment can also be used for BDA notes and concept understanding.
Bda assignment can also be used for BDA notes and concept understanding.Aditya205306
 
S ba0881 big-data-use-cases-pearson-edge2015-v7
S ba0881 big-data-use-cases-pearson-edge2015-v7S ba0881 big-data-use-cases-pearson-edge2015-v7
S ba0881 big-data-use-cases-pearson-edge2015-v7Tony Pearson
 
Get Smart: The Present and Future of Data Discovery
Get Smart: The Present and Future of Data DiscoveryGet Smart: The Present and Future of Data Discovery
Get Smart: The Present and Future of Data DiscoveryInside Analysis
 

Similar a Big Data Day LA 2015 - Data Science ≠ Big Data by Jim McGuire of ZestFinance (20)

How CIOs are grappling with big data analytics in Canada
How CIOs are grappling with big data analytics in CanadaHow CIOs are grappling with big data analytics in Canada
How CIOs are grappling with big data analytics in Canada
 
Thriving in the world of Big Data
Thriving in the world of Big DataThriving in the world of Big Data
Thriving in the world of Big Data
 
Five Trends in Analytics - How to Take Advantage Today - StampedeCon 2013
Five Trends in Analytics - How to Take Advantage Today - StampedeCon 2013Five Trends in Analytics - How to Take Advantage Today - StampedeCon 2013
Five Trends in Analytics - How to Take Advantage Today - StampedeCon 2013
 
Usama Fayyad talk in South Africa: From BigData to Data Science
Usama Fayyad talk in South Africa:  From BigData to Data ScienceUsama Fayyad talk in South Africa:  From BigData to Data Science
Usama Fayyad talk in South Africa: From BigData to Data Science
 
Oceans of big data: Take the plunge or wade in slowly?
Oceans of big data: Take the plunge or wade in slowly?Oceans of big data: Take the plunge or wade in slowly?
Oceans of big data: Take the plunge or wade in slowly?
 
Unlocking the Value of Big Data (Innovation Summit 2014)
Unlocking the Value of Big Data (Innovation Summit 2014)Unlocking the Value of Big Data (Innovation Summit 2014)
Unlocking the Value of Big Data (Innovation Summit 2014)
 
big data analytics pgpmx2015
big data analytics pgpmx2015big data analytics pgpmx2015
big data analytics pgpmx2015
 
Big Data for the Next Big Idea in Financial Services (Whitepaper)
Big Data for the Next Big Idea in Financial Services (Whitepaper)Big Data for the Next Big Idea in Financial Services (Whitepaper)
Big Data for the Next Big Idea in Financial Services (Whitepaper)
 
Little Steps to BIG Data
Little Steps to BIG DataLittle Steps to BIG Data
Little Steps to BIG Data
 
Big Data 101, What It Means for Business - BDI 12/4/13 The Future of Financia...
Big Data 101, What It Means for Business - BDI 12/4/13 The Future of Financia...Big Data 101, What It Means for Business - BDI 12/4/13 The Future of Financia...
Big Data 101, What It Means for Business - BDI 12/4/13 The Future of Financia...
 
Startups Aiming to Disrupt Consumer Banking
Startups Aiming to Disrupt Consumer BankingStartups Aiming to Disrupt Consumer Banking
Startups Aiming to Disrupt Consumer Banking
 
Bigdata for sme-industrial intelligence information-24july2017-final
Bigdata for sme-industrial intelligence information-24july2017-finalBigdata for sme-industrial intelligence information-24july2017-final
Bigdata for sme-industrial intelligence information-24july2017-final
 
Creating Big Data Success with the Collaboration of Business and IT
Creating Big Data Success with the Collaboration of Business and ITCreating Big Data Success with the Collaboration of Business and IT
Creating Big Data Success with the Collaboration of Business and IT
 
Big Data Jujitsu Walkthru Client x Client
Big Data Jujitsu Walkthru Client x ClientBig Data Jujitsu Walkthru Client x Client
Big Data Jujitsu Walkthru Client x Client
 
Big Data for Small Businesses
Big Data for Small BusinessesBig Data for Small Businesses
Big Data for Small Businesses
 
Importance of Big data for your Business
Importance of Big data for your BusinessImportance of Big data for your Business
Importance of Big data for your Business
 
Bda assignment can also be used for BDA notes and concept understanding.
Bda assignment can also be used for BDA notes and concept understanding.Bda assignment can also be used for BDA notes and concept understanding.
Bda assignment can also be used for BDA notes and concept understanding.
 
S ba0881 big-data-use-cases-pearson-edge2015-v7
S ba0881 big-data-use-cases-pearson-edge2015-v7S ba0881 big-data-use-cases-pearson-edge2015-v7
S ba0881 big-data-use-cases-pearson-edge2015-v7
 
Actuarial Analytics in R
Actuarial Analytics in RActuarial Analytics in R
Actuarial Analytics in R
 
Get Smart: The Present and Future of Data Discovery
Get Smart: The Present and Future of Data DiscoveryGet Smart: The Present and Future of Data Discovery
Get Smart: The Present and Future of Data Discovery
 

Más de Data Con LA

Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA
 
Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA
 
Data Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup ShowcaseData Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup ShowcaseData Con LA
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA
 
Data Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendationsData Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendationsData Con LA
 
Data Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA
 
Data Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learningData Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learningData Con LA
 
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA
 
Data Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentationData Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentationData Con LA
 
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...Data Con LA
 
Data Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWSData Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWSData Con LA
 
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA
 
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...Data Con LA
 
Data Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data ScienceData Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data ScienceData Con LA
 
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing EntertainmentData Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing EntertainmentData Con LA
 
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA
 
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...Data Con LA
 
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...Data Con LA
 
Data Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with KafkaData Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with KafkaData Con LA
 

Más de Data Con LA (20)

Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
 
Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 Keynote
 
Data Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup ShowcaseData Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup Showcase
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 Keynote
 
Data Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendationsData Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendations
 
Data Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA 2022 - AI Ethics
Data Con LA 2022 - AI Ethics
 
Data Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learningData Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learning
 
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
 
Data Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentationData Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentation
 
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
 
Data Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWSData Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWS
 
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
 
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
 
Data Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data ScienceData Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data Science
 
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing EntertainmentData Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
 
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
 
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
 
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
 
Data Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with KafkaData Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with Kafka
 

Último

From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 

Último (20)

From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 

Big Data Day LA 2015 - Data Science ≠ Big Data by Jim McGuire of ZestFinance

  • 1. © 2015 ZestFinance, Inc. Data Science ≠ Big Data Jim McGuire ZestFinance
  • 2. 1 © 2015 ZestFinance, Inc. Disregard (Possibly) If You Work Here:
  • 3. 2 © 2015 ZestFinance, Inc. Real “Big Data” Challenges Company Amount of Data Facebook 1.2B Users Google 60T Pages Amazon 200M products XBox 50M Users
  • 4. 3 © 2015 ZestFinance, Inc. ZestFinance and Big Data  Big Data Lends New Zest to Banks’ Credit Judgments  How Big Data is Revolutionizing the Credit Scoring Industry  Beware Big Data’s Easy Answers  How Big Data Might Mean Better Business for Big Banks  ZestFinance… Uses Big Data to Weed Out Deadbeats  For ZestFinance, Big Data Comes With Big Responsibility  Big Data Uncovers Some Weird Correlations  How Big Data Could Replace Your Credit Score  Can ‘Big Data’ Lift People Out of Cycles of Debt?
  • 5. 4 © 2015 ZestFinance, Inc. ZestFinance • Started in 2009 by Douglas Merrill, ramped up in 2011 • Mission: to provide fair and transparent credit to everyone • Initial focus on the underbanked in the US • Extensive use of ML • Provide models and services for: • Underwriting • Fraud Prevention • Customer Service • Collections • Marketing • Consult with external clients • Traditional lenders • Subprime lenders • Media providers • eCommerce • Telemarketing • Insurance • International
  • 6. 5 © 2015 ZestFinance, Inc. Extending Credit to the Underbanked • Traditional credit scoring methodology in use for 40+ years • FICO et al work great for those with ample credit history • Lumped together at the bottom score bands: • Little credit history • Bankruptcies • Poor credit history • Use of alternative sources of credit • Problems in modeling credit for <600 FICO • Traditional credit bureau data is sketchy • Alternative providers of “credit” data with limited coverage and quality • Higher incidence of fraud • Most borrowers already live financially “on the edge”
  • 7. 6 © 2015 ZestFinance, Inc. ZestFinance & JD.com • Credit in China: • Historically a “savings” culture • < 25% of population has access to credit • Growing middle class • Growing population of young workers in tech industry • No credit bureau’s per se • Credit available for corporations and the wealthy • Companies don’t have data or tools to assess risk • Tech companies are being given licenses to lend and build credit scores • Learnings from “deep subprime” in America translate to new markets • No extensive credit data • Must combine noisy data from disparate sources • Traditional credit scoring philosophies may not apply
  • 8. 7 © 2015 ZestFinance, Inc. Why Data Science is More Than Just a Lot of Data • Interpretability / insights necessary for buy-in • Past business decisions leave imprint on your data • Expectations of model must be financially grounded • Noisy data with lots of holes and biases • Often trying to capture ephemeral “human behavior” • Complicated structure / relations between fields
  • 9. 8 © 2015 ZestFinance, Inc. EC2 + Favorite ML Software
  • 10. 9 © 2015 ZestFinance, Inc. Shaped Datasets • Past business decisions shape datasets used for building models • More extreme for some verticals, like lending • Toy example to demonstrate effect
  • 11. 10 © 2015 ZestFinance, Inc. Shaped Datasets • Launch new model, change • Collect more data • Iterate on model build
  • 12. 11 © 2015 ZestFinance, Inc. Shaped Datasets • After several iterations, end up in this situation • How to predict swap- in population after next model iteration?
  • 13. 12 © 2015 ZestFinance, Inc. Blind Spots in Data Collection Blind Spot Training Set Entire Applicant Population • Model is not validated in regions of feature space that exist in whole population • Capping variables may help, or may give you false security • Box cuts can exacerbate issue • Univariate analysis may be misleading
  • 14. 13 © 2015 ZestFinance, Inc. Blind Spots in Data Collection VolumeinBlindSpotPerformanceinBlindSpot
  • 15. 14 © 2015 ZestFinance, Inc. Univariate Signals and Shaping • In business settings, predictors are often intuitive and interpretable • After building a model, this variable bubbles up as significant
  • 16. 15 © 2015 ZestFinance, Inc. Univariate Signals and Shaping • In this case, univariate behavior of the variable has exactly the opposite of expectations # Delinquent Accounts
  • 17. 16 © 2015 ZestFinance, Inc. Univariate Signals and Shaping#Trades>30DQEver # Trades Ever Opened Possible to have high number of delinquent accounts if only a small fraction of total accounts
  • 18. 17 © 2015 ZestFinance, Inc. • Categorical feature showed good risk splits Historical Business Performance and Unintentional Proxies Category Metric Red 8.3% Green 7.9% Blue 6.4% But really it just proxied for time
  • 19. 18 © 2015 ZestFinance, Inc. Lots of Missing Data That Have Different Meanings • Context relevant • Combination of ML expertise and business acumen to handle appropriately
  • 20. 19 © 2015 ZestFinance, Inc. Hire to Solve Your Business Problem We Need Data Artists To Save Zombie Borrowers: The hottest job right now is ‘data scientist’, but that label is wrong. It’s not just about science, it’s also about art. The hottest job should be data artist – those who understand the quirks of data and think through and enjoy the nature of what you’re talking about… People think you get a whole bunch of bits and drop them into a stats box and get all the answers. Turns out every answer you get is incorrect. • “You can find a great developer and a great researcher who has a background in statistics, and maybe you can find a great problem solver, but to find that in the same person is hard.” • Stan Humphries, Zillow • “Practice. I strongly believe that being a data scientist is an interesting merger of science and craftsmanship. You need to understand the theory but at the same time you also need to exercise your gut feeling.” • Michael Berthold, KNIME
  • 21. 20 © 2015 ZestFinance, Inc. Hiring for Diversity • Naturally an interdisciplinary field that is rapidly evolving • “Data Scientist” means a lot of different things • “Full stack analyst” vs. “data engineer” vs. … • Are coding skills important? Are big data skills important? • Draw from range of backgrounds: • Psychology • Biostatistics • Physics • Applied Math • Computer Science • Statistics • Engineering • MBA
  • 22. 21 © 2015 ZestFinance, Inc. Q & A • Thanks!