SlideShare una empresa de Scribd logo
1 de 50
Descargar para leer sin conexión
Valid Statistical Analysis at John
    Deere and Use of the R
    Programming Language
            Derek Hoffman
             Nov-8-2012
A bit about your speaker…

 • BS in Statistics and
    Material Science
    @ Winona State
        University
 • Masters in Statistics
    @ Iowa State
        University
 • 5 Years @ John Deere
Forecasting Group in 2012




 •   Improvements due to the science of forecasting
 •   Explosion in value and statistician hiring
 •   Increase in problem solving flexibility due to use of R
 •   Huge company saving with dropping flop forecasting software
• Revenue of roughly 35
  billion, 8.7% profit
• Has been a Fortune 500
  company for the last 56
  years, roughly 94th in
  rank.
• Employs about 50,000
  people world wide –
  roughly 5,000 of them in
  the Moline headquarters.
Deere & Company – 3 parts

 • Agriculture ~70%

                                 • Turf~15%




                • Construction
                  ~15%
Why does Deere hire forecasters?

 • Availability needs to match demand OR you
   lose market share
 • Inventory needs to stay low OR you pay lots
   in taxes and storage costs
 • New factories need to be built at the right
   size and time OR you made a multi million
   dollar mistake.
 • Work force needs to be hired/cut depending
   on production plans OR you lose tons
   training and severance.
My group’s reach at John Deere

       CEO,                        Flexibility of
    Presidents,                     Inventory
    Financials                     Next Month




                  Forecasts



                                 Factory Shifts
   New Markets,
                                      and
   10 Years Out
                                  Production
My group’s reach at John Deere

       CEO,                        Flexibility of
    Presidents,                     Inventory
    Financials                     Next Month




                  Forecasts



                                 Factory Shifts
   New Markets,
                                      and
   10 Years Out
                                  Production
Why do statisticians love R?

 • Common statistical methods are available as
   packages (advantage over C++)
 • Large support group of users worldwide
 • Credibility due to submission standards and
   university usage.
 • Often the program of choice during education
 • Easy to send results to another person (even
   if just text files for data and code)
Why does Deere love R?

• The cost is right
• Open source – no black box mysteries, no
  propriety lock downs
• Easy to share across the business
• Relatively easy to learn
• Often works better or faster than microsoft
  products for data and analysis
• Infinitely customizable to your problem and
  your products – vertical integration
Case Studies at John Deere

•   Short Term Demand Forecasting
•   Crop Forecasting
•   Long Term Demand Forecasting
•   Parts Decision Tree (APO)
•   Order Line Up
•   Data Coordinator
Short Term Demand Forecasting


                      Marketing   Potential Good:
                      Forecast
        Factory                   •Multiple view points
        Forecast
                                  •Buy-in from all players
                                  •Disciplined in forecast creation
                   Estimate
                    Group
                   Forecast       Potential Bad:
                                  •Group-think
                                  •Pressures other than accuracy
                                  •Poor information digestion


       Composite Forecast
Bad Forecasting Philosophies
  Executive Override   Gut Feel / Art          Blackbox Forecasts
                            News,
      News,
                        Experience, Last             History
    Experience             YR’s #’s



     Experience +       Math Comparisons,
    Feelings on that    Finical Forecasting,
     Day + Outside          Experience,                 ?
      pressures          Outside forecasts




                                                  Forecasts (NO
    “Forecasts” and
                                                   estimates of
     directives and         Forecasts
                                                  accuracy, NO
         goals
                                                  interpretation)
Forecasting Philosophies
  Statistical Models         Assumption Models         Economic Models
   Historical Data             Assumptions              Data, Assumptions,
                                  (user generated          News, ???,
  (known because is in the
                               assumptions about the
      past or current)
                                      future)
                                                        Outside Forecasts




         Data +                     Data +              Data + Economics
     Math/Statistics            Math/Statistics                + ???
   as calculated by a         as calculated by a          as created by a
   trained statistician       trained statistician      trained economist




    Forecasts and               Forecasts and              Forecasts,
    MEANINGFUL                    Analysis of               Outside
     plus/minus                 Forecast Error             Forecasts,
      intervals                Contributions by         Current Economic
      (flexibility and bad
     forecast detection)
                                 Assumptions                 News
Use of Data-Driven Analysis




                  Analysis done in
                  my group using R
                  and company data.
Case Studies at John Deere

•   Short Term Demand Forecasting
•   Crop Forecasting
•   Long Term Demand Forecasting
•   Parts Decision Tree (APO)
•   Order Line Up
•   Data Coordinator
Crop Yields Forecasting
Relative Land Area and Use

                             Circle = Total Land
Acres in Major World Crops
              Circle = Total Crop Land
Crop Yields Forecasting
Crop Yields Forecasting



    History                        2nd Year OUT

               1 Year OUT                                    3rd Year OUT




     The whole time, calculating the valid forecast error and influences.

     A large computational task, heavily using programs written in R.
Changes in Crop Splits
Corn Yields
Case Studies at John Deere

•   Short Term Demand Forecasting
•   Crop Forecasting
•   Long Term Demand Forecasting
•   Parts Decision Tree (APO)
•   Order Line Up
•   Data Coordinator
The Wrong way – Growth f(t)

 • The problem really is that we are looking at a
   correlation with time, not a causation. Also
   we will always be extrapolating (because the
   future value of time is outside the our
   historical data set).
What are Likely Causes?

 •   Crop Yields
 •   Planted Acres
 •   Crop Prices
 •   Population
 •   Gross Domestic Product
 •   Farm Size
 •   Government
 •   Mechanization Level of Farming
 •   Crop Choices (Corn damages combines faster than
     wheat.)
Example of Calculations




    The whole time, calculating the valid forecast error and influences.

    A large computational task, heavily using programs written in R.
Case Studies at John Deere

•   Short Term Demand Forecasting
•   Crop Forecasting
•   Long Term Demand Forecasting
•   Parts Decision Tree (APO)
•   Order Line Up
•   Data Coordinator
Parts Forecasting

                    • Tons of parts, need direction
                      how to best forecast with
                      SAP.
Parts Forecasting – Trilingual?
Case Studies at John Deere

•   Short Term Demand Forecasting
•   Crop Forecasting
•   Long Term Demand Forecasting
•   Parts Decision Tree (APO)
•   Order Line Up
•   Data Coordinator
Order Scheduling
Order Scheduling

 Restraint on
 Feature A:
 At most 2
 per 4 in a
 row.

 We’re OK!
Order Scheduling

 Restraint on
 Feature A:
 At most 2
 per 4 in a
 row.

 We’re OK!
Order Scheduling

 Restraint on
 Feature B:
 At most 1
 per 3 in a
 row.

 We’re OK!
Order Scheduling

 Restraint on
 Feature A:
 At most 1
 per 3 in a
 row.

 We’re got a
 problem!

 Have to
 move Matt
 or Shawn’s
 tractor to
 another spot
 and recheck
 it all!
Harvester Lineup – Random Guess
Harvester Lineup – Program Results
Order Scheduling – Time
Order Scheduling = $$$

 •   Old Process                 • Derek’s Process
     – Done manually by             – Automates the process
       hand                         – Duration: 1.5-2 hours
     – Weekly                       – Human time:15 mins
     – Duration: 8 Hours
     – Not necessarily perfect      – Saves about 8 hours
                                      per week
                                    – Saves ~$12K per year,
                                      per product
                                      implementation
Case Studies at John Deere

•   Short Term Demand Forecasting
•   Crop Forecasting
•   Long Term Demand Forecasting
•   Parts Decision Tree (APO)
•   Order Line Up
•   Data Coordinator
Data Coordinator Uses
                                           Scheduled
                                             Tasks
  Multiples
    Data        Multiple
 sources and     ODBC                                  DB2
                                           Batch
  Data types   Connections                 File
                                           execution
    DB2
                               Single R                 Export
                             source Code               Channels
    SQL


    DB2


   Oracle
A forecast of “Analytics”

 • A short history of “cool topics”

 • The future of forecasters

 • The coming data flood and analytics boom

     increase in scalpels ≠ increase in surgeons
The cool word of the year – Dot-com
The cool word of the year - Radiation
The cool word of the year – Big Data


                       How can we grow responsibly as data
                       scientists and statisticians?
Signs you are in the hype

 •   Everyone claims it will change the world
 •   It’s taught in business schools
 •   Features on covers of general magazines
 •   TONS of snake-oil salesmen
 •   Legitimate ease in access to the new thing
Cautionary tale:

                   • Thousands spent on a
                     weather “forecast”
                   • Ridiculous accuracy
                     measures
                   • Business users don’t
                     know the short falls till
                     it’s too late
Growing Need of Forecasting Professionals


 • A need for educated gate keepers to weed
   bad analysis from good.
 • More people are needed to practice
   forecasting as a profession – or the whole
   industry will suffer.
 • More data, more ease, more computing
   needed, with greater need for responsible
   use.
Statistics and R at John Deere

 • John Deere is among the best in large
   manufactures in implementing good
   forecasting methods to demand planning
 • There are still huge areas to grow – no
   where near the data usage of companies like
   Amazon or Wal-Mart
 • The challenge is to increase usage and
   access while maintaining a good internal and
   external reputation

Más contenido relacionado

La actualidad más candente

Crop Nutrient uptake models
Crop Nutrient uptake modelsCrop Nutrient uptake models
Crop Nutrient uptake modelsIniya Lakshimi
 
Impact of climate change on tea plantation 10.5.2018 Mrityunjay Choubey
Impact of climate change on tea plantation 10.5.2018 Mrityunjay ChoubeyImpact of climate change on tea plantation 10.5.2018 Mrityunjay Choubey
Impact of climate change on tea plantation 10.5.2018 Mrityunjay Choubeymrityunjay choubey
 
Groundwater and watershed management
Groundwater and watershed managementGroundwater and watershed management
Groundwater and watershed managementHaroon khan
 
Drip irrigation............
Drip irrigation............Drip irrigation............
Drip irrigation............Bhupesh Katriya
 
9. Soil science.pptx
9. Soil science.pptx9. Soil science.pptx
9. Soil science.pptxAsnainAamir
 
Phosphate based fertilizers
Phosphate based fertilizersPhosphate based fertilizers
Phosphate based fertilizersAdeel Hasnain
 
ROLE OF SOIL ORGANIC MANURE IN SUSTAINING SOIL HEALTH
ROLE OF SOIL ORGANIC MANURE IN SUSTAINING SOIL HEALTHROLE OF SOIL ORGANIC MANURE IN SUSTAINING SOIL HEALTH
ROLE OF SOIL ORGANIC MANURE IN SUSTAINING SOIL HEALTHRamyajit Mondal
 
The Potassium Cycle
The Potassium CycleThe Potassium Cycle
The Potassium CycleManzoor Wani
 
Ground water sampling & Analysis technique
Ground water sampling & Analysis techniqueGround water sampling & Analysis technique
Ground water sampling & Analysis techniqueEr. Atun Roy Choudhury
 
ORGANIC FARMING : COMMON ORGANIC MANURES SMG
ORGANIC FARMING : COMMON ORGANIC MANURES     SMGORGANIC FARMING : COMMON ORGANIC MANURES     SMG
ORGANIC FARMING : COMMON ORGANIC MANURES SMGsajigeorge64
 
Improved National Forest Inventory Map sampling design
Improved National Forest Inventory Map sampling designImproved National Forest Inventory Map sampling design
Improved National Forest Inventory Map sampling designFAO
 
Micronutrient chelate (1)
Micronutrient chelate (1)Micronutrient chelate (1)
Micronutrient chelate (1)aakvd
 

La actualidad más candente (20)

Soil groups
Soil groupsSoil groups
Soil groups
 
Analysis of jaggery
Analysis of jaggeryAnalysis of jaggery
Analysis of jaggery
 
Application of biochar in agriculture.pptx
Application of biochar in agriculture.pptxApplication of biochar in agriculture.pptx
Application of biochar in agriculture.pptx
 
Organic Fertilizers & their impact on crop production
Organic Fertilizers & their impact on crop productionOrganic Fertilizers & their impact on crop production
Organic Fertilizers & their impact on crop production
 
Crop Nutrient uptake models
Crop Nutrient uptake modelsCrop Nutrient uptake models
Crop Nutrient uptake models
 
Impact of climate change on tea plantation 10.5.2018 Mrityunjay Choubey
Impact of climate change on tea plantation 10.5.2018 Mrityunjay ChoubeyImpact of climate change on tea plantation 10.5.2018 Mrityunjay Choubey
Impact of climate change on tea plantation 10.5.2018 Mrityunjay Choubey
 
Groundwater and watershed management
Groundwater and watershed managementGroundwater and watershed management
Groundwater and watershed management
 
Sulphur
SulphurSulphur
Sulphur
 
potassium
potassiumpotassium
potassium
 
Drip irrigation............
Drip irrigation............Drip irrigation............
Drip irrigation............
 
9. Soil science.pptx
9. Soil science.pptx9. Soil science.pptx
9. Soil science.pptx
 
Phosphate based fertilizers
Phosphate based fertilizersPhosphate based fertilizers
Phosphate based fertilizers
 
Remote Sensing
Remote Sensing Remote Sensing
Remote Sensing
 
ROLE OF SOIL ORGANIC MANURE IN SUSTAINING SOIL HEALTH
ROLE OF SOIL ORGANIC MANURE IN SUSTAINING SOIL HEALTHROLE OF SOIL ORGANIC MANURE IN SUSTAINING SOIL HEALTH
ROLE OF SOIL ORGANIC MANURE IN SUSTAINING SOIL HEALTH
 
The Potassium Cycle
The Potassium CycleThe Potassium Cycle
The Potassium Cycle
 
Ground water sampling & Analysis technique
Ground water sampling & Analysis techniqueGround water sampling & Analysis technique
Ground water sampling & Analysis technique
 
Soil health an overview
Soil health an overviewSoil health an overview
Soil health an overview
 
ORGANIC FARMING : COMMON ORGANIC MANURES SMG
ORGANIC FARMING : COMMON ORGANIC MANURES     SMGORGANIC FARMING : COMMON ORGANIC MANURES     SMG
ORGANIC FARMING : COMMON ORGANIC MANURES SMG
 
Improved National Forest Inventory Map sampling design
Improved National Forest Inventory Map sampling designImproved National Forest Inventory Map sampling design
Improved National Forest Inventory Map sampling design
 
Micronutrient chelate (1)
Micronutrient chelate (1)Micronutrient chelate (1)
Micronutrient chelate (1)
 

Destacado

John Deere Final
John Deere FinalJohn Deere Final
John Deere FinalTim Lewis
 
John Deere Social Media Analysis Q4 2015
John Deere Social Media Analysis Q4 2015John Deere Social Media Analysis Q4 2015
John Deere Social Media Analysis Q4 2015Unmetric
 
How the growth of R helps data-driven organizations succeed
How the growth of R helps data-driven organizations succeedHow the growth of R helps data-driven organizations succeed
How the growth of R helps data-driven organizations succeedRevolution Analytics
 
Big Data in Retail - Examples in Action
Big Data in Retail - Examples in ActionBig Data in Retail - Examples in Action
Big Data in Retail - Examples in ActionDavid Pittman
 

Destacado (6)

John Deere Final
John Deere FinalJohn Deere Final
John Deere Final
 
The John Deere Way
The John Deere WayThe John Deere Way
The John Deere Way
 
John Deere Social Media Analysis Q4 2015
John Deere Social Media Analysis Q4 2015John Deere Social Media Analysis Q4 2015
John Deere Social Media Analysis Q4 2015
 
3...forecasting methods
3...forecasting methods3...forecasting methods
3...forecasting methods
 
How the growth of R helps data-driven organizations succeed
How the growth of R helps data-driven organizations succeedHow the growth of R helps data-driven organizations succeed
How the growth of R helps data-driven organizations succeed
 
Big Data in Retail - Examples in Action
Big Data in Retail - Examples in ActionBig Data in Retail - Examples in Action
Big Data in Retail - Examples in Action
 

Similar a Order Fulfillment Forecasting at John Deere: How R Facilitates Creativity and Flexibility

Spreadmart To Data Mart BISIG Presentation
Spreadmart To Data Mart BISIG PresentationSpreadmart To Data Mart BISIG Presentation
Spreadmart To Data Mart BISIG PresentationDan English
 
Better decisions through analytics in healthcare industry. Our journey so far
Better decisions through analytics in healthcare industry.  Our journey so farBetter decisions through analytics in healthcare industry.  Our journey so far
Better decisions through analytics in healthcare industry. Our journey so farSAS Asia Pacific
 
Iwsm2014 why cant people estimate (dan galorath)
Iwsm2014   why cant people estimate (dan galorath)Iwsm2014   why cant people estimate (dan galorath)
Iwsm2014 why cant people estimate (dan galorath)Nesma
 
Data to Dollars™ - Practical Analytics in the Big Data Era Jaime Fitzgerald A...
Data to Dollars™ - Practical Analytics in the Big Data Era Jaime Fitzgerald A...Data to Dollars™ - Practical Analytics in the Big Data Era Jaime Fitzgerald A...
Data to Dollars™ - Practical Analytics in the Big Data Era Jaime Fitzgerald A...Fitzgerald Analytics, Inc.
 
Estimating software development
Estimating software developmentEstimating software development
Estimating software developmentJane Prusakova
 
Software licensing update 12 10-08
Software licensing update 12 10-08Software licensing update 12 10-08
Software licensing update 12 10-08Nadia Mayard
 
Effective Commercial Underwriting using Big Data and Risk Analytics
Effective Commercial Underwriting using Big Data and Risk AnalyticsEffective Commercial Underwriting using Big Data and Risk Analytics
Effective Commercial Underwriting using Big Data and Risk Analyticsintellectseec
 
Forecasting Product Performance Like a Meteorologist (June 2012)
Forecasting Product Performance Like a Meteorologist (June 2012)Forecasting Product Performance Like a Meteorologist (June 2012)
Forecasting Product Performance Like a Meteorologist (June 2012)ProductCamp Boston
 
Forecasting Product Performance060912
Forecasting Product Performance060912Forecasting Product Performance060912
Forecasting Product Performance060912Ananda Chakravarty
 
Forecasting New Product Performance Like A Meteorologist
Forecasting New Product Performance Like A MeteorologistForecasting New Product Performance Like A Meteorologist
Forecasting New Product Performance Like A MeteorologistAnanda Chakravarty
 
McKinsey Big Data Overview
McKinsey Big Data OverviewMcKinsey Big Data Overview
McKinsey Big Data Overviewoptier
 
Data Science at LinkedIn - Data-Driven Products & Insights
Data Science at LinkedIn - Data-Driven Products & InsightsData Science at LinkedIn - Data-Driven Products & Insights
Data Science at LinkedIn - Data-Driven Products & InsightsYael Garten
 
OpTier McKinsey Big Data Overview
OpTier McKinsey Big Data OverviewOpTier McKinsey Big Data Overview
OpTier McKinsey Big Data Overviewnickychu
 
McKinsey Big Data Overview
McKinsey Big Data OverviewMcKinsey Big Data Overview
McKinsey Big Data Overviewoptier
 
Bi introduction for cf os syntelli solutions
Bi introduction for cf os   syntelli solutionsBi introduction for cf os   syntelli solutions
Bi introduction for cf os syntelli solutionsSyntelli Solutions
 
Building a Giant Atlassian Universe to Take Over the World
Building a Giant Atlassian Universe to Take Over the WorldBuilding a Giant Atlassian Universe to Take Over the World
Building a Giant Atlassian Universe to Take Over the WorldAtlassian
 
Planning a data solution - "By Failing to prepare, you are preparing to fail"
Planning a data solution - "By Failing to prepare, you are preparing to fail"Planning a data solution - "By Failing to prepare, you are preparing to fail"
Planning a data solution - "By Failing to prepare, you are preparing to fail"Itai Yaffe
 

Similar a Order Fulfillment Forecasting at John Deere: How R Facilitates Creativity and Flexibility (20)

Spreadmart To Data Mart BISIG Presentation
Spreadmart To Data Mart BISIG PresentationSpreadmart To Data Mart BISIG Presentation
Spreadmart To Data Mart BISIG Presentation
 
Better decisions through analytics in healthcare industry. Our journey so far
Better decisions through analytics in healthcare industry.  Our journey so farBetter decisions through analytics in healthcare industry.  Our journey so far
Better decisions through analytics in healthcare industry. Our journey so far
 
Iwsm2014 why cant people estimate (dan galorath)
Iwsm2014   why cant people estimate (dan galorath)Iwsm2014   why cant people estimate (dan galorath)
Iwsm2014 why cant people estimate (dan galorath)
 
Data to Dollars™ - Practical Analytics in the Big Data Era Jaime Fitzgerald A...
Data to Dollars™ - Practical Analytics in the Big Data Era Jaime Fitzgerald A...Data to Dollars™ - Practical Analytics in the Big Data Era Jaime Fitzgerald A...
Data to Dollars™ - Practical Analytics in the Big Data Era Jaime Fitzgerald A...
 
Estimating software development
Estimating software developmentEstimating software development
Estimating software development
 
Software licensing update 12 10-08
Software licensing update 12 10-08Software licensing update 12 10-08
Software licensing update 12 10-08
 
Effective Commercial Underwriting using Big Data and Risk Analytics
Effective Commercial Underwriting using Big Data and Risk AnalyticsEffective Commercial Underwriting using Big Data and Risk Analytics
Effective Commercial Underwriting using Big Data and Risk Analytics
 
Forecasting Product Performance Like a Meteorologist (June 2012)
Forecasting Product Performance Like a Meteorologist (June 2012)Forecasting Product Performance Like a Meteorologist (June 2012)
Forecasting Product Performance Like a Meteorologist (June 2012)
 
Forecasting Product Performance060912
Forecasting Product Performance060912Forecasting Product Performance060912
Forecasting Product Performance060912
 
Forecasting New Product Performance Like A Meteorologist
Forecasting New Product Performance Like A MeteorologistForecasting New Product Performance Like A Meteorologist
Forecasting New Product Performance Like A Meteorologist
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Insync10 1708 145 vella
Insync10 1708 145 vellaInsync10 1708 145 vella
Insync10 1708 145 vella
 
McKinsey Big Data Overview
McKinsey Big Data OverviewMcKinsey Big Data Overview
McKinsey Big Data Overview
 
Data Science at LinkedIn - Data-Driven Products & Insights
Data Science at LinkedIn - Data-Driven Products & InsightsData Science at LinkedIn - Data-Driven Products & Insights
Data Science at LinkedIn - Data-Driven Products & Insights
 
Engineering Global Content Planning - Pam Didner
Engineering Global Content Planning - Pam DidnerEngineering Global Content Planning - Pam Didner
Engineering Global Content Planning - Pam Didner
 
OpTier McKinsey Big Data Overview
OpTier McKinsey Big Data OverviewOpTier McKinsey Big Data Overview
OpTier McKinsey Big Data Overview
 
McKinsey Big Data Overview
McKinsey Big Data OverviewMcKinsey Big Data Overview
McKinsey Big Data Overview
 
Bi introduction for cf os syntelli solutions
Bi introduction for cf os   syntelli solutionsBi introduction for cf os   syntelli solutions
Bi introduction for cf os syntelli solutions
 
Building a Giant Atlassian Universe to Take Over the World
Building a Giant Atlassian Universe to Take Over the WorldBuilding a Giant Atlassian Universe to Take Over the World
Building a Giant Atlassian Universe to Take Over the World
 
Planning a data solution - "By Failing to prepare, you are preparing to fail"
Planning a data solution - "By Failing to prepare, you are preparing to fail"Planning a data solution - "By Failing to prepare, you are preparing to fail"
Planning a data solution - "By Failing to prepare, you are preparing to fail"
 

Más de Revolution Analytics

Speeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the CloudSpeeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the CloudRevolution Analytics
 
Migrating Existing Open Source Machine Learning to Azure
Migrating Existing Open Source Machine Learning to AzureMigrating Existing Open Source Machine Learning to Azure
Migrating Existing Open Source Machine Learning to AzureRevolution Analytics
 
Speed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudSpeed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudRevolution Analytics
 
Predicting Loan Delinquency at One Million Transactions per Second
Predicting Loan Delinquency at One Million Transactions per SecondPredicting Loan Delinquency at One Million Transactions per Second
Predicting Loan Delinquency at One Million Transactions per SecondRevolution Analytics
 
The Value of Open Source Communities
The Value of Open Source CommunitiesThe Value of Open Source Communities
The Value of Open Source CommunitiesRevolution Analytics
 
Building a scalable data science platform with R
Building a scalable data science platform with RBuilding a scalable data science platform with R
Building a scalable data science platform with RRevolution Analytics
 
The Business Economics and Opportunity of Open Source Data Science
The Business Economics and Opportunity of Open Source Data ScienceThe Business Economics and Opportunity of Open Source Data Science
The Business Economics and Opportunity of Open Source Data ScienceRevolution Analytics
 
Taking R Analytics to SQL and the Cloud
Taking R Analytics to SQL and the CloudTaking R Analytics to SQL and the Cloud
Taking R Analytics to SQL and the CloudRevolution Analytics
 
The Network structure of R packages on CRAN & BioConductor
The Network structure of R packages on CRAN & BioConductorThe Network structure of R packages on CRAN & BioConductor
The Network structure of R packages on CRAN & BioConductorRevolution Analytics
 
The network structure of cran 2015 07-02 final
The network structure of cran 2015 07-02 finalThe network structure of cran 2015 07-02 final
The network structure of cran 2015 07-02 finalRevolution Analytics
 
Simple Reproducibility with the checkpoint package
Simple Reproducibilitywith the checkpoint packageSimple Reproducibilitywith the checkpoint package
Simple Reproducibility with the checkpoint packageRevolution Analytics
 

Más de Revolution Analytics (20)

Speeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the CloudSpeeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the Cloud
 
Migrating Existing Open Source Machine Learning to Azure
Migrating Existing Open Source Machine Learning to AzureMigrating Existing Open Source Machine Learning to Azure
Migrating Existing Open Source Machine Learning to Azure
 
R in Minecraft
R in Minecraft R in Minecraft
R in Minecraft
 
The case for R for AI developers
The case for R for AI developersThe case for R for AI developers
The case for R for AI developers
 
Speed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudSpeed up R with parallel programming in the Cloud
Speed up R with parallel programming in the Cloud
 
The R Ecosystem
The R EcosystemThe R Ecosystem
The R Ecosystem
 
R Then and Now
R Then and NowR Then and Now
R Then and Now
 
Predicting Loan Delinquency at One Million Transactions per Second
Predicting Loan Delinquency at One Million Transactions per SecondPredicting Loan Delinquency at One Million Transactions per Second
Predicting Loan Delinquency at One Million Transactions per Second
 
Reproducible Data Science with R
Reproducible Data Science with RReproducible Data Science with R
Reproducible Data Science with R
 
The Value of Open Source Communities
The Value of Open Source CommunitiesThe Value of Open Source Communities
The Value of Open Source Communities
 
The R Ecosystem
The R EcosystemThe R Ecosystem
The R Ecosystem
 
R at Microsoft (useR! 2016)
R at Microsoft (useR! 2016)R at Microsoft (useR! 2016)
R at Microsoft (useR! 2016)
 
Building a scalable data science platform with R
Building a scalable data science platform with RBuilding a scalable data science platform with R
Building a scalable data science platform with R
 
R at Microsoft
R at MicrosoftR at Microsoft
R at Microsoft
 
The Business Economics and Opportunity of Open Source Data Science
The Business Economics and Opportunity of Open Source Data ScienceThe Business Economics and Opportunity of Open Source Data Science
The Business Economics and Opportunity of Open Source Data Science
 
Taking R Analytics to SQL and the Cloud
Taking R Analytics to SQL and the CloudTaking R Analytics to SQL and the Cloud
Taking R Analytics to SQL and the Cloud
 
The Network structure of R packages on CRAN & BioConductor
The Network structure of R packages on CRAN & BioConductorThe Network structure of R packages on CRAN & BioConductor
The Network structure of R packages on CRAN & BioConductor
 
The network structure of cran 2015 07-02 final
The network structure of cran 2015 07-02 finalThe network structure of cran 2015 07-02 final
The network structure of cran 2015 07-02 final
 
Simple Reproducibility with the checkpoint package
Simple Reproducibilitywith the checkpoint packageSimple Reproducibilitywith the checkpoint package
Simple Reproducibility with the checkpoint package
 
R at Microsoft
R at MicrosoftR at Microsoft
R at Microsoft
 

Order Fulfillment Forecasting at John Deere: How R Facilitates Creativity and Flexibility

  • 1. Valid Statistical Analysis at John Deere and Use of the R Programming Language Derek Hoffman Nov-8-2012
  • 2. A bit about your speaker… • BS in Statistics and Material Science @ Winona State University • Masters in Statistics @ Iowa State University • 5 Years @ John Deere
  • 3. Forecasting Group in 2012 • Improvements due to the science of forecasting • Explosion in value and statistician hiring • Increase in problem solving flexibility due to use of R • Huge company saving with dropping flop forecasting software
  • 4. • Revenue of roughly 35 billion, 8.7% profit • Has been a Fortune 500 company for the last 56 years, roughly 94th in rank. • Employs about 50,000 people world wide – roughly 5,000 of them in the Moline headquarters.
  • 5. Deere & Company – 3 parts • Agriculture ~70% • Turf~15% • Construction ~15%
  • 6. Why does Deere hire forecasters? • Availability needs to match demand OR you lose market share • Inventory needs to stay low OR you pay lots in taxes and storage costs • New factories need to be built at the right size and time OR you made a multi million dollar mistake. • Work force needs to be hired/cut depending on production plans OR you lose tons training and severance.
  • 7. My group’s reach at John Deere CEO, Flexibility of Presidents, Inventory Financials Next Month Forecasts Factory Shifts New Markets, and 10 Years Out Production
  • 8. My group’s reach at John Deere CEO, Flexibility of Presidents, Inventory Financials Next Month Forecasts Factory Shifts New Markets, and 10 Years Out Production
  • 9. Why do statisticians love R? • Common statistical methods are available as packages (advantage over C++) • Large support group of users worldwide • Credibility due to submission standards and university usage. • Often the program of choice during education • Easy to send results to another person (even if just text files for data and code)
  • 10. Why does Deere love R? • The cost is right • Open source – no black box mysteries, no propriety lock downs • Easy to share across the business • Relatively easy to learn • Often works better or faster than microsoft products for data and analysis • Infinitely customizable to your problem and your products – vertical integration
  • 11. Case Studies at John Deere • Short Term Demand Forecasting • Crop Forecasting • Long Term Demand Forecasting • Parts Decision Tree (APO) • Order Line Up • Data Coordinator
  • 12. Short Term Demand Forecasting Marketing Potential Good: Forecast Factory •Multiple view points Forecast •Buy-in from all players •Disciplined in forecast creation Estimate Group Forecast Potential Bad: •Group-think •Pressures other than accuracy •Poor information digestion Composite Forecast
  • 13. Bad Forecasting Philosophies Executive Override Gut Feel / Art Blackbox Forecasts News, News, Experience, Last History Experience YR’s #’s Experience + Math Comparisons, Feelings on that Finical Forecasting, Day + Outside Experience, ? pressures Outside forecasts Forecasts (NO “Forecasts” and estimates of directives and Forecasts accuracy, NO goals interpretation)
  • 14. Forecasting Philosophies Statistical Models Assumption Models Economic Models Historical Data Assumptions Data, Assumptions, (user generated News, ???, (known because is in the assumptions about the past or current) future) Outside Forecasts Data + Data + Data + Economics Math/Statistics Math/Statistics + ??? as calculated by a as calculated by a as created by a trained statistician trained statistician trained economist Forecasts and Forecasts and Forecasts, MEANINGFUL Analysis of Outside plus/minus Forecast Error Forecasts, intervals Contributions by Current Economic (flexibility and bad forecast detection) Assumptions News
  • 15. Use of Data-Driven Analysis Analysis done in my group using R and company data.
  • 16. Case Studies at John Deere • Short Term Demand Forecasting • Crop Forecasting • Long Term Demand Forecasting • Parts Decision Tree (APO) • Order Line Up • Data Coordinator
  • 18. Relative Land Area and Use Circle = Total Land
  • 19. Acres in Major World Crops Circle = Total Crop Land
  • 21. Crop Yields Forecasting History 2nd Year OUT 1 Year OUT 3rd Year OUT The whole time, calculating the valid forecast error and influences. A large computational task, heavily using programs written in R.
  • 22. Changes in Crop Splits
  • 24. Case Studies at John Deere • Short Term Demand Forecasting • Crop Forecasting • Long Term Demand Forecasting • Parts Decision Tree (APO) • Order Line Up • Data Coordinator
  • 25. The Wrong way – Growth f(t) • The problem really is that we are looking at a correlation with time, not a causation. Also we will always be extrapolating (because the future value of time is outside the our historical data set).
  • 26. What are Likely Causes? • Crop Yields • Planted Acres • Crop Prices • Population • Gross Domestic Product • Farm Size • Government • Mechanization Level of Farming • Crop Choices (Corn damages combines faster than wheat.)
  • 27. Example of Calculations The whole time, calculating the valid forecast error and influences. A large computational task, heavily using programs written in R.
  • 28. Case Studies at John Deere • Short Term Demand Forecasting • Crop Forecasting • Long Term Demand Forecasting • Parts Decision Tree (APO) • Order Line Up • Data Coordinator
  • 29. Parts Forecasting • Tons of parts, need direction how to best forecast with SAP.
  • 30. Parts Forecasting – Trilingual?
  • 31. Case Studies at John Deere • Short Term Demand Forecasting • Crop Forecasting • Long Term Demand Forecasting • Parts Decision Tree (APO) • Order Line Up • Data Coordinator
  • 33. Order Scheduling Restraint on Feature A: At most 2 per 4 in a row. We’re OK!
  • 34. Order Scheduling Restraint on Feature A: At most 2 per 4 in a row. We’re OK!
  • 35. Order Scheduling Restraint on Feature B: At most 1 per 3 in a row. We’re OK!
  • 36. Order Scheduling Restraint on Feature A: At most 1 per 3 in a row. We’re got a problem! Have to move Matt or Shawn’s tractor to another spot and recheck it all!
  • 37. Harvester Lineup – Random Guess
  • 38. Harvester Lineup – Program Results
  • 40. Order Scheduling = $$$ • Old Process • Derek’s Process – Done manually by – Automates the process hand – Duration: 1.5-2 hours – Weekly – Human time:15 mins – Duration: 8 Hours – Not necessarily perfect – Saves about 8 hours per week – Saves ~$12K per year, per product implementation
  • 41. Case Studies at John Deere • Short Term Demand Forecasting • Crop Forecasting • Long Term Demand Forecasting • Parts Decision Tree (APO) • Order Line Up • Data Coordinator
  • 42. Data Coordinator Uses Scheduled Tasks Multiples Data Multiple sources and ODBC DB2 Batch Data types Connections File execution DB2 Single R Export source Code Channels SQL DB2 Oracle
  • 43. A forecast of “Analytics” • A short history of “cool topics” • The future of forecasters • The coming data flood and analytics boom increase in scalpels ≠ increase in surgeons
  • 44. The cool word of the year – Dot-com
  • 45. The cool word of the year - Radiation
  • 46. The cool word of the year – Big Data How can we grow responsibly as data scientists and statisticians?
  • 47. Signs you are in the hype • Everyone claims it will change the world • It’s taught in business schools • Features on covers of general magazines • TONS of snake-oil salesmen • Legitimate ease in access to the new thing
  • 48. Cautionary tale: • Thousands spent on a weather “forecast” • Ridiculous accuracy measures • Business users don’t know the short falls till it’s too late
  • 49. Growing Need of Forecasting Professionals • A need for educated gate keepers to weed bad analysis from good. • More people are needed to practice forecasting as a profession – or the whole industry will suffer. • More data, more ease, more computing needed, with greater need for responsible use.
  • 50. Statistics and R at John Deere • John Deere is among the best in large manufactures in implementing good forecasting methods to demand planning • There are still huge areas to grow – no where near the data usage of companies like Amazon or Wal-Mart • The challenge is to increase usage and access while maintaining a good internal and external reputation