SlideShare una empresa de Scribd logo
1 de 21
Euro area GDP forecasting using large survey datasets A random forest approach Olivier Biau – Angela D’Elia Directorate General for Economic and Financial Affairs European Commission Groupe Travail prévision Paris, 12 May 2010   Views expressed represent exclusively the positions of the authors and do not necessarily correspond to those of the European Commission
Introduction ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Introduction ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Introduction ,[object Object],[object Object],[object Object],[object Object],[object Object]
Introduction ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object]
DG ECFIN: mission statement ,[object Object],[object Object],[object Object],[object Object]
Data ,[object Object],[object Object],[object Object],[object Object],[object Object]
Data ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
From binary trees to Random Forest ,[object Object],[object Object],[object Object]
How to grow a tree? with CART ,[object Object],[object Object],[object Object],[object Object]
What is the tree predictor? ,[object Object],[object Object],[object Object]
RF algorithm ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
RF algorithm   ,[object Object],[object Object],[object Object],[object Object],[object Object]
Benchmark ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
 
Results (1) ,[object Object],[object Object],[object Object],[object Object]
Results (2) ,[object Object],[object Object]
Results (2) ,[object Object],[object Object]
Results (2) ,[object Object],[object Object],[object Object]
Conclusion ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Más contenido relacionado

La actualidad más candente

JDemetra+Nowcasting: Macroeconomic Monitoring and Visualizing News
JDemetra+Nowcasting: Macroeconomic Monitoring and Visualizing NewsJDemetra+Nowcasting: Macroeconomic Monitoring and Visualizing News
JDemetra+Nowcasting: Macroeconomic Monitoring and Visualizing NewsNational Bank of Belgium
 
Nowcasting German GDP growth and the real time newsflow
Nowcasting German GDP growth and the real time newsflowNowcasting German GDP growth and the real time newsflow
Nowcasting German GDP growth and the real time newsflowNational Bank of Belgium
 
Presentation of sm on distibution lag model
Presentation of sm on distibution lag modelPresentation of sm on distibution lag model
Presentation of sm on distibution lag modelsumit gyawali
 
Modelling traffic flows with gravity models and mobile phone large data
Modelling traffic flows with gravity models and mobile phone large dataModelling traffic flows with gravity models and mobile phone large data
Modelling traffic flows with gravity models and mobile phone large dataUniversity of Salerno
 
Professor Alejandro Diaz Bautista Input Output Conference March 2013.
Professor Alejandro Diaz Bautista Input Output Conference March 2013.Professor Alejandro Diaz Bautista Input Output Conference March 2013.
Professor Alejandro Diaz Bautista Input Output Conference March 2013.Economist
 
The CIA (consistency in aggregation) approach - A new economic approach to el...
The CIA (consistency in aggregation) approach - A new economic approach to el...The CIA (consistency in aggregation) approach - A new economic approach to el...
The CIA (consistency in aggregation) approach - A new economic approach to el...Istituto nazionale di statistica
 

La actualidad más candente (6)

JDemetra+Nowcasting: Macroeconomic Monitoring and Visualizing News
JDemetra+Nowcasting: Macroeconomic Monitoring and Visualizing NewsJDemetra+Nowcasting: Macroeconomic Monitoring and Visualizing News
JDemetra+Nowcasting: Macroeconomic Monitoring and Visualizing News
 
Nowcasting German GDP growth and the real time newsflow
Nowcasting German GDP growth and the real time newsflowNowcasting German GDP growth and the real time newsflow
Nowcasting German GDP growth and the real time newsflow
 
Presentation of sm on distibution lag model
Presentation of sm on distibution lag modelPresentation of sm on distibution lag model
Presentation of sm on distibution lag model
 
Modelling traffic flows with gravity models and mobile phone large data
Modelling traffic flows with gravity models and mobile phone large dataModelling traffic flows with gravity models and mobile phone large data
Modelling traffic flows with gravity models and mobile phone large data
 
Professor Alejandro Diaz Bautista Input Output Conference March 2013.
Professor Alejandro Diaz Bautista Input Output Conference March 2013.Professor Alejandro Diaz Bautista Input Output Conference March 2013.
Professor Alejandro Diaz Bautista Input Output Conference March 2013.
 
The CIA (consistency in aggregation) approach - A new economic approach to el...
The CIA (consistency in aggregation) approach - A new economic approach to el...The CIA (consistency in aggregation) approach - A new economic approach to el...
The CIA (consistency in aggregation) approach - A new economic approach to el...
 

Destacado

Forecasting GDP profile with an application to French Business Surveys
Forecasting GDP profile with an application to French Business SurveysForecasting GDP profile with an application to French Business Surveys
Forecasting GDP profile with an application to French Business SurveysCdiscount
 
Ranking binaire, agrégation multiclasses
Ranking binaire, agrégation multiclasses Ranking binaire, agrégation multiclasses
Ranking binaire, agrégation multiclasses Cdiscount
 
Prévision de consommation électrique avec adaptive GAM
Prévision de consommation électrique avec adaptive GAMPrévision de consommation électrique avec adaptive GAM
Prévision de consommation électrique avec adaptive GAMCdiscount
 
Prévision consommation électrique par processus à valeurs fonctionnelles
Prévision consommation électrique par processus à valeurs fonctionnellesPrévision consommation électrique par processus à valeurs fonctionnelles
Prévision consommation électrique par processus à valeurs fonctionnellesCdiscount
 
Prediction in dynamic Graphs
Prediction in dynamic GraphsPrediction in dynamic Graphs
Prediction in dynamic GraphsCdiscount
 
Paris2012 session1
Paris2012 session1Paris2012 session1
Paris2012 session1Cdiscount
 
Paris2012 session3b
Paris2012 session3bParis2012 session3b
Paris2012 session3bCdiscount
 
State Space Model
State Space ModelState Space Model
State Space ModelCdiscount
 
Prediction of Quantiles by Statistical Learning and Application to GDP Foreca...
Prediction of Quantiles by Statistical Learning and Application to GDP Foreca...Prediction of Quantiles by Statistical Learning and Application to GDP Foreca...
Prediction of Quantiles by Statistical Learning and Application to GDP Foreca...Cdiscount
 
Paris2012 session4
Paris2012 session4Paris2012 session4
Paris2012 session4Cdiscount
 
Scm prix blé_2012_11_06
Scm prix blé_2012_11_06Scm prix blé_2012_11_06
Scm prix blé_2012_11_06Cdiscount
 
Paris2012 session2
Paris2012 session2Paris2012 session2
Paris2012 session2Cdiscount
 
Robust sequentiel learning
Robust sequentiel learningRobust sequentiel learning
Robust sequentiel learningCdiscount
 
Scm indicateurs prospectifs_2012_11_06
Scm indicateurs prospectifs_2012_11_06Scm indicateurs prospectifs_2012_11_06
Scm indicateurs prospectifs_2012_11_06Cdiscount
 
Prévisions trafic aérien
Prévisions trafic aérienPrévisions trafic aérien
Prévisions trafic aérienCdiscount
 
Présentation G.Biau Random Forests
Présentation G.Biau Random ForestsPrésentation G.Biau Random Forests
Présentation G.Biau Random ForestsCdiscount
 

Destacado (17)

Forecasting GDP profile with an application to French Business Surveys
Forecasting GDP profile with an application to French Business SurveysForecasting GDP profile with an application to French Business Surveys
Forecasting GDP profile with an application to French Business Surveys
 
Ranking binaire, agrégation multiclasses
Ranking binaire, agrégation multiclasses Ranking binaire, agrégation multiclasses
Ranking binaire, agrégation multiclasses
 
Prévision de consommation électrique avec adaptive GAM
Prévision de consommation électrique avec adaptive GAMPrévision de consommation électrique avec adaptive GAM
Prévision de consommation électrique avec adaptive GAM
 
Prévision consommation électrique par processus à valeurs fonctionnelles
Prévision consommation électrique par processus à valeurs fonctionnellesPrévision consommation électrique par processus à valeurs fonctionnelles
Prévision consommation électrique par processus à valeurs fonctionnelles
 
Prediction in dynamic Graphs
Prediction in dynamic GraphsPrediction in dynamic Graphs
Prediction in dynamic Graphs
 
Paris2012 session1
Paris2012 session1Paris2012 session1
Paris2012 session1
 
Paris2012 session3b
Paris2012 session3bParis2012 session3b
Paris2012 session3b
 
State Space Model
State Space ModelState Space Model
State Space Model
 
Prediction of Quantiles by Statistical Learning and Application to GDP Foreca...
Prediction of Quantiles by Statistical Learning and Application to GDP Foreca...Prediction of Quantiles by Statistical Learning and Application to GDP Foreca...
Prediction of Quantiles by Statistical Learning and Application to GDP Foreca...
 
Paris2012 session4
Paris2012 session4Paris2012 session4
Paris2012 session4
 
Scm prix blé_2012_11_06
Scm prix blé_2012_11_06Scm prix blé_2012_11_06
Scm prix blé_2012_11_06
 
Paris2012 session2
Paris2012 session2Paris2012 session2
Paris2012 session2
 
Scm risques
Scm risquesScm risques
Scm risques
 
Robust sequentiel learning
Robust sequentiel learningRobust sequentiel learning
Robust sequentiel learning
 
Scm indicateurs prospectifs_2012_11_06
Scm indicateurs prospectifs_2012_11_06Scm indicateurs prospectifs_2012_11_06
Scm indicateurs prospectifs_2012_11_06
 
Prévisions trafic aérien
Prévisions trafic aérienPrévisions trafic aérien
Prévisions trafic aérien
 
Présentation G.Biau Random Forests
Présentation G.Biau Random ForestsPrésentation G.Biau Random Forests
Présentation G.Biau Random Forests
 

Similar a Présentation Olivier Biau Random forests et conjoncture

Advanced Econometrics L1-2.pptx
Advanced Econometrics L1-2.pptxAdvanced Econometrics L1-2.pptx
Advanced Econometrics L1-2.pptxakashayosha
 
Paul Hiebert, ECB - OECD Workshop on “Climate change, Assumptions, Uncertaint...
Paul Hiebert, ECB - OECD Workshop on “Climate change, Assumptions, Uncertaint...Paul Hiebert, ECB - OECD Workshop on “Climate change, Assumptions, Uncertaint...
Paul Hiebert, ECB - OECD Workshop on “Climate change, Assumptions, Uncertaint...OECD Environment
 
Advanced Econometrics L3-4.pptx
Advanced Econometrics L3-4.pptxAdvanced Econometrics L3-4.pptx
Advanced Econometrics L3-4.pptxakashayosha
 
Predicting the economic public opinions in Europe
Predicting the economic public opinions in EuropePredicting the economic public opinions in Europe
Predicting the economic public opinions in EuropeSYRTO Project
 
ARTIFICIAL INTELLIGENCE TO OPTIMIZE COUNTRIES’ MACROECONOMIC AND ENVIRONMENTA...
ARTIFICIAL INTELLIGENCE TO OPTIMIZE COUNTRIES’ MACROECONOMIC AND ENVIRONMENTA...ARTIFICIAL INTELLIGENCE TO OPTIMIZE COUNTRIES’ MACROECONOMIC AND ENVIRONMENTA...
ARTIFICIAL INTELLIGENCE TO OPTIMIZE COUNTRIES’ MACROECONOMIC AND ENVIRONMENTA...ijaia
 
Sectoral allocation and macroeconomic imbalances in EMU
Sectoral allocation and macroeconomic imbalances in EMUSectoral allocation and macroeconomic imbalances in EMU
Sectoral allocation and macroeconomic imbalances in EMUADEMU_Project
 
Developing a Soft Linkage between a detailed dynamic input-output macroeconom...
Developing a Soft Linkage between a detailed dynamic input-output macroeconom...Developing a Soft Linkage between a detailed dynamic input-output macroeconom...
Developing a Soft Linkage between a detailed dynamic input-output macroeconom...IEA-ETSAP
 
Session III - Census and Registers - M. Scanu, G.Donariello, D. Frattarola, ...
Session III - Census and Registers -  M. Scanu, G.Donariello, D. Frattarola, ...Session III - Census and Registers -  M. Scanu, G.Donariello, D. Frattarola, ...
Session III - Census and Registers - M. Scanu, G.Donariello, D. Frattarola, ...Istituto nazionale di statistica
 
INTRODUCTION TO TIME SERIES REGRESSION AND FORCASTING
INTRODUCTION TO TIME SERIES REGRESSION AND FORCASTINGINTRODUCTION TO TIME SERIES REGRESSION AND FORCASTING
INTRODUCTION TO TIME SERIES REGRESSION AND FORCASTINGSPICEGODDESS
 
The Role of Production Factor Quality and Technology Diffusion in 20th Centur...
The Role of Production Factor Quality and Technology Diffusion in 20th Centur...The Role of Production Factor Quality and Technology Diffusion in 20th Centur...
The Role of Production Factor Quality and Technology Diffusion in 20th Centur...Structuralpolicyanalysis
 
Wavelet Multi-resolution Analysis of High Frequency FX Rates
Wavelet Multi-resolution Analysis of High Frequency FX RatesWavelet Multi-resolution Analysis of High Frequency FX Rates
Wavelet Multi-resolution Analysis of High Frequency FX RatesaiQUANT
 
Discussion of “Anatomy of sovereign distress: The role of financial sector fr...
Discussion of “Anatomy of sovereign distress: The role of financial sector fr...Discussion of “Anatomy of sovereign distress: The role of financial sector fr...
Discussion of “Anatomy of sovereign distress: The role of financial sector fr...Latvijas Banka
 
Polynomial regression model of making cost prediction in mixed cost analysis
Polynomial regression model of making cost prediction in mixed cost analysisPolynomial regression model of making cost prediction in mixed cost analysis
Polynomial regression model of making cost prediction in mixed cost analysisAlexander Decker
 
11.polynomial regression model of making cost prediction in mixed cost analysis
11.polynomial regression model of making cost prediction in mixed cost analysis11.polynomial regression model of making cost prediction in mixed cost analysis
11.polynomial regression model of making cost prediction in mixed cost analysisAlexander Decker
 
AN IMPROVED DECISION SUPPORT SYSTEM BASED ON THE BDM (BIT DECISION MAKING) ME...
AN IMPROVED DECISION SUPPORT SYSTEM BASED ON THE BDM (BIT DECISION MAKING) ME...AN IMPROVED DECISION SUPPORT SYSTEM BASED ON THE BDM (BIT DECISION MAKING) ME...
AN IMPROVED DECISION SUPPORT SYSTEM BASED ON THE BDM (BIT DECISION MAKING) ME...ijmpict
 
论文 yanning zang
论文 yanning zang论文 yanning zang
论文 yanning zangYanning Zang
 
Productivity and GDP per capita growth: A long-term perspective, Bergeaud, Ce...
Productivity and GDP per capita growth: A long-term perspective, Bergeaud, Ce...Productivity and GDP per capita growth: A long-term perspective, Bergeaud, Ce...
Productivity and GDP per capita growth: A long-term perspective, Bergeaud, Ce...Soledad Zignago
 

Similar a Présentation Olivier Biau Random forests et conjoncture (20)

Advanced Econometrics L1-2.pptx
Advanced Econometrics L1-2.pptxAdvanced Econometrics L1-2.pptx
Advanced Econometrics L1-2.pptx
 
Labour Productivity Dynamics Regularities Analyses by Manufacturing in Europe...
Labour Productivity Dynamics Regularities Analyses by Manufacturing in Europe...Labour Productivity Dynamics Regularities Analyses by Manufacturing in Europe...
Labour Productivity Dynamics Regularities Analyses by Manufacturing in Europe...
 
Paul Hiebert, ECB - OECD Workshop on “Climate change, Assumptions, Uncertaint...
Paul Hiebert, ECB - OECD Workshop on “Climate change, Assumptions, Uncertaint...Paul Hiebert, ECB - OECD Workshop on “Climate change, Assumptions, Uncertaint...
Paul Hiebert, ECB - OECD Workshop on “Climate change, Assumptions, Uncertaint...
 
Advanced Econometrics L3-4.pptx
Advanced Econometrics L3-4.pptxAdvanced Econometrics L3-4.pptx
Advanced Econometrics L3-4.pptx
 
Predicting the economic public opinions in Europe
Predicting the economic public opinions in EuropePredicting the economic public opinions in Europe
Predicting the economic public opinions in Europe
 
ARTIFICIAL INTELLIGENCE TO OPTIMIZE COUNTRIES’ MACROECONOMIC AND ENVIRONMENTA...
ARTIFICIAL INTELLIGENCE TO OPTIMIZE COUNTRIES’ MACROECONOMIC AND ENVIRONMENTA...ARTIFICIAL INTELLIGENCE TO OPTIMIZE COUNTRIES’ MACROECONOMIC AND ENVIRONMENTA...
ARTIFICIAL INTELLIGENCE TO OPTIMIZE COUNTRIES’ MACROECONOMIC AND ENVIRONMENTA...
 
Sectoral allocation and macroeconomic imbalances in EMU
Sectoral allocation and macroeconomic imbalances in EMUSectoral allocation and macroeconomic imbalances in EMU
Sectoral allocation and macroeconomic imbalances in EMU
 
Developing a Soft Linkage between a detailed dynamic input-output macroeconom...
Developing a Soft Linkage between a detailed dynamic input-output macroeconom...Developing a Soft Linkage between a detailed dynamic input-output macroeconom...
Developing a Soft Linkage between a detailed dynamic input-output macroeconom...
 
Session III - Census and Registers - M. Scanu, G.Donariello, D. Frattarola, ...
Session III - Census and Registers -  M. Scanu, G.Donariello, D. Frattarola, ...Session III - Census and Registers -  M. Scanu, G.Donariello, D. Frattarola, ...
Session III - Census and Registers - M. Scanu, G.Donariello, D. Frattarola, ...
 
INTRODUCTION TO TIME SERIES REGRESSION AND FORCASTING
INTRODUCTION TO TIME SERIES REGRESSION AND FORCASTINGINTRODUCTION TO TIME SERIES REGRESSION AND FORCASTING
INTRODUCTION TO TIME SERIES REGRESSION AND FORCASTING
 
The Role of Production Factor Quality and Technology Diffusion in 20th Centur...
The Role of Production Factor Quality and Technology Diffusion in 20th Centur...The Role of Production Factor Quality and Technology Diffusion in 20th Centur...
The Role of Production Factor Quality and Technology Diffusion in 20th Centur...
 
Comparing the Economic Structure and Carbon Dioxide Emission between China an...
Comparing the Economic Structure and Carbon Dioxide Emission between China an...Comparing the Economic Structure and Carbon Dioxide Emission between China an...
Comparing the Economic Structure and Carbon Dioxide Emission between China an...
 
Wavelet Multi-resolution Analysis of High Frequency FX Rates
Wavelet Multi-resolution Analysis of High Frequency FX RatesWavelet Multi-resolution Analysis of High Frequency FX Rates
Wavelet Multi-resolution Analysis of High Frequency FX Rates
 
Discussion of “Anatomy of sovereign distress: The role of financial sector fr...
Discussion of “Anatomy of sovereign distress: The role of financial sector fr...Discussion of “Anatomy of sovereign distress: The role of financial sector fr...
Discussion of “Anatomy of sovereign distress: The role of financial sector fr...
 
Polynomial regression model of making cost prediction in mixed cost analysis
Polynomial regression model of making cost prediction in mixed cost analysisPolynomial regression model of making cost prediction in mixed cost analysis
Polynomial regression model of making cost prediction in mixed cost analysis
 
11.polynomial regression model of making cost prediction in mixed cost analysis
11.polynomial regression model of making cost prediction in mixed cost analysis11.polynomial regression model of making cost prediction in mixed cost analysis
11.polynomial regression model of making cost prediction in mixed cost analysis
 
AN IMPROVED DECISION SUPPORT SYSTEM BASED ON THE BDM (BIT DECISION MAKING) ME...
AN IMPROVED DECISION SUPPORT SYSTEM BASED ON THE BDM (BIT DECISION MAKING) ME...AN IMPROVED DECISION SUPPORT SYSTEM BASED ON THE BDM (BIT DECISION MAKING) ME...
AN IMPROVED DECISION SUPPORT SYSTEM BASED ON THE BDM (BIT DECISION MAKING) ME...
 
Artur-eea-presentation
Artur-eea-presentationArtur-eea-presentation
Artur-eea-presentation
 
论文 yanning zang
论文 yanning zang论文 yanning zang
论文 yanning zang
 
Productivity and GDP per capita growth: A long-term perspective, Bergeaud, Ce...
Productivity and GDP per capita growth: A long-term perspective, Bergeaud, Ce...Productivity and GDP per capita growth: A long-term perspective, Bergeaud, Ce...
Productivity and GDP per capita growth: A long-term perspective, Bergeaud, Ce...
 

Más de Cdiscount

Presentation r markdown
Presentation r markdown Presentation r markdown
Presentation r markdown Cdiscount
 
R2DOCX : R + WORD
R2DOCX : R + WORDR2DOCX : R + WORD
R2DOCX : R + WORDCdiscount
 
Fltau r interface
Fltau r interfaceFltau r interface
Fltau r interfaceCdiscount
 
Dataiku r users group v2
Dataiku   r users group v2Dataiku   r users group v2
Dataiku r users group v2Cdiscount
 
Introduction à la cartographie avec R
Introduction à la cartographie avec RIntroduction à la cartographie avec R
Introduction à la cartographie avec RCdiscount
 
Parallel R in snow (english after 2nd slide)
Parallel R in snow (english after 2nd slide)Parallel R in snow (english after 2nd slide)
Parallel R in snow (english after 2nd slide)Cdiscount
 
Premier pas de web scrapping avec R
Premier pas de  web scrapping avec RPremier pas de  web scrapping avec R
Premier pas de web scrapping avec RCdiscount
 
Incorporer du C dans R, créer son package
Incorporer du C dans R, créer son packageIncorporer du C dans R, créer son package
Incorporer du C dans R, créer son packageCdiscount
 
Comptabilité Nationale avec R
Comptabilité Nationale avec RComptabilité Nationale avec R
Comptabilité Nationale avec RCdiscount
 
Cartographie avec igraph sous R (Partie 2)
Cartographie avec igraph sous R (Partie 2)Cartographie avec igraph sous R (Partie 2)
Cartographie avec igraph sous R (Partie 2)Cdiscount
 
Cartographie avec igraph sous R (Partie 1)
Cartographie avec igraph sous R (Partie 1) Cartographie avec igraph sous R (Partie 1)
Cartographie avec igraph sous R (Partie 1) Cdiscount
 
RStudio is good for you
RStudio is good for youRStudio is good for you
RStudio is good for youCdiscount
 
R fait du la tex
R fait du la texR fait du la tex
R fait du la texCdiscount
 
Première approche de cartographie sous R
Première approche de cartographie sous RPremière approche de cartographie sous R
Première approche de cartographie sous RCdiscount
 

Más de Cdiscount (17)

R Devtools
R DevtoolsR Devtools
R Devtools
 
Presentation r markdown
Presentation r markdown Presentation r markdown
Presentation r markdown
 
R2DOCX : R + WORD
R2DOCX : R + WORDR2DOCX : R + WORD
R2DOCX : R + WORD
 
Gur1009
Gur1009Gur1009
Gur1009
 
Fltau r interface
Fltau r interfaceFltau r interface
Fltau r interface
 
Dataiku r users group v2
Dataiku   r users group v2Dataiku   r users group v2
Dataiku r users group v2
 
Introduction à la cartographie avec R
Introduction à la cartographie avec RIntroduction à la cartographie avec R
Introduction à la cartographie avec R
 
HADOOP + R
HADOOP + RHADOOP + R
HADOOP + R
 
Parallel R in snow (english after 2nd slide)
Parallel R in snow (english after 2nd slide)Parallel R in snow (english after 2nd slide)
Parallel R in snow (english after 2nd slide)
 
Premier pas de web scrapping avec R
Premier pas de  web scrapping avec RPremier pas de  web scrapping avec R
Premier pas de web scrapping avec R
 
Incorporer du C dans R, créer son package
Incorporer du C dans R, créer son packageIncorporer du C dans R, créer son package
Incorporer du C dans R, créer son package
 
Comptabilité Nationale avec R
Comptabilité Nationale avec RComptabilité Nationale avec R
Comptabilité Nationale avec R
 
Cartographie avec igraph sous R (Partie 2)
Cartographie avec igraph sous R (Partie 2)Cartographie avec igraph sous R (Partie 2)
Cartographie avec igraph sous R (Partie 2)
 
Cartographie avec igraph sous R (Partie 1)
Cartographie avec igraph sous R (Partie 1) Cartographie avec igraph sous R (Partie 1)
Cartographie avec igraph sous R (Partie 1)
 
RStudio is good for you
RStudio is good for youRStudio is good for you
RStudio is good for you
 
R fait du la tex
R fait du la texR fait du la tex
R fait du la tex
 
Première approche de cartographie sous R
Première approche de cartographie sous RPremière approche de cartographie sous R
Première approche de cartographie sous R
 

Último

Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 

Último (20)

Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 

Présentation Olivier Biau Random forests et conjoncture

  • 1. Euro area GDP forecasting using large survey datasets A random forest approach Olivier Biau – Angela D’Elia Directorate General for Economic and Financial Affairs European Commission Groupe Travail prévision Paris, 12 May 2010 Views expressed represent exclusively the positions of the authors and do not necessarily correspond to those of the European Commission
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.  
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.

Notas del editor

  1. In recent years there has been increasing interest in forecasting methods that utilise large data sets . Indeed, there is a huge quantity of information available in the economic arena which might be useful for forecasting, but standard econometric techniques are not well suited to extract this in a useful form. This is not only an issue of academic interest . Central bankers and policy makers are interested in summarising large data sets for forecasting purposes (Eklund and Kapetanios gave a wide review of the recent literature on this issue) Since the reference paper of Stock and Watson (2002), factors methods have been at the fore front of developments in forecasting with large data set. The methodology to extract a few number of common factors has become more and more sophisticated from principal components to dynamic factor models, for example in Forni et al. Or to deal with unbalanced data sets at the end of the sample (jagged edge feature of the data), like in the recent contribution from Giannone on real time nowcast Finally, factor analysis combined with linear models (bridge models) has been among the main tool to summarise large data set for forecasting purpose.
  2. In our study, we would like to present a new statistical approach to forecasting macro economic aggregates based on the RF techniques , proposed by Breiman in the 2000’s. This technique is widely used in biostatistics, becomes more and more popular and appears to be very powerful in a lot of different applications (classification problems and also regression problems). But, it is largely unknown in economics . To our knowledge, the only application in the economic field is the paper I wrote with my co-authosr (gérard Biau ans Laurent Rouvière) when I was working for the French INSEE. In this paper, we use the micro data from the French Industry Business Tendency Survey to track the manufacturing output.
  3. If RF are so widely used in the medical research, it is because it enjoys good prediction properties, is robust to noise and can handle a very large number of input variables (which is very often the case in medicine, as you can collect a lot of variable on your patients but very often the number of observation is limited. RF is considered to be one of the most accurate general-purpose learning techniques available, independent of any functional and distributional assumptions RF is considered to be very powerful… but it is not clearly elucidated from a mathematical point of view . Although the mechanism appears simple, it involves many different driving forces which make it difficult to analyse. In fact, its mathematical properties remain to date largely unknown and, up to now, most theoretical studies have concentrated on isolated parts or stylized versions of the algorithm. Lin and Jeon: connection with the adaptative nearest neighbour The most recent paper from Gérard Biau is a step frowards as it shows consistency theorem of the RF algorithm.
  4. After this introduction, here the outline of the presentation: I will first present the dataset In order to explain the algorithm I will take one basic example Then, I will present the benchmarks, I mean, the competitors of the Random forest outputs Finally, we will see our results and the perspectives of this work
  5. After this introduction, here the outline of the presentation: I will first present the dataset In order to explain the algorithm I will take one basic example Then, I will present the benchmarks, I mean, the competitors of the Random forest outputs Finally, we will see our results and the perspectives of this work
  6. Well, the data set used in this paper is based on the J H EU Programme of BCS . It covers 5 sectors: A key aspect of the business surveys is that most questions ask for qualitative responses, reflecting the sentiment or confidence (optimistic, pessimistic or neutral) of managers and consumers The Programme covers all the 27 Members States, Croatia, the FYROM and Turkey More than 125 000 firms and over 40 000 consumers are surveyed every month
  7. To be more precise, The dataset mainly consists of the euro area balances of opinion (%positive - % negative) The time series used in the analysis are those available at the end of the third month of each quarter: the level series: monthly St or quarterly Sq , the difference series ( St - St-1 , St - St-2 , St - St-3 for monthly questions, Sq - Sq-1 for quarterly questions) The dataset is composed of p = 172 ‘soft’ series: X i The only ‘hard’ variable is the euro area GDP qoq growth series: Y i Finally, we have –what we called in this literature- a « learning set » L = { ( X 1 , Y 1 ) … ( X n , Y n ) } n=57 (from 3 rd Q 1995 to 3 rd Q 2009)
  8. I will now speak about the RF. But, before dealing with the Forest, I have to explain what is a tree and, above all, how it is grown . What is a tree? It is a partition of space into regions. The most important is the binary tree, since they have just 2 children per node Here we have an example: We first split the space into two regions. One or both of these regions are split into two more regions, and this process is continued, until some stopping rule is applied How to grow a tree (the response is for example with CART) and what is the tree predictor (or tree regressor) . In my next slide, I will take a basic example…
  9. Imagine I want to predict the income of someone who enters this room using the information I have collected about your self: So I know about you a lot of characteristics Xi such as your gender, size, age, your position –I mean do you have responsibility or not, are you head of unit, etc…), …and of course I have collected your income Y. We have to find the first node of the tree, that is the first splitting variable X j and the first split point s which discriminate the most (by solving this expression if we want to model the response forexample by a constant in each region) If we take the variable ‘size’, N1 represents the small ones (smaller than I) and N2 the tall ones. C1 is the average income for the small ones and C2 the average income of the all one. Do you think, this would be the best partition in term of minimum sum of square ? Undoubtedly the answer would be NO. The first node will be the position (the chiefs / the others…) or the age. So having found the first split by scanning all the covariates, CART repeats the process with other variables until for example the terminal nodes contains a user specified number of individuals.
  10. Now, the tree is grow… Given the characteristics X of the new entering the room (gender, size, position, …) I put him down the tree, He falls in a terminal node N(X) And I predict his income by averaging the observed Yi over the observations i ‘falling’ in that node. For example, is the new entering this room is a young Man, he is 150cm high, he is French and not HoU, … I will predict his income by averaging those of the young small French Men who are not chiefs.
  11. Breiman has demonstrated that consequential gains in prediction accuracy can be achieve by using a set of simpler trees. Here we have the RF algorithm as described in the book of Hastie More precisely, a random forest is a collection of K tree predictors, Where each tree is constructed from a bootstrap sample from the learning dataset L However, instead of determining the optimal split on a given node by evaluating all possible covariates, a subset of covariates drawn at random is used. Finally, we aggregate K tree predictors of simpler size trees. By doing that, it is possible to approximate a rich class of functions. And it gives an accurate approximation of the conditional mean. For the free parameters K , nodesize and mtry (the variables chosen at random from p) , we used the default values 500, 5 and p /3 of the random forest R-package
  12. In addition, RF posses a number of features, that can be used for example to deal with the problem of missing values. In this study, we use the important feature to reduce data dimensionality . Here n the number of observations is much lower than p the dimension of the explanatory variables The variable importance allows to discriminate between informative and noninformative variables. For each variable, the idea is to compare the prediction error with the prediction error where the variable is randomly permuted - Large positive values for a variable indicate that this variable is predictive, since noising it up increases prediction error, - zero or negative importance values indicate non-predictive variables In my previous example, the importance for size would have been very likely negative, while age or position would have been positive…
  13. So, our aim is to predict GDP growth for quarter Q, based on data available at the end on quarter Q To be clear, based on the information available until end of June 2010, we want to forecast 2 nd Q of 2010 (now cast, remember that Flash estimate GDP will be release by Eurostat mid august) To asses the quality of our forecast, we will do an out-of-sample analysis: 2004q1 – 2009q3 The criterium will be the MSE (out of sample MSE) We started our study an unvariate AR model … which appear to be a poor competitor. And we decided to compared our results with a fair competitor: the quarterly projections of the EZEO (jointly release by three major European economic institutes, the German IFO, the French Insee and the Italian ISAE) EZEO is a quarterly publication Based on the information available at the end of the previous quarter, EZEO forecast GDP growth of quarter Q. In fact, this publication provides also 2-steps -ahead projections (Q+1 and Q+2) for GDP, IP, Consumption, Inflation And describes also economic links explaining these forecasts… Here, however, our concern was just to asses How a data-driven model like the RF perform relative to competitors for GDP nowcasting.
  14. Results: We compare the pure random forest (non parametric) to our benchmarks. Based on MSE computed on the whole out of sample period, RF outperforms the AR but not the euro zone economic outlook You have quarter after quarter all the forecast in the paper, but in this graph I plotted how MSE develop over time (real time MSE) The graphs highlights the good performance of RF before the crisis Poor performance during the crisis… in fact for 2008Q4 and 2009Q1, we made a huge error. But it is not surprising, as the pure RF are based on average of values observed in the learning set. And before the crisis, no negative vales were present in the learning set ! However, it is worth noting that the RF algorithm “learns” and will be able to predict negative output in the future.
  15. To square the problem of non negative values in our learning set, we try to use the power of RF to select the 25 most important variables And then to insert this variable in a bridge model. We use the General to Specific procedure to choose the best linear model. RF_LIMOD is the model we retained. Remarkably, 3 of five explanatory variables come from the Industry (whose variability explain most of the GDP cycles), moreover the variable are related to orders book, which are well known to be among the most factual and reliable soft variable We have also one question about order in retail trader and on question from the consumer survey.
  16. To square the problem of non negative values in our learning set, we try to use the power of RF to select the 25 most important variables And then to insert this variable in a bridge model. We use the General to Specific procedure to choose the best linear model. RF_LIMOD is the model we retained. Remarkably, 3 of five explanatory variables come from the Industry (whose variability explain most of the GDP cycles), moreover the variable are related to orders book, which are well known to be among the most factual and reliable soft variable We have also one question about order in retail trader and on question from the consumer survey.
  17. Results: We continue compare RF_LINMOD outputs using the same criterion Based on MSE computed on the whole out of sample period, it performs as well as the euro zone economic outlook outperforms the pure RF and compares well to the euro zone economic outlook during the ‘crisis’
  18. For further development, we would like continue this analysis by adding in our dataset the hard variable that are available at the end of each quarter (for example the carry over of industrial production, first registration of private cars)… and according to our tests, the carry over of the IP appears to be one of the most important variable! Moreover, this kind of information is for sure used by our colleagues in the euro zone economic outlook… So, this work is still ongoing but promising.