Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.
Rob J Hyndman
Automatic time series
forecasting
Outline
1 Motivation
2 Forecasting competitions
3 Forecasting the PBS
4 Exponential smoothing
5 ARIMA modelling
6 Automati...
Motivation
Automatic time series forecasting Motivation 3
Motivation
Automatic time series forecasting Motivation 3
Motivation
Automatic time series forecasting Motivation 3
Motivation
Automatic time series forecasting Motivation 3
Motivation
Automatic time series forecasting Motivation 3
Motivation
1 Common in business to have over 1000
products that need forecasting at least monthly.
2 Forecasts are often r...
Motivation
1 Common in business to have over 1000
products that need forecasting at least monthly.
2 Forecasts are often r...
Example: Asian sheep
Automatic time series forecasting Motivation 5
Numbers of sheep in Asia
Year
millionsofsheep
1960 197...
Example: Asian sheep
Automatic time series forecasting Motivation 5
Automatic ETS forecasts
Year
millionsofsheep
1960 1970...
Example: Cortecosteroid sales
Automatic time series forecasting Motivation 6
Monthly cortecosteroid drug sales in Australi...
Example: Cortecosteroid sales
Automatic time series forecasting Motivation 6
Automatic ARIMA forecasts
Year
Totalscripts(m...
Outline
1 Motivation
2 Forecasting competitions
3 Forecasting the PBS
4 Exponential smoothing
5 ARIMA modelling
6 Automati...
Makridakis and Hibon (1979)
Automatic time series forecasting Forecasting competitions 8
Makridakis and Hibon (1979)
Automatic time series forecasting Forecasting competitions 8
Makridakis and Hibon (1979)
This was the first large-scale empirical evaluation of
time series forecasting methods.
Highly ...
Makridakis and Hibon (1979)
I do not believe that it is very fruitful to attempt to
classify series according to which for...
Makridakis and Hibon (1979)
I do not believe that it is very fruitful to attempt to
classify series according to which for...
Makridakis and Hibon (1979)
Modern man is fascinated with the subject of
forecasting — W.G. Gilchrist
It is amazing to me,...
Makridakis and Hibon (1979)
Modern man is fascinated with the subject of
forecasting — W.G. Gilchrist
It is amazing to me,...
Makridakis and Hibon (1979)
I find it hard to believe that Box-Jenkins, if properly
applied, can actually be worse than so ...
Makridakis and Hibon (1979)
I find it hard to believe that Box-Jenkins, if properly
applied, can actually be worse than so ...
Makridakis and Hibon (1979)
I find it hard to believe that Box-Jenkins, if properly
applied, can actually be worse than so ...
Makridakis and Hibon (1979)
There is a fact that Professor Priestley must accept:
empirical evidence is in disagreement wi...
Makridakis and Hibon (1979)
There is a fact that Professor Priestley must accept:
empirical evidence is in disagreement wi...
Consequences of MH (1979)
As a result of this paper, researchers started to:
¯ consider how to automate forecasting method...
Consequences of MH (1979)
As a result of this paper, researchers started to:
¯ consider how to automate forecasting method...
M-competition
Automatic time series forecasting Forecasting competitions 15
M-competition
Main findings (taken from Makridakis  Hibon, 2000)
1 Statistically sophisticated or complex methods do
not ne...
M3 competition
Automatic time series forecasting Forecasting competitions 17
Makridakis and Hibon (2000)
“The M3-Competition is a final attempt by the authors to
settle the accuracy issue of various t...
Makridakis and Hibon (2000)
Best methods:
Theta
A very confusing explanation.
Shown by Hyndman and Billah (2003) to be ave...
M3 results (recalculated)
Method MAPE sMAPE MASE
Theta 17.42 12.76 1.39
ForecastPro 18.00 13.06 1.47
ForecastX 17.35 13.09...
M3 results (recalculated)
Method MAPE sMAPE MASE
Theta 17.42 12.76 1.39
ForecastPro 18.00 13.06 1.47
ForecastX 17.35 13.09...
Outline
1 Motivation
2 Forecasting competitions
3 Forecasting the PBS
4 Exponential smoothing
5 ARIMA modelling
6 Automati...
Forecasting the PBS
Automatic time series forecasting Forecasting the PBS 22
Forecasting the PBS
The Pharmaceutical Benefits Scheme (PBS) is
the Australian government drugs subsidy scheme.
Many drugs ...
Forecasting the PBS
The Pharmaceutical Benefits Scheme (PBS) is
the Australian government drugs subsidy scheme.
Many drugs ...
Forecasting the PBS
The Pharmaceutical Benefits Scheme (PBS) is
the Australian government drugs subsidy scheme.
Many drugs ...
Forecasting the PBS
The Pharmaceutical Benefits Scheme (PBS) is
the Australian government drugs subsidy scheme.
Many drugs ...
Forecasting the PBS
In 2001: $4.5 billion budget, under-forecasted
by $800 million.
Thousands of products. Seasonal demand...
Forecasting the PBS
In 2001: $4.5 billion budget, under-forecasted
by $800 million.
Thousands of products. Seasonal demand...
Forecasting the PBS
In 2001: $4.5 billion budget, under-forecasted
by $800 million.
Thousands of products. Seasonal demand...
Forecasting the PBS
In 2001: $4.5 billion budget, under-forecasted
by $800 million.
Thousands of products. Seasonal demand...
Forecasting the PBS
In 2001: $4.5 billion budget, under-forecasted
by $800 million.
Thousands of products. Seasonal demand...
PBS data
Automatic time series forecasting Forecasting the PBS 25
Total cost: A03 concession safety net group
Time
$thousa...
PBS data
Automatic time series forecasting Forecasting the PBS 25
Total cost: A05 general copayments group
Time
$thousands...
PBS data
Automatic time series forecasting Forecasting the PBS 25
Total cost: D01 general copayments group
Time
$thousands...
PBS data
Automatic time series forecasting Forecasting the PBS 25
Total cost: S01 general copayments group
Time
$thousands...
PBS data
Automatic time series forecasting Forecasting the PBS 25
Total cost: R03 general copayments group
Time
$thousands...
Outline
1 Motivation
2 Forecasting competitions
3 Forecasting the PBS
4 Exponential smoothing
5 ARIMA modelling
6 Automati...
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A...
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A...
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A...
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A...
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A...
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A...
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A...
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A...
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A...
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A...
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A...
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A...
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A...
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A...
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A...
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A...
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A...
Exponential smoothing methods
Seasonal Component
Trend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A...
ETS state space model
xt−1
εt
yt
Automatic time series forecasting Exponential smoothing 29
State space model
xt = (level,...
ETS state space model
xt−1
εt
yt
xt
Automatic time series forecasting Exponential smoothing 29
State space model
xt = (lev...
ETS state space model
xt−1
εt
yt
xt yt+1
εt+1
Automatic time series forecasting Exponential smoothing 29
State space model...
ETS state space model
xt−1
εt
yt
xt yt+1
εt+1 xt+1
Automatic time series forecasting Exponential smoothing 29
State space ...
ETS state space model
xt−1
εt
yt
xt yt+1
εt+1 xt+1 yt+2
εt+2
Automatic time series forecasting Exponential smoothing 29
St...
ETS state space model
xt−1
εt
yt
xt yt+1
εt+1 xt+1 yt+2
εt+2 xt+2
Automatic time series forecasting Exponential smoothing ...
ETS state space model
xt−1
εt
yt
xt yt+1
εt+1 xt+1 yt+2
εt+2 xt+2 yt+3
εt+3
Automatic time series forecasting Exponential ...
ETS state space model
xt−1
εt
yt
xt yt+1
εt+1 xt+1 yt+2
εt+2 xt+2 yt+3
εt+3 xt+3
Automatic time series forecasting Exponen...
ETS state space model
xt−1
εt
yt
xt yt+1
εt+1 xt+1 yt+2
εt+2 xt+2 yt+3
εt+3 xt+3 yt+4
εt+4
Automatic time series forecasti...
ETS state space model
xt−1
εt
yt
xt yt+1
εt+1 xt+1 yt+2
εt+2 xt+2 yt+3
εt+3 xt+3 yt+4
εt+4
Automatic time series forecasti...
Innovations state space models
Let xt = ( t, bt, st, st−1, . . . , st−m+1) and εt
iid
∼ N(0, σ2
).
yt = h(xt−1) + k(xt−1)ε...
Innovations state space models
All models can be written in state space form.
Additive and multiplicative versions give sa...
Innovations state space models
All models can be written in state space form.
Additive and multiplicative versions give sa...
Innovations state space models
All models can be written in state space form.
Additive and multiplicative versions give sa...
Innovations state space models
All models can be written in state space form.
Additive and multiplicative versions give sa...
Innovations state space models
All models can be written in state space form.
Additive and multiplicative versions give sa...
Innovations state space models
All models can be written in state space form.
Additive and multiplicative versions give sa...
Cross-validation
Traditional evaluation
Automatic time series forecasting Exponential smoothing 32
q q q q q q q q q q q q...
Cross-validation
Traditional evaluation
Standard cross-validation
Automatic time series forecasting Exponential smoothing ...
Cross-validation
Traditional evaluation
Standard cross-validation
Time series cross-validation
Automatic time series forec...
Cross-validation
Traditional evaluation
Standard cross-validation
Time series cross-validation
Automatic time series forec...
Cross-validation
Traditional evaluation
Standard cross-validation
Time series cross-validation
Automatic time series forec...
Akaike’s Information Criterion
AIC = −2 log(L) + 2k
where L is the likelihood and k is the number of
estimated parameters ...
Akaike’s Information Criterion
AIC = −2 log(L) + 2k
where L is the likelihood and k is the number of
estimated parameters ...
Akaike’s Information Criterion
AIC = −2 log(L) + 2k
where L is the likelihood and k is the number of
estimated parameters ...
Akaike’s Information Criterion
AIC = −2 log(L) + 2k
where L is the likelihood and k is the number of
estimated parameters ...
Akaike’s Information Criterion
AIC = −2 log(L) + 2k
where L is the likelihood and k is the number of
estimated parameters ...
Akaike’s Information Criterion
AIC = −2 log(L) + 2k
Corrected AIC
For small T, AIC tends to over-fit. Bias-corrected
versio...
Akaike’s Information Criterion
AIC = −2 log(L) + 2k
Corrected AIC
For small T, AIC tends to over-fit. Bias-corrected
versio...
Akaike’s Information Criterion
AIC = −2 log(L) + 2k
Corrected AIC
For small T, AIC tends to over-fit. Bias-corrected
versio...
What to use?
Choice: AIC, AICc, BIC, CV-MSE
CV-MSE too time consuming for most automatic
forecasting purposes. Also requir...
What to use?
Choice: AIC, AICc, BIC, CV-MSE
CV-MSE too time consuming for most automatic
forecasting purposes. Also requir...
What to use?
Choice: AIC, AICc, BIC, CV-MSE
CV-MSE too time consuming for most automatic
forecasting purposes. Also requir...
What to use?
Choice: AIC, AICc, BIC, CV-MSE
CV-MSE too time consuming for most automatic
forecasting purposes. Also requir...
What to use?
Choice: AIC, AICc, BIC, CV-MSE
CV-MSE too time consuming for most automatic
forecasting purposes. Also requir...
ets algorithm in R
Automatic time series forecasting Exponential smoothing 36
Based on Hyndman, Koehler,
Snyder  Grose (IJ...
ets algorithm in R
Automatic time series forecasting Exponential smoothing 36
Based on Hyndman, Koehler,
Snyder  Grose (IJ...
ets algorithm in R
Automatic time series forecasting Exponential smoothing 36
Based on Hyndman, Koehler,
Snyder  Grose (IJ...
ets algorithm in R
Automatic time series forecasting Exponential smoothing 36
Based on Hyndman, Koehler,
Snyder  Grose (IJ...
Exponential smoothing
Automatic time series forecasting Exponential smoothing 37
Forecasts from ETS(M,A,N)
Year
millionsof...
Exponential smoothing
fit - ets(livestock)
fcast - forecast(fit)
plot(fcast)
Automatic time series forecasting Exponential...
Exponential smoothing
Automatic time series forecasting Exponential smoothing 39
Forecasts from ETS(M,Md,M)
Year
Totalscri...
Exponential smoothing
fit - ets(h02)
fcast - forecast(fit)
plot(fcast)
Automatic time series forecasting Exponential smoot...
Exponential smoothing
 fit
ETS(M,Md,M)
Smoothing parameters:
alpha = 0.3318
beta = 4e-04
gamma = 1e-04
phi = 0.9695
Initia...
M3 comparisons
Method MAPE sMAPE MASE
Theta 17.42 12.76 1.39
ForecastPro 18.00 13.06 1.47
ForecastX 17.35 13.09 1.42
Autom...
Exponential smoothing
Automatic time series forecasting Exponential smoothing 43
Exponential smoothing
Automatic time series forecasting Exponential smoothing 43
www.OTexts.com/fpp
Exponential smoothing
Automatic time series forecasting Exponential smoothing 43
Outline
1 Motivation
2 Forecasting competitions
3 Forecasting the PBS
4 Exponential smoothing
5 ARIMA modelling
6 Automati...
ARIMA models
yt−1
yt−2
yt−3
yt
Inputs Output
Automatic time series forecasting ARIMA modelling 45
ARIMA models
yt−1
yt−2
yt−3
εt
yt
Inputs Output
Automatic time series forecasting ARIMA modelling 45
Autoregression (AR)
m...
ARIMA models
yt−1
yt−2
yt−3
εt
εt−1
εt−2
yt
Inputs Output
Automatic time series forecasting ARIMA modelling 45
Autoregress...
ARIMA models
yt−1
yt−2
yt−3
εt
εt−1
εt−2
yt
Inputs Output
Automatic time series forecasting ARIMA modelling 45
Autoregress...
ARIMA models
yt−1
yt−2
yt−3
εt
εt−1
εt−2
yt
Inputs Output
Automatic time series forecasting ARIMA modelling 45
Autoregress...
ARIMA modelling
Automatic time series forecasting ARIMA modelling 46
ARIMA modelling
Automatic time series forecasting ARIMA modelling 46
ARIMA modelling
Automatic time series forecasting ARIMA modelling 46
Auto ARIMA
Automatic time series forecasting ARIMA modelling 47
Forecasts from ARIMA(0,1,0) with drift
Year
millionsofshee...
Auto ARIMA
fit - auto.arima(livestock)
fcast - forecast(fit)
plot(fcast)
Automatic time series forecasting ARIMA modelling...
Auto ARIMA
Automatic time series forecasting ARIMA modelling 49
Forecasts from ARIMA(3,1,3)(0,1,1)[12]
Year
Totalscripts(m...
Auto ARIMA
fit - auto.arima(h02)
fcast - forecast(fit)
plot(fcast)
Automatic time series forecasting ARIMA modelling 50
Fo...
Auto ARIMA
 fit
Series: h02
ARIMA(3,1,3)(0,1,1)[12]
Coefficients:
ar1 ar2 ar3 ma1 ma2 ma3
-0.3648 -0.0636 0.3568 -0.4850 0...
How does auto.arima() work?
A non-seasonal ARIMA process
φ(B)(1 − B)d
yt = c + θ(B)εt
Need to select appropriate orders p,...
How does auto.arima() work?
A non-seasonal ARIMA process
φ(B)(1 − B)d
yt = c + θ(B)εt
Need to select appropriate orders p,...
How does auto.arima() work?
A non-seasonal ARIMA process
φ(B)(1 − B)d
yt = c + θ(B)εt
Need to select appropriate orders p,...
How does auto.arima() work?
A seasonal ARIMA process
Φ(Bm
)φ(B)(1 − B)d
(1 − Bm
)D
yt = c + Θ(Bm
)θ(B)εt
Need to select ap...
M3 comparisons
Method MAPE sMAPE MASE
Theta 17.42 12.76 1.39
ForecastPro 18.00 13.06 1.47
B-J automatic 19.13 13.72 1.54
E...
M3 comparisons
Method MAPE sMAPE MASE
Theta 17.42 12.76 1.39
ForecastPro 18.00 13.06 1.47
B-J automatic 19.13 13.72 1.54
E...
M3 conclusions
MYTHS
Simple methods do better.
Exponential smoothing is better than ARIMA.
FACTS
The best methods are hybr...
M3 conclusions
MYTHS
Simple methods do better.
Exponential smoothing is better than ARIMA.
FACTS
The best methods are hybr...
M3 conclusions
MYTHS
Simple methods do better.
Exponential smoothing is better than ARIMA.
FACTS
The best methods are hybr...
M3 conclusions
MYTHS
Simple methods do better.
Exponential smoothing is better than ARIMA.
FACTS
The best methods are hybr...
M3 conclusions
MYTHS
Simple methods do better.
Exponential smoothing is better than ARIMA.
FACTS
The best methods are hybr...
Outline
1 Motivation
2 Forecasting competitions
3 Forecasting the PBS
4 Exponential smoothing
5 ARIMA modelling
6 Automati...
Automatic nonlinear forecasting
Automatic ANN in M3 competition did poorly.
Linear methods did best in the NN3
competition...
Automatic nonlinear forecasting
Automatic ANN in M3 competition did poorly.
Linear methods did best in the NN3
competition...
Automatic nonlinear forecasting
Automatic ANN in M3 competition did poorly.
Linear methods did best in the NN3
competition...
Automatic nonlinear forecasting
Automatic ANN in M3 competition did poorly.
Linear methods did best in the NN3
competition...
Automatic nonlinear forecasting
Automatic ANN in M3 competition did poorly.
Linear methods did best in the NN3
competition...
Outline
1 Motivation
2 Forecasting competitions
3 Forecasting the PBS
4 Exponential smoothing
5 ARIMA modelling
6 Automati...
Examples
Automatic time series forecasting Time series with complex seasonality 59
US finished motor gasoline products
Wee...
Examples
Automatic time series forecasting Time series with complex seasonality 59
Number of calls to large American bank ...
Examples
Automatic time series forecasting Time series with complex seasonality 59
Turkish electricity demand
Days
Electri...
TBATS model
TBATS
Trigonometric terms for seasonality
Box-Cox transformations for heterogeneity
ARMA errors for short-term...
TBATS model
yt = observation at time t
y
(ω)
t =
(yω
t − 1)/ω if ω = 0;
log yt if ω = 0.
y
(ω)
t = t−1 + φbt−1 +
M
i=1
s
(...
TBATS model
yt = observation at time t
y
(ω)
t =
(yω
t − 1)/ω if ω = 0;
log yt if ω = 0.
y
(ω)
t = t−1 + φbt−1 +
M
i=1
s
(...
TBATS model
yt = observation at time t
y
(ω)
t =
(yω
t − 1)/ω if ω = 0;
log yt if ω = 0.
y
(ω)
t = t−1 + φbt−1 +
M
i=1
s
(...
TBATS model
yt = observation at time t
y
(ω)
t =
(yω
t − 1)/ω if ω = 0;
log yt if ω = 0.
y
(ω)
t = t−1 + φbt−1 +
M
i=1
s
(...
TBATS model
yt = observation at time t
y
(ω)
t =
(yω
t − 1)/ω if ω = 0;
log yt if ω = 0.
y
(ω)
t = t−1 + φbt−1 +
M
i=1
s
(...
TBATS model
yt = observation at time t
y
(ω)
t =
(yω
t − 1)/ω if ω = 0;
log yt if ω = 0.
y
(ω)
t = t−1 + φbt−1 +
M
i=1
s
(...
TBATS model
yt = observation at time t
y
(ω)
t =
(yω
t − 1)/ω if ω = 0;
log yt if ω = 0.
y
(ω)
t = t−1 + φbt−1 +
M
i=1
s
(...
Examples
fit - tbats(gasoline)
fcast - forecast(fit)
plot(fcast)
Automatic time series forecasting Time series with comple...
Examples
fit - tbats(callcentre)
fcast - forecast(fit)
plot(fcast)
Automatic time series forecasting Time series with comp...
Examples
fit - tbats(turk)
fcast - forecast(fit)
plot(fcast)
Automatic time series forecasting Time series with complex se...
Outline
1 Motivation
2 Forecasting competitions
3 Forecasting the PBS
4 Exponential smoothing
5 ARIMA modelling
6 Automati...
Further competitions
1 2011 tourism forecasting competition.
2 Kaggle and other forecasting platforms.
3 GEFCom 2012: Poin...
Further competitions
1 2011 tourism forecasting competition.
2 Kaggle and other forecasting platforms.
3 GEFCom 2012: Poin...
Further competitions
1 2011 tourism forecasting competition.
2 Kaggle and other forecasting platforms.
3 GEFCom 2012: Poin...
Further competitions
1 2011 tourism forecasting competition.
2 Kaggle and other forecasting platforms.
3 GEFCom 2012: Poin...
Forecasts about forecasting
1 Automatic algorithms will become more
general — handling a wide variety of time
series.
2 Mo...
Forecasts about forecasting
1 Automatic algorithms will become more
general — handling a wide variety of time
series.
2 Mo...
Forecasts about forecasting
1 Automatic algorithms will become more
general — handling a wide variety of time
series.
2 Mo...
Forecasts about forecasting
1 Automatic algorithms will become more
general — handling a wide variety of time
series.
2 Mo...
For further information
robjhyndman.com
Slides and references for this talk.
Links to all papers and books.
Links to R pac...
Próxima SlideShare
Cargando en…5
×

Automatic time series forecasting

Talk given at National Tsing Hua University on 7 January 2015

  • Sé el primero en comentar

Automatic time series forecasting

  1. 1. Rob J Hyndman Automatic time series forecasting
  2. 2. Outline 1 Motivation 2 Forecasting competitions 3 Forecasting the PBS 4 Exponential smoothing 5 ARIMA modelling 6 Automatic nonlinear forecasting? 7 Time series with complex seasonality 8 Recent developments Automatic time series forecasting Motivation 2
  3. 3. Motivation Automatic time series forecasting Motivation 3
  4. 4. Motivation Automatic time series forecasting Motivation 3
  5. 5. Motivation Automatic time series forecasting Motivation 3
  6. 6. Motivation Automatic time series forecasting Motivation 3
  7. 7. Motivation Automatic time series forecasting Motivation 3
  8. 8. Motivation 1 Common in business to have over 1000 products that need forecasting at least monthly. 2 Forecasts are often required by people who are untrained in time series analysis. Specifications Automatic forecasting algorithms must: ¯ determine an appropriate time series model; ¯ estimate the parameters; ¯ compute the forecasts with prediction intervals. Automatic time series forecasting Motivation 4
  9. 9. Motivation 1 Common in business to have over 1000 products that need forecasting at least monthly. 2 Forecasts are often required by people who are untrained in time series analysis. Specifications Automatic forecasting algorithms must: ¯ determine an appropriate time series model; ¯ estimate the parameters; ¯ compute the forecasts with prediction intervals. Automatic time series forecasting Motivation 4
  10. 10. Example: Asian sheep Automatic time series forecasting Motivation 5 Numbers of sheep in Asia Year millionsofsheep 1960 1970 1980 1990 2000 2010 250300350400450500550
  11. 11. Example: Asian sheep Automatic time series forecasting Motivation 5 Automatic ETS forecasts Year millionsofsheep 1960 1970 1980 1990 2000 2010 250300350400450500550
  12. 12. Example: Cortecosteroid sales Automatic time series forecasting Motivation 6 Monthly cortecosteroid drug sales in Australia Year Totalscripts(millions) 1995 2000 2005 2010 0.40.60.81.01.21.4
  13. 13. Example: Cortecosteroid sales Automatic time series forecasting Motivation 6 Automatic ARIMA forecasts Year Totalscripts(millions) 1995 2000 2005 2010 0.40.60.81.01.21.4
  14. 14. Outline 1 Motivation 2 Forecasting competitions 3 Forecasting the PBS 4 Exponential smoothing 5 ARIMA modelling 6 Automatic nonlinear forecasting? 7 Time series with complex seasonality 8 Recent developments Automatic time series forecasting Forecasting competitions 7
  15. 15. Makridakis and Hibon (1979) Automatic time series forecasting Forecasting competitions 8
  16. 16. Makridakis and Hibon (1979) Automatic time series forecasting Forecasting competitions 8
  17. 17. Makridakis and Hibon (1979) This was the first large-scale empirical evaluation of time series forecasting methods. Highly controversial at the time. Difficulties: How to measure forecast accuracy? How to apply methods consistently and objectively? How to explain unexpected results? Common thinking was that the more sophisticated mathematical models (ARIMA models at the time) were necessarily better. If results showed ARIMA models not best, it must be because analyst was unskilled. Automatic time series forecasting Forecasting competitions 9
  18. 18. Makridakis and Hibon (1979) I do not believe that it is very fruitful to attempt to classify series according to which forecasting techniques perform “best”. The performance of any particular technique when applied to a particular series depends essentially on (a) the model which the series obeys; (b) our ability to identify and fit this model correctly and (c) the criterion chosen to measure the forecasting accuracy. — M.B. Priestley . . . the paper suggests the application of normal scientific experimental design to forecasting, with measures of unbiased testing of forecasts against subsequent reality, for success or failure. A long overdue reform. — F.H. Hansford-Miller Automatic time series forecasting Forecasting competitions 10
  19. 19. Makridakis and Hibon (1979) I do not believe that it is very fruitful to attempt to classify series according to which forecasting techniques perform “best”. The performance of any particular technique when applied to a particular series depends essentially on (a) the model which the series obeys; (b) our ability to identify and fit this model correctly and (c) the criterion chosen to measure the forecasting accuracy. — M.B. Priestley . . . the paper suggests the application of normal scientific experimental design to forecasting, with measures of unbiased testing of forecasts against subsequent reality, for success or failure. A long overdue reform. — F.H. Hansford-Miller Automatic time series forecasting Forecasting competitions 10
  20. 20. Makridakis and Hibon (1979) Modern man is fascinated with the subject of forecasting — W.G. Gilchrist It is amazing to me, however, that after all this exercise in identifying models, transforming and so on, that the autoregressive moving averages come out so badly. I wonder whether it might be partly due to the authors not using the backwards forecasting approach to obtain the initial errors. — W.G. Gilchrist Automatic time series forecasting Forecasting competitions 11
  21. 21. Makridakis and Hibon (1979) Modern man is fascinated with the subject of forecasting — W.G. Gilchrist It is amazing to me, however, that after all this exercise in identifying models, transforming and so on, that the autoregressive moving averages come out so badly. I wonder whether it might be partly due to the authors not using the backwards forecasting approach to obtain the initial errors. — W.G. Gilchrist Automatic time series forecasting Forecasting competitions 11
  22. 22. Makridakis and Hibon (1979) I find it hard to believe that Box-Jenkins, if properly applied, can actually be worse than so many of the simple methods — C. Chatfield Why do empirical studies sometimes give different answers? It may depend on the selected sample of time series, but I suspect it is more likely to depend on the skill of the analyst and on their individual interpretations of what is meant by Method X. — C. Chatfield . . . these authors are more at home with simple procedures than with Box-Jenkins. — C. Chatfield Automatic time series forecasting Forecasting competitions 12
  23. 23. Makridakis and Hibon (1979) I find it hard to believe that Box-Jenkins, if properly applied, can actually be worse than so many of the simple methods — C. Chatfield Why do empirical studies sometimes give different answers? It may depend on the selected sample of time series, but I suspect it is more likely to depend on the skill of the analyst and on their individual interpretations of what is meant by Method X. — C. Chatfield . . . these authors are more at home with simple procedures than with Box-Jenkins. — C. Chatfield Automatic time series forecasting Forecasting competitions 12
  24. 24. Makridakis and Hibon (1979) I find it hard to believe that Box-Jenkins, if properly applied, can actually be worse than so many of the simple methods — C. Chatfield Why do empirical studies sometimes give different answers? It may depend on the selected sample of time series, but I suspect it is more likely to depend on the skill of the analyst and on their individual interpretations of what is meant by Method X. — C. Chatfield . . . these authors are more at home with simple procedures than with Box-Jenkins. — C. Chatfield Automatic time series forecasting Forecasting competitions 12
  25. 25. Makridakis and Hibon (1979) There is a fact that Professor Priestley must accept: empirical evidence is in disagreement with his theoretical arguments. — S. Makridakis M. Hibon Dr Chatfield expresses some personal views about the first author . . . It might be useful for Dr Chatfield to read some of the psychological literature quoted in the main paper, and he can then learn a little more about biases and how they affect prior probabilities. — S. Makridakis M. Hibon Automatic time series forecasting Forecasting competitions 13
  26. 26. Makridakis and Hibon (1979) There is a fact that Professor Priestley must accept: empirical evidence is in disagreement with his theoretical arguments. — S. Makridakis M. Hibon Dr Chatfield expresses some personal views about the first author . . . It might be useful for Dr Chatfield to read some of the psychological literature quoted in the main paper, and he can then learn a little more about biases and how they affect prior probabilities. — S. Makridakis M. Hibon Automatic time series forecasting Forecasting competitions 13
  27. 27. Consequences of MH (1979) As a result of this paper, researchers started to: ¯ consider how to automate forecasting methods; ¯ study what methods give the best forecasts; ¯ be aware of the dangers of over-fitting; ¯ treat forecasting as a different problem from time series analysis. Makridakis Hibon followed up with a new competition in 1982: 1001 series Anyone could submit forecasts (avoiding the charge of incompetence) Multiple forecast measures used. Automatic time series forecasting Forecasting competitions 14
  28. 28. Consequences of MH (1979) As a result of this paper, researchers started to: ¯ consider how to automate forecasting methods; ¯ study what methods give the best forecasts; ¯ be aware of the dangers of over-fitting; ¯ treat forecasting as a different problem from time series analysis. Makridakis Hibon followed up with a new competition in 1982: 1001 series Anyone could submit forecasts (avoiding the charge of incompetence) Multiple forecast measures used. Automatic time series forecasting Forecasting competitions 14
  29. 29. M-competition Automatic time series forecasting Forecasting competitions 15
  30. 30. M-competition Main findings (taken from Makridakis Hibon, 2000) 1 Statistically sophisticated or complex methods do not necessarily provide more accurate forecasts than simpler ones. 2 The relative ranking of the performance of the various methods varies according to the accuracy measure being used. 3 The accuracy when various methods are being combined outperforms, on average, the individual methods being combined and does very well in comparison to other methods. 4 The accuracy of the various methods depends upon the length of the forecasting horizon involved. Automatic time series forecasting Forecasting competitions 16
  31. 31. M3 competition Automatic time series forecasting Forecasting competitions 17
  32. 32. Makridakis and Hibon (2000) “The M3-Competition is a final attempt by the authors to settle the accuracy issue of various time series methods. . . The extension involves the inclusion of more methods/ researchers (in particular in the areas of neural networks and expert systems) and more series.” 3003 series All data from business, demography, finance and economics. Series length between 14 and 126. Either non-seasonal, monthly or quarterly. All time series positive. MH claimed that the M3-competition supported the findings of their earlier work. However, best performing methods far from “simple”. Automatic time series forecasting Forecasting competitions 18
  33. 33. Makridakis and Hibon (2000) Best methods: Theta A very confusing explanation. Shown by Hyndman and Billah (2003) to be average of linear regression and simple exponential smoothing with drift, applied to seasonally adjusted data. Later, the original authors claimed that their explanation was incorrect. Forecast Pro A commercial software package with an unknown algorithm. Known to fit either exponential smoothing or ARIMA models using BIC. Automatic time series forecasting Forecasting competitions 19
  34. 34. M3 results (recalculated) Method MAPE sMAPE MASE Theta 17.42 12.76 1.39 ForecastPro 18.00 13.06 1.47 ForecastX 17.35 13.09 1.42 Automatic ANN 17.18 13.98 1.53 B-J automatic 19.13 13.72 1.54 Automatic time series forecasting Forecasting competitions 20
  35. 35. M3 results (recalculated) Method MAPE sMAPE MASE Theta 17.42 12.76 1.39 ForecastPro 18.00 13.06 1.47 ForecastX 17.35 13.09 1.42 Automatic ANN 17.18 13.98 1.53 B-J automatic 19.13 13.72 1.54 Automatic time series forecasting Forecasting competitions 20 ® Calculations do not match published paper. ® Some contestants apparently submitted multiple entries but only best ones published.
  36. 36. Outline 1 Motivation 2 Forecasting competitions 3 Forecasting the PBS 4 Exponential smoothing 5 ARIMA modelling 6 Automatic nonlinear forecasting? 7 Time series with complex seasonality 8 Recent developments Automatic time series forecasting Forecasting the PBS 21
  37. 37. Forecasting the PBS Automatic time series forecasting Forecasting the PBS 22
  38. 38. Forecasting the PBS The Pharmaceutical Benefits Scheme (PBS) is the Australian government drugs subsidy scheme. Many drugs bought from pharmacies are subsidised to allow more equitable access to modern drugs. The cost to government is determined by the number and types of drugs purchased. Currently nearly 1% of GDP ($14 billion). The total cost is budgeted based on forecasts of drug usage. Automatic time series forecasting Forecasting the PBS 23
  39. 39. Forecasting the PBS The Pharmaceutical Benefits Scheme (PBS) is the Australian government drugs subsidy scheme. Many drugs bought from pharmacies are subsidised to allow more equitable access to modern drugs. The cost to government is determined by the number and types of drugs purchased. Currently nearly 1% of GDP ($14 billion). The total cost is budgeted based on forecasts of drug usage. Automatic time series forecasting Forecasting the PBS 23
  40. 40. Forecasting the PBS The Pharmaceutical Benefits Scheme (PBS) is the Australian government drugs subsidy scheme. Many drugs bought from pharmacies are subsidised to allow more equitable access to modern drugs. The cost to government is determined by the number and types of drugs purchased. Currently nearly 1% of GDP ($14 billion). The total cost is budgeted based on forecasts of drug usage. Automatic time series forecasting Forecasting the PBS 23
  41. 41. Forecasting the PBS The Pharmaceutical Benefits Scheme (PBS) is the Australian government drugs subsidy scheme. Many drugs bought from pharmacies are subsidised to allow more equitable access to modern drugs. The cost to government is determined by the number and types of drugs purchased. Currently nearly 1% of GDP ($14 billion). The total cost is budgeted based on forecasts of drug usage. Automatic time series forecasting Forecasting the PBS 23
  42. 42. Forecasting the PBS In 2001: $4.5 billion budget, under-forecasted by $800 million. Thousands of products. Seasonal demand. Subject to covert marketing, volatile products, uncontrollable expenditure. Although monthly data available for 10 years, data are aggregated to annual values, and only the first three years are used in estimating the forecasts. All forecasts being done with the FORECAST function in MS-Excel! Automatic time series forecasting Forecasting the PBS 24
  43. 43. Forecasting the PBS In 2001: $4.5 billion budget, under-forecasted by $800 million. Thousands of products. Seasonal demand. Subject to covert marketing, volatile products, uncontrollable expenditure. Although monthly data available for 10 years, data are aggregated to annual values, and only the first three years are used in estimating the forecasts. All forecasts being done with the FORECAST function in MS-Excel! Automatic time series forecasting Forecasting the PBS 24
  44. 44. Forecasting the PBS In 2001: $4.5 billion budget, under-forecasted by $800 million. Thousands of products. Seasonal demand. Subject to covert marketing, volatile products, uncontrollable expenditure. Although monthly data available for 10 years, data are aggregated to annual values, and only the first three years are used in estimating the forecasts. All forecasts being done with the FORECAST function in MS-Excel! Automatic time series forecasting Forecasting the PBS 24
  45. 45. Forecasting the PBS In 2001: $4.5 billion budget, under-forecasted by $800 million. Thousands of products. Seasonal demand. Subject to covert marketing, volatile products, uncontrollable expenditure. Although monthly data available for 10 years, data are aggregated to annual values, and only the first three years are used in estimating the forecasts. All forecasts being done with the FORECAST function in MS-Excel! Automatic time series forecasting Forecasting the PBS 24
  46. 46. Forecasting the PBS In 2001: $4.5 billion budget, under-forecasted by $800 million. Thousands of products. Seasonal demand. Subject to covert marketing, volatile products, uncontrollable expenditure. Although monthly data available for 10 years, data are aggregated to annual values, and only the first three years are used in estimating the forecasts. All forecasts being done with the FORECAST function in MS-Excel! Automatic time series forecasting Forecasting the PBS 24
  47. 47. PBS data Automatic time series forecasting Forecasting the PBS 25 Total cost: A03 concession safety net group Time $thousands 1995 2000 2005 020040060080010001200
  48. 48. PBS data Automatic time series forecasting Forecasting the PBS 25 Total cost: A05 general copayments group Time $thousands 1995 2000 2005 050100150200
  49. 49. PBS data Automatic time series forecasting Forecasting the PBS 25 Total cost: D01 general copayments group Time $thousands 1995 2000 2005 0100200300400500600700
  50. 50. PBS data Automatic time series forecasting Forecasting the PBS 25 Total cost: S01 general copayments group Time $thousands 1995 2000 2005 050010001500
  51. 51. PBS data Automatic time series forecasting Forecasting the PBS 25 Total cost: R03 general copayments group Time $thousands 1995 2000 2005 10002000300040005000
  52. 52. Outline 1 Motivation 2 Forecasting competitions 3 Forecasting the PBS 4 Exponential smoothing 5 ARIMA modelling 6 Automatic nonlinear forecasting? 7 Time series with complex seasonality 8 Recent developments Automatic time series forecasting Exponential smoothing 26
  53. 53. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M Automatic time series forecasting Exponential smoothing 27
  54. 54. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M N,N: Simple exponential smoothing Automatic time series forecasting Exponential smoothing 27
  55. 55. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M N,N: Simple exponential smoothing A,N: Holt’s linear method Automatic time series forecasting Exponential smoothing 27
  56. 56. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M N,N: Simple exponential smoothing A,N: Holt’s linear method Ad,N: Additive damped trend method Automatic time series forecasting Exponential smoothing 27
  57. 57. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M N,N: Simple exponential smoothing A,N: Holt’s linear method Ad,N: Additive damped trend method M,N: Exponential trend method Automatic time series forecasting Exponential smoothing 27
  58. 58. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M N,N: Simple exponential smoothing A,N: Holt’s linear method Ad,N: Additive damped trend method M,N: Exponential trend method Md,N: Multiplicative damped trend method Automatic time series forecasting Exponential smoothing 27
  59. 59. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M N,N: Simple exponential smoothing A,N: Holt’s linear method Ad,N: Additive damped trend method M,N: Exponential trend method Md,N: Multiplicative damped trend method A,A: Additive Holt-Winters’ method Automatic time series forecasting Exponential smoothing 27
  60. 60. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M N,N: Simple exponential smoothing A,N: Holt’s linear method Ad,N: Additive damped trend method M,N: Exponential trend method Md,N: Multiplicative damped trend method A,A: Additive Holt-Winters’ method A,M: Multiplicative Holt-Winters’ method Automatic time series forecasting Exponential smoothing 27
  61. 61. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M There are 15 separate exponential smoothing methods. Automatic time series forecasting Exponential smoothing 27
  62. 62. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M There are 15 separate exponential smoothing methods. Each can have an additive or multiplicative error, giving 30 separate models. Automatic time series forecasting Exponential smoothing 27
  63. 63. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M There are 15 separate exponential smoothing methods. Each can have an additive or multiplicative error, giving 30 separate models. Only 19 models are numerically stable. Automatic time series forecasting Exponential smoothing 27
  64. 64. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M General notation E T S : ExponenTial Smoothing Examples: A,N,N: Simple exponential smoothing with additive errors A,A,N: Holt’s linear method with additive errors M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors Automatic time series forecasting Exponential smoothing 28
  65. 65. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M General notation E T S : ExponenTial Smoothing Examples: A,N,N: Simple exponential smoothing with additive errors A,A,N: Holt’s linear method with additive errors M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors Automatic time series forecasting Exponential smoothing 28
  66. 66. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M General notation E T S : ExponenTial Smoothing ↑ Trend Examples: A,N,N: Simple exponential smoothing with additive errors A,A,N: Holt’s linear method with additive errors M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors Automatic time series forecasting Exponential smoothing 28
  67. 67. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M General notation E T S : ExponenTial Smoothing ↑ Trend Seasonal Examples: A,N,N: Simple exponential smoothing with additive errors A,A,N: Holt’s linear method with additive errors M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors Automatic time series forecasting Exponential smoothing 28
  68. 68. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M General notation E T S : ExponenTial Smoothing ↑ Error Trend Seasonal Examples: A,N,N: Simple exponential smoothing with additive errors A,A,N: Holt’s linear method with additive errors M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors Automatic time series forecasting Exponential smoothing 28
  69. 69. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M General notation E T S : ExponenTial Smoothing ↑ Error Trend Seasonal Examples: A,N,N: Simple exponential smoothing with additive errors A,A,N: Holt’s linear method with additive errors M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors Automatic time series forecasting Exponential smoothing 28
  70. 70. Exponential smoothing methods Seasonal Component Trend N A M Component (None) (Additive) (Multiplicative) N (None) N,N N,A N,M A (Additive) A,N A,A A,M Ad (Additive damped) Ad,N Ad,A Ad,M M (Multiplicative) M,N M,A M,M Md (Multiplicative damped) Md,N Md,A Md,M General notation E T S : ExponenTial Smoothing ↑ Error Trend Seasonal Examples: A,N,N: Simple exponential smoothing with additive errors A,A,N: Holt’s linear method with additive errors M,A,M: Multiplicative Holt-Winters’ method with multiplicative errors Automatic time series forecasting Exponential smoothing 28 Innovations state space models ¯ All ETS models can be written in innovations state space form (IJF, 2002). ¯ Additive and multiplicative versions give the same point forecasts but different prediction intervals.
  71. 71. ETS state space model xt−1 εt yt Automatic time series forecasting Exponential smoothing 29 State space model xt = (level, slope, seasonal)
  72. 72. ETS state space model xt−1 εt yt xt Automatic time series forecasting Exponential smoothing 29 State space model xt = (level, slope, seasonal)
  73. 73. ETS state space model xt−1 εt yt xt yt+1 εt+1 Automatic time series forecasting Exponential smoothing 29 State space model xt = (level, slope, seasonal)
  74. 74. ETS state space model xt−1 εt yt xt yt+1 εt+1 xt+1 Automatic time series forecasting Exponential smoothing 29 State space model xt = (level, slope, seasonal)
  75. 75. ETS state space model xt−1 εt yt xt yt+1 εt+1 xt+1 yt+2 εt+2 Automatic time series forecasting Exponential smoothing 29 State space model xt = (level, slope, seasonal)
  76. 76. ETS state space model xt−1 εt yt xt yt+1 εt+1 xt+1 yt+2 εt+2 xt+2 Automatic time series forecasting Exponential smoothing 29 State space model xt = (level, slope, seasonal)
  77. 77. ETS state space model xt−1 εt yt xt yt+1 εt+1 xt+1 yt+2 εt+2 xt+2 yt+3 εt+3 Automatic time series forecasting Exponential smoothing 29 State space model xt = (level, slope, seasonal)
  78. 78. ETS state space model xt−1 εt yt xt yt+1 εt+1 xt+1 yt+2 εt+2 xt+2 yt+3 εt+3 xt+3 Automatic time series forecasting Exponential smoothing 29 State space model xt = (level, slope, seasonal)
  79. 79. ETS state space model xt−1 εt yt xt yt+1 εt+1 xt+1 yt+2 εt+2 xt+2 yt+3 εt+3 xt+3 yt+4 εt+4 Automatic time series forecasting Exponential smoothing 29 State space model xt = (level, slope, seasonal)
  80. 80. ETS state space model xt−1 εt yt xt yt+1 εt+1 xt+1 yt+2 εt+2 xt+2 yt+3 εt+3 xt+3 yt+4 εt+4 Automatic time series forecasting Exponential smoothing 29 State space model xt = (level, slope, seasonal) Estimation Compute likelihood L from ε1, ε2, . . . , εT. Optimize L wrt model parameters.
  81. 81. Innovations state space models Let xt = ( t, bt, st, st−1, . . . , st−m+1) and εt iid ∼ N(0, σ2 ). yt = h(xt−1) + k(xt−1)εt Observation equation µt et xt = f(xt−1) + g(xt−1)εt State equation Additive errors: k(xt−1) = 1. yt = µt + εt. Multiplicative errors: k(xt−1) = µt. yt = µt(1 + εt). εt = (yt − µt)/µt is relative error. Automatic time series forecasting Exponential smoothing 30
  82. 82. Innovations state space models All models can be written in state space form. Additive and multiplicative versions give same point forecasts but different prediction intervals. Estimation L∗ (θ, x0) = n log n t=1 ε2 t /k2 (xt−1) + 2 n t=1 log |k(xt−1)| = −2 log(Likelihood) + constant Minimize wrt θ = (α, β, γ, φ) and initial states x0 = ( 0, b0, s0, s−1, . . . , s−m+1). Automatic time series forecasting Exponential smoothing 31
  83. 83. Innovations state space models All models can be written in state space form. Additive and multiplicative versions give same point forecasts but different prediction intervals. Estimation L∗ (θ, x0) = n log n t=1 ε2 t /k2 (xt−1) + 2 n t=1 log |k(xt−1)| = −2 log(Likelihood) + constant Minimize wrt θ = (α, β, γ, φ) and initial states x0 = ( 0, b0, s0, s−1, . . . , s−m+1). Automatic time series forecasting Exponential smoothing 31
  84. 84. Innovations state space models All models can be written in state space form. Additive and multiplicative versions give same point forecasts but different prediction intervals. Estimation L∗ (θ, x0) = n log n t=1 ε2 t /k2 (xt−1) + 2 n t=1 log |k(xt−1)| = −2 log(Likelihood) + constant Minimize wrt θ = (α, β, γ, φ) and initial states x0 = ( 0, b0, s0, s−1, . . . , s−m+1). Automatic time series forecasting Exponential smoothing 31
  85. 85. Innovations state space models All models can be written in state space form. Additive and multiplicative versions give same point forecasts but different prediction intervals. Estimation L∗ (θ, x0) = n log n t=1 ε2 t /k2 (xt−1) + 2 n t=1 log |k(xt−1)| = −2 log(Likelihood) + constant Minimize wrt θ = (α, β, γ, φ) and initial states x0 = ( 0, b0, s0, s−1, . . . , s−m+1). Automatic time series forecasting Exponential smoothing 31
  86. 86. Innovations state space models All models can be written in state space form. Additive and multiplicative versions give same point forecasts but different prediction intervals. Estimation L∗ (θ, x0) = n log n t=1 ε2 t /k2 (xt−1) + 2 n t=1 log |k(xt−1)| = −2 log(Likelihood) + constant Minimize wrt θ = (α, β, γ, φ) and initial states x0 = ( 0, b0, s0, s−1, . . . , s−m+1). Automatic time series forecasting Exponential smoothing 31
  87. 87. Innovations state space models All models can be written in state space form. Additive and multiplicative versions give same point forecasts but different prediction intervals. Estimation L∗ (θ, x0) = n log n t=1 ε2 t /k2 (xt−1) + 2 n t=1 log |k(xt−1)| = −2 log(Likelihood) + constant Minimize wrt θ = (α, β, γ, φ) and initial states x0 = ( 0, b0, s0, s−1, . . . , s−m+1). Automatic time series forecasting Exponential smoothing 31 Q: How to choose between the 19 different ETS models?
  88. 88. Cross-validation Traditional evaluation Automatic time series forecasting Exponential smoothing 32 q q q q q q q q q q q q q q q q q q q q q q q q q time Training data Test data
  89. 89. Cross-validation Traditional evaluation Standard cross-validation Automatic time series forecasting Exponential smoothing 32 q q q q q q q q q q q q q q q q q q q q q q q q q time Training data Test data q q q q q q q q q q q q q q q q q q q qq q qq q q q q q q q q q q q q q q q q q q q q qqq qq q q q q q q q q q q q q q q q q q q q q qq qqq q q q q q q q q q q q q q q q q q q q q qqqq q q q q q q q q q q q q q q q q q q q q q qq q qq q
  90. 90. Cross-validation Traditional evaluation Standard cross-validation Time series cross-validation Automatic time series forecasting Exponential smoothing 32 q q q q q q q q q q q q q q q q q q q q q q q q q time Training data Test data q q q q q q q q q q q q q q q q q q q qq q qq q q q q q q q q q q q q q q q q q q q q qqq qq q q q q q q q q q q q q q q q q q q q q qq qqq q q q q q q q q q q q q q q q q q q q q qqqq q q q q q q q q q q q q q q q q q q q q q qq q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q
  91. 91. Cross-validation Traditional evaluation Standard cross-validation Time series cross-validation Automatic time series forecasting Exponential smoothing 32 q q q q q q q q q q q q q q q q q q q q q q q q q time Training data Test data q q q q q q q q q q q q q q q q q q q qq q qq q q q q q q q q q q q q q q q q q q q q qqq qq q q q q q q q q q q q q q q q q q q q q qq qqq q q q q q q q q q q q q q q q q q q q q qqqq q q q q q q q q q q q q q q q q q q q q q qq q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q
  92. 92. Cross-validation Traditional evaluation Standard cross-validation Time series cross-validation Automatic time series forecasting Exponential smoothing 32 q q q q q q q q q q q q q q q q q q q q q q q q q time Training data Test data q q q q q q q q q q q q q q q q q q q qq q qq q q q q q q q q q q q q q q q q q q q q qqq qq q q q q q q q q q q q q q q q q q q q q qq qqq q q q q q q q q q q q q q q q q q q q q qqqq q q q q q q q q q q q q q q q q q q q q q qq q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q Also known as “Evaluation on a rolling forecast origin”
  93. 93. Akaike’s Information Criterion AIC = −2 log(L) + 2k where L is the likelihood and k is the number of estimated parameters in the model. This is a penalized likelihood approach. If L is Gaussian, then AIC ≈ c + T log MSE + 2k where c is a constant, MSE is from one-step forecasts on training set, and T is the length of the series. Minimizing the Gaussian AIC is asymptotically equivalent (as T → ∞) to minimizing MSE from one-step forecasts on test set via time series cross-validation. Automatic time series forecasting Exponential smoothing 33
  94. 94. Akaike’s Information Criterion AIC = −2 log(L) + 2k where L is the likelihood and k is the number of estimated parameters in the model. This is a penalized likelihood approach. If L is Gaussian, then AIC ≈ c + T log MSE + 2k where c is a constant, MSE is from one-step forecasts on training set, and T is the length of the series. Minimizing the Gaussian AIC is asymptotically equivalent (as T → ∞) to minimizing MSE from one-step forecasts on test set via time series cross-validation. Automatic time series forecasting Exponential smoothing 33
  95. 95. Akaike’s Information Criterion AIC = −2 log(L) + 2k where L is the likelihood and k is the number of estimated parameters in the model. This is a penalized likelihood approach. If L is Gaussian, then AIC ≈ c + T log MSE + 2k where c is a constant, MSE is from one-step forecasts on training set, and T is the length of the series. Minimizing the Gaussian AIC is asymptotically equivalent (as T → ∞) to minimizing MSE from one-step forecasts on test set via time series cross-validation. Automatic time series forecasting Exponential smoothing 33
  96. 96. Akaike’s Information Criterion AIC = −2 log(L) + 2k where L is the likelihood and k is the number of estimated parameters in the model. This is a penalized likelihood approach. If L is Gaussian, then AIC ≈ c + T log MSE + 2k where c is a constant, MSE is from one-step forecasts on training set, and T is the length of the series. Minimizing the Gaussian AIC is asymptotically equivalent (as T → ∞) to minimizing MSE from one-step forecasts on test set via time series cross-validation. Automatic time series forecasting Exponential smoothing 33
  97. 97. Akaike’s Information Criterion AIC = −2 log(L) + 2k where L is the likelihood and k is the number of estimated parameters in the model. This is a penalized likelihood approach. If L is Gaussian, then AIC ≈ c + T log MSE + 2k where c is a constant, MSE is from one-step forecasts on training set, and T is the length of the series. Minimizing the Gaussian AIC is asymptotically equivalent (as T → ∞) to minimizing MSE from one-step forecasts on test set via time series cross-validation. Automatic time series forecasting Exponential smoothing 33
  98. 98. Akaike’s Information Criterion AIC = −2 log(L) + 2k Corrected AIC For small T, AIC tends to over-fit. Bias-corrected version: AICC = AIC + 2(k+1)(k+2) T−k Bayesian Information Criterion BIC = AIC + k[log(T) − 2] BIC penalizes terms more heavily than AIC Minimizing BIC is consistent if there is a true model. Automatic time series forecasting Exponential smoothing 34
  99. 99. Akaike’s Information Criterion AIC = −2 log(L) + 2k Corrected AIC For small T, AIC tends to over-fit. Bias-corrected version: AICC = AIC + 2(k+1)(k+2) T−k Bayesian Information Criterion BIC = AIC + k[log(T) − 2] BIC penalizes terms more heavily than AIC Minimizing BIC is consistent if there is a true model. Automatic time series forecasting Exponential smoothing 34
  100. 100. Akaike’s Information Criterion AIC = −2 log(L) + 2k Corrected AIC For small T, AIC tends to over-fit. Bias-corrected version: AICC = AIC + 2(k+1)(k+2) T−k Bayesian Information Criterion BIC = AIC + k[log(T) − 2] BIC penalizes terms more heavily than AIC Minimizing BIC is consistent if there is a true model. Automatic time series forecasting Exponential smoothing 34
  101. 101. What to use? Choice: AIC, AICc, BIC, CV-MSE CV-MSE too time consuming for most automatic forecasting purposes. Also requires large T. As T → ∞, BIC selects true model if there is one. But that is never true! AICc focuses on forecasting performance, can be used on small samples and is very fast to compute. Empirical studies in forecasting show AIC is better than BIC for forecast accuracy. Automatic time series forecasting Exponential smoothing 35
  102. 102. What to use? Choice: AIC, AICc, BIC, CV-MSE CV-MSE too time consuming for most automatic forecasting purposes. Also requires large T. As T → ∞, BIC selects true model if there is one. But that is never true! AICc focuses on forecasting performance, can be used on small samples and is very fast to compute. Empirical studies in forecasting show AIC is better than BIC for forecast accuracy. Automatic time series forecasting Exponential smoothing 35
  103. 103. What to use? Choice: AIC, AICc, BIC, CV-MSE CV-MSE too time consuming for most automatic forecasting purposes. Also requires large T. As T → ∞, BIC selects true model if there is one. But that is never true! AICc focuses on forecasting performance, can be used on small samples and is very fast to compute. Empirical studies in forecasting show AIC is better than BIC for forecast accuracy. Automatic time series forecasting Exponential smoothing 35
  104. 104. What to use? Choice: AIC, AICc, BIC, CV-MSE CV-MSE too time consuming for most automatic forecasting purposes. Also requires large T. As T → ∞, BIC selects true model if there is one. But that is never true! AICc focuses on forecasting performance, can be used on small samples and is very fast to compute. Empirical studies in forecasting show AIC is better than BIC for forecast accuracy. Automatic time series forecasting Exponential smoothing 35
  105. 105. What to use? Choice: AIC, AICc, BIC, CV-MSE CV-MSE too time consuming for most automatic forecasting purposes. Also requires large T. As T → ∞, BIC selects true model if there is one. But that is never true! AICc focuses on forecasting performance, can be used on small samples and is very fast to compute. Empirical studies in forecasting show AIC is better than BIC for forecast accuracy. Automatic time series forecasting Exponential smoothing 35
  106. 106. ets algorithm in R Automatic time series forecasting Exponential smoothing 36 Based on Hyndman, Koehler, Snyder Grose (IJF 2002): Apply each of 19 models that are appropriate to the data. Optimize parameters and initial values using MLE. Select best method using AICc. Produce forecasts using best method. Obtain prediction intervals using underlying state space model.
  107. 107. ets algorithm in R Automatic time series forecasting Exponential smoothing 36 Based on Hyndman, Koehler, Snyder Grose (IJF 2002): Apply each of 19 models that are appropriate to the data. Optimize parameters and initial values using MLE. Select best method using AICc. Produce forecasts using best method. Obtain prediction intervals using underlying state space model.
  108. 108. ets algorithm in R Automatic time series forecasting Exponential smoothing 36 Based on Hyndman, Koehler, Snyder Grose (IJF 2002): Apply each of 19 models that are appropriate to the data. Optimize parameters and initial values using MLE. Select best method using AICc. Produce forecasts using best method. Obtain prediction intervals using underlying state space model.
  109. 109. ets algorithm in R Automatic time series forecasting Exponential smoothing 36 Based on Hyndman, Koehler, Snyder Grose (IJF 2002): Apply each of 19 models that are appropriate to the data. Optimize parameters and initial values using MLE. Select best method using AICc. Produce forecasts using best method. Obtain prediction intervals using underlying state space model.
  110. 110. Exponential smoothing Automatic time series forecasting Exponential smoothing 37 Forecasts from ETS(M,A,N) Year millionsofsheep 1960 1970 1980 1990 2000 2010 300400500600
  111. 111. Exponential smoothing fit - ets(livestock) fcast - forecast(fit) plot(fcast) Automatic time series forecasting Exponential smoothing 38 Forecasts from ETS(M,A,N) Year millionsofsheep 1960 1970 1980 1990 2000 2010 300400500600
  112. 112. Exponential smoothing Automatic time series forecasting Exponential smoothing 39 Forecasts from ETS(M,Md,M) Year Totalscripts(millions) 1995 2000 2005 2010 0.40.60.81.01.21.41.6
  113. 113. Exponential smoothing fit - ets(h02) fcast - forecast(fit) plot(fcast) Automatic time series forecasting Exponential smoothing 40 Forecasts from ETS(M,Md,M) Year Totalscripts(millions) 1995 2000 2005 2010 0.40.60.81.01.21.41.6
  114. 114. Exponential smoothing fit ETS(M,Md,M) Smoothing parameters: alpha = 0.3318 beta = 4e-04 gamma = 1e-04 phi = 0.9695 Initial states: l = 0.4003 b = 1.0233 s = 0.8575 0.8183 0.7559 0.7627 0.6873 1.2884 1.3456 1.1867 1.1653 1.1033 1.0398 0.9893 sigma: 0.0651 AIC AICc BIC -121.97999 -118.68967 -65.57195 Automatic time series forecasting Exponential smoothing 41
  115. 115. M3 comparisons Method MAPE sMAPE MASE Theta 17.42 12.76 1.39 ForecastPro 18.00 13.06 1.47 ForecastX 17.35 13.09 1.42 Automatic ANN 17.18 13.98 1.53 B-J automatic 19.13 13.72 1.54 ETS 18.06 13.38 1.52 Automatic time series forecasting Exponential smoothing 42
  116. 116. Exponential smoothing Automatic time series forecasting Exponential smoothing 43
  117. 117. Exponential smoothing Automatic time series forecasting Exponential smoothing 43 www.OTexts.com/fpp
  118. 118. Exponential smoothing Automatic time series forecasting Exponential smoothing 43
  119. 119. Outline 1 Motivation 2 Forecasting competitions 3 Forecasting the PBS 4 Exponential smoothing 5 ARIMA modelling 6 Automatic nonlinear forecasting? 7 Time series with complex seasonality 8 Recent developments Automatic time series forecasting ARIMA modelling 44
  120. 120. ARIMA models yt−1 yt−2 yt−3 yt Inputs Output Automatic time series forecasting ARIMA modelling 45
  121. 121. ARIMA models yt−1 yt−2 yt−3 εt yt Inputs Output Automatic time series forecasting ARIMA modelling 45 Autoregression (AR) model
  122. 122. ARIMA models yt−1 yt−2 yt−3 εt εt−1 εt−2 yt Inputs Output Automatic time series forecasting ARIMA modelling 45 Autoregression moving average (ARMA) model
  123. 123. ARIMA models yt−1 yt−2 yt−3 εt εt−1 εt−2 yt Inputs Output Automatic time series forecasting ARIMA modelling 45 Autoregression moving average (ARMA) model Estimation Compute likelihood L from ε1, ε2, . . . , εT. Use optimization algorithm to maximize L.
  124. 124. ARIMA models yt−1 yt−2 yt−3 εt εt−1 εt−2 yt Inputs Output Automatic time series forecasting ARIMA modelling 45 Autoregression moving average (ARMA) model Estimation Compute likelihood L from ε1, ε2, . . . , εT. Use optimization algorithm to maximize L. ARIMA model Autoregression moving average (ARMA) model applied to differences.
  125. 125. ARIMA modelling Automatic time series forecasting ARIMA modelling 46
  126. 126. ARIMA modelling Automatic time series forecasting ARIMA modelling 46
  127. 127. ARIMA modelling Automatic time series forecasting ARIMA modelling 46
  128. 128. Auto ARIMA Automatic time series forecasting ARIMA modelling 47 Forecasts from ARIMA(0,1,0) with drift Year millionsofsheep 1960 1970 1980 1990 2000 2010 250300350400450500550
  129. 129. Auto ARIMA fit - auto.arima(livestock) fcast - forecast(fit) plot(fcast) Automatic time series forecasting ARIMA modelling 48 Forecasts from ARIMA(0,1,0) with drift Year millionsofsheep 1960 1970 1980 1990 2000 2010 250300350400450500550
  130. 130. Auto ARIMA Automatic time series forecasting ARIMA modelling 49 Forecasts from ARIMA(3,1,3)(0,1,1)[12] Year Totalscripts(millions) 1995 2000 2005 2010 0.40.60.81.01.21.4
  131. 131. Auto ARIMA fit - auto.arima(h02) fcast - forecast(fit) plot(fcast) Automatic time series forecasting ARIMA modelling 50 Forecasts from ARIMA(3,1,3)(0,1,1)[12] Year Totalscripts(millions) 1995 2000 2005 2010 0.40.60.81.01.21.4
  132. 132. Auto ARIMA fit Series: h02 ARIMA(3,1,3)(0,1,1)[12] Coefficients: ar1 ar2 ar3 ma1 ma2 ma3 -0.3648 -0.0636 0.3568 -0.4850 0.0479 -0.353 s.e. 0.2198 0.3293 0.1268 0.2227 0.2755 0.212 sma1 -0.5931 s.e. 0.0651 sigma^2 estimated as 0.002706: log likelihood=290.25 AIC=-564.5 AICc=-563.71 BIC=-538.48 Automatic time series forecasting ARIMA modelling 51
  133. 133. How does auto.arima() work? A non-seasonal ARIMA process φ(B)(1 − B)d yt = c + θ(B)εt Need to select appropriate orders p, q, d, and whether to include c. Automatic time series forecasting ARIMA modelling 52 Algorithm choices driven by forecast accuracy.
  134. 134. How does auto.arima() work? A non-seasonal ARIMA process φ(B)(1 − B)d yt = c + θ(B)εt Need to select appropriate orders p, q, d, and whether to include c. Hyndman Khandakar (JSS, 2008) algorithm: Select no. differences d via KPSS unit root test. Select p, q, c by minimising AICc. Use stepwise search to traverse model space, starting with a simple model and considering nearby variants. Automatic time series forecasting ARIMA modelling 52 Algorithm choices driven by forecast accuracy.
  135. 135. How does auto.arima() work? A non-seasonal ARIMA process φ(B)(1 − B)d yt = c + θ(B)εt Need to select appropriate orders p, q, d, and whether to include c. Hyndman Khandakar (JSS, 2008) algorithm: Select no. differences d via KPSS unit root test. Select p, q, c by minimising AICc. Use stepwise search to traverse model space, starting with a simple model and considering nearby variants. Automatic time series forecasting ARIMA modelling 52 Algorithm choices driven by forecast accuracy.
  136. 136. How does auto.arima() work? A seasonal ARIMA process Φ(Bm )φ(B)(1 − B)d (1 − Bm )D yt = c + Θ(Bm )θ(B)εt Need to select appropriate orders p, q, d, P, Q, D, and whether to include c. Hyndman Khandakar (JSS, 2008) algorithm: Select no. differences d via KPSS unit root test. Select D using OCSB unit root test. Select p, q, P, Q, c by minimising AIC. Use stepwise search to traverse model space, starting with a simple model and considering nearby variants. Automatic time series forecasting ARIMA modelling 53
  137. 137. M3 comparisons Method MAPE sMAPE MASE Theta 17.42 12.76 1.39 ForecastPro 18.00 13.06 1.47 B-J automatic 19.13 13.72 1.54 ETS 18.06 13.38 1.52 AutoARIMA 19.04 13.86 1.47 Automatic time series forecasting ARIMA modelling 54
  138. 138. M3 comparisons Method MAPE sMAPE MASE Theta 17.42 12.76 1.39 ForecastPro 18.00 13.06 1.47 B-J automatic 19.13 13.72 1.54 ETS 18.06 13.38 1.52 AutoARIMA 19.04 13.86 1.47 ETS-ARIMA 17.92 13.02 1.44 Automatic time series forecasting ARIMA modelling 54
  139. 139. M3 conclusions MYTHS Simple methods do better. Exponential smoothing is better than ARIMA. FACTS The best methods are hybrid approaches. ETS-ARIMA (the simple average of ETS-additive and AutoARIMA) is the only fully documented method that is comparable to the M3 competition winners. Automatic time series forecasting ARIMA modelling 55
  140. 140. M3 conclusions MYTHS Simple methods do better. Exponential smoothing is better than ARIMA. FACTS The best methods are hybrid approaches. ETS-ARIMA (the simple average of ETS-additive and AutoARIMA) is the only fully documented method that is comparable to the M3 competition winners. Automatic time series forecasting ARIMA modelling 55
  141. 141. M3 conclusions MYTHS Simple methods do better. Exponential smoothing is better than ARIMA. FACTS The best methods are hybrid approaches. ETS-ARIMA (the simple average of ETS-additive and AutoARIMA) is the only fully documented method that is comparable to the M3 competition winners. Automatic time series forecasting ARIMA modelling 55
  142. 142. M3 conclusions MYTHS Simple methods do better. Exponential smoothing is better than ARIMA. FACTS The best methods are hybrid approaches. ETS-ARIMA (the simple average of ETS-additive and AutoARIMA) is the only fully documented method that is comparable to the M3 competition winners. Automatic time series forecasting ARIMA modelling 55
  143. 143. M3 conclusions MYTHS Simple methods do better. Exponential smoothing is better than ARIMA. FACTS The best methods are hybrid approaches. ETS-ARIMA (the simple average of ETS-additive and AutoARIMA) is the only fully documented method that is comparable to the M3 competition winners. Automatic time series forecasting ARIMA modelling 55
  144. 144. Outline 1 Motivation 2 Forecasting competitions 3 Forecasting the PBS 4 Exponential smoothing 5 ARIMA modelling 6 Automatic nonlinear forecasting? 7 Time series with complex seasonality 8 Recent developments Automatic time series forecasting Automatic nonlinear forecasting? 56
  145. 145. Automatic nonlinear forecasting Automatic ANN in M3 competition did poorly. Linear methods did best in the NN3 competition! Very few machine learning methods get published in the IJF because authors cannot demonstrate their methods give better forecasts than linear benchmark methods, even on supposedly nonlinear data. Some good recent work by Kourentzes and Crone (Neurocomputing, 2010) on automated ANN for time series. Watch this space! Automatic time series forecasting Automatic nonlinear forecasting? 57
  146. 146. Automatic nonlinear forecasting Automatic ANN in M3 competition did poorly. Linear methods did best in the NN3 competition! Very few machine learning methods get published in the IJF because authors cannot demonstrate their methods give better forecasts than linear benchmark methods, even on supposedly nonlinear data. Some good recent work by Kourentzes and Crone (Neurocomputing, 2010) on automated ANN for time series. Watch this space! Automatic time series forecasting Automatic nonlinear forecasting? 57
  147. 147. Automatic nonlinear forecasting Automatic ANN in M3 competition did poorly. Linear methods did best in the NN3 competition! Very few machine learning methods get published in the IJF because authors cannot demonstrate their methods give better forecasts than linear benchmark methods, even on supposedly nonlinear data. Some good recent work by Kourentzes and Crone (Neurocomputing, 2010) on automated ANN for time series. Watch this space! Automatic time series forecasting Automatic nonlinear forecasting? 57
  148. 148. Automatic nonlinear forecasting Automatic ANN in M3 competition did poorly. Linear methods did best in the NN3 competition! Very few machine learning methods get published in the IJF because authors cannot demonstrate their methods give better forecasts than linear benchmark methods, even on supposedly nonlinear data. Some good recent work by Kourentzes and Crone (Neurocomputing, 2010) on automated ANN for time series. Watch this space! Automatic time series forecasting Automatic nonlinear forecasting? 57
  149. 149. Automatic nonlinear forecasting Automatic ANN in M3 competition did poorly. Linear methods did best in the NN3 competition! Very few machine learning methods get published in the IJF because authors cannot demonstrate their methods give better forecasts than linear benchmark methods, even on supposedly nonlinear data. Some good recent work by Kourentzes and Crone (Neurocomputing, 2010) on automated ANN for time series. Watch this space! Automatic time series forecasting Automatic nonlinear forecasting? 57
  150. 150. Outline 1 Motivation 2 Forecasting competitions 3 Forecasting the PBS 4 Exponential smoothing 5 ARIMA modelling 6 Automatic nonlinear forecasting? 7 Time series with complex seasonality 8 Recent developments Automatic time series forecasting Time series with complex seasonality 58
  151. 151. Examples Automatic time series forecasting Time series with complex seasonality 59 US finished motor gasoline products Weeks Thousandsofbarrelsperday 1992 1994 1996 1998 2000 2002 2004 6500700075008000850090009500
  152. 152. Examples Automatic time series forecasting Time series with complex seasonality 59 Number of calls to large American bank (7am−9pm) 5 minute intervals Numberofcallarrivals 100200300400 3 March 17 March 31 March 14 April 28 April 12 May
  153. 153. Examples Automatic time series forecasting Time series with complex seasonality 59 Turkish electricity demand Days Electricitydemand(GW) 2000 2002 2004 2006 2008 10152025
  154. 154. TBATS model TBATS Trigonometric terms for seasonality Box-Cox transformations for heterogeneity ARMA errors for short-term dynamics Trend (possibly damped) Seasonal (including multiple and non-integer periods) Automatic algorithm described in AM De Livera, RJ Hyndman, and RD Snyder (2011). “Forecasting time series with complex seasonal patterns using exponential smoothing”. Journal of the American Statistical Association 106(496), 1513–1527. Automatic time series forecasting Time series with complex seasonality 60
  155. 155. TBATS model yt = observation at time t y (ω) t = (yω t − 1)/ω if ω = 0; log yt if ω = 0. y (ω) t = t−1 + φbt−1 + M i=1 s (i) t−mi + dt t = t−1 + φbt−1 + αdt bt = (1 − φ)b + φbt−1 + βdt dt = p i=1 φidt−i + q j=1 θjεt−j + εt s (i) t = ki j=1 s (i) j,t Automatic time series forecasting Time series with complex seasonality 61 s (i) j,t = s (i) j,t−1 cos λ (i) j + s ∗(i) j,t−1 sin λ (i) j + γ (i) 1 dt s (i) j,t = −s (i) j,t−1 sin λ (i) j + s ∗(i) j,t−1 cos λ (i) j + γ (i) 2 dt
  156. 156. TBATS model yt = observation at time t y (ω) t = (yω t − 1)/ω if ω = 0; log yt if ω = 0. y (ω) t = t−1 + φbt−1 + M i=1 s (i) t−mi + dt t = t−1 + φbt−1 + αdt bt = (1 − φ)b + φbt−1 + βdt dt = p i=1 φidt−i + q j=1 θjεt−j + εt s (i) t = ki j=1 s (i) j,t Automatic time series forecasting Time series with complex seasonality 61 s (i) j,t = s (i) j,t−1 cos λ (i) j + s ∗(i) j,t−1 sin λ (i) j + γ (i) 1 dt s (i) j,t = −s (i) j,t−1 sin λ (i) j + s ∗(i) j,t−1 cos λ (i) j + γ (i) 2 dt Box-Cox transformation
  157. 157. TBATS model yt = observation at time t y (ω) t = (yω t − 1)/ω if ω = 0; log yt if ω = 0. y (ω) t = t−1 + φbt−1 + M i=1 s (i) t−mi + dt t = t−1 + φbt−1 + αdt bt = (1 − φ)b + φbt−1 + βdt dt = p i=1 φidt−i + q j=1 θjεt−j + εt s (i) t = ki j=1 s (i) j,t Automatic time series forecasting Time series with complex seasonality 61 s (i) j,t = s (i) j,t−1 cos λ (i) j + s ∗(i) j,t−1 sin λ (i) j + γ (i) 1 dt s (i) j,t = −s (i) j,t−1 sin λ (i) j + s ∗(i) j,t−1 cos λ (i) j + γ (i) 2 dt Box-Cox transformation M seasonal periods
  158. 158. TBATS model yt = observation at time t y (ω) t = (yω t − 1)/ω if ω = 0; log yt if ω = 0. y (ω) t = t−1 + φbt−1 + M i=1 s (i) t−mi + dt t = t−1 + φbt−1 + αdt bt = (1 − φ)b + φbt−1 + βdt dt = p i=1 φidt−i + q j=1 θjεt−j + εt s (i) t = ki j=1 s (i) j,t Automatic time series forecasting Time series with complex seasonality 61 s (i) j,t = s (i) j,t−1 cos λ (i) j + s ∗(i) j,t−1 sin λ (i) j + γ (i) 1 dt s (i) j,t = −s (i) j,t−1 sin λ (i) j + s ∗(i) j,t−1 cos λ (i) j + γ (i) 2 dt Box-Cox transformation M seasonal periods global and local trend
  159. 159. TBATS model yt = observation at time t y (ω) t = (yω t − 1)/ω if ω = 0; log yt if ω = 0. y (ω) t = t−1 + φbt−1 + M i=1 s (i) t−mi + dt t = t−1 + φbt−1 + αdt bt = (1 − φ)b + φbt−1 + βdt dt = p i=1 φidt−i + q j=1 θjεt−j + εt s (i) t = ki j=1 s (i) j,t Automatic time series forecasting Time series with complex seasonality 61 s (i) j,t = s (i) j,t−1 cos λ (i) j + s ∗(i) j,t−1 sin λ (i) j + γ (i) 1 dt s (i) j,t = −s (i) j,t−1 sin λ (i) j + s ∗(i) j,t−1 cos λ (i) j + γ (i) 2 dt Box-Cox transformation M seasonal periods global and local trend ARMA error
  160. 160. TBATS model yt = observation at time t y (ω) t = (yω t − 1)/ω if ω = 0; log yt if ω = 0. y (ω) t = t−1 + φbt−1 + M i=1 s (i) t−mi + dt t = t−1 + φbt−1 + αdt bt = (1 − φ)b + φbt−1 + βdt dt = p i=1 φidt−i + q j=1 θjεt−j + εt s (i) t = ki j=1 s (i) j,t Automatic time series forecasting Time series with complex seasonality 61 s (i) j,t = s (i) j,t−1 cos λ (i) j + s ∗(i) j,t−1 sin λ (i) j + γ (i) 1 dt s (i) j,t = −s (i) j,t−1 sin λ (i) j + s ∗(i) j,t−1 cos λ (i) j + γ (i) 2 dt Box-Cox transformation M seasonal periods global and local trend ARMA error Fourier-like seasonal terms
  161. 161. TBATS model yt = observation at time t y (ω) t = (yω t − 1)/ω if ω = 0; log yt if ω = 0. y (ω) t = t−1 + φbt−1 + M i=1 s (i) t−mi + dt t = t−1 + φbt−1 + αdt bt = (1 − φ)b + φbt−1 + βdt dt = p i=1 φidt−i + q j=1 θjεt−j + εt s (i) t = ki j=1 s (i) j,t Automatic time series forecasting Time series with complex seasonality 61 s (i) j,t = s (i) j,t−1 cos λ (i) j + s ∗(i) j,t−1 sin λ (i) j + γ (i) 1 dt s (i) j,t = −s (i) j,t−1 sin λ (i) j + s ∗(i) j,t−1 cos λ (i) j + γ (i) 2 dt Box-Cox transformation M seasonal periods global and local trend ARMA error Fourier-like seasonal terms TBATS Trigonometric Box-Cox ARMA Trend Seasonal
  162. 162. Examples fit - tbats(gasoline) fcast - forecast(fit) plot(fcast) Automatic time series forecasting Time series with complex seasonality 62 Forecasts from TBATS(0.999, {2,2}, 1, {52.1785714285714,8}) Weeks Thousandsofbarrelsperday 1995 2000 2005 70008000900010000
  163. 163. Examples fit - tbats(callcentre) fcast - forecast(fit) plot(fcast) Automatic time series forecasting Time series with complex seasonality 63 Forecasts from TBATS(1, {3,1}, 0.987, {169,5, 845,3}) 5 minute intervals Numberofcallarrivals 0100200300400500 3 March 17 March 31 March 14 April 28 April 12 May 26 May 9 June
  164. 164. Examples fit - tbats(turk) fcast - forecast(fit) plot(fcast) Automatic time series forecasting Time series with complex seasonality 64 Forecasts from TBATS(0, {5,3}, 0.997, {7,3, 354.37,12, 365.25,4}) Days Electricitydemand(GW) 2000 2002 2004 2006 2008 2010 10152025
  165. 165. Outline 1 Motivation 2 Forecasting competitions 3 Forecasting the PBS 4 Exponential smoothing 5 ARIMA modelling 6 Automatic nonlinear forecasting? 7 Time series with complex seasonality 8 Recent developments Automatic time series forecasting Recent developments 65
  166. 166. Further competitions 1 2011 tourism forecasting competition. 2 Kaggle and other forecasting platforms. 3 GEFCom 2012: Point forecasting of electricity load and wind power. 4 GEFCom 2014: Probabilistic forecasting of electricity load, electricity price, wind energy and solar energy. Automatic time series forecasting Recent developments 66
  167. 167. Further competitions 1 2011 tourism forecasting competition. 2 Kaggle and other forecasting platforms. 3 GEFCom 2012: Point forecasting of electricity load and wind power. 4 GEFCom 2014: Probabilistic forecasting of electricity load, electricity price, wind energy and solar energy. Automatic time series forecasting Recent developments 66
  168. 168. Further competitions 1 2011 tourism forecasting competition. 2 Kaggle and other forecasting platforms. 3 GEFCom 2012: Point forecasting of electricity load and wind power. 4 GEFCom 2014: Probabilistic forecasting of electricity load, electricity price, wind energy and solar energy. Automatic time series forecasting Recent developments 66
  169. 169. Further competitions 1 2011 tourism forecasting competition. 2 Kaggle and other forecasting platforms. 3 GEFCom 2012: Point forecasting of electricity load and wind power. 4 GEFCom 2014: Probabilistic forecasting of electricity load, electricity price, wind energy and solar energy. Automatic time series forecasting Recent developments 66
  170. 170. Forecasts about forecasting 1 Automatic algorithms will become more general — handling a wide variety of time series. 2 Model selection methods will take account of multi-step forecast accuracy as well as one-step forecast accuracy. 3 Automatic forecasting algorithms for multivariate time series will be developed. 4 Automatic forecasting algorithms that include covariate information will be developed. Automatic time series forecasting Recent developments 67
  171. 171. Forecasts about forecasting 1 Automatic algorithms will become more general — handling a wide variety of time series. 2 Model selection methods will take account of multi-step forecast accuracy as well as one-step forecast accuracy. 3 Automatic forecasting algorithms for multivariate time series will be developed. 4 Automatic forecasting algorithms that include covariate information will be developed. Automatic time series forecasting Recent developments 67
  172. 172. Forecasts about forecasting 1 Automatic algorithms will become more general — handling a wide variety of time series. 2 Model selection methods will take account of multi-step forecast accuracy as well as one-step forecast accuracy. 3 Automatic forecasting algorithms for multivariate time series will be developed. 4 Automatic forecasting algorithms that include covariate information will be developed. Automatic time series forecasting Recent developments 67
  173. 173. Forecasts about forecasting 1 Automatic algorithms will become more general — handling a wide variety of time series. 2 Model selection methods will take account of multi-step forecast accuracy as well as one-step forecast accuracy. 3 Automatic forecasting algorithms for multivariate time series will be developed. 4 Automatic forecasting algorithms that include covariate information will be developed. Automatic time series forecasting Recent developments 67
  174. 174. For further information robjhyndman.com Slides and references for this talk. Links to all papers and books. Links to R packages. A blog about forecasting research. Automatic time series forecasting Recent developments 68

    Sé el primero en comentar

    Inicia sesión para ver los comentarios

  • DaviddeAntonioLiedo

    Mar. 10, 2015

Talk given at National Tsing Hua University on 7 January 2015

Vistas

Total de vistas

2.258

En Slideshare

0

De embebidos

0

Número de embebidos

643

Acciones

Descargas

164

Compartidos

0

Comentarios

0

Me gusta

1

×