SlideShare una empresa de Scribd logo
1 de 43
Descargar para leer sin conexión
Data Analysis Course
Time Series Analysis &
Forecasting(Version-1)
Venkat Reddy
Data Analysis Course
•   Data analysis design document
•   Introduction to statistical data analysis
•   Descriptive statistics
•   Data exploration, validation & sanitization
•   Probability distributions examples and applications




                                                                Venkat Reddy
                                                          Data Analysis Course
•   Simple correlation and regression analysis
•   Multiple liner regression analysis
•   Logistic regression analysis
•   Testing of hypothesis
•   Clustering and Decision trees

• Time series analysis and forecasting
• Credit Risk Model building-1                                   2
• Credit Risk Model building-2
Note
• This presentation is just class notes. The course notes for Data
  Analysis Training is by written by me, as an aid for myself.
• The best way to treat this is as a high-level summary; the
  actual session went more in depth and contained other




                                                                           Venkat Reddy
                                                                     Data Analysis Course
  information.
• Most of this material was written as informal notes, not
  intended for publication
• Please send questions/comments/corrections to
  venkat@trenwiseanalytics.com or 21.venkat@gmail.com
• Please check my website for latest version of this document
                                         -Venkat Reddy                      3
Contents
• What is a Time Series
• Applications of Time Series Analysis
• Time series model building & Forecasting Methodologies
  • TCSI Method




                                                                 Venkat Reddy
                                                           Data Analysis Course
     • Components of time series
     • Goodness of fit
     • Forecasting using TCSI model
  • ARIMA
     • Main steps in ARIMA
     • Goodness of fit
     • Forecasting using ARIMA model

                                                                  4
What is a Time Series
• Time series data is a sequence of observations collected from
  a process with equally spaced periods of time.
• Examples
  •   Dow Jones Industrial Averages




                                                                        Venkat Reddy
                                                                  Data Analysis Course
  •   Daily data on sales
  •   Monthly inventory
  •   Daily Customers ,
  •   Monthly interest rates, costs
  •   Monthly unemployment rates,
  •   Weekly measures of money supply,
  •   Daily closing prices of stock indices, and so on
                                                                         5
Forecasting Methodologies
• There are many different time series techniques.
• It is usually impossible to know which technique will be best
  for a particular data set.
• It is customary to try out several different techniques and




                                                                        Venkat Reddy
                                                                  Data Analysis Course
  select the one that seems to work best.
• To be an effective time series modeler, you need to keep
  several time series techniques in your “tool box.”
• Simple ideas
  • Moving averages
  • TSI method
• Complex statistical concepts
  • Box-Jenkins methodology                                              6
Building Model Using TSI Method
•   Very Simple technique
•   Spreadsheet is sufficient
•   Less ambiguous
•   Easy to understand & Interpret




                                                 Venkat Reddy
                                           Data Analysis Course
•   Serves the purpose most of the times




                                                  7
Components of a Time series
1. Secular Trend(T): Gradual long term movement(up or down). Easiest to detect
     •     Eg: Population growth In India
2. Cyclical Patterns(C): Results from events recurrent but not periodic in nature.
   An up-and-down repetitive movement in demand. repeats itself over a long
   period of time




                                                                                              Venkat Reddy
                                                                                        Data Analysis Course
     •     Eg. Recession in US Economy
3.       Seasonal Pattern(S): Results from events that are periodic and recurrent in
         nature. An up-and-down repetitive movement within a trend occurring
         periodically. Often weather related but could be daily or weekly occurrence
     •       Eg. Sales in festive seasons
4.       Irregular Component(I): Disturbances or residual variation that remain after
         all the other behaviors have been accounted for. Erratic movements that are
         not predictable because they do not follow a pattern
                                                                                               8
     •       Eg. Earthquake
Building Time Series Model: TCSI
    O(t) = T(t) + S(t) + I(t) or O(t) = T(t) * S(t) * I(t)
    Where O(t) = observed series, T(t) = Trend component, S(t) = Seasonal , I(t) = Irregular component
      Step 1 : Smooth the series & de trend the series




                                                                                                               Venkat Reddy
                                                                                                         Data Analysis Course
      Step 2 : Find out Seasonal component and adjust the data for seasonality



      Step 3 : See if there is still some trend/seasinality in the data & quantify it



                                                                                                                9
Step-1: Smoothening the series
• Why Smoothening ? Basically to find average value and detrend the
  series
• After removing the effect of trends series is left with seasonal and
  irregular components
Moving Average
• Series of arithmetic means, Used only for smoothing. Provides overall
  impression of data over time




                                                                                Venkat Reddy
                                                                          Data Analysis Course
• MA(3)= (yt + yt-1 + yt-2)/3
  Actual Series                        Series after Removing the Trend




                                                                              10
Smoothening the series
Exponential Smoothing
• Form of weighted moving average. Weights decline exponentially
  and most recent data weighted most
• Requires smoothing constant (W). Ranges from 0 to 1, Subjectively
  chosen
• Involves little record keeping of past data




                                                                             Venkat Reddy
                                                                       Data Analysis Course
                Ei = W·Yi + (1 - W)·Ei-1
Actual Series                        Series after Removing the Trend




                                                                           11
Step-2 : Capturing Seasonality - Seasonal
  indices
Year   Quarter   Sales
 1       Q1      362
 1       Q2      385
 1       Q3      432
 1       Q4      678
 2       Q1      382
 2       Q2      409
 2       Q3      498
 2       Q4      642




                                                                                         Venkat Reddy
                                                                                   Data Analysis Course
 3       Q1      473
 3       Q2      513
 3       Q3      582
 3       Q4      789
 4       Q1      544
 4       Q2      582
 4       Q3      681
 4       Q4      899
 5       Q1      628     • Quantify the effect of seasonality in each year
 5               707
 5
         Q2
         Q3      773     • Calculate overall average for each quarter for across
 5       Q4      1008      all years
                         • De seasionalize the dataset with by diving with
                           seasonal indices
                                                                                       12
Seasonal Indices & De Seasionalising
        Step 1 : Reformating the Dataset
          Qtr 1       Qtr 2      Qtr 3          Qtr 4
Y1        362         385        432             678
Y2        382         409        498             642
Y3        473         513        582             789
Y4        544         582        681             899
Y5        628         707        773            1008




                                                              Venkat Reddy
                                                        Data Analysis Course
     Step 2 : Calculation Of Seasonal Indices
          Qtr 1       Qtr 2      Qtr 3          Qtr 4
Y1        0.78        0.83       0.93           1.46
Y2        0.79        0.85       1.03           1.33
Y3        0.80        0.87       0.99           1.34
Y4        0.80        0.86       1.01           1.33
Y5        0.81        0.91       0.99           1.29
SI        0.80        0.86       0.99           1.35




          Step 3: Deseanolised Dataset
          Qtr 1       Qtr 2      Qtr 3          Qtr 4
Y1        454         446        436            502
Y2        479         474        503            475
Y3        594         594        588            584         13
Y4        683         674        688            666
Y5        788         819        781            746
Step-3: Irregular Component
• Irregular No Trend  Cant be modeled
• After first level of de trending & de seasionalising, if there is
  still any pattern left in the time series, repeat the above steps
  once again




                                                                            Venkat Reddy
                                                                      Data Analysis Course
                                                                          14
Lab
•   Download the data from here
•   Draw the trend chart, can you see trend
•   Calculate MA2 and MA3
•   De-trend the series and create the graph again




                                                                         Venkat Reddy
                                                                   Data Analysis Course
•   Is there any trend now?
•   Can you judge the seasonality in de trended data?
•   Find seasonality indices for each quarter
•   De-seasinalise the series by diving with seasonality indices
•   Forecast sales for next four months, one by one
    • First assume I as 1, multiply with S, multiply with MA
                                                                       15
Residual Analysis
 • Residual analysis is done after the model is built
 • There should be no Pattern (information) left in the residuals.
 • Residual should be totally random




                                                                           Venkat Reddy
                                                   Desired Pattern




                                                                     Data Analysis Course
Trend Not Fully Accounted for

Error                                       Error

 0                                          0


          Time (Years)                                Time (Years)
                                                                         16
Residual Analysis
e                                 e

0                                 0




                                                                                      Venkat Reddy
                                                                                Data Analysis Course
                              T                                             T
        Random errors                  Cyclical effects not accounted for

e                                 e

0                                 0

                                                                                    17
                              T                                             T
    Trend not accounted for           Seasonal effects not accounted for
Goodness of fit
• We need a way to compare different time series techniques
  for a given data set.
• Four common techniques are the:
                                            n          ˆ
                                                  Yi  Yi
                                    MAD = 




                                                                             Venkat Reddy
                                                                       Data Analysis Course
  • Mean absolute deviation,
                                           i 1       n
                                                  n            ˆ
                                                          Yi  Yi
  • Mean absolute percent error          100
                                  MAPE =     
                                          n i 1              Yi

  • Mean square error,              MSE = 
                                            Y  Y 
                                            n    ˆ
                                                      i        i
                                                                   2


                                           i 1           n
  • Root mean square error.       RMSE           MSE                      18
Additive and Multiplicative. What model
 to use?
• Is there any striking difference between these two time series?




                                                                          Venkat Reddy
                                                                    Data Analysis Course
                                                                        19
Additive and Multiplicative. What model
to use?

• In many time series, the amplitude of both the seasonal and
  irregular variations increase as the level of the trend rises. In
  this situation, a multiplicative model is usually appropriate.




                                                                             Venkat Reddy
                                                                       Data Analysis Course
• In some time series, the amplitude of both the seasonal and
  irregular variations do not change as the level of the trend rises
  or falls. In such cases, an additive model is appropriate.




                                                                           20
Lab-2
• Find the error at each point in above problem
• Conduct the residual analysis for problem in lab-1
Data Set-2: Predicting the Stock Price
• Download Coal India Limited Stock data from Yahoo finance




                                                                        Venkat Reddy
                                                                  Data Analysis Course
• De trend it by calculating MA5(one week)
• Use day function and pivot table to identify seasonality
  interval
• Are 5-26 & 26 to 4 are two seasons in every month ?
• Find the seasonal indices and remove the seasonality
• Conduct residual analysis
• Forecast future stock price values, buy and sell at the right       21
  time and become rich
Time Series Models : Type 2
• Autoregressive (AR) process: series current values depend on
  its own previous values
• Moving average (MA) process: The current deviation from
  mean depends on previous deviations
• Autoregressive Moving average (ARMA) process




                                                                                       Venkat Reddy
                                                                                 Data Analysis Course
• Autoregressive Integrated Moving average
  (ARIMA)process.

• ARIMA is also known as Box-Jenkins approach. It is popular because of its
  generality; It can handle any series, with or without seasonal elements, and
  it has well-documented computer programs


                                                                                     22
AR Process and MA Process




     Data Analysis Course
23




           Venkat Reddy
ARIMA Model
Autoregressive Integrated Moving Average
     → AR filter → Integration filter → MA filter → εt
        (long term)      (stochastic trend)      (short term)    (white noise error)




                                                                                             Venkat Reddy
                                                                                       Data Analysis Course
ARIMA (2,0,1) yt = a1yt-1 + a2yt-2 + εt + b1εt-1
ARIMA (3,0,1) yt = a1yt-1 + a2yt-2 + a3yt-3+ εt + b1εt-1
ARIMA (1,1,0) Δyt = a1 Δ yt-1 + εt , where Δyt = yt - yt-1
ARIMA (2,1,0) Δyt = a1 Δ yt-1 + a2Δ yt-2 + εt where Δyt = yt - yt-1
ARIMA (2,1,1) =? ARIMA (2,1,2) =? ARIMA (3,1,2) =?
ARIMA(1,0,0) =? ARIMA(2,0,0) =?
ARIMA(0,0,1) =? ARIMA(0,0,2) =?
To build a time series model issuing ARIMA, we need to study the time                      24
series and identify p,d,q…..…simple :P
ARIMA (p,d,q) modeling
To build a time series model issuing ARIMA, we need to study the time
series and identify p,d,q
• Identification:
  • Determine the appropriate values of p, d, & q using the ACF, PACF,
    and unit root tests




                                                                                  Venkat Reddy
                                                                            Data Analysis Course
  • p is the AR order, d is the integration order, q is the MA order
• Estimation :
  • Estimate an ARIMA model using values of p, d, & q you think are
    appropriate.
• Diagnostic checking:
  • Check residuals of estimated ARIMA model(s) to see if they are white
    noise; pick best model with well behaved residuals.
• Forecasting:
  • Produce out of sample forecasts or set aside last few data points for       25
    in-sample forecasting.
The Box-Jenkins Approach

Differencing the                                        Estimate the
series to achieve               Identify the model    parameters of the
   stationary                                              model




                                                                                  Venkat Reddy
                                                                            Data Analysis Course
                                                     Diagnostic checking.
                                        No               Is the model
                                                          adequate?




                                                                                26
                    Use Model for forecasting                Yes
The time series has to be Stationary
Processes
• In order to model a time series with the Box-Jenkins approach,
  the series has to be stationary
• In practical terms, the series is stationary if tends to wonder
  more or less uniformly about some fixed level




                                                                               Venkat Reddy
                                                                         Data Analysis Course
• In statistical terms, a stationary process is assumed to be in a
  particular state of statistical equilibrium, i.e., p(xt) is the same
  for all t
• In particular, if zt is a stationary process, then the first
  difference zt = zt - zt-1and higher differences  zt are
                                                        d

  stationary
                                                                             27
• BTW …..Most time series are nonstationary
1




     3
                            2
                                Some non stationary series




     Data Analysis Course
28




           Venkat Reddy
Achieving Stationarity
• Regular differencing (RD)
(1st order)    xt = (1 – B)xt = xt – xt-1
(2nd order)    2xt = (1 – B)2xt = xt – 2xt-1 + xt-2




                                                                         Venkat Reddy
                                                                   Data Analysis Course
“B” is the backward shift operator
• It is unlikely that more than two regular differencing would
  ever be needed
• Sometimes regular differencing by itself is not sufficient and
  prior transformation is also needed
                                                                       29
Differentiation

  Actual Series




                         Venkat Reddy
                   Data Analysis Course
  Series After
 Differentiation


                       30
Identification of orders p and q
• Identification starts with d
• ARIMA(p,d,q)
• What is Integration here?




                                                                      Venkat Reddy
                                                                Data Analysis Course
• First we need to make the time series stationary



• We need to learn about ACF & PACF to identify p,q
• Once we are working with a stationary time series, we can
  examine the ACF and PACF to help identify the proper number
  of lagged y (AR) terms and ε (MA) terms.                          31
Autocorrelation Function (ACF)
• Correlation with lag-1, lag2, lag3 etc.,
• The ACF represents the degree of persistence over respective
  lags of a variable.
ρk = γk / γ0 = covariance at lag k/ variance




                                                                       Venkat Reddy
                                                                 Data Analysis Course
ρk = E[(yt – μ)(yt-k – μ)]2
        E[(yt – μ)2]
ACF (0) = 1, ACF (k) = ACF (-k)




                                                                     32
ACF Graph
       1.00
       0.50




                                                                                  Venkat Reddy
                                                                            Data Analysis Course
       0.00
   -0.50




              0                     10                     20     30   40
                                                           Lag
              Bartlett's formula for MA(q) 95% confidence bands




                                                                                33
Partial Autocorrelation Function (PACF)
• Partial regression coefficient -
• The lag k partial autocorrelation is the partial regression
  coefficient, θkk in the kth order autoregression
• yt = θk1yt-1 + θk2yt-2 + …+ θkkyt-k + εt




                                                                       Venkat Reddy
                                                                 Data Analysis Course
• Partial correlation measures the degree
  of association between two random variables, with the effect
  of a set of controlling random variables removed.




                                                                     34
PACF Graph
      1.00
      0.50




                                                                           Venkat Reddy
                                                                     Data Analysis Course
      0.00
  -0.50




             0                     10                20    30   40
                                                     Lag
             95% Confidence bands [se = 1/sqrt(n)]




                                                                         35
Identification of AR Processes & its order -p
• What is AR procerss
• For AR models, the ACF will dampen exponentially
• The PACF will identify the order of the AR model:
  • The AR(1) model (yt = a1yt-1 + εt) would have one significant spike at
    lag 1 on the PACF.
  • The AR(3) model (yt = a1yt-1+a2yt-2+a3yt-3+εt) would have significant
    spikes on the PACF at lags 1, 2, & 3.




                                                                                   Venkat Reddy
                                                                             Data Analysis Course
                                                                                 36
Identification of MA Processes & its
order - q
• Recall that a MA(q) can be represented as an AR(∞), thus we expect the
  opposite patterns for MA processes.
• The PACF will dampen exponentially.
• The ACF will be used to identify the order of the MA process.
• MA(1) (yt = εt + b1 εt-1) has one significant spike in the ACF at lag 1.
• MA (3) (yt = εt + b1 εt-1 + b2 εt-2 + b3 εt-3) has three significant spikes in




                                                                                         Venkat Reddy
                                                                                   Data Analysis Course
  the ACF at lags 1, 2, & 3.




                                                                                       37
Parameter Estimate
• We already know the model equation. AR(1,0,0) or AR(2,1,0)
  or ARIMA(2,1,1)
• We need to estimate the coefficients using Least squares.
  Minimizing the sum of squares of deviations




                                                                     Venkat Reddy
                                                               Data Analysis Course
                                                                   38
Interpreting Coefficients
• If we include lagged variables for the dependent variable in an
  OLS model, we cannot simply interpret the β coefficients in
  the standard way.
• Consider the model, Yt = a0 + a1Yt-1 + b1Xt + εt
• The effect of Xt on Yt occurs in period t, but also influences Yt




                                                                            Venkat Reddy
                                                                      Data Analysis Course
  in period t+1 because we include a lagged value of Yt-1 in the
  model.
• To capture these effects, we must calculate multipliers
  (impact, interim, total) or mean/median lags (how long it
  takes for the average effect to occur).

                                                                          39
How good is my model?
• Does our model really give an adequate description of the
  data
• Two criteria to check the goodness of fit
  • Akaike information criterion (AIC)




                                                                            Venkat Reddy
                                                                      Data Analysis Course
  • Schwartz Bayesiancriterion (SBC)/Bayesian information criterion
    (BIC).




                                                                          40
Goodness of fit
• Remember… Residual analysis and Mean deviation, Mean
  Absolute Deviation and Root Mean Square errors?
• Four common techniques are the:
                                            n          ˆ
                                                  Yi  Yi
                                    MAD = 




                                                                             Venkat Reddy
                                                                       Data Analysis Course
  • Mean absolute deviation,
                                           i 1       n
                                                  n            ˆ
                                                          Yi  Yi
  • Mean absolute percent error          100
                                  MAPE =     
                                          n i 1              Yi

  • Mean square error,              MSE = 
                                            Y  Y 
                                            n    ˆ
                                                      i        i
                                                                   2


                                           i 1           n
  • Root mean square error.       RMSE           MSE                      41
Lab
•   Download time_slales data
•   Draw the trend graph
•   Test the stationary using stationarity=(DICKEY) option
•   Fit a time series model




                                                                   Venkat Reddy
                                                             Data Analysis Course
    •   Correlograms
    •   Do we need to differentiate the model?
    •   Test the stationary using stationarity again
    •   Identify p, q
• Draw a residual plot
• Predict the sales for Mar-2013
                                                                 42
Venkat Reddy Konasani
Manager at Trendwise Analytics
venkat@TrendwiseAnalytics.com
21.venkat@gmail.com




                                       Venkat Reddy
                                 Data Analysis Course
+91 9886 768879




                                     43

Más contenido relacionado

La actualidad más candente

Lesson 2 stationary_time_series
Lesson 2 stationary_time_seriesLesson 2 stationary_time_series
Lesson 2 stationary_time_seriesankit_ppt
 
Arima model (time series)
Arima model (time series)Arima model (time series)
Arima model (time series)Kumar P
 
ForecastIT 3. Simple Exponential Smoothing
ForecastIT 3. Simple Exponential SmoothingForecastIT 3. Simple Exponential Smoothing
ForecastIT 3. Simple Exponential SmoothingDeepThought, Inc.
 
Time Series - Auto Regressive Models
Time Series - Auto Regressive ModelsTime Series - Auto Regressive Models
Time Series - Auto Regressive ModelsBhaskar T
 
Arima model
Arima modelArima model
Arima modelJassika
 
Exponential smoothing
Exponential smoothingExponential smoothing
Exponential smoothingJairo Moreno
 
Time-series Analysis in Minutes
Time-series Analysis in MinutesTime-series Analysis in Minutes
Time-series Analysis in MinutesOrzota
 
Decision Trees
Decision TreesDecision Trees
Decision TreesStudent
 
Logistic regression
Logistic regressionLogistic regression
Logistic regressionsaba khan
 
Lesson 5 arima
Lesson 5 arimaLesson 5 arima
Lesson 5 arimaankit_ppt
 
Time Series, Moving Average
Time Series, Moving AverageTime Series, Moving Average
Time Series, Moving AverageSOMASUNDARAM T
 
Time Series Analysis.pptx
Time Series Analysis.pptxTime Series Analysis.pptx
Time Series Analysis.pptxSunny429247
 
Mba 532 2011_part_3_time_series_analysis
Mba 532 2011_part_3_time_series_analysisMba 532 2011_part_3_time_series_analysis
Mba 532 2011_part_3_time_series_analysisChandra Kodituwakku
 
Forecasting exponential smoothing
Forecasting exponential smoothingForecasting exponential smoothing
Forecasting exponential smoothingDoiyan
 
Box jenkins method of forecasting
Box jenkins method of forecastingBox jenkins method of forecasting
Box jenkins method of forecastingEr. Vaibhav Agarwal
 

La actualidad más candente (20)

Time series
Time seriesTime series
Time series
 
Lesson 2 stationary_time_series
Lesson 2 stationary_time_seriesLesson 2 stationary_time_series
Lesson 2 stationary_time_series
 
Time series Analysis
Time series AnalysisTime series Analysis
Time series Analysis
 
Arima model (time series)
Arima model (time series)Arima model (time series)
Arima model (time series)
 
ForecastIT 3. Simple Exponential Smoothing
ForecastIT 3. Simple Exponential SmoothingForecastIT 3. Simple Exponential Smoothing
ForecastIT 3. Simple Exponential Smoothing
 
Time Series - Auto Regressive Models
Time Series - Auto Regressive ModelsTime Series - Auto Regressive Models
Time Series - Auto Regressive Models
 
Arima model
Arima modelArima model
Arima model
 
Exponential smoothing
Exponential smoothingExponential smoothing
Exponential smoothing
 
Time-series Analysis in Minutes
Time-series Analysis in MinutesTime-series Analysis in Minutes
Time-series Analysis in Minutes
 
Time series slideshare
Time series slideshareTime series slideshare
Time series slideshare
 
Decision Trees
Decision TreesDecision Trees
Decision Trees
 
Time series
Time seriesTime series
Time series
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Lesson 5 arima
Lesson 5 arimaLesson 5 arima
Lesson 5 arima
 
Time Series, Moving Average
Time Series, Moving AverageTime Series, Moving Average
Time Series, Moving Average
 
Time Series Analysis.pptx
Time Series Analysis.pptxTime Series Analysis.pptx
Time Series Analysis.pptx
 
Mba 532 2011_part_3_time_series_analysis
Mba 532 2011_part_3_time_series_analysisMba 532 2011_part_3_time_series_analysis
Mba 532 2011_part_3_time_series_analysis
 
Time series
Time seriesTime series
Time series
 
Forecasting exponential smoothing
Forecasting exponential smoothingForecasting exponential smoothing
Forecasting exponential smoothing
 
Box jenkins method of forecasting
Box jenkins method of forecastingBox jenkins method of forecasting
Box jenkins method of forecasting
 

Similar a Data Analysis Time Series Forecasting Course

Data Collection Preparation
Data Collection PreparationData Collection Preparation
Data Collection PreparationBusiness Student
 
Es 08 forecasting topic final
Es 08 forecasting topic finalEs 08 forecasting topic final
Es 08 forecasting topic finalTim Arroyo
 
TRend analysis os time series data .pptx
TRend analysis os time series data .pptxTRend analysis os time series data .pptx
TRend analysis os time series data .pptxAnupPadhy
 
Ops management lecture 2 forecasting
Ops management lecture 2 forecastingOps management lecture 2 forecasting
Ops management lecture 2 forecastingjillmitchell8778
 
Time Series Analysis and Forecasting.ppt
Time Series Analysis and Forecasting.pptTime Series Analysis and Forecasting.ppt
Time Series Analysis and Forecasting.pptssuser220491
 
Strategies for SQL Server Index Analysis
Strategies for SQL Server Index AnalysisStrategies for SQL Server Index Analysis
Strategies for SQL Server Index AnalysisJason Strate
 
Forecasting and Quantification - Presentation - EMBA Degree Individual Assign...
Forecasting and Quantification - Presentation - EMBA Degree Individual Assign...Forecasting and Quantification - Presentation - EMBA Degree Individual Assign...
Forecasting and Quantification - Presentation - EMBA Degree Individual Assign...WilfredGitaari2
 
Smoothing method,abshor marantika, kelompok 10
Smoothing method,abshor marantika, kelompok 10Smoothing method,abshor marantika, kelompok 10
Smoothing method,abshor marantika, kelompok 10muhammadfadhil105
 
Machine learning & Time Series Analysis , Finlab CTO 韓承佑
Machine learning & Time Series Analysis ,  Finlab CTO 韓承佑Machine learning & Time Series Analysis ,  Finlab CTO 韓承佑
Machine learning & Time Series Analysis , Finlab CTO 韓承佑TaiLiLuo
 
DATA SCIENCE AND BIG DATA ANALYTICSCHAPTER 2 DATA ANA.docx
DATA SCIENCE AND BIG DATA ANALYTICSCHAPTER 2 DATA ANA.docxDATA SCIENCE AND BIG DATA ANALYTICSCHAPTER 2 DATA ANA.docx
DATA SCIENCE AND BIG DATA ANALYTICSCHAPTER 2 DATA ANA.docxrandyburney60861
 
Module_1___Analytical_Thinking___Problem_Solving.ppt.pptx
Module_1___Analytical_Thinking___Problem_Solving.ppt.pptxModule_1___Analytical_Thinking___Problem_Solving.ppt.pptx
Module_1___Analytical_Thinking___Problem_Solving.ppt.pptxsolomonvijayanand2
 

Similar a Data Analysis Time Series Forecasting Course (20)

Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statistics
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Data analysis Design Document
Data analysis Design DocumentData analysis Design Document
Data analysis Design Document
 
Multiple regression
Multiple regressionMultiple regression
Multiple regression
 
Data Collection Preparation
Data Collection PreparationData Collection Preparation
Data Collection Preparation
 
Es 08 forecasting topic final
Es 08 forecasting topic finalEs 08 forecasting topic final
Es 08 forecasting topic final
 
TRend analysis os time series data .pptx
TRend analysis os time series data .pptxTRend analysis os time series data .pptx
TRend analysis os time series data .pptx
 
Ops management lecture 2 forecasting
Ops management lecture 2 forecastingOps management lecture 2 forecasting
Ops management lecture 2 forecasting
 
Time Series Analysis and Forecasting.ppt
Time Series Analysis and Forecasting.pptTime Series Analysis and Forecasting.ppt
Time Series Analysis and Forecasting.ppt
 
Strategies for SQL Server Index Analysis
Strategies for SQL Server Index AnalysisStrategies for SQL Server Index Analysis
Strategies for SQL Server Index Analysis
 
Forecasting and Quantification - Presentation - EMBA Degree Individual Assign...
Forecasting and Quantification - Presentation - EMBA Degree Individual Assign...Forecasting and Quantification - Presentation - EMBA Degree Individual Assign...
Forecasting and Quantification - Presentation - EMBA Degree Individual Assign...
 
Smoothing method,abshor marantika, kelompok 10
Smoothing method,abshor marantika, kelompok 10Smoothing method,abshor marantika, kelompok 10
Smoothing method,abshor marantika, kelompok 10
 
Rm presentation
Rm presentationRm presentation
Rm presentation
 
Rm presentation
Rm presentationRm presentation
Rm presentation
 
Machine learning & Time Series Analysis , Finlab CTO 韓承佑
Machine learning & Time Series Analysis ,  Finlab CTO 韓承佑Machine learning & Time Series Analysis ,  Finlab CTO 韓承佑
Machine learning & Time Series Analysis , Finlab CTO 韓承佑
 
Machine learning & Time Series Analysis
Machine learning & Time Series AnalysisMachine learning & Time Series Analysis
Machine learning & Time Series Analysis
 
DATA SCIENCE AND BIG DATA ANALYTICSCHAPTER 2 DATA ANA.docx
DATA SCIENCE AND BIG DATA ANALYTICSCHAPTER 2 DATA ANA.docxDATA SCIENCE AND BIG DATA ANALYTICSCHAPTER 2 DATA ANA.docx
DATA SCIENCE AND BIG DATA ANALYTICSCHAPTER 2 DATA ANA.docx
 
Qa improvement
Qa improvementQa improvement
Qa improvement
 
Basic forecasting
Basic forecastingBasic forecasting
Basic forecasting
 
Module_1___Analytical_Thinking___Problem_Solving.ppt.pptx
Module_1___Analytical_Thinking___Problem_Solving.ppt.pptxModule_1___Analytical_Thinking___Problem_Solving.ppt.pptx
Module_1___Analytical_Thinking___Problem_Solving.ppt.pptx
 

Más de Venkata Reddy Konasani

Machine Learning Deep Learning AI and Data Science
Machine Learning Deep Learning AI and Data Science Machine Learning Deep Learning AI and Data Science
Machine Learning Deep Learning AI and Data Science Venkata Reddy Konasani
 
Model selection and cross validation techniques
Model selection and cross validation techniquesModel selection and cross validation techniques
Model selection and cross validation techniquesVenkata Reddy Konasani
 
Table of Contents - Practical Business Analytics using SAS
Table of Contents - Practical Business Analytics using SAS Table of Contents - Practical Business Analytics using SAS
Table of Contents - Practical Business Analytics using SAS Venkata Reddy Konasani
 
Learning Tableau - Data, Graphs, Filters, Dashboards and Advanced features
Learning Tableau -  Data, Graphs, Filters, Dashboards and Advanced featuresLearning Tableau -  Data, Graphs, Filters, Dashboards and Advanced features
Learning Tableau - Data, Graphs, Filters, Dashboards and Advanced featuresVenkata Reddy Konasani
 
Data exploration validation and sanitization
Data exploration validation and sanitizationData exploration validation and sanitization
Data exploration validation and sanitizationVenkata Reddy Konasani
 

Más de Venkata Reddy Konasani (20)

Transformers 101
Transformers 101 Transformers 101
Transformers 101
 
Machine Learning Deep Learning AI and Data Science
Machine Learning Deep Learning AI and Data Science Machine Learning Deep Learning AI and Data Science
Machine Learning Deep Learning AI and Data Science
 
Model selection and cross validation techniques
Model selection and cross validation techniquesModel selection and cross validation techniques
Model selection and cross validation techniques
 
Neural Network Part-2
Neural Network Part-2Neural Network Part-2
Neural Network Part-2
 
GBM theory code and parameters
GBM theory code and parametersGBM theory code and parameters
GBM theory code and parameters
 
Neural Networks made easy
Neural Networks made easyNeural Networks made easy
Neural Networks made easy
 
Decision tree
Decision treeDecision tree
Decision tree
 
Step By Step Guide to Learn R
Step By Step Guide to Learn RStep By Step Guide to Learn R
Step By Step Guide to Learn R
 
Credit Risk Model Building Steps
Credit Risk Model Building StepsCredit Risk Model Building Steps
Credit Risk Model Building Steps
 
Table of Contents - Practical Business Analytics using SAS
Table of Contents - Practical Business Analytics using SAS Table of Contents - Practical Business Analytics using SAS
Table of Contents - Practical Business Analytics using SAS
 
SAS basics Step by step learning
SAS basics Step by step learningSAS basics Step by step learning
SAS basics Step by step learning
 
Testing of hypothesis case study
Testing of hypothesis case study Testing of hypothesis case study
Testing of hypothesis case study
 
L101 predictive modeling case_study
L101 predictive modeling case_studyL101 predictive modeling case_study
L101 predictive modeling case_study
 
Learning Tableau - Data, Graphs, Filters, Dashboards and Advanced features
Learning Tableau -  Data, Graphs, Filters, Dashboards and Advanced featuresLearning Tableau -  Data, Graphs, Filters, Dashboards and Advanced features
Learning Tableau - Data, Graphs, Filters, Dashboards and Advanced features
 
Machine Learning for Dummies
Machine Learning for DummiesMachine Learning for Dummies
Machine Learning for Dummies
 
Online data sources for analaysis
Online data sources for analaysis Online data sources for analaysis
Online data sources for analaysis
 
A data analyst view of Bigdata
A data analyst view of Bigdata A data analyst view of Bigdata
A data analyst view of Bigdata
 
R- Introduction
R- IntroductionR- Introduction
R- Introduction
 
Cluster Analysis for Dummies
Cluster Analysis for DummiesCluster Analysis for Dummies
Cluster Analysis for Dummies
 
Data exploration validation and sanitization
Data exploration validation and sanitizationData exploration validation and sanitization
Data exploration validation and sanitization
 

Último

Q4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxQ4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxnelietumpap1
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 

Último (20)

YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
Q4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxQ4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptx
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 

Data Analysis Time Series Forecasting Course

  • 1. Data Analysis Course Time Series Analysis & Forecasting(Version-1) Venkat Reddy
  • 2. Data Analysis Course • Data analysis design document • Introduction to statistical data analysis • Descriptive statistics • Data exploration, validation & sanitization • Probability distributions examples and applications Venkat Reddy Data Analysis Course • Simple correlation and regression analysis • Multiple liner regression analysis • Logistic regression analysis • Testing of hypothesis • Clustering and Decision trees • Time series analysis and forecasting • Credit Risk Model building-1 2 • Credit Risk Model building-2
  • 3. Note • This presentation is just class notes. The course notes for Data Analysis Training is by written by me, as an aid for myself. • The best way to treat this is as a high-level summary; the actual session went more in depth and contained other Venkat Reddy Data Analysis Course information. • Most of this material was written as informal notes, not intended for publication • Please send questions/comments/corrections to venkat@trenwiseanalytics.com or 21.venkat@gmail.com • Please check my website for latest version of this document -Venkat Reddy 3
  • 4. Contents • What is a Time Series • Applications of Time Series Analysis • Time series model building & Forecasting Methodologies • TCSI Method Venkat Reddy Data Analysis Course • Components of time series • Goodness of fit • Forecasting using TCSI model • ARIMA • Main steps in ARIMA • Goodness of fit • Forecasting using ARIMA model 4
  • 5. What is a Time Series • Time series data is a sequence of observations collected from a process with equally spaced periods of time. • Examples • Dow Jones Industrial Averages Venkat Reddy Data Analysis Course • Daily data on sales • Monthly inventory • Daily Customers , • Monthly interest rates, costs • Monthly unemployment rates, • Weekly measures of money supply, • Daily closing prices of stock indices, and so on 5
  • 6. Forecasting Methodologies • There are many different time series techniques. • It is usually impossible to know which technique will be best for a particular data set. • It is customary to try out several different techniques and Venkat Reddy Data Analysis Course select the one that seems to work best. • To be an effective time series modeler, you need to keep several time series techniques in your “tool box.” • Simple ideas • Moving averages • TSI method • Complex statistical concepts • Box-Jenkins methodology 6
  • 7. Building Model Using TSI Method • Very Simple technique • Spreadsheet is sufficient • Less ambiguous • Easy to understand & Interpret Venkat Reddy Data Analysis Course • Serves the purpose most of the times 7
  • 8. Components of a Time series 1. Secular Trend(T): Gradual long term movement(up or down). Easiest to detect • Eg: Population growth In India 2. Cyclical Patterns(C): Results from events recurrent but not periodic in nature. An up-and-down repetitive movement in demand. repeats itself over a long period of time Venkat Reddy Data Analysis Course • Eg. Recession in US Economy 3. Seasonal Pattern(S): Results from events that are periodic and recurrent in nature. An up-and-down repetitive movement within a trend occurring periodically. Often weather related but could be daily or weekly occurrence • Eg. Sales in festive seasons 4. Irregular Component(I): Disturbances or residual variation that remain after all the other behaviors have been accounted for. Erratic movements that are not predictable because they do not follow a pattern 8 • Eg. Earthquake
  • 9. Building Time Series Model: TCSI O(t) = T(t) + S(t) + I(t) or O(t) = T(t) * S(t) * I(t) Where O(t) = observed series, T(t) = Trend component, S(t) = Seasonal , I(t) = Irregular component  Step 1 : Smooth the series & de trend the series Venkat Reddy Data Analysis Course  Step 2 : Find out Seasonal component and adjust the data for seasonality  Step 3 : See if there is still some trend/seasinality in the data & quantify it 9
  • 10. Step-1: Smoothening the series • Why Smoothening ? Basically to find average value and detrend the series • After removing the effect of trends series is left with seasonal and irregular components Moving Average • Series of arithmetic means, Used only for smoothing. Provides overall impression of data over time Venkat Reddy Data Analysis Course • MA(3)= (yt + yt-1 + yt-2)/3 Actual Series Series after Removing the Trend 10
  • 11. Smoothening the series Exponential Smoothing • Form of weighted moving average. Weights decline exponentially and most recent data weighted most • Requires smoothing constant (W). Ranges from 0 to 1, Subjectively chosen • Involves little record keeping of past data Venkat Reddy Data Analysis Course Ei = W·Yi + (1 - W)·Ei-1 Actual Series Series after Removing the Trend 11
  • 12. Step-2 : Capturing Seasonality - Seasonal indices Year Quarter Sales 1 Q1 362 1 Q2 385 1 Q3 432 1 Q4 678 2 Q1 382 2 Q2 409 2 Q3 498 2 Q4 642 Venkat Reddy Data Analysis Course 3 Q1 473 3 Q2 513 3 Q3 582 3 Q4 789 4 Q1 544 4 Q2 582 4 Q3 681 4 Q4 899 5 Q1 628 • Quantify the effect of seasonality in each year 5 707 5 Q2 Q3 773 • Calculate overall average for each quarter for across 5 Q4 1008 all years • De seasionalize the dataset with by diving with seasonal indices 12
  • 13. Seasonal Indices & De Seasionalising Step 1 : Reformating the Dataset Qtr 1 Qtr 2 Qtr 3 Qtr 4 Y1 362 385 432 678 Y2 382 409 498 642 Y3 473 513 582 789 Y4 544 582 681 899 Y5 628 707 773 1008 Venkat Reddy Data Analysis Course Step 2 : Calculation Of Seasonal Indices Qtr 1 Qtr 2 Qtr 3 Qtr 4 Y1 0.78 0.83 0.93 1.46 Y2 0.79 0.85 1.03 1.33 Y3 0.80 0.87 0.99 1.34 Y4 0.80 0.86 1.01 1.33 Y5 0.81 0.91 0.99 1.29 SI 0.80 0.86 0.99 1.35 Step 3: Deseanolised Dataset Qtr 1 Qtr 2 Qtr 3 Qtr 4 Y1 454 446 436 502 Y2 479 474 503 475 Y3 594 594 588 584 13 Y4 683 674 688 666 Y5 788 819 781 746
  • 14. Step-3: Irregular Component • Irregular No Trend  Cant be modeled • After first level of de trending & de seasionalising, if there is still any pattern left in the time series, repeat the above steps once again Venkat Reddy Data Analysis Course 14
  • 15. Lab • Download the data from here • Draw the trend chart, can you see trend • Calculate MA2 and MA3 • De-trend the series and create the graph again Venkat Reddy Data Analysis Course • Is there any trend now? • Can you judge the seasonality in de trended data? • Find seasonality indices for each quarter • De-seasinalise the series by diving with seasonality indices • Forecast sales for next four months, one by one • First assume I as 1, multiply with S, multiply with MA 15
  • 16. Residual Analysis • Residual analysis is done after the model is built • There should be no Pattern (information) left in the residuals. • Residual should be totally random Venkat Reddy Desired Pattern Data Analysis Course Trend Not Fully Accounted for Error Error 0 0 Time (Years) Time (Years) 16
  • 17. Residual Analysis e e 0 0 Venkat Reddy Data Analysis Course T T Random errors Cyclical effects not accounted for e e 0 0 17 T T Trend not accounted for Seasonal effects not accounted for
  • 18. Goodness of fit • We need a way to compare different time series techniques for a given data set. • Four common techniques are the: n ˆ Yi  Yi MAD =  Venkat Reddy Data Analysis Course • Mean absolute deviation, i 1 n n ˆ Yi  Yi • Mean absolute percent error 100 MAPE =  n i 1 Yi • Mean square error, MSE =  Y  Y  n ˆ i i 2 i 1 n • Root mean square error. RMSE  MSE 18
  • 19. Additive and Multiplicative. What model to use? • Is there any striking difference between these two time series? Venkat Reddy Data Analysis Course 19
  • 20. Additive and Multiplicative. What model to use? • In many time series, the amplitude of both the seasonal and irregular variations increase as the level of the trend rises. In this situation, a multiplicative model is usually appropriate. Venkat Reddy Data Analysis Course • In some time series, the amplitude of both the seasonal and irregular variations do not change as the level of the trend rises or falls. In such cases, an additive model is appropriate. 20
  • 21. Lab-2 • Find the error at each point in above problem • Conduct the residual analysis for problem in lab-1 Data Set-2: Predicting the Stock Price • Download Coal India Limited Stock data from Yahoo finance Venkat Reddy Data Analysis Course • De trend it by calculating MA5(one week) • Use day function and pivot table to identify seasonality interval • Are 5-26 & 26 to 4 are two seasons in every month ? • Find the seasonal indices and remove the seasonality • Conduct residual analysis • Forecast future stock price values, buy and sell at the right 21 time and become rich
  • 22. Time Series Models : Type 2 • Autoregressive (AR) process: series current values depend on its own previous values • Moving average (MA) process: The current deviation from mean depends on previous deviations • Autoregressive Moving average (ARMA) process Venkat Reddy Data Analysis Course • Autoregressive Integrated Moving average (ARIMA)process. • ARIMA is also known as Box-Jenkins approach. It is popular because of its generality; It can handle any series, with or without seasonal elements, and it has well-documented computer programs 22
  • 23. AR Process and MA Process Data Analysis Course 23 Venkat Reddy
  • 24. ARIMA Model Autoregressive Integrated Moving Average → AR filter → Integration filter → MA filter → εt (long term) (stochastic trend) (short term) (white noise error) Venkat Reddy Data Analysis Course ARIMA (2,0,1) yt = a1yt-1 + a2yt-2 + εt + b1εt-1 ARIMA (3,0,1) yt = a1yt-1 + a2yt-2 + a3yt-3+ εt + b1εt-1 ARIMA (1,1,0) Δyt = a1 Δ yt-1 + εt , where Δyt = yt - yt-1 ARIMA (2,1,0) Δyt = a1 Δ yt-1 + a2Δ yt-2 + εt where Δyt = yt - yt-1 ARIMA (2,1,1) =? ARIMA (2,1,2) =? ARIMA (3,1,2) =? ARIMA(1,0,0) =? ARIMA(2,0,0) =? ARIMA(0,0,1) =? ARIMA(0,0,2) =? To build a time series model issuing ARIMA, we need to study the time 24 series and identify p,d,q…..…simple :P
  • 25. ARIMA (p,d,q) modeling To build a time series model issuing ARIMA, we need to study the time series and identify p,d,q • Identification: • Determine the appropriate values of p, d, & q using the ACF, PACF, and unit root tests Venkat Reddy Data Analysis Course • p is the AR order, d is the integration order, q is the MA order • Estimation : • Estimate an ARIMA model using values of p, d, & q you think are appropriate. • Diagnostic checking: • Check residuals of estimated ARIMA model(s) to see if they are white noise; pick best model with well behaved residuals. • Forecasting: • Produce out of sample forecasts or set aside last few data points for 25 in-sample forecasting.
  • 26. The Box-Jenkins Approach Differencing the Estimate the series to achieve Identify the model parameters of the stationary model Venkat Reddy Data Analysis Course Diagnostic checking. No Is the model adequate? 26 Use Model for forecasting Yes
  • 27. The time series has to be Stationary Processes • In order to model a time series with the Box-Jenkins approach, the series has to be stationary • In practical terms, the series is stationary if tends to wonder more or less uniformly about some fixed level Venkat Reddy Data Analysis Course • In statistical terms, a stationary process is assumed to be in a particular state of statistical equilibrium, i.e., p(xt) is the same for all t • In particular, if zt is a stationary process, then the first difference zt = zt - zt-1and higher differences  zt are d stationary 27 • BTW …..Most time series are nonstationary
  • 28. 1 3 2 Some non stationary series Data Analysis Course 28 Venkat Reddy
  • 29. Achieving Stationarity • Regular differencing (RD) (1st order) xt = (1 – B)xt = xt – xt-1 (2nd order) 2xt = (1 – B)2xt = xt – 2xt-1 + xt-2 Venkat Reddy Data Analysis Course “B” is the backward shift operator • It is unlikely that more than two regular differencing would ever be needed • Sometimes regular differencing by itself is not sufficient and prior transformation is also needed 29
  • 30. Differentiation Actual Series Venkat Reddy Data Analysis Course Series After Differentiation 30
  • 31. Identification of orders p and q • Identification starts with d • ARIMA(p,d,q) • What is Integration here? Venkat Reddy Data Analysis Course • First we need to make the time series stationary • We need to learn about ACF & PACF to identify p,q • Once we are working with a stationary time series, we can examine the ACF and PACF to help identify the proper number of lagged y (AR) terms and ε (MA) terms. 31
  • 32. Autocorrelation Function (ACF) • Correlation with lag-1, lag2, lag3 etc., • The ACF represents the degree of persistence over respective lags of a variable. ρk = γk / γ0 = covariance at lag k/ variance Venkat Reddy Data Analysis Course ρk = E[(yt – μ)(yt-k – μ)]2 E[(yt – μ)2] ACF (0) = 1, ACF (k) = ACF (-k) 32
  • 33. ACF Graph 1.00 0.50 Venkat Reddy Data Analysis Course 0.00 -0.50 0 10 20 30 40 Lag Bartlett's formula for MA(q) 95% confidence bands 33
  • 34. Partial Autocorrelation Function (PACF) • Partial regression coefficient - • The lag k partial autocorrelation is the partial regression coefficient, θkk in the kth order autoregression • yt = θk1yt-1 + θk2yt-2 + …+ θkkyt-k + εt Venkat Reddy Data Analysis Course • Partial correlation measures the degree of association between two random variables, with the effect of a set of controlling random variables removed. 34
  • 35. PACF Graph 1.00 0.50 Venkat Reddy Data Analysis Course 0.00 -0.50 0 10 20 30 40 Lag 95% Confidence bands [se = 1/sqrt(n)] 35
  • 36. Identification of AR Processes & its order -p • What is AR procerss • For AR models, the ACF will dampen exponentially • The PACF will identify the order of the AR model: • The AR(1) model (yt = a1yt-1 + εt) would have one significant spike at lag 1 on the PACF. • The AR(3) model (yt = a1yt-1+a2yt-2+a3yt-3+εt) would have significant spikes on the PACF at lags 1, 2, & 3. Venkat Reddy Data Analysis Course 36
  • 37. Identification of MA Processes & its order - q • Recall that a MA(q) can be represented as an AR(∞), thus we expect the opposite patterns for MA processes. • The PACF will dampen exponentially. • The ACF will be used to identify the order of the MA process. • MA(1) (yt = εt + b1 εt-1) has one significant spike in the ACF at lag 1. • MA (3) (yt = εt + b1 εt-1 + b2 εt-2 + b3 εt-3) has three significant spikes in Venkat Reddy Data Analysis Course the ACF at lags 1, 2, & 3. 37
  • 38. Parameter Estimate • We already know the model equation. AR(1,0,0) or AR(2,1,0) or ARIMA(2,1,1) • We need to estimate the coefficients using Least squares. Minimizing the sum of squares of deviations Venkat Reddy Data Analysis Course 38
  • 39. Interpreting Coefficients • If we include lagged variables for the dependent variable in an OLS model, we cannot simply interpret the β coefficients in the standard way. • Consider the model, Yt = a0 + a1Yt-1 + b1Xt + εt • The effect of Xt on Yt occurs in period t, but also influences Yt Venkat Reddy Data Analysis Course in period t+1 because we include a lagged value of Yt-1 in the model. • To capture these effects, we must calculate multipliers (impact, interim, total) or mean/median lags (how long it takes for the average effect to occur). 39
  • 40. How good is my model? • Does our model really give an adequate description of the data • Two criteria to check the goodness of fit • Akaike information criterion (AIC) Venkat Reddy Data Analysis Course • Schwartz Bayesiancriterion (SBC)/Bayesian information criterion (BIC). 40
  • 41. Goodness of fit • Remember… Residual analysis and Mean deviation, Mean Absolute Deviation and Root Mean Square errors? • Four common techniques are the: n ˆ Yi  Yi MAD =  Venkat Reddy Data Analysis Course • Mean absolute deviation, i 1 n n ˆ Yi  Yi • Mean absolute percent error 100 MAPE =  n i 1 Yi • Mean square error, MSE =  Y  Y  n ˆ i i 2 i 1 n • Root mean square error. RMSE  MSE 41
  • 42. Lab • Download time_slales data • Draw the trend graph • Test the stationary using stationarity=(DICKEY) option • Fit a time series model Venkat Reddy Data Analysis Course • Correlograms • Do we need to differentiate the model? • Test the stationary using stationarity again • Identify p, q • Draw a residual plot • Predict the sales for Mar-2013 42
  • 43. Venkat Reddy Konasani Manager at Trendwise Analytics venkat@TrendwiseAnalytics.com 21.venkat@gmail.com Venkat Reddy Data Analysis Course +91 9886 768879 43