SlideShare una empresa de Scribd logo
1 de 29
Descargar para leer sin conexión
Time Series Analysis in Python with statsmodels

                   Wes McKinney1                 Josef Perktold2               Skipper Seabold3

                                            1 Departmentof Statistical Science
                                                    Duke University
                                            2 Department of Economics

                                    University of North Carolina at Chapel Hill
                                               3 Departmentof Economics
                                                  American University


                       10th Python in Science Conference, 13 July 2011



McKinney, Perktold, Seabold (statsmodels)        Python Time Series Analysis          SciPy Conference 2011   1 / 29
What is statsmodels?




          A library for statistical modeling, implementing standard statistical
          models in Python using NumPy and SciPy
          Includes:
                  Linear (regression) models of many forms
                  Descriptive statistics
                  Statistical tests
                  Time series analysis
                  ...and much more




McKinney, Perktold, Seabold (statsmodels)   Python Time Series Analysis   SciPy Conference 2011   2 / 29
What is Time Series Analysis?




          Statistical modeling of time-ordered data observations
          Inferring structure, forecasting and simulation, and testing
          distributional assumptions about the data
          Modeling dynamic relationships among multiple time series
          Broad applications e.g. in economics, finance, neuroscience, signal
          processing...




McKinney, Perktold, Seabold (statsmodels)   Python Time Series Analysis   SciPy Conference 2011   3 / 29
Talk Overview



          Brief update on statsmodels development
          Aside: user interface and data structures
          Descriptive statistics and tests
          Auto-regressive moving average models (ARMA)
          Vector autoregression (VAR) models
          Filtering tools (Hodrick-Prescott and others)
          Near future: Bayesian dynamic linear models (DLMs), ARCH /
          GARCH volatility models and beyond




McKinney, Perktold, Seabold (statsmodels)   Python Time Series Analysis   SciPy Conference 2011   4 / 29
Statsmodels development update



          We’re now on GitHub! Join us:

                         http://github.com/statsmodels/statsmodels

          Check out the slick Sphinx docs:

                                http://statsmodels.sourceforge.net

          Development focus has been largely computational, i.e. writing
          correct, tested implementations of all the common classes of
          statistical models




McKinney, Perktold, Seabold (statsmodels)   Python Time Series Analysis   SciPy Conference 2011   5 / 29
Statsmodels development update




          Major work to be done on providing a nice integrated user interface
          We must work together to close the gap between R and Python!
          Some important areas:
                  Formula framework, for specifying model design matrices
                  Need integrated rich statistical data structures (pandas)
                  Data visualization of results should always be a few keystrokes away
                  Write a “Statsmodels for R users” guide




McKinney, Perktold, Seabold (statsmodels)   Python Time Series Analysis   SciPy Conference 2011   6 / 29
Aside: statistical data structures and user interface



          While I have a captive audience...
          Controversial fact: pandas is the only Python library currently
          providing data structures matching (and in many places exceeding)
          the richness of R’s data structures (for statistics)
                  Let’s have a BoF session so I can justify this statement
          Feedback I hear is that end users find the fragmented, incohesive set
          of Python tools for data analysis and statistics to be confusing,
          frustrating, and certainly not compelling them to use Python...
                  (Not to mention the packaging headaches)




McKinney, Perktold, Seabold (statsmodels)   Python Time Series Analysis   SciPy Conference 2011   7 / 29
Aside: statistical data structures and user interface




          We need to “commit” ASAP (not 12 months from now) to a high
          level data structure(s) as the “primary data structure(s) for statistical
          data analysis” and communicate that clearly to end users
                  Or we might as well all start programming in R...




McKinney, Perktold, Seabold (statsmodels)   Python Time Series Analysis   SciPy Conference 2011   8 / 29
Example data: EEG trace data


               300

               200

               100

                 0

               100

               200

               300

               400

               500

               600
                  0         500           0      0           0              0      0          0             0
                                      100     150         200         250       300        350        400




McKinney, Perktold, Seabold (statsmodels)     Python Time Series Analysis              SciPy Conference 2011    9 / 29
Example data: Macroeconomic data


              5.5
              5.0      cpi
              4.5
              4.0
              3.5
              3.0
              7.5
              7.0      m1
              6.5
              6.0
              5.5
              5.0
              4.5
              9.5
              9.0
                       realgdp
              8.5
              8.0
                  0   4     8  2  6   0   4   8   2   6   0   4    8
               196 196 196 197 197 198 198 198 199 199 200 200 200




McKinney, Perktold, Seabold (statsmodels)   Python Time Series Analysis   SciPy Conference 2011   10 / 29
Example data: Stock data


              800
                         AAPL
              700        GOOG
                         MSFT
              600        YHOO
              500
              400
              300
              200
              100
                0
                          1         2          3        4           5      6           7      8       9
                       200       200        200      200      200       200      200       200     200




McKinney, Perktold, Seabold (statsmodels)          Python Time Series Analysis              SciPy Conference 2011   11 / 29
Descriptive statistics
            Autocorrelation, partial autocorrelation plots
            Commonly used for identification in ARMA(p,q) and ARIMA(p,d,q)
            models
            acf = tsa . acf ( eeg , 50)
            pacf = tsa . pacf ( eeg , 50)

     1.0                  Autocorrelation                     1.0               Partial Autocorrelation


     0.5                                                      0.5


     0.0                                                      0.0


     0.5                                                      0.5


     1.00         10        20        30    40        50      1.00         10        20        30         40    50

McKinney, Perktold, Seabold (statsmodels)    Python Time Series Analysis               SciPy Conference 2011   12 / 29
Statistical tests




          Ljung-Box test for zero autocorrelation
          Unit root test for cointegration (Augmented Dickey-Fuller test)
          Granger-causality
          Whiteness (iid-ness) and normality
          See our conference paper (when the proceedings get published!)




McKinney, Perktold, Seabold (statsmodels)   Python Time Series Analysis   SciPy Conference 2011   13 / 29
Autoregressive moving average (ARMA) models
          One of most common univariate time series models:

                   yt = µ + a1 yt−1 + ... + ak yt−p +                t    + b1   t−1   + ... + bq       t−q
                                                                                           2
                   where E ( t , s ) = 0, for t = s and                   t   ∼ N (0, σ )


          Exact log-likelihood can be evaluated via the Kalman filter, but the
          “conditional” likelihood is easier and commonly used
          statsmodels has tools for simulating ARMA processes with known
          coefficients ai , bi and also estimation given specified lag orders
              import scikits.statsmodels.tsa.arima_process as ap
              ar_coef = [1, .75, -.25]; ma_coef = [1, -.5]
              nobs = 100
              y = ap.arma_generate_sample(ar_coef, ma_coef, nobs)
              y += 4 # add in constant

McKinney, Perktold, Seabold (statsmodels)   Python Time Series Analysis                SciPy Conference 2011   14 / 29
ARMA Estimation



          Several likelihood-based estimators implemented (see docs)
              model = tsa.ARMA(y)
              result = model.fit(order=(2, 1), trend=’c’,
                                 method=’css-mle’, disp=-1)
              result.params
              # array([ 3.97, -0.97, -0.05, -0.13])


          Standard model diagnostics, standard errors, information criteria
          (AIC, BIC, ...), etc available in the returned ARMAResults object




McKinney, Perktold, Seabold (statsmodels)   Python Time Series Analysis   SciPy Conference 2011   15 / 29
Vector Autoregression (VAR) models



          Widely used model for modeling multiple (K -variate) time series,
          especially in macroeconomics:

                           Yt = A1 Yt−1 + . . . + Ap Yt−p +               t,   t   ∼ N (0, Σ)

          Matrices Ai are K × K .
          Yt must be a stationary process (sometimes achieved by
          differencing). Related class of models (VECM) for modeling
          nonstationary (including cointegrated) processes




McKinney, Perktold, Seabold (statsmodels)   Python Time Series Analysis            SciPy Conference 2011   16 / 29
Vector Autoregression (VAR) models

   >>> model = VAR(data); model.select_order(8)
                    VAR Order Selection
   =====================================================
              aic          bic          fpe         hqic
   -----------------------------------------------------
   0       -27.83       -27.78    8.214e-13       -27.81
   1       -28.77       -28.57    3.189e-13       -28.69
   2       -29.00      -28.64*    2.556e-13       -28.85
   3       -29.10       -28.60    2.304e-13      -28.90*
   4       -29.09       -28.43    2.330e-13       -28.82
   5       -29.13       -28.33    2.228e-13       -28.81
   6      -29.14*       -28.18   2.213e-13*       -28.75
   7       -29.07       -27.96    2.387e-13       -28.62
   =====================================================
   * Minimum

McKinney, Perktold, Seabold (statsmodels)   Python Time Series Analysis   SciPy Conference 2011   17 / 29
Vector Autoregression (VAR) models

   >>> result = model.fit(2)
   >>> result.summary() # print summary for each variable
   <snip>
   Results for equation m1
   ====================================================
               coefficient    std. error t-stat    prob
   ----------------------------------------------------
   const          0.004968      0.001850   2.685 0.008
   L1.m1          0.363636      0.071307   5.100 0.000
   L1.realgdp    -0.077460      0.092975 -0.833 0.406
   L1.cpi        -0.052387      0.128161 -0.409 0.683
   L2.m1          0.250589      0.072050   3.478 0.001
   L2.realgdp    -0.085874      0.092032 -0.933 0.352
   L2.cpi         0.169803      0.128376   1.323 0.188
   ====================================================
   <snip>


McKinney, Perktold, Seabold (statsmodels)   Python Time Series Analysis   SciPy Conference 2011   18 / 29
Vector Autoregression (VAR) models




   >>> result = model.fit(2)
   >>> result.summary() # print summary for each variable
   <snip>
   Correlation matrix of residuals
                    m1   realgdp       cpi
   m1         1.000000 -0.055690 -0.297494
   realgdp   -0.055690 1.000000 0.115597
   cpi       -0.297494 0.115597 1.000000




McKinney, Perktold, Seabold (statsmodels)   Python Time Series Analysis   SciPy Conference 2011   19 / 29
VAR: Impulse Response analysis
          Analyze systematic impact of unit “shock” to a single variable

   irf = result.irf(10)
   irf.plot()

                                                                  Impulse responses
                                      m1 → m1                         realgdp → m1                        cpi → m1
                         1.0                               0.2                               0.4
                         0.8                               0.1                               0.3
                                                                                             0.2
                         0.6                               0.0                               0.1
                         0.4                               0.1                               0.0
                         0.2                               0.2                               0.1
                                                                                             0.2
                         0.0                               0.3                               0.3
                         0.20        4                     0.40          4                10 0.40
                                2            6
                                    m1 → realgdp   8   10         2 realgdp → realgdp 8
                                                                                6                   2   cpi4→ realgdp
                                                                                                                  6     8   10
                        0.20                               1.0                               0.2
                        0.15                               0.8                               0.1
                        0.10                               0.6                               0.0
                        0.05
                                                           0.4                               0.1
                        0.00
                        0.05                               0.2                               0.2
                        0.10                               0.0                               0.3
                        0.150   2     4      6     8   10 0.20    2     4                    0.40         4 → cpi
                                      m1 → cpi                        realgdp →6
                                                                               cpi   8    10        2     cpi 6         8   10
                        0.20                              0.15                               1.0
                        0.15                              0.10                               0.8
                        0.10                              0.05                               0.6
                        0.05                              0.00
                        0.00                              0.05                               0.4
                        0.05                              0.10                               0.2
                        0.100   2     4     6      8   10 0.150   2     4      6     8    10 0.00   2     4      6      8   10



McKinney, Perktold, Seabold (statsmodels)                 Python Time Series Analysis                                SciPy Conference 2011   20 / 29
VAR: Forecast Error Variance Decomposition
          Analyze contribution of each variable to forecasting error

   fevd = result.fevd(20)
   fevd.plot()

                                                Forecast error variance decomposition (FEVD)         m1
                         1.0                                 m1                                      realgdp
                         0.8                                                                         cpi
                         0.6
                         0.4
                         0.2
                         0.00               5                 10                        15     20
                         1.2                               realgdp
                         1.0
                         0.8
                         0.6
                         0.4
                         0.2
                         0.00               5                10                         15     20
                         1.2                                 cpi
                         1.0
                         0.8
                         0.6
                         0.4
                         0.2
                         0.00               5                 10                        15     20



McKinney, Perktold, Seabold (statsmodels)       Python Time Series Analysis                     SciPy Conference 2011   21 / 29
VAR: Statistical tests



   In [137]: result.test_causality(’m1’, [’cpi’, ’realgdp’])
   Granger causality f-test
   =========================================================
      Test statistic   Critical Value      p-value        df
   ---------------------------------------------------------
            1.248787         2.387325        0.289 (4, 579)
   =========================================================
   H_0: [’cpi’, ’realgdp’] do not Granger-cause m1
   Conclusion: fail to reject H_0 at 5.00% significance level




McKinney, Perktold, Seabold (statsmodels)   Python Time Series Analysis   SciPy Conference 2011   22 / 29
Filtering

          Hodrick-Prescott (HP) filter separates a time series yt into a trend τt
          and a cyclical component ζt , so that yt = τt + ζt .

              14
                                                                                       Inflation
              12                                                                       Cyclical component
              10                                                                       Trend component
               8
               6
               4
                2
               0
                2
                4
                       2      6      0      4      8       2       6       0       4      8        2       6
                    196    196    197    197    197    198     198     199     199     199      200    200

McKinney, Perktold, Seabold (statsmodels)        Python Time Series Analysis                  SciPy Conference 2011   23 / 29
Filtering

          In addition to the HP filter, 2 other filters popular in finance and
          economics, Baxter-King and Christiano-Fitzgerald, are available
          We refer you to our paper and the documentation for details on these:

                          Inflation and Unemployment: BK Filtered                           Inflation and Unemployment: CF Filtered
                                                                    INFL                                                              INFL
              4                                                               4                                                       UNEMP
                                                                    UNEMP

              2                                                               2


              0                                                               0


              2                                                               2


              4                                                               4
                                                                                  63



                                                                                               73



                                                                                                           83



                                                                                                                       93
                                                                                       68



                                                                                                     78



                                                                                                                 88



                                                                                                                             98

                                                                                                                                      03
                         71




                                      81




                                                    91




                                                                                                                                           08
                    66




                                76




                                              86




                                                           96

                                                                    01

                                                                         06



                                                                                  19



                                                                                              19



                                                                                                          19



                                                                                                                      19
                                                                                       19



                                                                                                    19



                                                                                                                19



                                                                                                                            19
                         19




                                     19




                                                   19




                                                                                                                                  20
                  19




                              19




                                            19




                                                         19




                                                                                                                                           20
                                                                20

                                                                         20




McKinney, Perktold, Seabold (statsmodels)                   Python Time Series Analysis                         SciPy Conference 2011           24 / 29
Preview: Bayesian dynamic linear models (DLM)



          A state space model by another name:

                                      yt = Ft θt + νt ,       νt ∼ N (0, Vt )
                                      θt = G θt−1 + ωt ,          ωt ∼ N (0, Wt )

          Estimation of basic model by Kalman filter recursions. Provides
          elegant way to do time-varying linear regressions for forecasting
          Extensions: multivariate DLMs, stochastic volatility (SV) models,
          MCMC-based posterior sampling, mixtures of DLMs




McKinney, Perktold, Seabold (statsmodels)    Python Time Series Analysis        SciPy Conference 2011   25 / 29
Preview: DLM Example (Constant+Trend model)

   model = Polynomial(2)
   dlm = DLM(close_px[’AAPL’], model.F, G=model.G, # model
             m0=m0, C0=C0, n0=n0, s0=s0, # priors
             state_discount=.95) # discount factor
                                                                Constant + Trend DLM



                        200



                        150



                        100



                         50
                                       8            9        009            9        009               9               9
                                    200          200        2            200    Jul 2            200             200
                              Nov          Jan          Mar        May                     Sep             Nov

McKinney, Perktold, Seabold (statsmodels)                 Python Time Series Analysis                              SciPy Conference 2011   26 / 29
Preview: Stochastic volatility models


              1.6                       JPY-USD Exchange Rate Volatility Process

              1.4

              1.2

              1.0

              0.8

              0.6

              0.4

              0.20                200             400               600            800             1000



McKinney, Perktold, Seabold (statsmodels)      Python Time Series Analysis          SciPy Conference 2011   27 / 29
Future: sandbox and beyond




          ARCH / GARCH models for volatility
          Structural VAR and error correction models (ECM) for cointegrated
          processes
          Models with non-normally distributed errors
          Better data description, visualization, and interactive research tools
          More sophisticated Bayesian time series models




McKinney, Perktold, Seabold (statsmodels)   Python Time Series Analysis   SciPy Conference 2011   28 / 29
Conclusions




          We’ve implemented many foundational models for time series
          analysis, but the field is very broad
          User interface can and should be much improved
          Repo: http://github.com/statsmodels/statsmodels
          Docs: http://statsmodels.sourceforge.net
          Contact: pystatsmodels@googlegroups.com




McKinney, Perktold, Seabold (statsmodels)   Python Time Series Analysis   SciPy Conference 2011   29 / 29

Más contenido relacionado

La actualidad más candente

Machine learning with scikitlearn
Machine learning with scikitlearnMachine learning with scikitlearn
Machine learning with scikitlearnPratap Dangeti
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnBenjamin Bengfort
 
Machine Learning-Linear regression
Machine Learning-Linear regressionMachine Learning-Linear regression
Machine Learning-Linear regressionkishanthkumaar
 
multiple linear regression
multiple linear regressionmultiple linear regression
multiple linear regressionAkhilesh Joshi
 
Visualizing the Model Selection Process
Visualizing the Model Selection ProcessVisualizing the Model Selection Process
Visualizing the Model Selection ProcessBenjamin Bengfort
 
Introduction to Machine Learning Classifiers
Introduction to Machine Learning ClassifiersIntroduction to Machine Learning Classifiers
Introduction to Machine Learning ClassifiersFunctional Imperative
 
Feature selection concepts and methods
Feature selection concepts and methodsFeature selection concepts and methods
Feature selection concepts and methodsReza Ramezani
 
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...Logistic Regression in Python | Logistic Regression Example | Machine Learnin...
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...Edureka!
 
Data Science - Part IX - Support Vector Machine
Data Science - Part IX -  Support Vector MachineData Science - Part IX -  Support Vector Machine
Data Science - Part IX - Support Vector MachineDerek Kane
 
Ml2 train test-splits_validation_linear_regression
Ml2 train test-splits_validation_linear_regressionMl2 train test-splits_validation_linear_regression
Ml2 train test-splits_validation_linear_regressionankit_ppt
 
Time Series Forecasting Project Presentation.
Time Series Forecasting Project  Presentation.Time Series Forecasting Project  Presentation.
Time Series Forecasting Project Presentation.Anupama Kate
 
Classification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsMd. Main Uddin Rony
 
Model selection and cross validation techniques
Model selection and cross validation techniquesModel selection and cross validation techniques
Model selection and cross validation techniquesVenkata Reddy Konasani
 
Polynomial regression
Polynomial regressionPolynomial regression
Polynomial regressionnaveedaliabad
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machinesnextlib
 
What is the Expectation Maximization (EM) Algorithm?
What is the Expectation Maximization (EM) Algorithm?What is the Expectation Maximization (EM) Algorithm?
What is the Expectation Maximization (EM) Algorithm?Kazuki Yoshida
 

La actualidad más candente (20)

Machine learning with scikitlearn
Machine learning with scikitlearnMachine learning with scikitlearn
Machine learning with scikitlearn
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-Learn
 
Machine Learning-Linear regression
Machine Learning-Linear regressionMachine Learning-Linear regression
Machine Learning-Linear regression
 
multiple linear regression
multiple linear regressionmultiple linear regression
multiple linear regression
 
Visualizing the Model Selection Process
Visualizing the Model Selection ProcessVisualizing the Model Selection Process
Visualizing the Model Selection Process
 
Introduction to Machine Learning Classifiers
Introduction to Machine Learning ClassifiersIntroduction to Machine Learning Classifiers
Introduction to Machine Learning Classifiers
 
Feature selection concepts and methods
Feature selection concepts and methodsFeature selection concepts and methods
Feature selection concepts and methods
 
Time series forecasting
Time series forecastingTime series forecasting
Time series forecasting
 
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...Logistic Regression in Python | Logistic Regression Example | Machine Learnin...
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...
 
Data Science - Part IX - Support Vector Machine
Data Science - Part IX -  Support Vector MachineData Science - Part IX -  Support Vector Machine
Data Science - Part IX - Support Vector Machine
 
Ml2 train test-splits_validation_linear_regression
Ml2 train test-splits_validation_linear_regressionMl2 train test-splits_validation_linear_regression
Ml2 train test-splits_validation_linear_regression
 
Data Visualization With R
Data Visualization With RData Visualization With R
Data Visualization With R
 
Pca ppt
Pca pptPca ppt
Pca ppt
 
Time Series Forecasting Project Presentation.
Time Series Forecasting Project  Presentation.Time Series Forecasting Project  Presentation.
Time Series Forecasting Project Presentation.
 
Classification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning Algorithms
 
Model selection and cross validation techniques
Model selection and cross validation techniquesModel selection and cross validation techniques
Model selection and cross validation techniques
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Polynomial regression
Polynomial regressionPolynomial regression
Polynomial regression
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machines
 
What is the Expectation Maximization (EM) Algorithm?
What is the Expectation Maximization (EM) Algorithm?What is the Expectation Maximization (EM) Algorithm?
What is the Expectation Maximization (EM) Algorithm?
 

Destacado

Python for Financial Data Analysis with pandas
Python for Financial Data Analysis with pandasPython for Financial Data Analysis with pandas
Python for Financial Data Analysis with pandasWes McKinney
 
pandas: a Foundational Python Library for Data Analysis and Statistics
pandas: a Foundational Python Library for Data Analysis and Statisticspandas: a Foundational Python Library for Data Analysis and Statistics
pandas: a Foundational Python Library for Data Analysis and StatisticsWes McKinney
 
Data Structures for Statistical Computing in Python
Data Structures for Statistical Computing in PythonData Structures for Statistical Computing in Python
Data Structures for Statistical Computing in PythonWes McKinney
 
Time travel and time series analysis with pandas + statsmodels
Time travel and time series analysis with pandas + statsmodelsTime travel and time series analysis with pandas + statsmodels
Time travel and time series analysis with pandas + statsmodelsAlexander Hendorf
 
Revenue Growth through Machine Learning
Revenue Growth through Machine LearningRevenue Growth through Machine Learning
Revenue Growth through Machine LearningDataWorks Summit
 
SciPy 2011 pandas lightning talk
SciPy 2011 pandas lightning talkSciPy 2011 pandas lightning talk
SciPy 2011 pandas lightning talkWes McKinney
 
PyDataDC- Forecasting critical food violations at restaurants using open data
PyDataDC- Forecasting critical food violations at restaurants using open dataPyDataDC- Forecasting critical food violations at restaurants using open data
PyDataDC- Forecasting critical food violations at restaurants using open dataNicole A. Donnelly, CMCP
 
ET_with_EEG
ET_with_EEGET_with_EEG
ET_with_EEGXuan Guo
 
How Chile used social media during the Earthquake
How Chile used social media during the EarthquakeHow Chile used social media during the Earthquake
How Chile used social media during the EarthquakeSebastian Salazar
 
Structured Data Challenges in Finance and Statistics
Structured Data Challenges in Finance and StatisticsStructured Data Challenges in Finance and Statistics
Structured Data Challenges in Finance and StatisticsWes McKinney
 
Multivariate time series
Multivariate time seriesMultivariate time series
Multivariate time seriesLuigi Piva CQF
 
What's new in pandas and the SciPy stack for financial users
What's new in pandas and the SciPy stack for financial usersWhat's new in pandas and the SciPy stack for financial users
What's new in pandas and the SciPy stack for financial usersWes McKinney
 
Productive Data Tools for Quants
Productive Data Tools for QuantsProductive Data Tools for Quants
Productive Data Tools for QuantsWes McKinney
 
Analysis of EEG data Using ICA and Algorithm Development for Energy Comparison
Analysis of EEG data Using ICA and Algorithm Development for Energy ComparisonAnalysis of EEG data Using ICA and Algorithm Development for Energy Comparison
Analysis of EEG data Using ICA and Algorithm Development for Energy Comparisonijsrd.com
 
Predicting Stock Market Price Using Support Vector Regression
Predicting Stock Market Price Using Support Vector RegressionPredicting Stock Market Price Using Support Vector Regression
Predicting Stock Market Price Using Support Vector RegressionChittagong Independent University
 
Time series database, InfluxDB & PHP
Time series database, InfluxDB & PHPTime series database, InfluxDB & PHP
Time series database, InfluxDB & PHPCorley S.r.l.
 
ForecastIT 4. Holt's Exponential Smoothing
ForecastIT 4. Holt's Exponential SmoothingForecastIT 4. Holt's Exponential Smoothing
ForecastIT 4. Holt's Exponential SmoothingDeepThought, Inc.
 

Destacado (20)

Python for Financial Data Analysis with pandas
Python for Financial Data Analysis with pandasPython for Financial Data Analysis with pandas
Python for Financial Data Analysis with pandas
 
pandas: a Foundational Python Library for Data Analysis and Statistics
pandas: a Foundational Python Library for Data Analysis and Statisticspandas: a Foundational Python Library for Data Analysis and Statistics
pandas: a Foundational Python Library for Data Analysis and Statistics
 
Data Structures for Statistical Computing in Python
Data Structures for Statistical Computing in PythonData Structures for Statistical Computing in Python
Data Structures for Statistical Computing in Python
 
Time travel and time series analysis with pandas + statsmodels
Time travel and time series analysis with pandas + statsmodelsTime travel and time series analysis with pandas + statsmodels
Time travel and time series analysis with pandas + statsmodels
 
Revenue Growth through Machine Learning
Revenue Growth through Machine LearningRevenue Growth through Machine Learning
Revenue Growth through Machine Learning
 
SciPy 2011 pandas lightning talk
SciPy 2011 pandas lightning talkSciPy 2011 pandas lightning talk
SciPy 2011 pandas lightning talk
 
PyDataDC- Forecasting critical food violations at restaurants using open data
PyDataDC- Forecasting critical food violations at restaurants using open dataPyDataDC- Forecasting critical food violations at restaurants using open data
PyDataDC- Forecasting critical food violations at restaurants using open data
 
ET_with_EEG
ET_with_EEGET_with_EEG
ET_with_EEG
 
How Chile used social media during the Earthquake
How Chile used social media during the EarthquakeHow Chile used social media during the Earthquake
How Chile used social media during the Earthquake
 
Laughing Squid Opportunity Analysis Project
Laughing Squid Opportunity Analysis ProjectLaughing Squid Opportunity Analysis Project
Laughing Squid Opportunity Analysis Project
 
Structured Data Challenges in Finance and Statistics
Structured Data Challenges in Finance and StatisticsStructured Data Challenges in Finance and Statistics
Structured Data Challenges in Finance and Statistics
 
Multivariate time series
Multivariate time seriesMultivariate time series
Multivariate time series
 
What's new in pandas and the SciPy stack for financial users
What's new in pandas and the SciPy stack for financial usersWhat's new in pandas and the SciPy stack for financial users
What's new in pandas and the SciPy stack for financial users
 
Productive Data Tools for Quants
Productive Data Tools for QuantsProductive Data Tools for Quants
Productive Data Tools for Quants
 
Analysis of EEG data Using ICA and Algorithm Development for Energy Comparison
Analysis of EEG data Using ICA and Algorithm Development for Energy ComparisonAnalysis of EEG data Using ICA and Algorithm Development for Energy Comparison
Analysis of EEG data Using ICA and Algorithm Development for Energy Comparison
 
Time series Forecasting using svm
Time series Forecasting using  svmTime series Forecasting using  svm
Time series Forecasting using svm
 
Pocoyo
PocoyoPocoyo
Pocoyo
 
Predicting Stock Market Price Using Support Vector Regression
Predicting Stock Market Price Using Support Vector RegressionPredicting Stock Market Price Using Support Vector Regression
Predicting Stock Market Price Using Support Vector Regression
 
Time series database, InfluxDB & PHP
Time series database, InfluxDB & PHPTime series database, InfluxDB & PHP
Time series database, InfluxDB & PHP
 
ForecastIT 4. Holt's Exponential Smoothing
ForecastIT 4. Holt's Exponential SmoothingForecastIT 4. Holt's Exponential Smoothing
ForecastIT 4. Holt's Exponential Smoothing
 

Similar a Scipy 2011 Time Series Analysis in Python

Antao Biopython Bosc2008
Antao Biopython Bosc2008Antao Biopython Bosc2008
Antao Biopython Bosc2008bosc_2008
 
BOSC 2008 Biopython
BOSC 2008 BiopythonBOSC 2008 Biopython
BOSC 2008 Biopythontiago
 
Python For Scientists
Python For ScientistsPython For Scientists
Python For Scientistsaeberspaecher
 
Software tools for high-throughput materials data generation and data mining
Software tools for high-throughput materials data generation and data miningSoftware tools for high-throughput materials data generation and data mining
Software tools for high-throughput materials data generation and data miningAnubhav Jain
 
人工知能の基本問題:これまでとこれから
人工知能の基本問題:これまでとこれから人工知能の基本問題:これまでとこれから
人工知能の基本問題:これまでとこれからIchigaku Takigawa
 
Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks Jim Dowling
 
Automated Machine Learning via Sequential Uniform Designs
Automated Machine Learning via Sequential Uniform DesignsAutomated Machine Learning via Sequential Uniform Designs
Automated Machine Learning via Sequential Uniform DesignsAijun Zhang
 
Introduction to Apache Hivemall v0.5.0
Introduction to Apache Hivemall v0.5.0Introduction to Apache Hivemall v0.5.0
Introduction to Apache Hivemall v0.5.0Makoto Yui
 
Polyglot metadata for Hadoop
Polyglot metadata for HadoopPolyglot metadata for Hadoop
Polyglot metadata for HadoopJim Dowling
 
Colored petri nets theory and applications
Colored petri nets theory and applicationsColored petri nets theory and applications
Colored petri nets theory and applicationsAbu Hussein
 
Python for Chemistry
Python for ChemistryPython for Chemistry
Python for Chemistryguest5929fa7
 
Python for Chemistry
Python for ChemistryPython for Chemistry
Python for Chemistrybaoilleach
 
The Other HPC: High Productivity Computing
The Other HPC: High Productivity ComputingThe Other HPC: High Productivity Computing
The Other HPC: High Productivity ComputingUniversity of Washington
 
Crude-Oil Scheduling Technology: moving from simulation to optimization
Crude-Oil Scheduling Technology: moving from simulation to optimizationCrude-Oil Scheduling Technology: moving from simulation to optimization
Crude-Oil Scheduling Technology: moving from simulation to optimizationBrenno Menezes
 
An Overview of Python for Data Analytics
An Overview of Python for Data AnalyticsAn Overview of Python for Data Analytics
An Overview of Python for Data AnalyticsIRJET Journal
 

Similar a Scipy 2011 Time Series Analysis in Python (20)

Antao Biopython Bosc2008
Antao Biopython Bosc2008Antao Biopython Bosc2008
Antao Biopython Bosc2008
 
BOSC 2008 Biopython
BOSC 2008 BiopythonBOSC 2008 Biopython
BOSC 2008 Biopython
 
Python For Scientists
Python For ScientistsPython For Scientists
Python For Scientists
 
Software tools for high-throughput materials data generation and data mining
Software tools for high-throughput materials data generation and data miningSoftware tools for high-throughput materials data generation and data mining
Software tools for high-throughput materials data generation and data mining
 
人工知能の基本問題:これまでとこれから
人工知能の基本問題:これまでとこれから人工知能の基本問題:これまでとこれから
人工知能の基本問題:これまでとこれから
 
Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks
 
Sci computing using python
Sci computing using pythonSci computing using python
Sci computing using python
 
Automated Machine Learning via Sequential Uniform Designs
Automated Machine Learning via Sequential Uniform DesignsAutomated Machine Learning via Sequential Uniform Designs
Automated Machine Learning via Sequential Uniform Designs
 
Introduction to Apache Hivemall v0.5.0
Introduction to Apache Hivemall v0.5.0Introduction to Apache Hivemall v0.5.0
Introduction to Apache Hivemall v0.5.0
 
Python Orientation
Python OrientationPython Orientation
Python Orientation
 
Polyglot metadata for Hadoop
Polyglot metadata for HadoopPolyglot metadata for Hadoop
Polyglot metadata for Hadoop
 
Colored petri nets theory and applications
Colored petri nets theory and applicationsColored petri nets theory and applications
Colored petri nets theory and applications
 
Python for Chemistry
Python for ChemistryPython for Chemistry
Python for Chemistry
 
Python for Chemistry
Python for ChemistryPython for Chemistry
Python for Chemistry
 
2015 03-28-eb-final
2015 03-28-eb-final2015 03-28-eb-final
2015 03-28-eb-final
 
The Other HPC: High Productivity Computing
The Other HPC: High Productivity ComputingThe Other HPC: High Productivity Computing
The Other HPC: High Productivity Computing
 
Crude-Oil Scheduling Technology: moving from simulation to optimization
Crude-Oil Scheduling Technology: moving from simulation to optimizationCrude-Oil Scheduling Technology: moving from simulation to optimization
Crude-Oil Scheduling Technology: moving from simulation to optimization
 
Ibmr 2014
Ibmr 2014Ibmr 2014
Ibmr 2014
 
An Overview of Python for Data Analytics
An Overview of Python for Data AnalyticsAn Overview of Python for Data Analytics
An Overview of Python for Data Analytics
 
CMSI計算科学技術特論C (2015) ALPS と量子多体問題②
CMSI計算科学技術特論C (2015) ALPS と量子多体問題②CMSI計算科学技術特論C (2015) ALPS と量子多体問題②
CMSI計算科学技術特論C (2015) ALPS と量子多体問題②
 

Más de Wes McKinney

The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Solving Enterprise Data Challenges with Apache Arrow
Solving Enterprise Data Challenges with Apache ArrowSolving Enterprise Data Challenges with Apache Arrow
Solving Enterprise Data Challenges with Apache ArrowWes McKinney
 
Apache Arrow: Open Source Standard Becomes an Enterprise Necessity
Apache Arrow: Open Source Standard Becomes an Enterprise NecessityApache Arrow: Open Source Standard Becomes an Enterprise Necessity
Apache Arrow: Open Source Standard Becomes an Enterprise NecessityWes McKinney
 
Apache Arrow: High Performance Columnar Data Framework
Apache Arrow: High Performance Columnar Data FrameworkApache Arrow: High Performance Columnar Data Framework
Apache Arrow: High Performance Columnar Data FrameworkWes McKinney
 
New Directions for Apache Arrow
New Directions for Apache ArrowNew Directions for Apache Arrow
New Directions for Apache ArrowWes McKinney
 
Apache Arrow Flight: A New Gold Standard for Data Transport
Apache Arrow Flight: A New Gold Standard for Data TransportApache Arrow Flight: A New Gold Standard for Data Transport
Apache Arrow Flight: A New Gold Standard for Data TransportWes McKinney
 
ACM TechTalks : Apache Arrow and the Future of Data Frames
ACM TechTalks : Apache Arrow and the Future of Data FramesACM TechTalks : Apache Arrow and the Future of Data Frames
ACM TechTalks : Apache Arrow and the Future of Data FramesWes McKinney
 
Apache Arrow: Present and Future @ ScaledML 2020
Apache Arrow: Present and Future @ ScaledML 2020Apache Arrow: Present and Future @ ScaledML 2020
Apache Arrow: Present and Future @ ScaledML 2020Wes McKinney
 
PyCon Colombia 2020 Python for Data Analysis: Past, Present, and Future
PyCon Colombia 2020 Python for Data Analysis: Past, Present, and Future PyCon Colombia 2020 Python for Data Analysis: Past, Present, and Future
PyCon Colombia 2020 Python for Data Analysis: Past, Present, and Future Wes McKinney
 
Apache Arrow: Leveling Up the Analytics Stack
Apache Arrow: Leveling Up the Analytics StackApache Arrow: Leveling Up the Analytics Stack
Apache Arrow: Leveling Up the Analytics StackWes McKinney
 
Apache Arrow Workshop at VLDB 2019 / BOSS Session
Apache Arrow Workshop at VLDB 2019 / BOSS SessionApache Arrow Workshop at VLDB 2019 / BOSS Session
Apache Arrow Workshop at VLDB 2019 / BOSS SessionWes McKinney
 
Apache Arrow: Leveling Up the Data Science Stack
Apache Arrow: Leveling Up the Data Science StackApache Arrow: Leveling Up the Data Science Stack
Apache Arrow: Leveling Up the Data Science StackWes McKinney
 
Ursa Labs and Apache Arrow in 2019
Ursa Labs and Apache Arrow in 2019Ursa Labs and Apache Arrow in 2019
Ursa Labs and Apache Arrow in 2019Wes McKinney
 
PyCon.DE / PyData Karlsruhe keynote: "Looking backward, looking forward"
PyCon.DE / PyData Karlsruhe keynote: "Looking backward, looking forward"PyCon.DE / PyData Karlsruhe keynote: "Looking backward, looking forward"
PyCon.DE / PyData Karlsruhe keynote: "Looking backward, looking forward"Wes McKinney
 
Apache Arrow at DataEngConf Barcelona 2018
Apache Arrow at DataEngConf Barcelona 2018Apache Arrow at DataEngConf Barcelona 2018
Apache Arrow at DataEngConf Barcelona 2018Wes McKinney
 
Apache Arrow: Cross-language Development Platform for In-memory Data
Apache Arrow: Cross-language Development Platform for In-memory DataApache Arrow: Cross-language Development Platform for In-memory Data
Apache Arrow: Cross-language Development Platform for In-memory DataWes McKinney
 
Apache Arrow -- Cross-language development platform for in-memory data
Apache Arrow -- Cross-language development platform for in-memory dataApache Arrow -- Cross-language development platform for in-memory data
Apache Arrow -- Cross-language development platform for in-memory dataWes McKinney
 
Shared Infrastructure for Data Science
Shared Infrastructure for Data ScienceShared Infrastructure for Data Science
Shared Infrastructure for Data ScienceWes McKinney
 
Data Science Without Borders (JupyterCon 2017)
Data Science Without Borders (JupyterCon 2017)Data Science Without Borders (JupyterCon 2017)
Data Science Without Borders (JupyterCon 2017)Wes McKinney
 
Memory Interoperability in Analytics and Machine Learning
Memory Interoperability in Analytics and Machine LearningMemory Interoperability in Analytics and Machine Learning
Memory Interoperability in Analytics and Machine LearningWes McKinney
 

Más de Wes McKinney (20)

The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Solving Enterprise Data Challenges with Apache Arrow
Solving Enterprise Data Challenges with Apache ArrowSolving Enterprise Data Challenges with Apache Arrow
Solving Enterprise Data Challenges with Apache Arrow
 
Apache Arrow: Open Source Standard Becomes an Enterprise Necessity
Apache Arrow: Open Source Standard Becomes an Enterprise NecessityApache Arrow: Open Source Standard Becomes an Enterprise Necessity
Apache Arrow: Open Source Standard Becomes an Enterprise Necessity
 
Apache Arrow: High Performance Columnar Data Framework
Apache Arrow: High Performance Columnar Data FrameworkApache Arrow: High Performance Columnar Data Framework
Apache Arrow: High Performance Columnar Data Framework
 
New Directions for Apache Arrow
New Directions for Apache ArrowNew Directions for Apache Arrow
New Directions for Apache Arrow
 
Apache Arrow Flight: A New Gold Standard for Data Transport
Apache Arrow Flight: A New Gold Standard for Data TransportApache Arrow Flight: A New Gold Standard for Data Transport
Apache Arrow Flight: A New Gold Standard for Data Transport
 
ACM TechTalks : Apache Arrow and the Future of Data Frames
ACM TechTalks : Apache Arrow and the Future of Data FramesACM TechTalks : Apache Arrow and the Future of Data Frames
ACM TechTalks : Apache Arrow and the Future of Data Frames
 
Apache Arrow: Present and Future @ ScaledML 2020
Apache Arrow: Present and Future @ ScaledML 2020Apache Arrow: Present and Future @ ScaledML 2020
Apache Arrow: Present and Future @ ScaledML 2020
 
PyCon Colombia 2020 Python for Data Analysis: Past, Present, and Future
PyCon Colombia 2020 Python for Data Analysis: Past, Present, and Future PyCon Colombia 2020 Python for Data Analysis: Past, Present, and Future
PyCon Colombia 2020 Python for Data Analysis: Past, Present, and Future
 
Apache Arrow: Leveling Up the Analytics Stack
Apache Arrow: Leveling Up the Analytics StackApache Arrow: Leveling Up the Analytics Stack
Apache Arrow: Leveling Up the Analytics Stack
 
Apache Arrow Workshop at VLDB 2019 / BOSS Session
Apache Arrow Workshop at VLDB 2019 / BOSS SessionApache Arrow Workshop at VLDB 2019 / BOSS Session
Apache Arrow Workshop at VLDB 2019 / BOSS Session
 
Apache Arrow: Leveling Up the Data Science Stack
Apache Arrow: Leveling Up the Data Science StackApache Arrow: Leveling Up the Data Science Stack
Apache Arrow: Leveling Up the Data Science Stack
 
Ursa Labs and Apache Arrow in 2019
Ursa Labs and Apache Arrow in 2019Ursa Labs and Apache Arrow in 2019
Ursa Labs and Apache Arrow in 2019
 
PyCon.DE / PyData Karlsruhe keynote: "Looking backward, looking forward"
PyCon.DE / PyData Karlsruhe keynote: "Looking backward, looking forward"PyCon.DE / PyData Karlsruhe keynote: "Looking backward, looking forward"
PyCon.DE / PyData Karlsruhe keynote: "Looking backward, looking forward"
 
Apache Arrow at DataEngConf Barcelona 2018
Apache Arrow at DataEngConf Barcelona 2018Apache Arrow at DataEngConf Barcelona 2018
Apache Arrow at DataEngConf Barcelona 2018
 
Apache Arrow: Cross-language Development Platform for In-memory Data
Apache Arrow: Cross-language Development Platform for In-memory DataApache Arrow: Cross-language Development Platform for In-memory Data
Apache Arrow: Cross-language Development Platform for In-memory Data
 
Apache Arrow -- Cross-language development platform for in-memory data
Apache Arrow -- Cross-language development platform for in-memory dataApache Arrow -- Cross-language development platform for in-memory data
Apache Arrow -- Cross-language development platform for in-memory data
 
Shared Infrastructure for Data Science
Shared Infrastructure for Data ScienceShared Infrastructure for Data Science
Shared Infrastructure for Data Science
 
Data Science Without Borders (JupyterCon 2017)
Data Science Without Borders (JupyterCon 2017)Data Science Without Borders (JupyterCon 2017)
Data Science Without Borders (JupyterCon 2017)
 
Memory Interoperability in Analytics and Machine Learning
Memory Interoperability in Analytics and Machine LearningMemory Interoperability in Analytics and Machine Learning
Memory Interoperability in Analytics and Machine Learning
 

Último

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 

Último (20)

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

Scipy 2011 Time Series Analysis in Python

  • 1. Time Series Analysis in Python with statsmodels Wes McKinney1 Josef Perktold2 Skipper Seabold3 1 Departmentof Statistical Science Duke University 2 Department of Economics University of North Carolina at Chapel Hill 3 Departmentof Economics American University 10th Python in Science Conference, 13 July 2011 McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 1 / 29
  • 2. What is statsmodels? A library for statistical modeling, implementing standard statistical models in Python using NumPy and SciPy Includes: Linear (regression) models of many forms Descriptive statistics Statistical tests Time series analysis ...and much more McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 2 / 29
  • 3. What is Time Series Analysis? Statistical modeling of time-ordered data observations Inferring structure, forecasting and simulation, and testing distributional assumptions about the data Modeling dynamic relationships among multiple time series Broad applications e.g. in economics, finance, neuroscience, signal processing... McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 3 / 29
  • 4. Talk Overview Brief update on statsmodels development Aside: user interface and data structures Descriptive statistics and tests Auto-regressive moving average models (ARMA) Vector autoregression (VAR) models Filtering tools (Hodrick-Prescott and others) Near future: Bayesian dynamic linear models (DLMs), ARCH / GARCH volatility models and beyond McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 4 / 29
  • 5. Statsmodels development update We’re now on GitHub! Join us: http://github.com/statsmodels/statsmodels Check out the slick Sphinx docs: http://statsmodels.sourceforge.net Development focus has been largely computational, i.e. writing correct, tested implementations of all the common classes of statistical models McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 5 / 29
  • 6. Statsmodels development update Major work to be done on providing a nice integrated user interface We must work together to close the gap between R and Python! Some important areas: Formula framework, for specifying model design matrices Need integrated rich statistical data structures (pandas) Data visualization of results should always be a few keystrokes away Write a “Statsmodels for R users” guide McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 6 / 29
  • 7. Aside: statistical data structures and user interface While I have a captive audience... Controversial fact: pandas is the only Python library currently providing data structures matching (and in many places exceeding) the richness of R’s data structures (for statistics) Let’s have a BoF session so I can justify this statement Feedback I hear is that end users find the fragmented, incohesive set of Python tools for data analysis and statistics to be confusing, frustrating, and certainly not compelling them to use Python... (Not to mention the packaging headaches) McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 7 / 29
  • 8. Aside: statistical data structures and user interface We need to “commit” ASAP (not 12 months from now) to a high level data structure(s) as the “primary data structure(s) for statistical data analysis” and communicate that clearly to end users Or we might as well all start programming in R... McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 8 / 29
  • 9. Example data: EEG trace data 300 200 100 0 100 200 300 400 500 600 0 500 0 0 0 0 0 0 0 100 150 200 250 300 350 400 McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 9 / 29
  • 10. Example data: Macroeconomic data 5.5 5.0 cpi 4.5 4.0 3.5 3.0 7.5 7.0 m1 6.5 6.0 5.5 5.0 4.5 9.5 9.0 realgdp 8.5 8.0 0 4 8 2 6 0 4 8 2 6 0 4 8 196 196 196 197 197 198 198 198 199 199 200 200 200 McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 10 / 29
  • 11. Example data: Stock data 800 AAPL 700 GOOG MSFT 600 YHOO 500 400 300 200 100 0 1 2 3 4 5 6 7 8 9 200 200 200 200 200 200 200 200 200 McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 11 / 29
  • 12. Descriptive statistics Autocorrelation, partial autocorrelation plots Commonly used for identification in ARMA(p,q) and ARIMA(p,d,q) models acf = tsa . acf ( eeg , 50) pacf = tsa . pacf ( eeg , 50) 1.0 Autocorrelation 1.0 Partial Autocorrelation 0.5 0.5 0.0 0.0 0.5 0.5 1.00 10 20 30 40 50 1.00 10 20 30 40 50 McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 12 / 29
  • 13. Statistical tests Ljung-Box test for zero autocorrelation Unit root test for cointegration (Augmented Dickey-Fuller test) Granger-causality Whiteness (iid-ness) and normality See our conference paper (when the proceedings get published!) McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 13 / 29
  • 14. Autoregressive moving average (ARMA) models One of most common univariate time series models: yt = µ + a1 yt−1 + ... + ak yt−p + t + b1 t−1 + ... + bq t−q 2 where E ( t , s ) = 0, for t = s and t ∼ N (0, σ ) Exact log-likelihood can be evaluated via the Kalman filter, but the “conditional” likelihood is easier and commonly used statsmodels has tools for simulating ARMA processes with known coefficients ai , bi and also estimation given specified lag orders import scikits.statsmodels.tsa.arima_process as ap ar_coef = [1, .75, -.25]; ma_coef = [1, -.5] nobs = 100 y = ap.arma_generate_sample(ar_coef, ma_coef, nobs) y += 4 # add in constant McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 14 / 29
  • 15. ARMA Estimation Several likelihood-based estimators implemented (see docs) model = tsa.ARMA(y) result = model.fit(order=(2, 1), trend=’c’, method=’css-mle’, disp=-1) result.params # array([ 3.97, -0.97, -0.05, -0.13]) Standard model diagnostics, standard errors, information criteria (AIC, BIC, ...), etc available in the returned ARMAResults object McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 15 / 29
  • 16. Vector Autoregression (VAR) models Widely used model for modeling multiple (K -variate) time series, especially in macroeconomics: Yt = A1 Yt−1 + . . . + Ap Yt−p + t, t ∼ N (0, Σ) Matrices Ai are K × K . Yt must be a stationary process (sometimes achieved by differencing). Related class of models (VECM) for modeling nonstationary (including cointegrated) processes McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 16 / 29
  • 17. Vector Autoregression (VAR) models >>> model = VAR(data); model.select_order(8) VAR Order Selection ===================================================== aic bic fpe hqic ----------------------------------------------------- 0 -27.83 -27.78 8.214e-13 -27.81 1 -28.77 -28.57 3.189e-13 -28.69 2 -29.00 -28.64* 2.556e-13 -28.85 3 -29.10 -28.60 2.304e-13 -28.90* 4 -29.09 -28.43 2.330e-13 -28.82 5 -29.13 -28.33 2.228e-13 -28.81 6 -29.14* -28.18 2.213e-13* -28.75 7 -29.07 -27.96 2.387e-13 -28.62 ===================================================== * Minimum McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 17 / 29
  • 18. Vector Autoregression (VAR) models >>> result = model.fit(2) >>> result.summary() # print summary for each variable <snip> Results for equation m1 ==================================================== coefficient std. error t-stat prob ---------------------------------------------------- const 0.004968 0.001850 2.685 0.008 L1.m1 0.363636 0.071307 5.100 0.000 L1.realgdp -0.077460 0.092975 -0.833 0.406 L1.cpi -0.052387 0.128161 -0.409 0.683 L2.m1 0.250589 0.072050 3.478 0.001 L2.realgdp -0.085874 0.092032 -0.933 0.352 L2.cpi 0.169803 0.128376 1.323 0.188 ==================================================== <snip> McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 18 / 29
  • 19. Vector Autoregression (VAR) models >>> result = model.fit(2) >>> result.summary() # print summary for each variable <snip> Correlation matrix of residuals m1 realgdp cpi m1 1.000000 -0.055690 -0.297494 realgdp -0.055690 1.000000 0.115597 cpi -0.297494 0.115597 1.000000 McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 19 / 29
  • 20. VAR: Impulse Response analysis Analyze systematic impact of unit “shock” to a single variable irf = result.irf(10) irf.plot() Impulse responses m1 → m1 realgdp → m1 cpi → m1 1.0 0.2 0.4 0.8 0.1 0.3 0.2 0.6 0.0 0.1 0.4 0.1 0.0 0.2 0.2 0.1 0.2 0.0 0.3 0.3 0.20 4 0.40 4 10 0.40 2 6 m1 → realgdp 8 10 2 realgdp → realgdp 8 6 2 cpi4→ realgdp 6 8 10 0.20 1.0 0.2 0.15 0.8 0.1 0.10 0.6 0.0 0.05 0.4 0.1 0.00 0.05 0.2 0.2 0.10 0.0 0.3 0.150 2 4 6 8 10 0.20 2 4 0.40 4 → cpi m1 → cpi realgdp →6 cpi 8 10 2 cpi 6 8 10 0.20 0.15 1.0 0.15 0.10 0.8 0.10 0.05 0.6 0.05 0.00 0.00 0.05 0.4 0.05 0.10 0.2 0.100 2 4 6 8 10 0.150 2 4 6 8 10 0.00 2 4 6 8 10 McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 20 / 29
  • 21. VAR: Forecast Error Variance Decomposition Analyze contribution of each variable to forecasting error fevd = result.fevd(20) fevd.plot() Forecast error variance decomposition (FEVD) m1 1.0 m1 realgdp 0.8 cpi 0.6 0.4 0.2 0.00 5 10 15 20 1.2 realgdp 1.0 0.8 0.6 0.4 0.2 0.00 5 10 15 20 1.2 cpi 1.0 0.8 0.6 0.4 0.2 0.00 5 10 15 20 McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 21 / 29
  • 22. VAR: Statistical tests In [137]: result.test_causality(’m1’, [’cpi’, ’realgdp’]) Granger causality f-test ========================================================= Test statistic Critical Value p-value df --------------------------------------------------------- 1.248787 2.387325 0.289 (4, 579) ========================================================= H_0: [’cpi’, ’realgdp’] do not Granger-cause m1 Conclusion: fail to reject H_0 at 5.00% significance level McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 22 / 29
  • 23. Filtering Hodrick-Prescott (HP) filter separates a time series yt into a trend τt and a cyclical component ζt , so that yt = τt + ζt . 14 Inflation 12 Cyclical component 10 Trend component 8 6 4 2 0 2 4 2 6 0 4 8 2 6 0 4 8 2 6 196 196 197 197 197 198 198 199 199 199 200 200 McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 23 / 29
  • 24. Filtering In addition to the HP filter, 2 other filters popular in finance and economics, Baxter-King and Christiano-Fitzgerald, are available We refer you to our paper and the documentation for details on these: Inflation and Unemployment: BK Filtered Inflation and Unemployment: CF Filtered INFL INFL 4 4 UNEMP UNEMP 2 2 0 0 2 2 4 4 63 73 83 93 68 78 88 98 03 71 81 91 08 66 76 86 96 01 06 19 19 19 19 19 19 19 19 19 19 19 20 19 19 19 19 20 20 20 McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 24 / 29
  • 25. Preview: Bayesian dynamic linear models (DLM) A state space model by another name: yt = Ft θt + νt , νt ∼ N (0, Vt ) θt = G θt−1 + ωt , ωt ∼ N (0, Wt ) Estimation of basic model by Kalman filter recursions. Provides elegant way to do time-varying linear regressions for forecasting Extensions: multivariate DLMs, stochastic volatility (SV) models, MCMC-based posterior sampling, mixtures of DLMs McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 25 / 29
  • 26. Preview: DLM Example (Constant+Trend model) model = Polynomial(2) dlm = DLM(close_px[’AAPL’], model.F, G=model.G, # model m0=m0, C0=C0, n0=n0, s0=s0, # priors state_discount=.95) # discount factor Constant + Trend DLM 200 150 100 50 8 9 009 9 009 9 9 200 200 2 200 Jul 2 200 200 Nov Jan Mar May Sep Nov McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 26 / 29
  • 27. Preview: Stochastic volatility models 1.6 JPY-USD Exchange Rate Volatility Process 1.4 1.2 1.0 0.8 0.6 0.4 0.20 200 400 600 800 1000 McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 27 / 29
  • 28. Future: sandbox and beyond ARCH / GARCH models for volatility Structural VAR and error correction models (ECM) for cointegrated processes Models with non-normally distributed errors Better data description, visualization, and interactive research tools More sophisticated Bayesian time series models McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 28 / 29
  • 29. Conclusions We’ve implemented many foundational models for time series analysis, but the field is very broad User interface can and should be much improved Repo: http://github.com/statsmodels/statsmodels Docs: http://statsmodels.sourceforge.net Contact: pystatsmodels@googlegroups.com McKinney, Perktold, Seabold (statsmodels) Python Time Series Analysis SciPy Conference 2011 29 / 29