SlideShare a Scribd company logo
1 of 320
Hands-On Time-Series
       Analysis with Matlab



Michalis Vlachos and Spiros Papadimitriou
       IBM T.J. Watson Research Center
Tutorial | Time-Series with Matlab


Disclaimer
 Feel free to use any of the following slides
 for educational purposes, however kindly
 acknowledge the source.

 We would also like to know how you have
 used these slides, so please send us emails
 with comments or suggestions.
Tutorial | Time-Series with Matlab


    About this tutorial
   The goal of this tutorial is to show you that time-series research
    (or research in general) can be made fun, when it involves
    visualizing ideas, that can be achieved with concise programming.
   Matlab enables us to do that.



                   Will I be able               I am definitely
                    to use this                smarter than her,
                     MATLAB                   but I am not a time-
                    right away               series person, per-se.
                     after the
                     tutorial?               I wonder what I gain
                                              from this tutorial…
Tutorial | Time-Series with Matlab


     Disclaimer
   We are not affiliated with Mathworks in any way
   … but we do like using Matlab a lot
     since it makes our lives easier


   Errors and bugs are most likely contained in this tutorial.
   We might be responsible for some of them.
Tutorial | Time-Series with Matlab


    What this tutorial is NOT about
 Moving averages
 Autoregressive models
 Forecasting/Prediction
 Stationarity
 Seasonality
Tutorial | Time-Series with Matlab


Overview
PART A — The Matlab programming environment

PART B — Basic mathematics
   Introduction / geometric intuition
   Coordinates and transforms
   Quantized representations
   Non-Euclidean distances

PART C — Similarity Search and Applications
   Introduction
   Representations
   Distance Measures
   Lower Bounding
   Clustering/Classification/Visualization
   Applications
Tutorial | Time-Series with Matlab




PART A: Matlab Introduction
Tutorial | Time-Series with Matlab

Why does anyone need Matlab?
 Matlab enables the efficient
 Exploratory Data Analysis (EDA)

“Science progresses through observation”
  -- Isaac Newton
                                                           Isaac Newton




“The greatest value of a picture is that is forces us to
  notice what we never expected to see”
  -- John Tukey


                                                           John Tukey
Tutorial | Time-Series with Matlab

Matlab
 Interpreted Language
 – Easy code maintenance (code is very compact)
 – Very fast array/vector manipulation
 – Support for OOP
 Easy plotting and visualization
 Easy Integration with other Languages/OS’s
 – Interact with C/C++, COM Objects, DLLs
 – Build in Java support (and compiler)
 – Ability to make executable files
 – Multi-Platform Support (Windows, Mac, Linux)
 Extensive number of Toolboxes
 – Image, Statistics, Bioinformatics, etc
Tutorial | Time-Series with Matlab

History of Matlab (MATrix LABoratory)
“The most important thing in the programming language is the name.
I have recently invented a very good name and now I am looking for a
suitable language”. -- R. Knuth

 Programmed by Cleve Moler as an interface for
  EISPACK & LINPACK
                                                                                    Cleve Moler
  1957: Moler goes to Caltech. Studies numerical
    Analysis
  1961: Goes to Stanford. Works with G. Forsythe on
    Laplacian eigenvalues.
  1977: First edition of Matlab; 2000 lines of Fortran
    – 80 functions (now more than 8000 functions)
  1979: Met with Jack Little in Stanford. Started working
    on porting it to C
  1984: Mathworks is founded
Video:http://www.mathworks.com/company/aboutus/founders/origins_of_matlab_wm.html
Tutorial | Time-Series with Matlab
Tutorial | Time-Series with Matlab

Current State of Matlab/Mathworks
 Matlab, Simulink, Stateflow
 Matlab version 7.3, R2006b
 Used in variety of industries
 – Aerospace, defense, computers, communication, biotech
 Mathworks still is privately owned
 Used in >3,500 Universities, with >500,000 users worldwide
 2005 Revenue: >350 M.                      Money is better than
                                            Money is better than
                                              poverty, if only for
                                             poverty, if only for
 2005 Employees: 1,400+                    financial reasons……
                                           financial reasons……
 Pricing:
 – starts from 1900$ (Commercial use),
 – ~100$ (Student Edition)
Tutorial | Time-Series with Matlab

Matlab 7.3
 R2006b, Released on Sept 1 2006
 – Distributed computing
 – Better support for large files
 – New optimization Toolbox
 – Matlab builder for Java
   • create Java classes from Matlab


 – Demos, Webinars in Flash format
 – (http://www.mathworks.com/products/matlab/demos.
   html)
Tutorial | Time-Series with Matlab

Who needs Matlab?
 R&D companies for easy application deployment
 Professors
 – Lab assignments
 – Matlab allows focus on algorithms not on language features
 Students
 – Batch processing of files
   • No more incomprehensible perl code!
 – Great environment for testing ideas
   • Quick coding of ideas, then porting to C/Java etc
 – Easy visualization
 – It’s cheap! (for students at least…)
Tutorial | Time-Series with Matlab

Starting up Matlab                          Personally I'm always ready to learn, although I do not always like be

                                                                    Sir Winston Churchill
 Dos/Unix like directory navigation
 Commands like:
   – cd
   – pwd
   – mkdir
 For navigation it is easier to just
  copy/paste the path from explorer
  E.g.:
  cd ‘c:documents’
Tutorial | Time-Series with Matlab

Matlab Environment




                                           Command Window:
                                           - type commands
                                           - load scripts




  Workspace:
  Loaded Variables/Types/Size
Tutorial | Time-Series with Matlab

 Matlab Environment




                                             Command Window:
                                             - type commands
                                             - load scripts




    Workspace:
    Loaded Variables/Types/Size


Help contains a comprehensive
introduction to all functions
Tutorial | Time-Series with Matlab

Matlab Environment




                                           Command Window:
                                           - type commands
                                           - load scripts




  Workspace:
  Loaded Variables/Types/Size


  Excellent demos and
  tutorial of the various
 features and toolboxes
Tutorial | Time-Series with Matlab

Starting with Matlab
 Everything is arrays
 Manipulation of arrays is faster than regular manipulation
  with for-loops

 a = [1 2 3 4 5 6 7 9 10] % define an array
Tutorial | Time-Series with Matlab

Populating arrays
 Plot sinusoid function
 a = [0:0.3:2*pi] % generate values from 0 to 2pi (with step of 0.3)
 b = cos(a) % access cos at positions contained in array [a]
 plot(a,b) % plot a (x-axis) against b (y-axis)




Related:
linspace(-100,100,15); % generate 15 values between -100 and 100
Tutorial | Time-Series with Matlab

Array Access
 Access array elements
 >> a(1)                   >> a(1:3)
                           ans =
 ans =
                                     0        0.3000   0.6000
             0

 Set array elements
 >> a(1) = 100             >> a(1:3) = [100 100 100]
Tutorial | Time-Series with Matlab

2D Arrays
 Can access whole columns or rows
 – Let’s define a 2D array
 >> a = [1 2 3; 4 5 6]                            >> a(1,:)                            Row-wise access
 a =
                                                  ans =
     1         2       3
     4         5       6                                 1        2        3

 >> a(2,2)                                        >> a(:,1)                            Column-wise access
 ans =                                            ans =

     5                                                   1
                                                         4




   A good listener is not only popular everywhere, but after a while he gets to know something. –Wilson Mizner
Tutorial | Time-Series with Matlab

Column-wise computation
 For arrays greater than 1D, all computations happen
 column-by-column

 >> a = [1 2 3; 3 2 1]                        >> max(a)
 a =
                                              ans =
     1         2       3
     3         2       1                          3        2   3

 >> mean(a)                                   >> sort(a)

 ans =                                        ans =

    2.0000         2.0000        2.0000           1        2   1
                                                  3        2   3
Tutorial | Time-Series with Matlab

Concatenating arrays
 Column-wise or row-wise


 >> a = [1 2 3];                   Row next to row     >> a = [1;2];         Column next to column
 >> b = [4 5 6];                                       >> b = [3;4];
 >> c = [a b]                                          >> c = [a b]
                                                       c =
 c =
                                                                1      3
          1         2       3       4        5     6            2      4



 >>   a   =   [1 2 3];             Row below row       >>   a   =   [1;2];   Column below column
 >>   b   =   [4 5 6];                                 >>   b   =   [3;4];
 >>   c   =   [a; b]                                   >>   c   =   [a; b]

 c =                                                   c =

          1         2       3                                   1
          4         5       6                                   2
                                                                3
                                                                4
Tutorial | Time-Series with Matlab

Initializing arrays
 Create array of ones [ones]
  >> a = ones(1,3)                                    >> a = ones(2,2)*5;
  a =                                                 a =

             1             1       1                      5     5
                                                          5     5
  >> a = ones(1,3)*inf
  a =
       Inf Inf Inf


 Create array of zeroes [zeros]
 – Good for initializing arrays




  >> a = zeros(1,4)                                   >> a = zeros(3,1) + [1 2 3]’
  a =                                                 a =
                                                           1
             0             0       0   0                   2
                                                           3
Tutorial | Time-Series with Matlab

Reshaping and Replicating Arrays
 Changing the array shape [reshape]
  – (eg, for easier column-wise computation)

 >> a = [1 2 3 4 5 6]’; % make it into a column        reshape(X,[M,N]):
 >> reshape(a,2,3)                                     [M,N] matrix of
                                                       columnwise version
 ans =                                                 of X

      1          3       5
      2          4       6

 Replicating an array [repmat]
 >> a = [1 2 3];                                       repmat(X,[M,N]):
 >> repmat(a,1,2)                                      make [M,N] tiles of X

 ans =       1       2       3       1         2   3

 >> repmat(a,2,1)
 ans =
          1     2            3
          1     2            3
Tutorial | Time-Series with Matlab

Useful Array functions
 Last element of array [end]
   >> a = [1 3 2 5];                           >> a = [1 3 2 5];
   >> a(end)                                   >> a(end-1)

   ans =                                       ans =

        5                                              2

 Length of array [length]
                                                               Length = 4
   >> length(a)

   ans =                                          a=   1   3     2   5

        4

 Dimensions of array [size]                                    columns = 4
                                            rows = 1


   >> [rows, columns] = size(a)
   rows = 1                                            1   2     3   5
   columns = 4
Tutorial | Time-Series with Matlab

Useful Array functions
 Find a specific element [find] **
   >> a = [1 3 2 5 10 5 2 3];
   >> b = find(a==2)

   b =

          3       7


 Sorting [sort] ***
   >> a = [1 3 2 5];
   >> [s,i]=sort(a)                           a=   1   3   2   5

   s =
          1       2       3       5
                                              s=   1   2   3   5
   i =
          1       3       2       4           i=   1   3   2   4   Indicates the index
                                                                   where the element
                                                                   came from
Tutorial | Time-Series with Matlab

 Visualizing Data and Exporting Figures
  Use Fisher’s Iris dataset
     >> load fisheriris

     – 4 dimensions, 3 species
     – Petal length & width, sepal length & width
     – Iris:
          • virginica/versicolor/setosa

      meas (150x4 array):
      Holds 4D measurements
    ...
'versicolor'
'versicolor'
'versicolor'
'versicolor'
'versicolor'    species (150x1 cell array):
'virginica'     Holds name of species for
'virginica'     the specific measurement
'virginica'
'virginica‘
    ...
Tutorial | Time-Series with Matlab                                         strcmp, scatter, hold on


Visualizing Data (2D)
 >>    idx_setosa = strcmp(species, ‘setosa’); % rows of setosa data
 >>    idx_virginica = strcmp(species, ‘virginica’); % rows of virginica
 >>
 >>    setosa = meas(idx_setosa,[1:2]);
 >>    virgin = meas(idx_virginica,[1:2]);
 >>    scatter(setosa(:,1), setosa(:,2)); % plot in blue circles by default
 >>    hold on;
 >>    scatter(virgin(:,1), virgin(:,2), ‘rs’); % red[r] squares[s] for these



                                                                            idx_setosa
                                                                               ...
                                                                               1
                                                                               1          An array of zeros and
                                                                               1          ones indicating the
                                                                               0          positions where the
                                                                               0          keyword ‘setosa’ was
                                                                               0          found
                                                                               ...




      The world is governed more by appearances rather than realities… --Daniel Webster
Tutorial | Time-Series with Matlab                                            scatter3


 Visualizing Data (3D)
      >> idx_setosa = strcmp(species, ‘setosa’); % rows of setosa data
      >> idx_virginica = strcmp(species, ‘virginica’); % rows of virginica
      >> idx_versicolor = strcmp(species, ‘versicolor’); % rows of versicolor

      >>   setosa = meas(idx_setosa,[1:3]);
      >>   virgin = meas(idx_virginica,[1:3]);
      >>   versi = meas(idx_versicolor,[1:3]);
      >>   scatter3(setosa(:,1), setosa(:,2),setosa(:,3)); % plot in blue circles by default
      >>   hold on;
      >>   scatter3(virgin(:,1), virgin(:,2),virgin(:,3), ‘rs’); % red[r] squares[s] for these

      >> scatter3(versi(:,1), virgin(:,2),versi(:,3), ‘gx’);                                      % green x’s



 7


 6


 5


 4
                                                                            >> grid on; % show grid on axis
 3
                                                                            >> rotate3D on; % rotate with mouse
 2


  1
4.5
       4                                                                8
           3.5                                                    7.5
                                                              7
                                                        6.5
                 3                                  6
                                              5.5
                      2.5                 5
                                    4.5
                            2   4
Tutorial | Time-Series with Matlab

Changing Plots Visually

                                            Zoom out




                                             Zoom in
                                                            Computers are
                                                           Computers are
                                                           useless. They can
                                                          useless. They can
                                            Create line      only give you
                                                            only give you
                                                              answers…
                                                             answers…
                                           Create Arrow



  Select Object             Add text
Tutorial | Time-Series with Matlab

Changing Plots Visually
                                           Add titles
                                           Add labels on axis
                                           Change tick labels
                                           Add grids to axis
                                           Change color of line
                                           Change thickness/
                                           Linestyle
                                           etc
Tutorial | Time-Series with Matlab

Changing Plots Visually (Example)

                                                 Change color and
                                                  width of a line




                           A

                               Right click
                                             C




                   B
Tutorial | Time-Series with Matlab

Changing Plots Visually (Example)

                                                                  The result …



                                                                       Other Styles:
                                           3

                                           2

                                           1

                                           0

                                          -1

                                          -2

                                          -3
                                               0   10   20   30   40    50   60   70   80   90   100
                                          3

                                          2

                                          1

                                          0

                                          -1

                                          -2

                                          -3
                                               0   10   20   30   40    50   60   70   80   90   100
Tutorial | Time-Series with Matlab

Changing Figure Properties with Code
 GUI’s are easy, but sooner or later we realize that
 coding is faster
>> a = cumsum(randn(365,1)); % random walk of 365 values


                                                    If this represents a year’s
                                                    worth of measurements of an
                                                    imaginary quantity, we will
                                                    change:
                                                    • x-axis annotation to months
                                                    • Axis labels
                                                    • Put title in the figure
                                                    • Include some greek letters
                                                    in the title just for fun




    Real men do it command-line… --Anonymous
Tutorial | Time-Series with Matlab

Changing Figure Properties with Code
 Axis annotation to months
>> axis tight; % irrelevant but useful...
>> xx = [15:30:365];
>> set(gca, ‘xtick’,xx)                        The result …




    Real men do it command-line… --Anonymous
Tutorial | Time-Series with Matlab

Changing Figure Properties with Code
 Axis annotation to months
                                              >> set(gca,’xticklabel’,[‘Jan’; ...
                                                                        ‘Feb’;‘Mar’])
             The result …




   Real men do it command-line… --Anonymous
Tutorial | Time-Series with Matlab

 Changing Figure Properties with Code
                                                        Other latex examples:
  Axis labels and title
                                                      alpha, beta, e^{-alpha} etc
         >> title(‘My measurements (epsilon/pi)’)




>> ylabel(‘Imaginary Quantity’)




>> xlabel(‘Month of 2005’)



     Real men do it command-line… --Anonymous
Tutorial | Time-Series with Matlab

Saving Figures
 Matlab allows to save the figures (.fig) for later
 processing




                                                            .fig can be later
                                                            opened through
                                                                  Matlab




   You can always put-off for tomorrow, what you can do today. -Anonymous
Tutorial | Time-Series with Matlab

Exporting Figures




                                             Export to:
                                          emf, eps, jpg, etc
Tutorial | Time-Series with Matlab

Exporting figures (code)
 You can also achieve the same result with Matlab code




 Matlab code:
 % extract to color eps
 print -depsc myImage.eps; % from command-line
 print(gcf,’-depsc’,’myImage’) % using variable as name
Tutorial | Time-Series with Matlab

Visualizing Data - 2D Bars

                                                            1
                                                            2
                                                            3
                                                            4

                                                         colormap




                                                  bars
 time = [100 120 80 70]; % our data
 h = bar(time); % get handle
 cmap = [1 0 0; 0 1 0; 0 0 1; .5 0 1]; % colors
 colormap(cmap); % create colormap

 cdata = [1 2 3 4]; % assign colors
 set(h,'CDataMapping','direct','CData',cdata);
Tutorial | Time-Series with Matlab

Visualizing Data - 3D Bars
                                                       data                 colormap
10
                                                  10    8     7             0       0     0
 8
                                                  9     6     5        0.0198   0.0124   0.0079
 6
                                                  8     6     4        0.0397   0.0248   0.0158
 4                                                6     5     4        0.0595   0.0372   0.0237
 2                                                6     3     2        0.0794   0.0496   0.0316
 0                                                3     2     1   64   0.0992   0.0620   0.0395
                                                                                  ...
     1
         2                                                             1.0000   0.7440   0.4738
             3
                                                                       1.0000   0.7564   0.4817
                 5
                     6                3                                1.0000   0.7688   0.4896
                         7
                              1
                                  2
                                                                       1.0000   0.7812   0.4975

                                                                                  3

 data = [ 10 8 7; 9 6 5; 8 6 4; 6 5 4; 6 3 2; 3 2 1];
 bar3([1 2 3 5 6 7], data);

 c = colormap(gray); % get colors of colormap
 c = c(20:55,:); % get some colors
 colormap(c); % new colormap
Tutorial | Time-Series with Matlab

Visualizing Data - Surfaces

                                                                      data
  10
   9                                                         1    2   3 …      10
   8
                                                              1
   7

   6
   5                                                                         9 10
   4
                                                              1                10
   3
   2
   1
  10
                                                             The value at position
       8
           6                                        8
                                                        10
                                                             x-y of the array
                4                               6            indicates the height of
                                            4
                      2
                                     2                       the surface
                           0   0




data = [1:10];
data = repmat(data,10,1); % create data
surface(data,'FaceColor',[1 1 1], 'Edgecolor', [0 0 1]); % plot data
view(3); grid on; % change viewpoint and put axis lines
Tutorial | Time-Series with Matlab

Creating .m files
 Standard text files
 – Script: A series of Matlab commands (no input/output arguments)
 – Functions: Programs that accept input and return output




                              Right click
Tutorial | Time-Series with Matlab

Creating .m files


                                          M editor


                           Double click
Tutorial | Time-Series with Matlab                           cumsum, num2str, save


 Creating .m files
  The following script will create:
          – An array with 10 random walk vectors
          – Will save them under text files: 1.dat, …, 10.dat
myScript.m                                                      Sample Script              A   cumsum(A)
a = cumsum(randn(100,10)); % 10 random walk data of length 100                             1     1
for i=1:size(a,2),         % number of columns
    data = a(:,i) ;                                                                        2     3
    fname = [num2str(i) ‘.dat’]; % a string is a vector of characters!
    save(fname, ’data’,’-ASCII’); % save each column in a text file                        3     6
end
                                                                                           4    10

                                                                   Write this in the       5    15
                      A random walk time-series                      M editor…
 10


 5


 0                                                                         …and execute by typing the
                                                                              name on the Matlab
 -5                                                                             command line
      0     10   20      30   40   50   60   70   80   90 100
Tutorial | Time-Series with Matlab

Functions in .m scripts
 When we need to:
     – Organize our code
     – Frequently change parameters in our scripts
 keyword output argument function name
                                  input argument


function dataN = zNorm(data)
% ZNORM zNormalization of vector                   Help Text
% subtract mean and divide by std                  (help function_name)

if (nargin<1), % check parameters
    error(‘Not enough arguments’);
end
data = data – mean(data); % subtract mean          Function Body
data = data/std(data); % divide by std
dataN = data;


 function [a,b] = myFunc(data, x, y) % pass & return more arguments



See also:varargin, varargout
Tutorial | Time-Series with Matlab

Cell Arrays
 Cells that hold other Matlab arrays
 – Let’s read the files of a directory
 >> f = dir(‘*.dat’) % read file contents
 f =
 15x1 struct array with fields:
     name
                                                                                        me
     date                                               Struct Array               ).na
     bytes                                                             name    f(1
                                                                       date
     isdir                                                 1           bytes
 for i=1:length(f),                                                    isdir
     a{i} = load(f(i).name);                               2
     N = length(a{i});
     plot3([1:N], a{i}(:,1), a{i}(:,2), ...                3
           ‘r-’, ‘Linewidth’, 1.5);
     grid on;                                              4
     pause;          600
                                                           5
     cla;            500

 end                 400

                          300

                          200

                          100

                            0
                         1000
                                                          1500
                                 500             1000
                                           500
Tutorial | Time-Series with Matlab

Reading/Writing Files
 Load/Save are faster than C style I/O operations
 – But fscanf, fprintf can be useful for file formatting
   or reading non-Matlab files
fid = fopen('fischer.txt', 'wt');

for i=1:length(species),
    fprintf(fid, '%6.4f %6.4f %6.4f %6.4f %sn', meas(i,:), species{i});
end
fclose(fid);

Output file:                                Elements are accessed column-wise (again…)
                                                  x = 0:.1:1; y = [x; exp(x)];
                                                  fid = fopen('exp.txt','w');
                                                  fprintf(fid,'%6.2f %12.8fn',y);
                                                  fclose(fid);

                                              0     0.1     0.2         0.3      0.4      0.5      0.6      0.7

                                              1    1.1052   1.2214   1.3499   1.4918   1.6487   1.8221   2.0138
Tutorial | Time-Series with Matlab

Flow Control/Loops
 if (else/elseif) , switch
 – Check logical conditions
 while
 – Execute statements infinite number of times
 for
 – Execute statements a fixed number of times
 break, continue
 return
 – Return execution to the invoking function



   Life is pleasant. Death is peaceful. It’s the transition that’s troublesome. –Isaac Asimov
Tutorial | Time-Series with Matlab                               tic, toc, clear all


For-Loop or vectorization?                                       Pre-allocate arrays that
                                                                  store output results
 clear all;                      elapsed_time =                   – No need for Matlab to
 tic;
 for i=1:50000                         5.0070                       resize everytime
      a(i) = sin(i);
 end                                                             Functions are faster than
 toc
                                                                  scripts
                                                                  – Compiled into pseudo-
 clear all;
                                 elapsed_time =
                                                                    code
 a = zeros(1,50000);
 tic;
                                       0.1400
                                                                 Load/Save faster than
 for i=1:50000
      a(i) = sin(i);
                                                                  Matlab I/O functions
 end
 toc                                                             After v. 6.5 of Matlab there
                                                                  is for-loop vectorization
                                                                  (interpreter)
 clear all;
 tic;                            elapsed_time =                  Vectorizations help, but
 i = [1:50000];                                                   not so obvious how to
 a = sin(i);                           0.0200
 toc;                                                             achieve many times

   Time not important…only life important. –The Fifth Element
Tutorial | Time-Series with Matlab

  Matlab Profiler
 Find which portions of code take up most of the execution time
   – Identify bottlenecks
   – Vectorize offending code




         Time not important…only life important. –The Fifth Element
Tutorial | Time-Series with Matlab

Hints &Tips
 There is always an easier (and faster) way
 – Typically there is a specialized function for what you want to
   achieve
 Learn vectorization techniques, by ‘peaking’ at the
 actual Matlab files:
 – edit [fname], eg
 – edit mean
 – edit princomp
 Matlab Help contains many
 vectorization examples
Tutorial | Time-Series with Matlab

Debugging                      Beware of bugs in the above code; I have only proved it correct, not tried it
                               -- R. Knuth

 Not as frequently required as in C/C++
 – Set breakpoints, step, step in, check variables values
                                                   Set breakpoints
Tutorial | Time-Series with Matlab            Either this man is
                                                      Either this man is
                                                       dead or my watch
                                                      dead or my watch
Debugging                                                has stopped.
                                                        has stopped.
 Full control over variables and execution path
    – F10: step, F11: step in (visit functions, as well)
A




                                              B




                                              F10

                                                  C
Tutorial | Time-Series with Matlab

Advanced Features – 3D modeling/Volume Rendering
 Very easy volume manipulation and rendering
Tutorial | Time-Series with Matlab

Advanced Features – Making Animations (Example)
 Create animation by changing the camera viewpoint

                                                  3                                       3

                                                  2                                       2

                                                  1                                       1
3
                                                  0
                                                                                          0
2                                             -1
                                                                                          -1
1                                             -2
                                                                                          -2
0                                             -3
                                          0
                                              0                                           -3
-1                                                                                         0
                                                                                                                               4
-2                                   50                                                                                    3
                                                      50                                       50                      2
-3                                                                                                                 1
 -1   0                                                                                                        0
           1   2               100                                                    4             100
                   3       4                               100                2   3                       -1
                                                                      0   1
                                                                 -1




azimuth = [50:100 99:-1:50]; % azimuth range of values
for k = 1:length(azimuth),
    plot3(1:length(a), a(:,1), a(:,2), 'r', 'Linewidth',2);
    grid on;
    view(azimuth(k),30); % change new
    M(k) = getframe; % save the frame
end

movie(M,20); % play movie 20 times


      See also:movie2avi
Tutorial | Time-Series with Matlab

Advanced Features – GUI’s
 Built-in Development Environment
 – Buttons, figures, Menus, sliders, etc




                                                     Several Examples in Help
                                                     – Directory listing
                                                     – Address book reader
                                                     – GUI with multiple axis
Tutorial | Time-Series with Matlab

Advanced Features – Using Java
 Matlab is shipped with Java Virtual
 Machine (JVM)
 Access Java API (eg I/O or networking)
 Import Java classes and construct objects
 Pass data between Java objects and
 Matlab variables
Tutorial | Time-Series with Matlab

Advanced Features – Using Java (Example)
 Stock Quote Query
 – Connect to Yahoo server
 – http://www.mathworks.com/matlabcentral/fileexchange/loadFile.do?objectId=4069&objectType=file




                                                                                                   disp('Contacting YAHOO server using ...');
                                                                                                        disp(['url = java.net.URL(' urlString ')']);
                                                                                                   end;
                                                                                                   url = java.net.URL(urlString);

                                                                                                   try
                                                                                                       stream = openStream(url);
                                                                                                       ireader = java.io.InputStreamReader(stream);
                                                                                                       breader = java.io.BufferedReader(ireader);
                                                                                                       connect_query_data= 1; %connect made;
                                                                                                   catch
                                                                                                       connect_query_data= -1; %could not connect
                                                                                                   case;
                                                                                                       disp(['URL: ' urlString]);
                                                                                                       error(['Could not connect to server. It may
                                                                                                   be unavailable. Try again later.']);
                                                                                                       stockdata={};
                                                                                                       return;
                                                                                                   end
Tutorial | Time-Series with Matlab

Matlab Toolboxes
 You ca n buy m any specialize d toolbox e s from Ma thw orks
 – Image Processing, Statistics, Bio-Informatics, etc


 The re a re m any equiva le nt free toolbox e s too:
 – SVM toolbox
   • http://theoval.sys.uea.ac.u k/~gcc/svm/toolbox/

 – W avelets
   • http://www.math.rutgers.ed u/~ojanen/wavekit/

 – Speech Processing
   • http://www.ee.ic.ac.uk/hp /staff/dmb/voicebox/voicebox.html

 – Bayesian Networks
   • http://www.cs.ubc.ca/~murphyk/Software/BNT/bnt.html
Tutorial | Time-Series with Matlab
                                                     I’ve had a wonderful
                                                    I’ve had a wonderful
In case I get stuck…                                   evening. But this
                                                      evening. But this
                                                          wasn’t it…
                                                          wasn’t it…
 help [command] (on the command line)
  eg. help fft
 Menu: help -> matlab help
   – Excellent introduction on various topics
 Matlab webinars
   – http://www.mathworks.com/company/events/archived_webinars.html?fp

 Google groups
   – comp.soft-sys.matlab
   – You can find *anything* here
   – Someone else had the same
     problem before you!
Tutorial | Time-Series with Matlab




PART B: Mathematical notions
                                           Eight percent of
                                          Eight percent of
                                          success is showing
                                         success is showing
                                                  up.
                                                 up.
Tutorial | Time-Series with Matlab

Overview of Part B
1.   Introduction and geometric intuition
2.   Coordinates and transforms
      Fourier transform (DFT)
         Wavelet transform (DWT)
         Incremental DWT
         Principal components (PCA)
         Incremental PCA
3.   Quantized representations
      Piecewise quantized / symbolic
         Vector quantization (VQ) / K-means
4.   Non-Euclidean distances
      Dynamic time warping (DTW)
Tutorial | Time-Series with Matlab

What is a time-series
Definition: A sequence of measurements over time
Definition: A sequence of measurements over time
 Medicine                                         ECG
                                          64.0
 Stock Market
 Meteorology                             62.8
                                          62.0
 Geology                                 66.0
 Astronomy                               62.0
                                          32.0    Sunspot
 Chemistry                               86.4
                                          ...
 Biometrics                              21.6
 Robotics                                45.2
                                          43.2
                                          53.0   Earthquake
                                          43.2

                                          42.8

                                          43.2

                                          36.4           time
Tutorial | Time-Series with Matlab


Applications
            Images                              Shapes            Motion capture

            Image




Color Histogram
600

400

200
                                              Acer platanoides
 0
       50    100   150   200   250
400



200



 0
       50    100   150   200   250


800
600
400
200
 0
       50    100   150   200   250




      Time-Series                                                …more to come
                                                Salix fragilis
Tutorial | Time-Series with Matlab

Time Series
  value




   x5
   x2
   x6
   x3
   x1
   x4

                                          time
Tutorial | Time-Series with Matlab

Time Series
  value




                                                  x = (3, 8, 4, 1, 9, 6)
                              9
                  8
                                  6

                      4
              3
                          1


                                           time

  Sequence of numeric values
          – Finite:
          – N-dimensional vectors/points

          – Infinite:
          – Infinite-dimensional vectors
Tutorial | Time-Series with Matlab

Mean
 Definition:



 From now on, we will generally assume zero mean —
  mean normalization:
Tutorial | Time-Series with Matlab

Variance
 Definition:




  or, if zero mean, then




 From now on, we will generally assume unit variance
  — variance normalization:
Tutorial | Time-Series with Matlab

Mean and variance




                                               variance σ
 mean µ
Tutorial | Time-Series with Matlab

Why and when to normalize
 Intuitively, the notion of “shape” is generally
  independent of
   – Average level (mean)
   – Magnitude (variance)
 Unless otherwise specified, we normalize to zero
  mean and unit variance
Tutorial | Time-Series with Matlab

Variance “=” Length
 Variance of zero-mean series:



 Length of N-dimensional vector (L2-norm):



 So that:                                       x2




                                                  ||
                                              ||x
                                                       x1
Tutorial | Time-Series with Matlab

Covariance and correlation
 Definition




  or, if zero mean and unit variance, then
Tutorial | Time-Series with Matlab

Correlation and similarity

   How “strong” is the linear relationship


        between xt and yt ?
   For normalized series,                                                        residual



slope          2.5                                     2.5

                2                    ρ = -0.23          2    ρ = 0.99
               1.5                                     1.5

                1                                       1

               0.5                                     0.5
         CAD




                                                 BEF

                0                                       0

           -0.5                                    -0.5

                -1                                      -1

           -1.5                                    -1.5

                -2                                      -2

           -2.5                                    -2.5
                     -2   -1   0      1     2                -2    -1   0     1      2
                               FRF                                      FRF
Tutorial | Time-Series with Matlab

Correlation “=” Angle
 Correlation of normalized series:



 Cosine law:



 So that:
                                                    x



                                          θ
                                                        y
                                              x.y
Tutorial | Time-Series with Matlab

Correlation and distance
 For normalized series,




  i.e., correlation and squared Euclidean distance are
  linearly related.

                                                          x




                                                              ||x
                                                                  -y
                                                                     ||
                                                θ
                                                                   y
                                                    x.y
Tutorial | Time-Series with Matlab
Ergodicity
Example



   Assume I eat chicken at the same restaurant every day
     and


   Question: How often is the food good?
          – Answer one:


          – Answer two:


   Answers are equal ⇒ ergodic
          – “If the chicken is usually good, then my guests today can
            safely order other things.”
Tutorial | Time-Series with Matlab
Ergodicity
Example

 Ergodicity is a common and fundamental
   assumption, but sometimes can be wrong:

 “Total number of murders this year is 5% of the
   population”
 “If I live 100 years, then I will commit about 5
   murders, and if I live 60 years, I will commit about 3
   murders”
 … non-ergodic!
 Such ergodicity assumptions on population
   ensembles is commonly called “racism.”
Tutorial | Time-Series with Matlab
Stationarity
Example



   Is the chicken quality consistent?
          – Last week:


          – Two weeks ago:


          – Last month:


          – Last year:


   Answers are equal ⇒ stationary
Tutorial | Time-Series with Matlab

Autocorrelation
 Definition:



 Is well-defined if and only if the series is (weakly)
  stationary
 Depends only on lag ℓ, not time t
Tutorial | Time-Series with Matlab

 Time-domain “coordinates”
                                               6


                4
                                                   3.5
                                      2
                      1.5
                                                             1


       -0.5
                                                                 =
                              -2




-0.5                    +    4                       + 1.5           + -2




+ 2                     +    6                       + 3.5           +   1
Tutorial | Time-Series with Matlab

 Time-domain “coordinates”
                                               6


                4
                                                   3.5
                                      2
                      1.5
                                                             1


       -0.5
                                                                       =
                              -2




  x1
-0.5    £e1             + x2
                          4          £e2                x3
                                                     + 1.5       £e3         x4
                                                                           + -2   £e4




+ x5
  2     £e5             + x6
                          6          £e6                x7
                                                     + 3.5       £e7       + x8
                                                                             1    £e8
Tutorial | Time-Series with Matlab

Orthonormal basis
 Set of N vectors, { e1, e2, …, eN }
   – Normal: ||ei|| = 1, for all 1 ≤ i ≤ N
   – Orthogonal: ei¢ej = 0, for i ≠ j



 Describe a Cartesian coordinate system
   – Preserve length (aka. “Parseval theorem”)
   – Preserve angles (inner-product, correlations)
Tutorial | Time-Series with Matlab

Orthonormal basis
 Note that the coefficients xi w.r.t. the basis { e1, …, eN }
   are the corresponding “similarities” of x to each
   basis vector/series:




                           6
        4                      3.5

            1.5        2
                                     1   =   -0.5        +   4         + …
 -0.5
                  -2
                                                    e1            e2
                       x
                                                             x2
Tutorial | Time-Series with Matlab

Orthonormal bases

  The time-domain basis is a trivial tautology:
    – Each coefficient is simply the value at one time instant


  What other bases may be of interest? Coefficients may
   correspond to:
    – Frequency (Fourier)
    – Time/scale (wavelets)
    – Features extracted from series collection (PCA)
Tutorial | Time-Series with Matlab
  Frequency domain “coordinates”
  Preview


                                                 6


                  4
                                                     3.5
                                        2
                        1.5
                                                               1


        -0.5
                                                                   =
                                -2




 5.6                      + -2.2                       +   0           + 2.8




- 4.9                     + -3                         +   0           + 0.05
Tutorial | Time-Series with Matlab
Time series geometry
Summary


 Basic concepts:
   – Series / vector
   – Mean: “average level”
   – Variance: “magnitude/length”
   – Correlation: “similarity”, “distance”, “angle”
   – Basis: “Cartesian coordinate system”
Tutorial | Time-Series with Matlab
Time series geometry
Preview — Applications


 The quest for the right basis…
 Compression / pattern extraction
     – Filtering
     – Similarity / distance
     – Indexing
     – Clustering
     – Forecasting
     – Periodicity estimation
     – Correlation
Tutorial | Time-Series with Matlab

Overview
1.   Introduction and geometric intuition
2.   Coordinates and transforms
      Fourier transform (DFT)
         Wavelet transform (DWT)
         Incremental DWT
         Principal components (PCA)
         Incremental PCA
3.   Quantized representations
      Piecewise quantized / symbolic
         Vector quantization (VQ) / K-means
4.   Non-Euclidean distances
      Dynamic time warping (DTW)
Tutorial | Time-Series with Matlab

Frequency




 One cycle every 20 time units (period)
Tutorial | Time-Series with Matlab

Frequency and time




                                           .

                                                           = 0
 Why is the period 20?
        period = 8


 It’s not 8, because its “similarity” (projection) to a
  period-8 series (of the same length) is zero.
Tutorial | Time-Series with Matlab

Frequency and time




                                          .

                                                        = 0
        period = 10


  Why is the cycle 20?
  It’s not 10, because its “similarity” (projection) to a
   period-10 series (of the same length) is zero.
Tutorial | Time-Series with Matlab

Frequency and time




                                          .

                                                        = 0
        period = 40


  Why is the cycle 20?
  It’s not 40, because its “similarity” (projection) to a
   period-40 series (of the same length) is zero.

                         …and so on
Tutorial | Time-Series with Matlab
Frequency
Fourier transform - Intuition


 To find the period, we compared the time series with
    sinusoids of many different periods
 Therefore, a good “description” (or basis) would
    consist of all these sinusoids
 This is precisely the idea behind the discrete Fourier
    transform
      – The coefficients capture the similarity (in terms of amplitude
        and phase) of the series with sinusoids of different periods
Tutorial | Time-Series with Matlab
Frequency
Fourier transform - Intuition


 Technical details:
      – We have to ensure we get an orthonormal basis
      – Real form: sines and cosines at N/2 different frequencies
      – Complex form: exponentials at N different frequencies
Tutorial | Time-Series with Matlab
Fourier transform
Real form


 For odd-length series,




 The pair of bases at frequency fk are




plus the zero-frequency (mean) component
Tutorial | Time-Series with Matlab
Fourier transform
Real form — Amplitude and phase


 Observe that, for any fk, we can write




   where




   are the amplitude and phase, respectively.
Tutorial | Time-Series with Matlab
Fourier transform
Real form — Amplitude and phase


 It is often easier to think in terms of amplitude rk and
   phase θ k – e.g.,


        1


       0.5


        0


      -0.5
                 5

        -1
             0        10        20        30      40   50   60   70   80
Tutorial | Time-Series with Matlab
Fourier transform
Complex form

 The equations become easier to handle if we allow
   the series and the Fourier coefficients Xk to take
   complex values:




 Matlab note: fft omits the          scaling factor and
   is not unitary—however, ifft includes an
   scaling factor, so always ifft(fft(x)) == x.
Tutorial | Time-Series with Matlab
Fourier transform
Example


        2
        1
                                                  1 frequency
  GBP




        0
        -1


        2

                                                  2 frequencies
        1
  GBP




        0
        -1


        2

                                                  3 frequencies
        1
  GBP




        0
        -1



        2

                                                  5 frequencies
        1
  GBP




        0
        -1



        2

                                                  10 frequencies
        1
  GBP




        0
        -1



        2

                                                  20 frequencies
        1
  GBP




        0
        -1
Tutorial | Time-Series with Matlab

Other frequency-based transforms
 Discrete Cosine Transform (DCT)
  – Matlab: dct / idct
 Modified Discrete Cosine Transform (MDCT)
Tutorial | Time-Series with Matlab

Overview
1.   Introduction and geometric intuition
2.   Coordinates and transforms
      Fourier transform (DFT)
         Wavelet transform (DWT)
         Incremental DWT
         Principal components (PCA)
         Incremental PCA
3.   Quantized representations
      Piecewise quantized / symbolic
         Vector quantization (VQ) / K-means
4.   Non-Euclidean distances
      Dynamic time warping (DTW)
Tutorial | Time-Series with Matlab

Frequency and time




e.g., .   period = 20
                                                     ≠ 0
      .                                              ≠ 0
          period = 10

   What is the cycle now?                              etc…
   No single cycle, because the series isn’t exactly similar
    with any series of the same length.
Tutorial | Time-Series with Matlab

Frequency and time

  Fourier is successful for summarization of series with a
   few, stable periodic components
  However, content is “smeared” across frequencies
   when there are
    – Frequency shifts or jumps, e.g.,




    – Discontinuities (jumps) in time, e.g.,
Tutorial | Time-Series with Matlab

Frequency and time
 If there are discontinuities in time/frequency or
  frequency shifts, then we should seek an alternate
  “description” or basis
 Main idea: Localize bases in time
   – Short-time Fourier transform (STFT)
   – Discrete wavelet transform (DWT)
Tutorial | Time-Series with Matlab
Frequency and time
Intuition




    What if we examined, e.g., eight values at a time?
Tutorial | Time-Series with Matlab
Frequency and time
Intuition




    What if we examined, e.g., eight values at a time?
    Can only compare with periods up to eight.
            – Results may be different for each group (window)
Tutorial | Time-Series with Matlab
Frequency and time
Intuition




    Can “adapt” to localized phenomena


    Fixed window: short-window Fourier (STFT)
            – How to choose window size?


    Variable windows: wavelets
Tutorial | Time-Series with Matlab
Wavelets
Intuition


 Main idea
      – Use small windows for small periods
             • Remove high-frequency component, then
      – Use larger windows for larger periods
             • Twice as large
      – Repeat recursively


 Technical details
      – Need to ensure we get an orthonormal basis
Tutorial | Time-Series with Matlab
Wavelets
Intuition




                                                    Scale (frequency)
                                        Frequency


                Time                                                    Time
Tutorial | Time-Series with Matlab
Wavelets
Intuition — Tiling time and frequency




                                                           Scale (frequency)
Frequency




                                        Frequency


                                                    Time                         Time

        Fourier, DCT, …                             STFT                       Wavelets
Tutorial | Time-Series with Matlab
Wavelet transform
Pyramid algorithm




                                              High
                                              pass




                                              Low
                                              pass
Tutorial | Time-Series with Matlab
Wavelet transform
Pyramid algorithm




                                              High
                                              pass


                                              Low
                                              pass
Tutorial | Time-Series with Matlab
Wavelet transform
Pyramid algorithm




                                              High
                                              pass

                                              Low
                                              pass
Tutorial | Time-Series with Matlab
Wavelet transform
Pyramid algorithm




            High
                                                     w1
            pass


x ≡ w0

                             High
                                                     w2
                             pass
            Low     v1
            pass                              High
                                                     w3
                             Low      v2      pass
                             pass
                                              Low    v3
                                              pass
Tutorial | Time-Series with Matlab
Wavelet transforms
General form


 A high-pass / low-pass filter pair
     – Example: pairwise difference / average (Haar)
     – In general: Quadrature Mirror Filter (QMF) pair
          • Orthogonal spans, which cover the entire space
     – Additional requirements to ensure orthonormality of overall
       transform…
 Use to recursively analyze into top / bottom half of
   frequency band
Tutorial | Time-Series with Matlab
Wavelet transforms
Other filters — examples



                                                         Haar (Daubechies-1)




                                                                                         Better frequency isolation
                                                                                          Worse time localization
                                                            Daubechies-2




                                                            Daubechies-3




                                                            Daubechies-4




                                               Wavelet filter, or   Scaling filter, or
                                                Mother filter         Father filter
                                                (high-pass)           (low-pass)
Tutorial | Time-Series with Matlab
       Wavelets
       Example



                        Wavelet coefficients (GBP, Haar)                                             Wavelet coefficients (GBP, Daubechies-3)
         2                                                                                       2
 GBP




         1                                                                                       1
         0                                                                                       0
        -1                                                                                      -1
                       500               1000          1500          2000           2500                            500               1000               1500                       2000                     2500
         1                                                                                       1
 W1




         0                                                                                       0

        -1                                                                                      -1
                  200             400           600       800        1000         1200                        200              400                600             800               1000               1200
         1                                                                                       1
 W2




         0                                                                                       0

        -1                                                                                      -1
                  100             200           300       400        500          600                         100              200            300                 400               500                600
         2                                                                                       1
 W3




         0                                                                                       0

        -2                                                                                      -1
                  50              100           150       200        250          300                         50               100            150             200                   250                300
         2                                                                                       2
 W4




         0                                                                                       0
        -2                                                                                      -2
             20              40          60       80      100    120        140          160             20               40          60           80         100             120           140              160
         5                                                                                       5
 W5




         0                                                                                       0

        -5                                                                                      -5
             10              20         30        40      50    60          70           80          10               20             30           40         50              60            70           80
       10                                                                                        5
W6




         0                                                                                       0

       -10                                                                                      -5
             5           10             15       20      25     30          35          40           5               10         15           20         25              30          35            40           45
       20                                                                                      20
V6




         0                                                                                       0

       -20                                                                                     -20
             5           10             15       20      25     30          35           40          5               10         15           20         25              30          35            40           45
Tutorial | Time-Series with Matlab
          Wavelets
          Example



                    Multi-resolution analysis (GBP, Haar)                 Multi-resolution analysis (GBP, Daubechies-3)
            2                                                         2
 GBP




            1                                                         1
            0                                                         0
           -1                                                        -1
                     500       1000      1500      2000     2500                500      1000      1500      2000         2500

           0.1                                                        0
             0                                                     -0.2
D1




          -0.1
          -0.2                                                     -0.4
          -0.3                                                     -0.6
                     500       1000      1500      2000     2500                500      1000      1500      2000         2500
                                                                    0.2
           0.2                                                        0
D2




             0                                                     -0.2
          -0.2                                                     -0.4
                                                                   -0.6
                     500       1000      1500      2000     2500                500      1000      1500      2000         2500
           0.4                                                      0.2
           0.2
D3




             0                                                        0
          -0.2                                                     -0.2
          -0.4                                                     -0.4
                     500       1000      1500      2000     2500                500      1000      1500      2000         2500
           0.4
           0.2                                                      0.2
                                                                      0
D4




             0
          -0.2                                                     -0.2
          -0.4                                                     -0.4
                     500       1000      1500      2000     2500                500      1000      1500      2000         2500
           0.5                                                     0.5
D5




             0                                                        0
          -0.5
                                                                   -0.5
                     500       1000      1500      2000     2500                500      1000      1500      2000         2500
                                                                    0.5
           0.5
                                                                      0
D6




             0
          -0.5                                                     -0.5

                     500       1000      1500      2000     2500                500      1000      1500      2000         2500
            2                                                         2
            1                                                         1
     A6




            0                                                         0
           -1                                                        -1
                     500       1000      1500      2000     2500                500      1000      1500      2000         2500
Tutorial | Time-Series with Matlab
          Wavelets
          Example



                    Multi-resolution analysis (GBP, Haar)                          Multi-resolution analysis (GBP, Daubechies-3)
            2                                                               2
 GBP




            1                                                               1

                                                Analysis levels are orthogonal,
            0                                                               0
           -1                                                              -1


                                                       Di¢Dj = 0, for i ≠ j
                     500        1000     1500         2000        2500                    500          1000       1500          2000   2500

           0.1                                                              0
             0                                                           -0.2
D1




          -0.1
          -0.2                                                           -0.4
          -0.3                                                           -0.6
                     500        1000     1500         2000       2500                     500          1000       1500          2000   2500
           0.2              2
                                       Haar analysis: simple, piecewise constant
                                                               0.2
                                                                 0
D2




             0                                                           -0.2
          -0.2              1                                            -0.4
                                                                         -0.6
                     500    0   1000     1500         2000       2500                     500          1000       1500          2000   2500
           0.4                                                            0.2
           0.2             -1
D3




             0                                                              0
          -0.2                                                           -0.2
          -0.4                         500              1000             -0.4
                                                                            1500                2000           2500
                     500        1000     1500         2000       2500                     500          1000       1500          2000   2500
           0.4
                                                                          0.2
                                                 Daubechies-3 analysis: less artifacting
           0.2
                                                                            0
D4




             0
          -0.2                   2                                       -0.2
          -0.4                                                           -0.4
                     500          1
                                1000     1500         2000       2500                    500           1000      1500           2000   2500
                                                                         0.5
           0.5                   0
D5




             0                                                              0
          -0.5                  -1
                                                                         -0.5
                     500        1000     1500   500   2000        2500
                                                               1000                  1500 500          1000
                                                                                                        2000     1500    2500   2000   2500
                                                                          0.5
           0.5
                                                                            0
D6




             0
          -0.5                                                           -0.5

                     500        1000     1500         2000        2500                   500           1000      1500           2000   2500
            2                                                               2
            1                                                               1
     A6




            0                                                               0
           -1                                                              -1
                     500        1000     1500         2000       2500                    500           1000      1500           2000   2500
Tutorial | Time-Series with Matlab
Wavelets
Matlab


 Wavelet GUI: wavemenu


 Single level: dwt / idwt
 Multiple level: wavedec / waverec
     – wmaxlev

 Wavelet bases: wavefun
Tutorial | Time-Series with Matlab

 Other wavelets
  Only scratching the surface…
  Wavelet packets
     – All possible tilings (binary)
     – Best-basis transform
  Overcomplete wavelet transform (ODWT), aka.
    maximum-overlap wavelets (MODWT), aka. shift-
    invariant wavelets



Further reading:
1. Donald B. Percival, Andrew T. Walden, Wavelet Methods for Time Series Analysis,
Cambridge Univ. Press, 2006.
2. Gilbert Strang, Truong Nguyen, Wavelets and Filter Banks, Wellesley College, 1996.
3. Tao Li, Qi Li, Shenghuo Zhu, Mitsunori Ogihara, A Survey of Wavelet Applications in
Data Mining, SIGKDD Explorations, 4(2), 2002.
Tutorial | Time-Series with Matlab

More on wavelets
 Signal representation and compressibility

                      100
                                         Partial energy (GBP)                                           100
                                                                                                                       Partial energy (Light)

                       90                                                                                90


                       80                                                                                80


                       70                                                                                70




                                                                                   Quality (% energy)
 Quality (% energy)




                       60                                                                                60


                       50                                                                                50


                       40                                                                                40


                       30                                                                                30


                       20                                              Time                              20                                      Time
                                                                       FFT                                                                       FFT
                       10                                              Haar                              10                                      Haar
                                                                       DB3                                                                       DB3
                        0                                                                                 0
                            0        2         4         6         8          10                              0          5                 10           15

                                    Compression (% coefficients)                                                  Compression (% coefficients)
Tutorial | Time-Series with Matlab

 More wavelets
  Keeping the highest coefficients minimizes total error
    (L2-distance)
  Other coefficient selection/thresholding schemes for
    different error metrics (e.g., maximum per-instant
    error, or L1 -dist.)
     – Typically use Haar bases




Further reading:
1. Minos Garofalakis, Amit Kumar, Wavelet Synopses for General Error Metrics, ACM
TODS, 30(4), 2005.
2.Panagiotis Karras, Nikos Mamoulis, One-pass Wavelet Synopses for Maximum-Error
Metrics, VLDB 2005.
Tutorial | Time-Series with Matlab

Overview
1.   Introduction and geometric intuition
2.   Coordinates and transforms
      Fourier transform (DFT)
         Wavelet transform (DWT)
         Incremental DWT
         Principal components (PCA)
         Incremental PCA
3.   Quantized representations
      Piecewise quantized / symbolic
         Vector quantization (VQ) / K-means
4.   Non-Euclidean distances
      Dynamic time warping (DTW)
Tutorial | Time-Series with Matlab
Wavelets
Incremental estimation
Tutorial | Time-Series with Matlab
Wavelets
Incremental estimation
Tutorial | Time-Series with Matlab
Wavelets
Incremental estimation
Tutorial | Time-Series with Matlab
Wavelets
Incremental estimation
Tutorial | Time-Series with Matlab
Wavelets
Incremental estimation
Tutorial | Time-Series with Matlab
Wavelets
Incremental estimation




                                               post-order traversal
Tutorial | Time-Series with Matlab
Wavelets
Incremental estimation


 Forward transform                            :
     – Post-order traversal of wavelet coefficient tree
     – O(1) time (amortized)
     – O(logN) buffer space (total)                constant factor:
                                                   filter length
 Inverse transform:
     – Pre-order traversal of wavelet coefficient tree
     – Same complexity
Tutorial | Time-Series with Matlab

Overview
1.   Introduction and geometric intuition
2.   Coordinates and transforms
      Fourier transform (DFT)
         Wavelet transform (DWT)
         Incremental DWT
         Principal components (PCA)
         Incremental PCA
3.   Quantized representations
      Piecewise quantized / symbolic
         Vector quantization (VQ) / K-means
4.   Non-Euclidean distances
      Dynamic time warping (DTW)
Tutorial | Time-Series with Matlab
Time series collections
Overview


 Fourier and wavelets are the most prevalent and
   successful “descriptions” of time series.


 Next, we will consider collections of M time series,
   each of length N.
    – What is the series that is “most similar” to all series in the
      collection?
    – What is the second “most similar”, and so on…
Tutorial | Time-Series with Matlab

Time series collections
 Some notation:




values at time t, xt

                                  i-th series, x(i)
Tutorial | Time-Series with Matlab
Principal Component Analysis
Example


                           Exchange rates (vs. USD)                                     Principal components 1-4          (µ ≠ 0)
                                                                          0.05




                                                                    u1
          2
                                                                                                                                      = 48%
   AUD




                                                                     U1
          0                                                                  0
         -2                                                               -0.05
                                                                          0.05
          2
                                                                                                                                      + 33%




                                                                    u2
   BEF




                                                                             0




                                                                     U2
          0
         -2                                                               -0.05                                                       = 81%
          2                                                               0.05
                                                                                                                                      + 11%




                                                                    u3
   CAD




                                                                             0




                                                                     U3
          0
         -2                                                               -0.05                                                       = 92%
                                                                           0.05
          2
                                                                                                                                      + 4%
   FRF




                                                                    u4
          0                                                                   0




                                                                     U4
         -2                                                               -0.05
                                                                                  500       1000          1500     2000        2500
                                                                                                                                      = 96%
          2
   DEM




                                                                                                   Time
          0


                                                                     “Best” basis : { u1, u2, u3, u4 }
         -2

          2

              x = 49.1u1 + 8.1u2 + 7.8u3 + 3.6u4 + ε 1
   JPY




          0    (2)
         -2

          2
                                                                     Coefficients of each time series
   NLG




          0
         -2

          2                                                          w.r.t. basis { u1, u2, u3, u4 } :
   NZL




          0
         -2

          2
   ESP




          0
         -2

          2
   SEK




          0
         -2

          2
   CHF




          0
         -2

          2
   GBP




          0
         -2
                     500        1000          1500    2000   2500
                                       Time
Tutorial | Time-Series with Matlab

Principal component analysis

       2                          First two principal components
CAD




       0
      -2                                                                          2




                                                                           ESP
           50                                                                     0
                                                                                 -2

                                 SEK
           40
                                                               2

                                                        GBP
                                                               0
                                                              -2

           30                                 AUD

                                                                                                       2




                                                                                                FRF
                                                                                                       0
                                                                                                      -2
           20
υi,2




                                                                                                                                       2




                                                                                                                             BEF
           10                                                                                                                          0
                                                                                                                                      -2
                                                                          NZL
                                                                                                       CHF
             0

                                              2
                                       NLG




                                              0                                                                                   2




                                                                                                                           DEM
                                             -2                                                                                   0
           -10                                                                                                                   -2




           -20
                                                                                                                  2
                                                                                                           JPY




                                                                                                                  0
                                                                                                                 -2


                 -30      -20    -10                0              10      20         30   40     50                  60

                                                                        υi,1
Tutorial | Time-Series with Matlab
 Principal Component Analysis
 Matrix notation — Singular Value Decomposition (SVD)




                                            X = UΣVT
            X                            U
                                                                ΣVT

x(1) x(2)        x(M)
                        =       u1 u2            uk   .   υ1 υ2 υ3           υM




                                                           coefficients w.r.t.
                                                              basis in U
    time series                     basis for                 (columns)
                                  time series
Tutorial | Time-Series with Matlab
 Principal Component Analysis
 Matrix notation — Singular Value Decomposition (SVD)




                                            X = UΣVT
            X                            U
                                                                ΣVT
                                                                   v’1
                                                                   v’2
x(1) x(2)        x(M)
                        =       u1 u2            uk   .   υ1 υ2 υ3           υN


                                                                   v’k

                                                              basis for
                                                            measurements
    time series                     basis for                  (rows)
                                  time series
                                                           coefficients w.r.t.
                                                              basis in U
                                                              (columns)
Tutorial | Time-Series with Matlab
 Principal Component Analysis
 Matrix notation — Singular Value Decomposition (SVD)




                                            X = UΣVT
            X                            U
                                                                    Σ                VT
                                                          σ1                          v1
                                                               σ2                     v2
x(1) x(2)        x(M)
                        =       u1 u2            uk   .                      .

                                                                        σk            vk

                                                          scaling factors          basis for
                                                                                 measurements
    time series                     basis for                                       (rows)
                                  time series
Tutorial | Time-Series with Matlab
 Principal component analysis
 Properties — Singular Value Decomposition (SVD)


  V are the eigenvectors of the covariance matrix XTX,
     since




  U are the eigenvectors of the Gram (inner-product)
     matrix XXT, since




Further reading:
1. Ian T. Jolliffe, Principal Component Analysis (2nd ed), Springer, 2002.
2. Gilbert Strang, Linear Algebra and Its Applications (4th ed), Brooks Cole, 2005.
Tutorial | Time-Series with Matlab

 Kernels and KPCA
  What are kernels?
                                                                         Exchange rates
     – Usual definition of inner product w.r.t.                         SEK
                                                                                 ESP
                                                                                GBP
                                                                  CAD

         vector coordinates is x¢y = ∑i xiyi                              AUD




     – However, other definitions that preserve                                        NZL
                                                                                                FRF
                                                                                                BEF
                                                                                             DEMNLG
                                                                                                CHF


        the fundamental properties are possible
                                                                                             JPY

  Why kernels?
     – We no longer have explicit “coordinates”
          • Objects do not even need to be numeric
     – But we can still talk about distances and angles
     – Many algorithms rely just on these two concepts


Further reading:
1. Bernhard Schölkopf, Alexander J. Smola, Learning with Kernels: Support Vector
Machines, Regularization, Optimization and Beyond, MIT Press, 2001.
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab
Hands-On Time-Series Analysis with Matlab

More Related Content

What's hot (20)

Matlab ppt
Matlab pptMatlab ppt
Matlab ppt
 
Learn Matlab
Learn MatlabLearn Matlab
Learn Matlab
 
Basics of matlab
Basics of matlabBasics of matlab
Basics of matlab
 
Brief Introduction to Matlab
Brief  Introduction to MatlabBrief  Introduction to Matlab
Brief Introduction to Matlab
 
Seminar on MATLAB
Seminar on MATLABSeminar on MATLAB
Seminar on MATLAB
 
Introduction to Matlab
Introduction to MatlabIntroduction to Matlab
Introduction to Matlab
 
Matlab basic and image
Matlab basic and imageMatlab basic and image
Matlab basic and image
 
Introduction to MATLAB
Introduction to MATLABIntroduction to MATLAB
Introduction to MATLAB
 
Matlab Overviiew
Matlab OverviiewMatlab Overviiew
Matlab Overviiew
 
Introduction to matlab
Introduction to matlabIntroduction to matlab
Introduction to matlab
 
Matlab
MatlabMatlab
Matlab
 
Matlab Workshop Presentation
Matlab Workshop PresentationMatlab Workshop Presentation
Matlab Workshop Presentation
 
Importance of matlab
Importance of matlabImportance of matlab
Importance of matlab
 
Introduction to matlab
Introduction to matlabIntroduction to matlab
Introduction to matlab
 
Matlab practical and lab session
Matlab practical and lab sessionMatlab practical and lab session
Matlab practical and lab session
 
Matlab
MatlabMatlab
Matlab
 
Basic matlab and matrix
Basic matlab and matrixBasic matlab and matrix
Basic matlab and matrix
 
Writing Fast MATLAB Code
Writing Fast MATLAB CodeWriting Fast MATLAB Code
Writing Fast MATLAB Code
 
Introduction to matlab
Introduction to matlabIntroduction to matlab
Introduction to matlab
 
Matlab
MatlabMatlab
Matlab
 

Viewers also liked

DSP_FOEHU - MATLAB 02 - The Discrete-time Fourier Analysis
DSP_FOEHU - MATLAB 02 - The Discrete-time Fourier AnalysisDSP_FOEHU - MATLAB 02 - The Discrete-time Fourier Analysis
DSP_FOEHU - MATLAB 02 - The Discrete-time Fourier AnalysisAmr E. Mohamed
 
Introduction - Time Series Analysis
Introduction - Time Series AnalysisIntroduction - Time Series Analysis
Introduction - Time Series Analysisjaya gobi
 
MATLAB SIMULATIONS OF PARALLEL RESONANT CIRCUIT
MATLAB SIMULATIONS OF PARALLEL RESONANT CIRCUITMATLAB SIMULATIONS OF PARALLEL RESONANT CIRCUIT
MATLAB SIMULATIONS OF PARALLEL RESONANT CIRCUITMinh Anh Nguyen
 
DSP_FOEHU - Lec 02 - Frequency Domain Analysis of Signals and Systems
DSP_FOEHU - Lec 02 - Frequency Domain Analysis of Signals and SystemsDSP_FOEHU - Lec 02 - Frequency Domain Analysis of Signals and Systems
DSP_FOEHU - Lec 02 - Frequency Domain Analysis of Signals and SystemsAmr E. Mohamed
 
Circuit analysis i with matlab computing and simulink sim powersystems modeling
Circuit analysis i with matlab computing and simulink sim powersystems modelingCircuit analysis i with matlab computing and simulink sim powersystems modeling
Circuit analysis i with matlab computing and simulink sim powersystems modelingIndra S Wahyudi
 
Matlab Working With Images
Matlab Working With ImagesMatlab Working With Images
Matlab Working With Imagesmatlab Content
 
DSP_FOEHU - MATLAB 03 - The z-Transform
DSP_FOEHU - MATLAB 03 - The z-TransformDSP_FOEHU - MATLAB 03 - The z-Transform
DSP_FOEHU - MATLAB 03 - The z-TransformAmr E. Mohamed
 
Matlab for Electrical Engineers
Matlab for Electrical EngineersMatlab for Electrical Engineers
Matlab for Electrical EngineersManish Joshi
 
Matlab solving rlc circuit
Matlab solving rlc circuitMatlab solving rlc circuit
Matlab solving rlc circuitAmeen San
 
Ch 05 MATLAB Applications in Chemical Engineering_陳奇中教授教學投影片
Ch 05 MATLAB Applications in Chemical Engineering_陳奇中教授教學投影片Ch 05 MATLAB Applications in Chemical Engineering_陳奇中教授教學投影片
Ch 05 MATLAB Applications in Chemical Engineering_陳奇中教授教學投影片Chyi-Tsong Chen
 
Introduction to MATLAB
Introduction to MATLABIntroduction to MATLAB
Introduction to MATLABRavikiran A
 
Image proceesing with matlab
Image proceesing with matlabImage proceesing with matlab
Image proceesing with matlabAshutosh Shahi
 
MATLAB Programs For Beginners. | Abhi Sharma
MATLAB Programs For Beginners. | Abhi SharmaMATLAB Programs For Beginners. | Abhi Sharma
MATLAB Programs For Beginners. | Abhi SharmaAbee Sharma
 
Linear Algebra and Matlab tutorial
Linear Algebra and Matlab tutorialLinear Algebra and Matlab tutorial
Linear Algebra and Matlab tutorialJia-Bin Huang
 

Viewers also liked (20)

Libro de MATLAB
Libro de MATLABLibro de MATLAB
Libro de MATLAB
 
Getting started with image processing using Matlab
Getting started with image processing using MatlabGetting started with image processing using Matlab
Getting started with image processing using Matlab
 
DSP_FOEHU - MATLAB 02 - The Discrete-time Fourier Analysis
DSP_FOEHU - MATLAB 02 - The Discrete-time Fourier AnalysisDSP_FOEHU - MATLAB 02 - The Discrete-time Fourier Analysis
DSP_FOEHU - MATLAB 02 - The Discrete-time Fourier Analysis
 
Introduction - Time Series Analysis
Introduction - Time Series AnalysisIntroduction - Time Series Analysis
Introduction - Time Series Analysis
 
Matlab graficos3 d
Matlab graficos3 dMatlab graficos3 d
Matlab graficos3 d
 
MATLAB SIMULATIONS OF PARALLEL RESONANT CIRCUIT
MATLAB SIMULATIONS OF PARALLEL RESONANT CIRCUITMATLAB SIMULATIONS OF PARALLEL RESONANT CIRCUIT
MATLAB SIMULATIONS OF PARALLEL RESONANT CIRCUIT
 
2 pengenalan matlab
2 pengenalan matlab2 pengenalan matlab
2 pengenalan matlab
 
DSP_FOEHU - Lec 02 - Frequency Domain Analysis of Signals and Systems
DSP_FOEHU - Lec 02 - Frequency Domain Analysis of Signals and SystemsDSP_FOEHU - Lec 02 - Frequency Domain Analysis of Signals and Systems
DSP_FOEHU - Lec 02 - Frequency Domain Analysis of Signals and Systems
 
Simulink
SimulinkSimulink
Simulink
 
Circuit analysis i with matlab computing and simulink sim powersystems modeling
Circuit analysis i with matlab computing and simulink sim powersystems modelingCircuit analysis i with matlab computing and simulink sim powersystems modeling
Circuit analysis i with matlab computing and simulink sim powersystems modeling
 
Matlab Working With Images
Matlab Working With ImagesMatlab Working With Images
Matlab Working With Images
 
DSP_FOEHU - MATLAB 03 - The z-Transform
DSP_FOEHU - MATLAB 03 - The z-TransformDSP_FOEHU - MATLAB 03 - The z-Transform
DSP_FOEHU - MATLAB 03 - The z-Transform
 
Matlab for Electrical Engineers
Matlab for Electrical EngineersMatlab for Electrical Engineers
Matlab for Electrical Engineers
 
Matlab solving rlc circuit
Matlab solving rlc circuitMatlab solving rlc circuit
Matlab solving rlc circuit
 
Ch 05 MATLAB Applications in Chemical Engineering_陳奇中教授教學投影片
Ch 05 MATLAB Applications in Chemical Engineering_陳奇中教授教學投影片Ch 05 MATLAB Applications in Chemical Engineering_陳奇中教授教學投影片
Ch 05 MATLAB Applications in Chemical Engineering_陳奇中教授教學投影片
 
Introduction to MATLAB
Introduction to MATLABIntroduction to MATLAB
Introduction to MATLAB
 
Princing insurance contracts with R
Princing insurance contracts with RPrincing insurance contracts with R
Princing insurance contracts with R
 
Image proceesing with matlab
Image proceesing with matlabImage proceesing with matlab
Image proceesing with matlab
 
MATLAB Programs For Beginners. | Abhi Sharma
MATLAB Programs For Beginners. | Abhi SharmaMATLAB Programs For Beginners. | Abhi Sharma
MATLAB Programs For Beginners. | Abhi Sharma
 
Linear Algebra and Matlab tutorial
Linear Algebra and Matlab tutorialLinear Algebra and Matlab tutorial
Linear Algebra and Matlab tutorial
 

Similar to Hands-On Time-Series Analysis with Matlab

From zero to MATLAB hero: Mastering the basics and beyond
From zero to MATLAB hero: Mastering the basics and beyondFrom zero to MATLAB hero: Mastering the basics and beyond
From zero to MATLAB hero: Mastering the basics and beyondMahuaPal6
 
Introduction to Matlab.pdf
Introduction to Matlab.pdfIntroduction to Matlab.pdf
Introduction to Matlab.pdfssuser43b38e
 
1.1Introduction to matlab.pptx
1.1Introduction to matlab.pptx1.1Introduction to matlab.pptx
1.1Introduction to matlab.pptxBeheraA
 
interfacing matlab with embedded systems
interfacing matlab with embedded systemsinterfacing matlab with embedded systems
interfacing matlab with embedded systemsRaghav Shetty
 
Mat lab workshop
Mat lab workshopMat lab workshop
Mat lab workshopVinay Kumar
 
Lecture 01 variables scripts and operations
Lecture 01   variables scripts and operationsLecture 01   variables scripts and operations
Lecture 01 variables scripts and operationsSmee Kaem Chann
 
Introduction to matlab
Introduction to matlabIntroduction to matlab
Introduction to matlabDnyanesh Patil
 
Matlab ch1 intro
Matlab ch1 introMatlab ch1 intro
Matlab ch1 introRagu Nathan
 
2.Exploration with CAS-I.Lab2.pptx
2.Exploration with CAS-I.Lab2.pptx2.Exploration with CAS-I.Lab2.pptx
2.Exploration with CAS-I.Lab2.pptxakshatraj875
 
Variables in matlab
Variables in matlabVariables in matlab
Variables in matlabTUOS-Sam
 

Similar to Hands-On Time-Series Analysis with Matlab (20)

Matlab lec1
Matlab lec1Matlab lec1
Matlab lec1
 
Introduction to Matlab.ppt
Introduction to Matlab.pptIntroduction to Matlab.ppt
Introduction to Matlab.ppt
 
Matlab Tutorial.ppt
Matlab Tutorial.pptMatlab Tutorial.ppt
Matlab Tutorial.ppt
 
Matlab
MatlabMatlab
Matlab
 
Mit6 094 iap10_lec01
Mit6 094 iap10_lec01Mit6 094 iap10_lec01
Mit6 094 iap10_lec01
 
From zero to MATLAB hero: Mastering the basics and beyond
From zero to MATLAB hero: Mastering the basics and beyondFrom zero to MATLAB hero: Mastering the basics and beyond
From zero to MATLAB hero: Mastering the basics and beyond
 
Introduction to Matlab.pdf
Introduction to Matlab.pdfIntroduction to Matlab.pdf
Introduction to Matlab.pdf
 
1.1Introduction to matlab.pptx
1.1Introduction to matlab.pptx1.1Introduction to matlab.pptx
1.1Introduction to matlab.pptx
 
Matlab-1.pptx
Matlab-1.pptxMatlab-1.pptx
Matlab-1.pptx
 
Matlab
MatlabMatlab
Matlab
 
Matlab pt1
Matlab pt1Matlab pt1
Matlab pt1
 
interfacing matlab with embedded systems
interfacing matlab with embedded systemsinterfacing matlab with embedded systems
interfacing matlab with embedded systems
 
Mat lab workshop
Mat lab workshopMat lab workshop
Mat lab workshop
 
Lecture 01 variables scripts and operations
Lecture 01   variables scripts and operationsLecture 01   variables scripts and operations
Lecture 01 variables scripts and operations
 
Introduction to matlab
Introduction to matlabIntroduction to matlab
Introduction to matlab
 
Matlab ch1 intro
Matlab ch1 introMatlab ch1 intro
Matlab ch1 intro
 
2.Exploration with CAS-I.Lab2.pptx
2.Exploration with CAS-I.Lab2.pptx2.Exploration with CAS-I.Lab2.pptx
2.Exploration with CAS-I.Lab2.pptx
 
MatlabIntro (1).ppt
MatlabIntro (1).pptMatlabIntro (1).ppt
MatlabIntro (1).ppt
 
Matlab guide
Matlab guideMatlab guide
Matlab guide
 
Variables in matlab
Variables in matlabVariables in matlab
Variables in matlab
 

Hands-On Time-Series Analysis with Matlab

  • 1. Hands-On Time-Series Analysis with Matlab Michalis Vlachos and Spiros Papadimitriou IBM T.J. Watson Research Center
  • 2. Tutorial | Time-Series with Matlab Disclaimer Feel free to use any of the following slides for educational purposes, however kindly acknowledge the source. We would also like to know how you have used these slides, so please send us emails with comments or suggestions.
  • 3. Tutorial | Time-Series with Matlab About this tutorial  The goal of this tutorial is to show you that time-series research (or research in general) can be made fun, when it involves visualizing ideas, that can be achieved with concise programming.  Matlab enables us to do that. Will I be able I am definitely to use this smarter than her, MATLAB but I am not a time- right away series person, per-se. after the tutorial? I wonder what I gain from this tutorial…
  • 4. Tutorial | Time-Series with Matlab Disclaimer  We are not affiliated with Mathworks in any way  … but we do like using Matlab a lot  since it makes our lives easier  Errors and bugs are most likely contained in this tutorial.  We might be responsible for some of them.
  • 5. Tutorial | Time-Series with Matlab What this tutorial is NOT about  Moving averages  Autoregressive models  Forecasting/Prediction  Stationarity  Seasonality
  • 6. Tutorial | Time-Series with Matlab Overview PART A — The Matlab programming environment PART B — Basic mathematics  Introduction / geometric intuition  Coordinates and transforms  Quantized representations  Non-Euclidean distances PART C — Similarity Search and Applications  Introduction  Representations  Distance Measures  Lower Bounding  Clustering/Classification/Visualization  Applications
  • 7. Tutorial | Time-Series with Matlab PART A: Matlab Introduction
  • 8. Tutorial | Time-Series with Matlab Why does anyone need Matlab?  Matlab enables the efficient Exploratory Data Analysis (EDA) “Science progresses through observation” -- Isaac Newton Isaac Newton “The greatest value of a picture is that is forces us to notice what we never expected to see” -- John Tukey John Tukey
  • 9. Tutorial | Time-Series with Matlab Matlab  Interpreted Language – Easy code maintenance (code is very compact) – Very fast array/vector manipulation – Support for OOP  Easy plotting and visualization  Easy Integration with other Languages/OS’s – Interact with C/C++, COM Objects, DLLs – Build in Java support (and compiler) – Ability to make executable files – Multi-Platform Support (Windows, Mac, Linux)  Extensive number of Toolboxes – Image, Statistics, Bioinformatics, etc
  • 10. Tutorial | Time-Series with Matlab History of Matlab (MATrix LABoratory) “The most important thing in the programming language is the name. I have recently invented a very good name and now I am looking for a suitable language”. -- R. Knuth Programmed by Cleve Moler as an interface for EISPACK & LINPACK Cleve Moler  1957: Moler goes to Caltech. Studies numerical Analysis  1961: Goes to Stanford. Works with G. Forsythe on Laplacian eigenvalues.  1977: First edition of Matlab; 2000 lines of Fortran – 80 functions (now more than 8000 functions)  1979: Met with Jack Little in Stanford. Started working on porting it to C  1984: Mathworks is founded Video:http://www.mathworks.com/company/aboutus/founders/origins_of_matlab_wm.html
  • 11. Tutorial | Time-Series with Matlab
  • 12. Tutorial | Time-Series with Matlab Current State of Matlab/Mathworks  Matlab, Simulink, Stateflow  Matlab version 7.3, R2006b  Used in variety of industries – Aerospace, defense, computers, communication, biotech  Mathworks still is privately owned  Used in >3,500 Universities, with >500,000 users worldwide  2005 Revenue: >350 M. Money is better than Money is better than poverty, if only for poverty, if only for  2005 Employees: 1,400+ financial reasons…… financial reasons……  Pricing: – starts from 1900$ (Commercial use), – ~100$ (Student Edition)
  • 13. Tutorial | Time-Series with Matlab Matlab 7.3  R2006b, Released on Sept 1 2006 – Distributed computing – Better support for large files – New optimization Toolbox – Matlab builder for Java • create Java classes from Matlab – Demos, Webinars in Flash format – (http://www.mathworks.com/products/matlab/demos. html)
  • 14. Tutorial | Time-Series with Matlab Who needs Matlab?  R&D companies for easy application deployment  Professors – Lab assignments – Matlab allows focus on algorithms not on language features  Students – Batch processing of files • No more incomprehensible perl code! – Great environment for testing ideas • Quick coding of ideas, then porting to C/Java etc – Easy visualization – It’s cheap! (for students at least…)
  • 15. Tutorial | Time-Series with Matlab Starting up Matlab Personally I'm always ready to learn, although I do not always like be Sir Winston Churchill  Dos/Unix like directory navigation  Commands like: – cd – pwd – mkdir  For navigation it is easier to just copy/paste the path from explorer E.g.: cd ‘c:documents’
  • 16. Tutorial | Time-Series with Matlab Matlab Environment Command Window: - type commands - load scripts Workspace: Loaded Variables/Types/Size
  • 17. Tutorial | Time-Series with Matlab Matlab Environment Command Window: - type commands - load scripts Workspace: Loaded Variables/Types/Size Help contains a comprehensive introduction to all functions
  • 18. Tutorial | Time-Series with Matlab Matlab Environment Command Window: - type commands - load scripts Workspace: Loaded Variables/Types/Size Excellent demos and tutorial of the various features and toolboxes
  • 19. Tutorial | Time-Series with Matlab Starting with Matlab  Everything is arrays  Manipulation of arrays is faster than regular manipulation with for-loops a = [1 2 3 4 5 6 7 9 10] % define an array
  • 20. Tutorial | Time-Series with Matlab Populating arrays  Plot sinusoid function a = [0:0.3:2*pi] % generate values from 0 to 2pi (with step of 0.3) b = cos(a) % access cos at positions contained in array [a] plot(a,b) % plot a (x-axis) against b (y-axis) Related: linspace(-100,100,15); % generate 15 values between -100 and 100
  • 21. Tutorial | Time-Series with Matlab Array Access  Access array elements >> a(1) >> a(1:3) ans = ans = 0 0.3000 0.6000 0  Set array elements >> a(1) = 100 >> a(1:3) = [100 100 100]
  • 22. Tutorial | Time-Series with Matlab 2D Arrays  Can access whole columns or rows – Let’s define a 2D array >> a = [1 2 3; 4 5 6] >> a(1,:) Row-wise access a = ans = 1 2 3 4 5 6 1 2 3 >> a(2,2) >> a(:,1) Column-wise access ans = ans = 5 1 4 A good listener is not only popular everywhere, but after a while he gets to know something. –Wilson Mizner
  • 23. Tutorial | Time-Series with Matlab Column-wise computation  For arrays greater than 1D, all computations happen column-by-column >> a = [1 2 3; 3 2 1] >> max(a) a = ans = 1 2 3 3 2 1 3 2 3 >> mean(a) >> sort(a) ans = ans = 2.0000 2.0000 2.0000 1 2 1 3 2 3
  • 24. Tutorial | Time-Series with Matlab Concatenating arrays  Column-wise or row-wise >> a = [1 2 3]; Row next to row >> a = [1;2]; Column next to column >> b = [4 5 6]; >> b = [3;4]; >> c = [a b] >> c = [a b] c = c = 1 3 1 2 3 4 5 6 2 4 >> a = [1 2 3]; Row below row >> a = [1;2]; Column below column >> b = [4 5 6]; >> b = [3;4]; >> c = [a; b] >> c = [a; b] c = c = 1 2 3 1 4 5 6 2 3 4
  • 25. Tutorial | Time-Series with Matlab Initializing arrays  Create array of ones [ones] >> a = ones(1,3) >> a = ones(2,2)*5; a = a = 1 1 1 5 5 5 5 >> a = ones(1,3)*inf a = Inf Inf Inf  Create array of zeroes [zeros] – Good for initializing arrays >> a = zeros(1,4) >> a = zeros(3,1) + [1 2 3]’ a = a = 1 0 0 0 0 2 3
  • 26. Tutorial | Time-Series with Matlab Reshaping and Replicating Arrays  Changing the array shape [reshape] – (eg, for easier column-wise computation) >> a = [1 2 3 4 5 6]’; % make it into a column reshape(X,[M,N]): >> reshape(a,2,3) [M,N] matrix of columnwise version ans = of X 1 3 5 2 4 6  Replicating an array [repmat] >> a = [1 2 3]; repmat(X,[M,N]): >> repmat(a,1,2) make [M,N] tiles of X ans = 1 2 3 1 2 3 >> repmat(a,2,1) ans = 1 2 3 1 2 3
  • 27. Tutorial | Time-Series with Matlab Useful Array functions  Last element of array [end] >> a = [1 3 2 5]; >> a = [1 3 2 5]; >> a(end) >> a(end-1) ans = ans = 5 2  Length of array [length] Length = 4 >> length(a) ans = a= 1 3 2 5 4  Dimensions of array [size] columns = 4 rows = 1 >> [rows, columns] = size(a) rows = 1 1 2 3 5 columns = 4
  • 28. Tutorial | Time-Series with Matlab Useful Array functions  Find a specific element [find] ** >> a = [1 3 2 5 10 5 2 3]; >> b = find(a==2) b = 3 7  Sorting [sort] *** >> a = [1 3 2 5]; >> [s,i]=sort(a) a= 1 3 2 5 s = 1 2 3 5 s= 1 2 3 5 i = 1 3 2 4 i= 1 3 2 4 Indicates the index where the element came from
  • 29. Tutorial | Time-Series with Matlab Visualizing Data and Exporting Figures  Use Fisher’s Iris dataset >> load fisheriris – 4 dimensions, 3 species – Petal length & width, sepal length & width – Iris: • virginica/versicolor/setosa meas (150x4 array): Holds 4D measurements ... 'versicolor' 'versicolor' 'versicolor' 'versicolor' 'versicolor' species (150x1 cell array): 'virginica' Holds name of species for 'virginica' the specific measurement 'virginica' 'virginica‘ ...
  • 30. Tutorial | Time-Series with Matlab strcmp, scatter, hold on Visualizing Data (2D) >> idx_setosa = strcmp(species, ‘setosa’); % rows of setosa data >> idx_virginica = strcmp(species, ‘virginica’); % rows of virginica >> >> setosa = meas(idx_setosa,[1:2]); >> virgin = meas(idx_virginica,[1:2]); >> scatter(setosa(:,1), setosa(:,2)); % plot in blue circles by default >> hold on; >> scatter(virgin(:,1), virgin(:,2), ‘rs’); % red[r] squares[s] for these idx_setosa ... 1 1 An array of zeros and 1 ones indicating the 0 positions where the 0 keyword ‘setosa’ was 0 found ... The world is governed more by appearances rather than realities… --Daniel Webster
  • 31. Tutorial | Time-Series with Matlab scatter3 Visualizing Data (3D) >> idx_setosa = strcmp(species, ‘setosa’); % rows of setosa data >> idx_virginica = strcmp(species, ‘virginica’); % rows of virginica >> idx_versicolor = strcmp(species, ‘versicolor’); % rows of versicolor >> setosa = meas(idx_setosa,[1:3]); >> virgin = meas(idx_virginica,[1:3]); >> versi = meas(idx_versicolor,[1:3]); >> scatter3(setosa(:,1), setosa(:,2),setosa(:,3)); % plot in blue circles by default >> hold on; >> scatter3(virgin(:,1), virgin(:,2),virgin(:,3), ‘rs’); % red[r] squares[s] for these >> scatter3(versi(:,1), virgin(:,2),versi(:,3), ‘gx’); % green x’s 7 6 5 4 >> grid on; % show grid on axis 3 >> rotate3D on; % rotate with mouse 2 1 4.5 4 8 3.5 7.5 7 6.5 3 6 5.5 2.5 5 4.5 2 4
  • 32. Tutorial | Time-Series with Matlab Changing Plots Visually Zoom out Zoom in Computers are Computers are useless. They can useless. They can Create line only give you only give you answers… answers… Create Arrow Select Object Add text
  • 33. Tutorial | Time-Series with Matlab Changing Plots Visually  Add titles  Add labels on axis  Change tick labels  Add grids to axis  Change color of line  Change thickness/ Linestyle  etc
  • 34. Tutorial | Time-Series with Matlab Changing Plots Visually (Example) Change color and width of a line A Right click C B
  • 35. Tutorial | Time-Series with Matlab Changing Plots Visually (Example) The result … Other Styles: 3 2 1 0 -1 -2 -3 0 10 20 30 40 50 60 70 80 90 100 3 2 1 0 -1 -2 -3 0 10 20 30 40 50 60 70 80 90 100
  • 36. Tutorial | Time-Series with Matlab Changing Figure Properties with Code  GUI’s are easy, but sooner or later we realize that coding is faster >> a = cumsum(randn(365,1)); % random walk of 365 values If this represents a year’s worth of measurements of an imaginary quantity, we will change: • x-axis annotation to months • Axis labels • Put title in the figure • Include some greek letters in the title just for fun Real men do it command-line… --Anonymous
  • 37. Tutorial | Time-Series with Matlab Changing Figure Properties with Code  Axis annotation to months >> axis tight; % irrelevant but useful... >> xx = [15:30:365]; >> set(gca, ‘xtick’,xx) The result … Real men do it command-line… --Anonymous
  • 38. Tutorial | Time-Series with Matlab Changing Figure Properties with Code  Axis annotation to months >> set(gca,’xticklabel’,[‘Jan’; ... ‘Feb’;‘Mar’]) The result … Real men do it command-line… --Anonymous
  • 39. Tutorial | Time-Series with Matlab Changing Figure Properties with Code Other latex examples:  Axis labels and title alpha, beta, e^{-alpha} etc >> title(‘My measurements (epsilon/pi)’) >> ylabel(‘Imaginary Quantity’) >> xlabel(‘Month of 2005’) Real men do it command-line… --Anonymous
  • 40. Tutorial | Time-Series with Matlab Saving Figures  Matlab allows to save the figures (.fig) for later processing .fig can be later opened through Matlab You can always put-off for tomorrow, what you can do today. -Anonymous
  • 41. Tutorial | Time-Series with Matlab Exporting Figures Export to: emf, eps, jpg, etc
  • 42. Tutorial | Time-Series with Matlab Exporting figures (code)  You can also achieve the same result with Matlab code  Matlab code: % extract to color eps print -depsc myImage.eps; % from command-line print(gcf,’-depsc’,’myImage’) % using variable as name
  • 43. Tutorial | Time-Series with Matlab Visualizing Data - 2D Bars 1 2 3 4 colormap bars time = [100 120 80 70]; % our data h = bar(time); % get handle cmap = [1 0 0; 0 1 0; 0 0 1; .5 0 1]; % colors colormap(cmap); % create colormap cdata = [1 2 3 4]; % assign colors set(h,'CDataMapping','direct','CData',cdata);
  • 44. Tutorial | Time-Series with Matlab Visualizing Data - 3D Bars data colormap 10 10 8 7 0 0 0 8 9 6 5 0.0198 0.0124 0.0079 6 8 6 4 0.0397 0.0248 0.0158 4 6 5 4 0.0595 0.0372 0.0237 2 6 3 2 0.0794 0.0496 0.0316 0 3 2 1 64 0.0992 0.0620 0.0395 ... 1 2 1.0000 0.7440 0.4738 3 1.0000 0.7564 0.4817 5 6 3 1.0000 0.7688 0.4896 7 1 2 1.0000 0.7812 0.4975 3 data = [ 10 8 7; 9 6 5; 8 6 4; 6 5 4; 6 3 2; 3 2 1]; bar3([1 2 3 5 6 7], data); c = colormap(gray); % get colors of colormap c = c(20:55,:); % get some colors colormap(c); % new colormap
  • 45. Tutorial | Time-Series with Matlab Visualizing Data - Surfaces data 10 9 1 2 3 … 10 8 1 7 6 5 9 10 4 1 10 3 2 1 10 The value at position 8 6 8 10 x-y of the array 4 6 indicates the height of 4 2 2 the surface 0 0 data = [1:10]; data = repmat(data,10,1); % create data surface(data,'FaceColor',[1 1 1], 'Edgecolor', [0 0 1]); % plot data view(3); grid on; % change viewpoint and put axis lines
  • 46. Tutorial | Time-Series with Matlab Creating .m files  Standard text files – Script: A series of Matlab commands (no input/output arguments) – Functions: Programs that accept input and return output Right click
  • 47. Tutorial | Time-Series with Matlab Creating .m files M editor Double click
  • 48. Tutorial | Time-Series with Matlab cumsum, num2str, save Creating .m files  The following script will create: – An array with 10 random walk vectors – Will save them under text files: 1.dat, …, 10.dat myScript.m Sample Script A cumsum(A) a = cumsum(randn(100,10)); % 10 random walk data of length 100 1 1 for i=1:size(a,2), % number of columns data = a(:,i) ; 2 3 fname = [num2str(i) ‘.dat’]; % a string is a vector of characters! save(fname, ’data’,’-ASCII’); % save each column in a text file 3 6 end 4 10 Write this in the 5 15 A random walk time-series M editor… 10 5 0 …and execute by typing the name on the Matlab -5 command line 0 10 20 30 40 50 60 70 80 90 100
  • 49. Tutorial | Time-Series with Matlab Functions in .m scripts  When we need to: – Organize our code – Frequently change parameters in our scripts keyword output argument function name input argument function dataN = zNorm(data) % ZNORM zNormalization of vector Help Text % subtract mean and divide by std (help function_name) if (nargin<1), % check parameters error(‘Not enough arguments’); end data = data – mean(data); % subtract mean Function Body data = data/std(data); % divide by std dataN = data; function [a,b] = myFunc(data, x, y) % pass & return more arguments See also:varargin, varargout
  • 50. Tutorial | Time-Series with Matlab Cell Arrays  Cells that hold other Matlab arrays – Let’s read the files of a directory >> f = dir(‘*.dat’) % read file contents f = 15x1 struct array with fields: name me date Struct Array ).na bytes name f(1 date isdir 1 bytes for i=1:length(f), isdir a{i} = load(f(i).name); 2 N = length(a{i}); plot3([1:N], a{i}(:,1), a{i}(:,2), ... 3 ‘r-’, ‘Linewidth’, 1.5); grid on; 4 pause; 600 5 cla; 500 end 400 300 200 100 0 1000 1500 500 1000 500
  • 51. Tutorial | Time-Series with Matlab Reading/Writing Files  Load/Save are faster than C style I/O operations – But fscanf, fprintf can be useful for file formatting or reading non-Matlab files fid = fopen('fischer.txt', 'wt'); for i=1:length(species), fprintf(fid, '%6.4f %6.4f %6.4f %6.4f %sn', meas(i,:), species{i}); end fclose(fid); Output file:  Elements are accessed column-wise (again…) x = 0:.1:1; y = [x; exp(x)]; fid = fopen('exp.txt','w'); fprintf(fid,'%6.2f %12.8fn',y); fclose(fid); 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 1 1.1052 1.2214 1.3499 1.4918 1.6487 1.8221 2.0138
  • 52. Tutorial | Time-Series with Matlab Flow Control/Loops  if (else/elseif) , switch – Check logical conditions  while – Execute statements infinite number of times  for – Execute statements a fixed number of times  break, continue  return – Return execution to the invoking function Life is pleasant. Death is peaceful. It’s the transition that’s troublesome. –Isaac Asimov
  • 53. Tutorial | Time-Series with Matlab tic, toc, clear all For-Loop or vectorization?  Pre-allocate arrays that store output results clear all; elapsed_time = – No need for Matlab to tic; for i=1:50000 5.0070 resize everytime a(i) = sin(i); end  Functions are faster than toc scripts – Compiled into pseudo- clear all; elapsed_time = code a = zeros(1,50000); tic; 0.1400  Load/Save faster than for i=1:50000 a(i) = sin(i); Matlab I/O functions end toc  After v. 6.5 of Matlab there is for-loop vectorization (interpreter) clear all; tic; elapsed_time =  Vectorizations help, but i = [1:50000]; not so obvious how to a = sin(i); 0.0200 toc; achieve many times Time not important…only life important. –The Fifth Element
  • 54. Tutorial | Time-Series with Matlab Matlab Profiler  Find which portions of code take up most of the execution time – Identify bottlenecks – Vectorize offending code Time not important…only life important. –The Fifth Element
  • 55. Tutorial | Time-Series with Matlab Hints &Tips  There is always an easier (and faster) way – Typically there is a specialized function for what you want to achieve  Learn vectorization techniques, by ‘peaking’ at the actual Matlab files: – edit [fname], eg – edit mean – edit princomp  Matlab Help contains many vectorization examples
  • 56. Tutorial | Time-Series with Matlab Debugging Beware of bugs in the above code; I have only proved it correct, not tried it -- R. Knuth  Not as frequently required as in C/C++ – Set breakpoints, step, step in, check variables values Set breakpoints
  • 57. Tutorial | Time-Series with Matlab Either this man is Either this man is dead or my watch dead or my watch Debugging has stopped. has stopped.  Full control over variables and execution path – F10: step, F11: step in (visit functions, as well) A B F10 C
  • 58. Tutorial | Time-Series with Matlab Advanced Features – 3D modeling/Volume Rendering  Very easy volume manipulation and rendering
  • 59. Tutorial | Time-Series with Matlab Advanced Features – Making Animations (Example)  Create animation by changing the camera viewpoint 3 3 2 2 1 1 3 0 0 2 -1 -1 1 -2 -2 0 -3 0 0 -3 -1 0 4 -2 50 3 50 50 2 -3 1 -1 0 0 1 2 100 4 100 3 4 100 2 3 -1 0 1 -1 azimuth = [50:100 99:-1:50]; % azimuth range of values for k = 1:length(azimuth), plot3(1:length(a), a(:,1), a(:,2), 'r', 'Linewidth',2); grid on; view(azimuth(k),30); % change new M(k) = getframe; % save the frame end movie(M,20); % play movie 20 times See also:movie2avi
  • 60. Tutorial | Time-Series with Matlab Advanced Features – GUI’s  Built-in Development Environment – Buttons, figures, Menus, sliders, etc  Several Examples in Help – Directory listing – Address book reader – GUI with multiple axis
  • 61. Tutorial | Time-Series with Matlab Advanced Features – Using Java  Matlab is shipped with Java Virtual Machine (JVM)  Access Java API (eg I/O or networking)  Import Java classes and construct objects  Pass data between Java objects and Matlab variables
  • 62. Tutorial | Time-Series with Matlab Advanced Features – Using Java (Example)  Stock Quote Query – Connect to Yahoo server – http://www.mathworks.com/matlabcentral/fileexchange/loadFile.do?objectId=4069&objectType=file disp('Contacting YAHOO server using ...'); disp(['url = java.net.URL(' urlString ')']); end; url = java.net.URL(urlString); try stream = openStream(url); ireader = java.io.InputStreamReader(stream); breader = java.io.BufferedReader(ireader); connect_query_data= 1; %connect made; catch connect_query_data= -1; %could not connect case; disp(['URL: ' urlString]); error(['Could not connect to server. It may be unavailable. Try again later.']); stockdata={}; return; end
  • 63. Tutorial | Time-Series with Matlab Matlab Toolboxes  You ca n buy m any specialize d toolbox e s from Ma thw orks – Image Processing, Statistics, Bio-Informatics, etc  The re a re m any equiva le nt free toolbox e s too: – SVM toolbox • http://theoval.sys.uea.ac.u k/~gcc/svm/toolbox/ – W avelets • http://www.math.rutgers.ed u/~ojanen/wavekit/ – Speech Processing • http://www.ee.ic.ac.uk/hp /staff/dmb/voicebox/voicebox.html – Bayesian Networks • http://www.cs.ubc.ca/~murphyk/Software/BNT/bnt.html
  • 64. Tutorial | Time-Series with Matlab I’ve had a wonderful I’ve had a wonderful In case I get stuck… evening. But this evening. But this wasn’t it… wasn’t it…  help [command] (on the command line) eg. help fft  Menu: help -> matlab help – Excellent introduction on various topics  Matlab webinars – http://www.mathworks.com/company/events/archived_webinars.html?fp  Google groups – comp.soft-sys.matlab – You can find *anything* here – Someone else had the same problem before you!
  • 65. Tutorial | Time-Series with Matlab PART B: Mathematical notions Eight percent of Eight percent of success is showing success is showing up. up.
  • 66. Tutorial | Time-Series with Matlab Overview of Part B 1. Introduction and geometric intuition 2. Coordinates and transforms  Fourier transform (DFT)  Wavelet transform (DWT)  Incremental DWT  Principal components (PCA)  Incremental PCA 3. Quantized representations  Piecewise quantized / symbolic  Vector quantization (VQ) / K-means 4. Non-Euclidean distances  Dynamic time warping (DTW)
  • 67. Tutorial | Time-Series with Matlab What is a time-series Definition: A sequence of measurements over time Definition: A sequence of measurements over time  Medicine ECG 64.0  Stock Market  Meteorology 62.8 62.0  Geology 66.0  Astronomy 62.0 32.0 Sunspot  Chemistry 86.4 ...  Biometrics 21.6  Robotics 45.2 43.2 53.0 Earthquake 43.2 42.8 43.2 36.4 time
  • 68. Tutorial | Time-Series with Matlab Applications Images Shapes Motion capture Image Color Histogram 600 400 200 Acer platanoides 0 50 100 150 200 250 400 200 0 50 100 150 200 250 800 600 400 200 0 50 100 150 200 250 Time-Series …more to come Salix fragilis
  • 69. Tutorial | Time-Series with Matlab Time Series value x5 x2 x6 x3 x1 x4 time
  • 70. Tutorial | Time-Series with Matlab Time Series value x = (3, 8, 4, 1, 9, 6) 9 8 6 4 3 1 time  Sequence of numeric values – Finite: – N-dimensional vectors/points – Infinite: – Infinite-dimensional vectors
  • 71. Tutorial | Time-Series with Matlab Mean  Definition:  From now on, we will generally assume zero mean — mean normalization:
  • 72. Tutorial | Time-Series with Matlab Variance  Definition: or, if zero mean, then  From now on, we will generally assume unit variance — variance normalization:
  • 73. Tutorial | Time-Series with Matlab Mean and variance variance σ mean µ
  • 74. Tutorial | Time-Series with Matlab Why and when to normalize  Intuitively, the notion of “shape” is generally independent of – Average level (mean) – Magnitude (variance)  Unless otherwise specified, we normalize to zero mean and unit variance
  • 75. Tutorial | Time-Series with Matlab Variance “=” Length  Variance of zero-mean series:  Length of N-dimensional vector (L2-norm):  So that: x2 || ||x x1
  • 76. Tutorial | Time-Series with Matlab Covariance and correlation  Definition or, if zero mean and unit variance, then
  • 77. Tutorial | Time-Series with Matlab Correlation and similarity  How “strong” is the linear relationship between xt and yt ?  For normalized series, residual slope 2.5 2.5 2 ρ = -0.23 2 ρ = 0.99 1.5 1.5 1 1 0.5 0.5 CAD BEF 0 0 -0.5 -0.5 -1 -1 -1.5 -1.5 -2 -2 -2.5 -2.5 -2 -1 0 1 2 -2 -1 0 1 2 FRF FRF
  • 78. Tutorial | Time-Series with Matlab Correlation “=” Angle  Correlation of normalized series:  Cosine law:  So that: x θ y x.y
  • 79. Tutorial | Time-Series with Matlab Correlation and distance  For normalized series, i.e., correlation and squared Euclidean distance are linearly related. x ||x -y || θ y x.y
  • 80. Tutorial | Time-Series with Matlab Ergodicity Example  Assume I eat chicken at the same restaurant every day and  Question: How often is the food good? – Answer one: – Answer two:  Answers are equal ⇒ ergodic – “If the chicken is usually good, then my guests today can safely order other things.”
  • 81. Tutorial | Time-Series with Matlab Ergodicity Example  Ergodicity is a common and fundamental assumption, but sometimes can be wrong:  “Total number of murders this year is 5% of the population”  “If I live 100 years, then I will commit about 5 murders, and if I live 60 years, I will commit about 3 murders”  … non-ergodic!  Such ergodicity assumptions on population ensembles is commonly called “racism.”
  • 82. Tutorial | Time-Series with Matlab Stationarity Example  Is the chicken quality consistent? – Last week: – Two weeks ago: – Last month: – Last year:  Answers are equal ⇒ stationary
  • 83. Tutorial | Time-Series with Matlab Autocorrelation  Definition:  Is well-defined if and only if the series is (weakly) stationary  Depends only on lag ℓ, not time t
  • 84. Tutorial | Time-Series with Matlab Time-domain “coordinates” 6 4 3.5 2 1.5 1 -0.5 = -2 -0.5 + 4 + 1.5 + -2 + 2 + 6 + 3.5 + 1
  • 85. Tutorial | Time-Series with Matlab Time-domain “coordinates” 6 4 3.5 2 1.5 1 -0.5 = -2 x1 -0.5 £e1 + x2 4 £e2 x3 + 1.5 £e3 x4 + -2 £e4 + x5 2 £e5 + x6 6 £e6 x7 + 3.5 £e7 + x8 1 £e8
  • 86. Tutorial | Time-Series with Matlab Orthonormal basis  Set of N vectors, { e1, e2, …, eN } – Normal: ||ei|| = 1, for all 1 ≤ i ≤ N – Orthogonal: ei¢ej = 0, for i ≠ j  Describe a Cartesian coordinate system – Preserve length (aka. “Parseval theorem”) – Preserve angles (inner-product, correlations)
  • 87. Tutorial | Time-Series with Matlab Orthonormal basis  Note that the coefficients xi w.r.t. the basis { e1, …, eN } are the corresponding “similarities” of x to each basis vector/series: 6 4 3.5 1.5 2 1 = -0.5 + 4 + … -0.5 -2 e1 e2 x x2
  • 88. Tutorial | Time-Series with Matlab Orthonormal bases  The time-domain basis is a trivial tautology: – Each coefficient is simply the value at one time instant  What other bases may be of interest? Coefficients may correspond to: – Frequency (Fourier) – Time/scale (wavelets) – Features extracted from series collection (PCA)
  • 89. Tutorial | Time-Series with Matlab Frequency domain “coordinates” Preview 6 4 3.5 2 1.5 1 -0.5 = -2 5.6 + -2.2 + 0 + 2.8 - 4.9 + -3 + 0 + 0.05
  • 90. Tutorial | Time-Series with Matlab Time series geometry Summary  Basic concepts: – Series / vector – Mean: “average level” – Variance: “magnitude/length” – Correlation: “similarity”, “distance”, “angle” – Basis: “Cartesian coordinate system”
  • 91. Tutorial | Time-Series with Matlab Time series geometry Preview — Applications  The quest for the right basis…  Compression / pattern extraction – Filtering – Similarity / distance – Indexing – Clustering – Forecasting – Periodicity estimation – Correlation
  • 92. Tutorial | Time-Series with Matlab Overview 1. Introduction and geometric intuition 2. Coordinates and transforms  Fourier transform (DFT)  Wavelet transform (DWT)  Incremental DWT  Principal components (PCA)  Incremental PCA 3. Quantized representations  Piecewise quantized / symbolic  Vector quantization (VQ) / K-means 4. Non-Euclidean distances  Dynamic time warping (DTW)
  • 93. Tutorial | Time-Series with Matlab Frequency  One cycle every 20 time units (period)
  • 94. Tutorial | Time-Series with Matlab Frequency and time . = 0  Why is the period 20? period = 8  It’s not 8, because its “similarity” (projection) to a period-8 series (of the same length) is zero.
  • 95. Tutorial | Time-Series with Matlab Frequency and time . = 0 period = 10  Why is the cycle 20?  It’s not 10, because its “similarity” (projection) to a period-10 series (of the same length) is zero.
  • 96. Tutorial | Time-Series with Matlab Frequency and time . = 0 period = 40  Why is the cycle 20?  It’s not 40, because its “similarity” (projection) to a period-40 series (of the same length) is zero. …and so on
  • 97. Tutorial | Time-Series with Matlab Frequency Fourier transform - Intuition  To find the period, we compared the time series with sinusoids of many different periods  Therefore, a good “description” (or basis) would consist of all these sinusoids  This is precisely the idea behind the discrete Fourier transform – The coefficients capture the similarity (in terms of amplitude and phase) of the series with sinusoids of different periods
  • 98. Tutorial | Time-Series with Matlab Frequency Fourier transform - Intuition  Technical details: – We have to ensure we get an orthonormal basis – Real form: sines and cosines at N/2 different frequencies – Complex form: exponentials at N different frequencies
  • 99. Tutorial | Time-Series with Matlab Fourier transform Real form  For odd-length series,  The pair of bases at frequency fk are plus the zero-frequency (mean) component
  • 100. Tutorial | Time-Series with Matlab Fourier transform Real form — Amplitude and phase  Observe that, for any fk, we can write where are the amplitude and phase, respectively.
  • 101. Tutorial | Time-Series with Matlab Fourier transform Real form — Amplitude and phase  It is often easier to think in terms of amplitude rk and phase θ k – e.g., 1 0.5 0 -0.5 5 -1 0 10 20 30 40 50 60 70 80
  • 102. Tutorial | Time-Series with Matlab Fourier transform Complex form  The equations become easier to handle if we allow the series and the Fourier coefficients Xk to take complex values:  Matlab note: fft omits the scaling factor and is not unitary—however, ifft includes an scaling factor, so always ifft(fft(x)) == x.
  • 103. Tutorial | Time-Series with Matlab Fourier transform Example 2 1 1 frequency GBP 0 -1 2 2 frequencies 1 GBP 0 -1 2 3 frequencies 1 GBP 0 -1 2 5 frequencies 1 GBP 0 -1 2 10 frequencies 1 GBP 0 -1 2 20 frequencies 1 GBP 0 -1
  • 104. Tutorial | Time-Series with Matlab Other frequency-based transforms  Discrete Cosine Transform (DCT) – Matlab: dct / idct  Modified Discrete Cosine Transform (MDCT)
  • 105. Tutorial | Time-Series with Matlab Overview 1. Introduction and geometric intuition 2. Coordinates and transforms  Fourier transform (DFT)  Wavelet transform (DWT)  Incremental DWT  Principal components (PCA)  Incremental PCA 3. Quantized representations  Piecewise quantized / symbolic  Vector quantization (VQ) / K-means 4. Non-Euclidean distances  Dynamic time warping (DTW)
  • 106. Tutorial | Time-Series with Matlab Frequency and time e.g., . period = 20 ≠ 0 . ≠ 0 period = 10  What is the cycle now? etc…  No single cycle, because the series isn’t exactly similar with any series of the same length.
  • 107. Tutorial | Time-Series with Matlab Frequency and time  Fourier is successful for summarization of series with a few, stable periodic components  However, content is “smeared” across frequencies when there are – Frequency shifts or jumps, e.g., – Discontinuities (jumps) in time, e.g.,
  • 108. Tutorial | Time-Series with Matlab Frequency and time  If there are discontinuities in time/frequency or frequency shifts, then we should seek an alternate “description” or basis  Main idea: Localize bases in time – Short-time Fourier transform (STFT) – Discrete wavelet transform (DWT)
  • 109. Tutorial | Time-Series with Matlab Frequency and time Intuition  What if we examined, e.g., eight values at a time?
  • 110. Tutorial | Time-Series with Matlab Frequency and time Intuition  What if we examined, e.g., eight values at a time?  Can only compare with periods up to eight. – Results may be different for each group (window)
  • 111. Tutorial | Time-Series with Matlab Frequency and time Intuition  Can “adapt” to localized phenomena  Fixed window: short-window Fourier (STFT) – How to choose window size?  Variable windows: wavelets
  • 112. Tutorial | Time-Series with Matlab Wavelets Intuition  Main idea – Use small windows for small periods • Remove high-frequency component, then – Use larger windows for larger periods • Twice as large – Repeat recursively  Technical details – Need to ensure we get an orthonormal basis
  • 113. Tutorial | Time-Series with Matlab Wavelets Intuition Scale (frequency) Frequency Time Time
  • 114. Tutorial | Time-Series with Matlab Wavelets Intuition — Tiling time and frequency Scale (frequency) Frequency Frequency Time Time Fourier, DCT, … STFT Wavelets
  • 115. Tutorial | Time-Series with Matlab Wavelet transform Pyramid algorithm High pass Low pass
  • 116. Tutorial | Time-Series with Matlab Wavelet transform Pyramid algorithm High pass Low pass
  • 117. Tutorial | Time-Series with Matlab Wavelet transform Pyramid algorithm High pass Low pass
  • 118. Tutorial | Time-Series with Matlab Wavelet transform Pyramid algorithm High w1 pass x ≡ w0 High w2 pass Low v1 pass High w3 Low v2 pass pass Low v3 pass
  • 119. Tutorial | Time-Series with Matlab Wavelet transforms General form  A high-pass / low-pass filter pair – Example: pairwise difference / average (Haar) – In general: Quadrature Mirror Filter (QMF) pair • Orthogonal spans, which cover the entire space – Additional requirements to ensure orthonormality of overall transform…  Use to recursively analyze into top / bottom half of frequency band
  • 120. Tutorial | Time-Series with Matlab Wavelet transforms Other filters — examples Haar (Daubechies-1) Better frequency isolation Worse time localization Daubechies-2 Daubechies-3 Daubechies-4 Wavelet filter, or Scaling filter, or Mother filter Father filter (high-pass) (low-pass)
  • 121. Tutorial | Time-Series with Matlab Wavelets Example Wavelet coefficients (GBP, Haar) Wavelet coefficients (GBP, Daubechies-3) 2 2 GBP 1 1 0 0 -1 -1 500 1000 1500 2000 2500 500 1000 1500 2000 2500 1 1 W1 0 0 -1 -1 200 400 600 800 1000 1200 200 400 600 800 1000 1200 1 1 W2 0 0 -1 -1 100 200 300 400 500 600 100 200 300 400 500 600 2 1 W3 0 0 -2 -1 50 100 150 200 250 300 50 100 150 200 250 300 2 2 W4 0 0 -2 -2 20 40 60 80 100 120 140 160 20 40 60 80 100 120 140 160 5 5 W5 0 0 -5 -5 10 20 30 40 50 60 70 80 10 20 30 40 50 60 70 80 10 5 W6 0 0 -10 -5 5 10 15 20 25 30 35 40 5 10 15 20 25 30 35 40 45 20 20 V6 0 0 -20 -20 5 10 15 20 25 30 35 40 5 10 15 20 25 30 35 40 45
  • 122. Tutorial | Time-Series with Matlab Wavelets Example Multi-resolution analysis (GBP, Haar) Multi-resolution analysis (GBP, Daubechies-3) 2 2 GBP 1 1 0 0 -1 -1 500 1000 1500 2000 2500 500 1000 1500 2000 2500 0.1 0 0 -0.2 D1 -0.1 -0.2 -0.4 -0.3 -0.6 500 1000 1500 2000 2500 500 1000 1500 2000 2500 0.2 0.2 0 D2 0 -0.2 -0.2 -0.4 -0.6 500 1000 1500 2000 2500 500 1000 1500 2000 2500 0.4 0.2 0.2 D3 0 0 -0.2 -0.2 -0.4 -0.4 500 1000 1500 2000 2500 500 1000 1500 2000 2500 0.4 0.2 0.2 0 D4 0 -0.2 -0.2 -0.4 -0.4 500 1000 1500 2000 2500 500 1000 1500 2000 2500 0.5 0.5 D5 0 0 -0.5 -0.5 500 1000 1500 2000 2500 500 1000 1500 2000 2500 0.5 0.5 0 D6 0 -0.5 -0.5 500 1000 1500 2000 2500 500 1000 1500 2000 2500 2 2 1 1 A6 0 0 -1 -1 500 1000 1500 2000 2500 500 1000 1500 2000 2500
  • 123. Tutorial | Time-Series with Matlab Wavelets Example Multi-resolution analysis (GBP, Haar) Multi-resolution analysis (GBP, Daubechies-3) 2 2 GBP 1 1 Analysis levels are orthogonal, 0 0 -1 -1 Di¢Dj = 0, for i ≠ j 500 1000 1500 2000 2500 500 1000 1500 2000 2500 0.1 0 0 -0.2 D1 -0.1 -0.2 -0.4 -0.3 -0.6 500 1000 1500 2000 2500 500 1000 1500 2000 2500 0.2 2 Haar analysis: simple, piecewise constant 0.2 0 D2 0 -0.2 -0.2 1 -0.4 -0.6 500 0 1000 1500 2000 2500 500 1000 1500 2000 2500 0.4 0.2 0.2 -1 D3 0 0 -0.2 -0.2 -0.4 500 1000 -0.4 1500 2000 2500 500 1000 1500 2000 2500 500 1000 1500 2000 2500 0.4 0.2 Daubechies-3 analysis: less artifacting 0.2 0 D4 0 -0.2 2 -0.2 -0.4 -0.4 500 1 1000 1500 2000 2500 500 1000 1500 2000 2500 0.5 0.5 0 D5 0 0 -0.5 -1 -0.5 500 1000 1500 500 2000 2500 1000 1500 500 1000 2000 1500 2500 2000 2500 0.5 0.5 0 D6 0 -0.5 -0.5 500 1000 1500 2000 2500 500 1000 1500 2000 2500 2 2 1 1 A6 0 0 -1 -1 500 1000 1500 2000 2500 500 1000 1500 2000 2500
  • 124. Tutorial | Time-Series with Matlab Wavelets Matlab  Wavelet GUI: wavemenu  Single level: dwt / idwt  Multiple level: wavedec / waverec – wmaxlev  Wavelet bases: wavefun
  • 125. Tutorial | Time-Series with Matlab Other wavelets  Only scratching the surface…  Wavelet packets – All possible tilings (binary) – Best-basis transform  Overcomplete wavelet transform (ODWT), aka. maximum-overlap wavelets (MODWT), aka. shift- invariant wavelets Further reading: 1. Donald B. Percival, Andrew T. Walden, Wavelet Methods for Time Series Analysis, Cambridge Univ. Press, 2006. 2. Gilbert Strang, Truong Nguyen, Wavelets and Filter Banks, Wellesley College, 1996. 3. Tao Li, Qi Li, Shenghuo Zhu, Mitsunori Ogihara, A Survey of Wavelet Applications in Data Mining, SIGKDD Explorations, 4(2), 2002.
  • 126. Tutorial | Time-Series with Matlab More on wavelets  Signal representation and compressibility 100 Partial energy (GBP) 100 Partial energy (Light) 90 90 80 80 70 70 Quality (% energy) Quality (% energy) 60 60 50 50 40 40 30 30 20 Time 20 Time FFT FFT 10 Haar 10 Haar DB3 DB3 0 0 0 2 4 6 8 10 0 5 10 15 Compression (% coefficients) Compression (% coefficients)
  • 127. Tutorial | Time-Series with Matlab More wavelets  Keeping the highest coefficients minimizes total error (L2-distance)  Other coefficient selection/thresholding schemes for different error metrics (e.g., maximum per-instant error, or L1 -dist.) – Typically use Haar bases Further reading: 1. Minos Garofalakis, Amit Kumar, Wavelet Synopses for General Error Metrics, ACM TODS, 30(4), 2005. 2.Panagiotis Karras, Nikos Mamoulis, One-pass Wavelet Synopses for Maximum-Error Metrics, VLDB 2005.
  • 128. Tutorial | Time-Series with Matlab Overview 1. Introduction and geometric intuition 2. Coordinates and transforms  Fourier transform (DFT)  Wavelet transform (DWT)  Incremental DWT  Principal components (PCA)  Incremental PCA 3. Quantized representations  Piecewise quantized / symbolic  Vector quantization (VQ) / K-means 4. Non-Euclidean distances  Dynamic time warping (DTW)
  • 129. Tutorial | Time-Series with Matlab Wavelets Incremental estimation
  • 130. Tutorial | Time-Series with Matlab Wavelets Incremental estimation
  • 131. Tutorial | Time-Series with Matlab Wavelets Incremental estimation
  • 132. Tutorial | Time-Series with Matlab Wavelets Incremental estimation
  • 133. Tutorial | Time-Series with Matlab Wavelets Incremental estimation
  • 134. Tutorial | Time-Series with Matlab Wavelets Incremental estimation post-order traversal
  • 135. Tutorial | Time-Series with Matlab Wavelets Incremental estimation  Forward transform : – Post-order traversal of wavelet coefficient tree – O(1) time (amortized) – O(logN) buffer space (total) constant factor: filter length  Inverse transform: – Pre-order traversal of wavelet coefficient tree – Same complexity
  • 136. Tutorial | Time-Series with Matlab Overview 1. Introduction and geometric intuition 2. Coordinates and transforms  Fourier transform (DFT)  Wavelet transform (DWT)  Incremental DWT  Principal components (PCA)  Incremental PCA 3. Quantized representations  Piecewise quantized / symbolic  Vector quantization (VQ) / K-means 4. Non-Euclidean distances  Dynamic time warping (DTW)
  • 137. Tutorial | Time-Series with Matlab Time series collections Overview  Fourier and wavelets are the most prevalent and successful “descriptions” of time series.  Next, we will consider collections of M time series, each of length N. – What is the series that is “most similar” to all series in the collection? – What is the second “most similar”, and so on…
  • 138. Tutorial | Time-Series with Matlab Time series collections  Some notation: values at time t, xt i-th series, x(i)
  • 139. Tutorial | Time-Series with Matlab Principal Component Analysis Example Exchange rates (vs. USD) Principal components 1-4 (µ ≠ 0) 0.05 u1 2 = 48% AUD U1 0 0 -2 -0.05 0.05 2 + 33% u2 BEF 0 U2 0 -2 -0.05 = 81% 2 0.05 + 11% u3 CAD 0 U3 0 -2 -0.05 = 92% 0.05 2 + 4% FRF u4 0 0 U4 -2 -0.05 500 1000 1500 2000 2500 = 96% 2 DEM Time 0 “Best” basis : { u1, u2, u3, u4 } -2 2 x = 49.1u1 + 8.1u2 + 7.8u3 + 3.6u4 + ε 1 JPY 0 (2) -2 2 Coefficients of each time series NLG 0 -2 2 w.r.t. basis { u1, u2, u3, u4 } : NZL 0 -2 2 ESP 0 -2 2 SEK 0 -2 2 CHF 0 -2 2 GBP 0 -2 500 1000 1500 2000 2500 Time
  • 140. Tutorial | Time-Series with Matlab Principal component analysis 2 First two principal components CAD 0 -2 2 ESP 50 0 -2 SEK 40 2 GBP 0 -2 30 AUD 2 FRF 0 -2 20 υi,2 2 BEF 10 0 -2 NZL CHF 0 2 NLG 0 2 DEM -2 0 -10 -2 -20 2 JPY 0 -2 -30 -20 -10 0 10 20 30 40 50 60 υi,1
  • 141. Tutorial | Time-Series with Matlab Principal Component Analysis Matrix notation — Singular Value Decomposition (SVD) X = UΣVT X U ΣVT x(1) x(2) x(M) = u1 u2 uk . υ1 υ2 υ3 υM coefficients w.r.t. basis in U time series basis for (columns) time series
  • 142. Tutorial | Time-Series with Matlab Principal Component Analysis Matrix notation — Singular Value Decomposition (SVD) X = UΣVT X U ΣVT v’1 v’2 x(1) x(2) x(M) = u1 u2 uk . υ1 υ2 υ3 υN v’k basis for measurements time series basis for (rows) time series coefficients w.r.t. basis in U (columns)
  • 143. Tutorial | Time-Series with Matlab Principal Component Analysis Matrix notation — Singular Value Decomposition (SVD) X = UΣVT X U Σ VT σ1 v1 σ2 v2 x(1) x(2) x(M) = u1 u2 uk . . σk vk scaling factors basis for measurements time series basis for (rows) time series
  • 144. Tutorial | Time-Series with Matlab Principal component analysis Properties — Singular Value Decomposition (SVD)  V are the eigenvectors of the covariance matrix XTX, since  U are the eigenvectors of the Gram (inner-product) matrix XXT, since Further reading: 1. Ian T. Jolliffe, Principal Component Analysis (2nd ed), Springer, 2002. 2. Gilbert Strang, Linear Algebra and Its Applications (4th ed), Brooks Cole, 2005.
  • 145. Tutorial | Time-Series with Matlab Kernels and KPCA  What are kernels? Exchange rates – Usual definition of inner product w.r.t. SEK ESP GBP CAD vector coordinates is x¢y = ∑i xiyi AUD – However, other definitions that preserve NZL FRF BEF DEMNLG CHF the fundamental properties are possible JPY  Why kernels? – We no longer have explicit “coordinates” • Objects do not even need to be numeric – But we can still talk about distances and angles – Many algorithms rely just on these two concepts Further reading: 1. Bernhard Schölkopf, Alexander J. Smola, Learning with Kernels: Support Vector Machines, Regularization, Optimization and Beyond, MIT Press, 2001.

Editor's Notes

  1. Nice Synopsis of what we can achieve through the use of Matlab. Manipulate, analyse and visualize data. Pinpoint error and correct them
  2. 4 options. Columns or row next to each other or below one another
  3. Solid line, dashed line, dotted line, etc
  4. 4 attributes or fields
  5. Never again coredump
  6. After you exhaust the 8000 built-in Matlab commands…
  7. “Will consider finite (at any given time), although in streaming context, N grows”
  8. “Note that the number of coefficients is still eight…”
  9. …easier for interpretation, not for algebraic manipulation. But, algebraic, even easier with complex form (next slide)
  10. Callouts: “bases are zero outside window boundaries”
  11. Say about relationship (or lack thereof) between window “size” and filter length…
  12. Export setup: 6 x 5 in (expand axes)
  13. Explain “MRA” in words: reconstruction using the coefficients *only* from that level Export setup: 6 x 5 in (expand axes)
  14. Export setup: 6 x 5 in (expand axes)
  15. PE plot export: 4x4in (expand) Inset export: in (expand)
  16. Previous slide: more from signal-processing – this slide is DB-specific
  17. Export setup for t.s. plot: 5x7 in (expand axes)
  18. First: how what exactly do we mean by “correlation”? (Answer: linear correlations)
  19. So: all we have to do, is estimate the slope. Starting with the first two points, this is really very easy and fast.
  20. We are “lucky” so far. Next: what happens when we have to update the slope.
  21. Answer: rotate the slope to “fix” the error. [Unanswered question: rotate around *which* point?] This is a simple vector a addition (and re-normalization) -&gt; O(n) very simple operations
  22. Mention that this converges assuming no “drifts” (technically: stationarity).
  23. Done with intuition, now give real names.
  24. Just point out very special case (but.. APCA more elaborate time segmentation…)
  25. 1. Why variable-length segmentation is good (if goal is piecewise-constant) 2. Also shows weakness of Haar… APCA-21: 24% RMS error Haar (lv 7): 44% RMS error
  26. 1. Why variable-length segmentation is good (if goal is piecewise-constant) 2. Also shows weakness of Haar… APCA-21: 24% RMS error Haar (lv 7): 44% RMS error
  27. APCA-15: 27% RMS error DB3 (lv-7): 38% RMS error
  28. Show case k=2 (for which the equivalence is exact) First, two clusters always separable on 1 st PC (i.e., reduces to 1-D problem, easy) Furthermore, related objectives: K-means: minimize green length PCA: minimize red length (or, equivalently, angle) For k &gt; 2, things get more complicated – see reference
  29. Say: gray cells are prefix subsequences – we use only these in recursive definition/estimation
  30. This is a sketch of the idea… Works like this under certain(?) “smoothness” conditions (may have to look at all four sub-rectangles separately, lb property does not have to guarantee “inclusion”…)
  31. (+)No need to know anything about the distance. Just pairwise distances
  32. Distance functions that are robust to outliers or to extremely noisy data will typically violate the triangular inequality. These functions achieve this by not considering the most dissimilar parts of the objects. These functions are extremely useful, because they represent an accurate model of the human perception, since when comparing any kind of data (images, time-series etc), we mostly focus on the portions that are similar and we are willing to pay less attention to regions of great dissimilarity.
  33. Nx1 vector. It does not show the compression, but it does show the quality of the approximation
  34. Trajectory data in other applications too.
  35. All of these applications are a different application, a different twist of similarity measures and similarity matching.