SlideShare a Scribd company logo
1 of 21
Download to read offline
TimeSeries Technical
                    Presentation




Jacques Roy




April 6, 2011                          © 2010 IBM Corporation
Agenda
■    What is TimeSeries
■    Why TimeSeries
■    Components
■    Usage
■




 2                        © 2010 IBM Corporation
“Give me the Jan 1st element from time series “X”


    Most useful when a range of data is normally read

      “Give me the Jan 1st thru Jan 10th elements from time series “X”


    Access to one time series is usually completed before moving to
    the next time series.

3                                                              © 2010 IBM Corporation
Challenges Managing Time Series Data
■   Slow Performance
      – Extremely slow data access specially for ordered set of rows due to
        the data layout and disk I/O
      – Operations hard or impossible to do in standard SQL

■   High Storage Requirements
      – Time series are usually stored as "tall – thin" tables with a very large
        number of rows
      – May need one index to enforce uniqueness and another for index
        only read, more space used for index than data
      – Huge space requirements in standard relational layout, due to the
        volume and data

■   Complex Querying
      – Can be difficult to write SQL to work with the data
4
Informix Solution
●   TimeSeries Data Type : Native time series support
■   Store time series elements as an ordered set of elements
    – Uses less space because the "key" is factored out and the time field
      takes either 0 (for regular) or 11 ( for irregular) bytes
    – Access is faster than index-only-read
    – SQL can be made much simpler
■   Freedom to manage time series data:
    – Freedom to choose what and how it is stored
    – Freedom to choose the time series interval
    – Freedom to choose where the time series is stored

MeterId                                   Reading
                                                                           No other RDBMS
 1001     2010-01-01,daily,{(12.34,12567),(12.56,9000),(12.34,55567),..}
                                                                           has native time
 2001     2010-05-05,daily,{(199.08,6780),(198.55,3400),(198.12,250),..}   series support
 2011     2010-09-01,daily,{(9.34,8067),(9.56,9000),(9.40,10780),..}


5
Key Strengths of Informix TimeSeries
    Performance
     – Extremely fast data access: Data clustered on disk to reduce I/O
     – Provides very high degree of parallelism on reads and writes
     – Provides continuous loading of data with minimal impact on concurrent
       queries

    Space Savings
     – Provides high level of compression
     – Can be over 50% space savings over standard relational layout

    Usability
     – Time series tool kit allows custom analytics to be written
     – Handles operations hard or impossible to do in standard SQL
     – Conceptually closer to how users think of time series
     – No other RDBMS has native time series support

6                                                               © 2010 IBM Corporation
Smart Meters Data: Schema Example
                              Primary Key

                            mtr_id   date     Col1         Col2                 ColN
                              1      Mon     Value    1   Value   2   …….      Value   N
                              1      Tue     Value    1   Value   2   …….      Value   N
                              1      Wed     Value    1   Value   2   …….      Value   N
    Relational Schema        ...     ...       ...          ...       …….        ...
                             13      Mon     Value    1   Value   2   …….      Value   N
                             13      Tue     Value    1   Value   2   …….      Value   N
                             13      Wed     Value    1   Value   2   …….      Value   N
                             ...     ...       ...          ...       …….        ...

                            mtr_id         Series
                            (int)    timeseries(mtr_data)
                              1      [(Mon, v1, ...)(Tue,v1…)]
                              2      [(Mon, v1, ...)(Tue,v1…)]
Above schema using            3      [(Mon, v1, ...)(Tue,v1…)]
Informix TimeSeries
                              4      [(Mon, v1, ...)(Tue,v1…)]
                               …                  …



    Save space and increase performance with faster data access with Informix
7                                                                           © 2010 IBM Corporation
TimeSeries Space Savings Example
TimeSeries data type takes much less space than traditional relational storage
●


        – Proof of concept example:
           • Regular TimeSeries, 15 minute interval
           • Relational database used ~ 1TB (1000GB)
           • Informix used ~340GB



    The reason for this is:
        – The TimeSeries does not repeat data
              •MeterID: 4 bytes per reading
              •TimeStamp: Could be 12 bytes per reading
              •Assuming a 8 byte reading, that ~66% savings
                   •3X less storage!



                                                              Data Storage Comparison for 1 million meters



    8                                                                            © 2010 IBM Corporation
TimeSeries Performance

    Performance
     – Faster accessing sets of data
              • Ordered data
     – Much faster combining time series
     – For data loading into timeseries,
       Informix outperforms the nearest
       competition by more than 30x
       times
     – For report generation from
                                             Performance Comparison for Data
       timeseries, Informix outperforms    Loads and Reports for 1 Million Meters

       the nearest competition by more
       than 90x times
9                                                              © 2010 IBM Corporation
Who’s Interested in TimeSeries

     Energy: smart meters
     Capital Markets
     – Arbitrage opportunities, breakout signals, risk/return optimization,
       portfolio management, VaR calculations, simulations, backtesting...
     Telecommunications:
     – Network monitoring, load prediction, blocked calls (lost revenue)
       from load, phone usage, fraud detection and analysis...
     Manufacturing:
     – Machinery going out of spec; process sampling and analysis
     Logistics:
     – Location of a fleet (e.g. GPS); route analysis
     Scientific research:
     – Temperature over time...



10                    Informix Dynamic Server, TimeSeries DataBlade Module class   © 2007 IBM Corporation
TimeSeries: Key Concepts
      ■   Containers
            – Specialized storage for TimeSeries
              EXECUTE PROCEDURE
                 TSContainerCreate('raw_container', 'rootdbs',
                                          'meter_data', 100, 50);
      ■   Timeseries data element: row type
            – Flexibility to define as many parts as needed
              CREATE ROW TYPE meter_data (
                   tstamp datetime year to fraction(5),
                   value decimal(14,3)
              );
      ■   Timeseries types: regular, irregular
            – Covers regular intervals and sparse data distribution
      ■   Calendar
            – Defines business patterns



 11                                                                   © 2010 IBM Corporation
Features Unique to Regular TimeSeries

     Only one element per “on” interval
     Value "persists" to end of interval
     An element for an “on” interval may be missing, entire
     element will be NULL
     Calendar determines offset in TimeSeries of given time point
     Elements can be accessed by offset or time point
     Time point not stored; calculated from header + date/time
     arithmetic




12                  Informix Dynamic Server, TimeSeries DataBlade Module class   © 2007 IBM Corporation
Features Unique to Irregular TimeSeries

     Data can be entered at any time point within a valid "on"
     interval
     Element persist until next element
     No NULL elements
     Elements can only be accessed by time
     No duplicate time points allowed
     If element already exists at given time point either an error is
     raise or a unique time point is found:
      – round time point up to nearest second
      – search back for first element
      – add 10 microseconds, this is new time point




13                   Informix Dynamic Server, TimeSeries DataBlade Module class   © 2007 IBM Corporation
Accessing Timeseries

     Access through standard tabular view
     – Makes TimeSeries look like a standard relational table

     SQL Functions
     – 103 functions

     Customized functions
     – Written in Stored Procedure Language (SPL), “C”, Java
     – 65 “C” functions




14                     Informix Dynamic Server, TimeSeries DataBlade Module class   © 2007 IBM Corporation
TimeSeries Header

     A TimeSeries needs information that sets its context:
      – Calendar: Time period where data is found

      – Origin: Time origin of the TimeSeries

      – Threshold: in-row storage threshold

      – Container: where to store the out-of-row data

      – Metadata: optional data added by the TimeSeries creator




15                   Informix Dynamic Server, TimeSeries DataBlade Module class   © 2007 IBM Corporation
Calendar and Calendar Patterns
       A calendar pattern is needed before we can create a calendar:
     INSERT INTO CalendarPatterns
       VALUES('day', '{1 on, 2 off, 4 on}, day' );

       A Calendar defines a set of valid times at which the TimeSeries can record
       data. (July 8, 2005 is a Friday)
     INSERT INTO CalendarTable(c_name, c_calendar)
       VALUES('calday',
       'startdate(2005-07-08 00:00:00.00000),
        pattstart(2005-07-08 00:00:00.00000),
        pattname(day)' );

       You can provide a pattern explicitly:
     INSERT INTO CalendarTable(c_name, c_calendar)
       VALUES('weekcal',
       'startdate(2005-07-08 00:00:00.00000),
        pattstart(2005-07-08 00:00:00.00000),
        pattern({1 on, 2 off, 4 on}, day)' );

16                       Informix Dynamic Server, TimeSeries DataBlade Module class   © 2007 IBM Corporation
TimeSeries: Table

      A TimeSeries resides in a table:

     CREATE TABLE ts_data (
          loc_esi_id   char(20) NOT NULL,
          measure_unit varchar(10) NOT NULL,
          direction    char(1) NOT NULL,
          multiplier   TimeSeries(meter_data),
          raw_reads    timeseries(meter_data),
          PRIMARY KEY(loc_esi_id, measure_unit, direction)
     ) LOCK MODE ROW;




17                    Informix Dynamic Server, TimeSeries DataBlade Module class   © 2007 IBM Corporation
Populating a TimeSeries

       A timeSeries must first be created:
         INSERT INTO taqtrade_day
         VALUES("IBM.N",
         TSCreate('calday', '2005-07-08 00:00:00.00000',
           20, 0, 0, 'taqtrade_day')
         );
       It can be created through the input function:
          INSERT INTO taqtrade
          VALUES("AA.N",
           'irregular, container(taqtrade),
              origin(2007-04-03 06:30:00.00000),
              calendar(calsec),
           [(4.48, . . .)@2007-04-03 06:30:03.00003,
             (4.50,. . .)@2007-04-03 06:30:03.00119,
              . . .]'
          );
18             Informix Dynamic Server, TimeSeries DataBlade Module class   © 2007 IBM Corporation
The Virtual Table Interface

     Makes a TimeSeries look like a table:
      EXECUTE PROCEDURE
        TSCreateVirtualTab('ts_data_v', 'ts_data',
             'origin(2010-11-10 00:00:00.00000),
              calendar(cal15min),container(raw_container),
              threshold(0), regular',
          0, 'raw_reads');

     Virtual table created:
       CREATE TABLE ts_data_v            (
          loc_esi_id                     char(20),
          measure_unit                   varchar(10,0),
          direction                      char(1),
          tstamp                         datetime year to fraction(5),
          value                          decimal(14,3)
       );

19                Informix Dynamic Server, TimeSeries DataBlade Module class   © 2007 IBM Corporation
Quick Review
     A TimeSeries resides in a container
     – The container resides in a dbspace
     – The container is for a specific element type (row type)
     – A container is for either a regular or irregular TimeSeries (not both)
     – A container can contain multiple TimeSeries

     A TimeSeries requires a calendar
     – Defines when the data starts, defines a pattern of valid values

     A TimeSeries data is defines as a row type
     – Defines the values tracked

     You can operate on TimeSeries through special SQL functions or
     use the virtual table interface and standard SQL


20                     Informix Dynamic Server, TimeSeries DataBlade Module class   © 2007 IBM Corporation
DEMO

21   Informix Dynamic Server, TimeSeries DataBlade Module class   © 2007 IBM Corporation

More Related Content

Viewers also liked

Viewers also liked (20)

Chris Sharman-SPEDDEXES 2014
Chris Sharman-SPEDDEXES 2014Chris Sharman-SPEDDEXES 2014
Chris Sharman-SPEDDEXES 2014
 
Environment Canada's Data Management Service
Environment Canada's Data Management ServiceEnvironment Canada's Data Management Service
Environment Canada's Data Management Service
 
Event sourcing with the GetEventStore
Event sourcing with the GetEventStoreEvent sourcing with the GetEventStore
Event sourcing with the GetEventStore
 
Io t world_2016_iot_smart_gateways_moe
Io t world_2016_iot_smart_gateways_moeIo t world_2016_iot_smart_gateways_moe
Io t world_2016_iot_smart_gateways_moe
 
Scaling ingest pipelines with high performance computing principles - Rajiv K...
Scaling ingest pipelines with high performance computing principles - Rajiv K...Scaling ingest pipelines with high performance computing principles - Rajiv K...
Scaling ingest pipelines with high performance computing principles - Rajiv K...
 
Choosing the right platform for your Internet -of-Things solution
Choosing the right platform for your Internet -of-Things solutionChoosing the right platform for your Internet -of-Things solution
Choosing the right platform for your Internet -of-Things solution
 
Why Smart Meters Need Informix TimeSeries
Why Smart Meters Need Informix TimeSeriesWhy Smart Meters Need Informix TimeSeries
Why Smart Meters Need Informix TimeSeries
 
Need for Time series Database
Need for Time series DatabaseNeed for Time series Database
Need for Time series Database
 
Titan and Cassandra at WellAware
Titan and Cassandra at WellAwareTitan and Cassandra at WellAware
Titan and Cassandra at WellAware
 
SignalFx: Making Cassandra Perform as a Time Series Database
SignalFx: Making Cassandra Perform as a Time Series DatabaseSignalFx: Making Cassandra Perform as a Time Series Database
SignalFx: Making Cassandra Perform as a Time Series Database
 
Deep dive into event store using Apache Cassandra
Deep dive into event store using Apache CassandraDeep dive into event store using Apache Cassandra
Deep dive into event store using Apache Cassandra
 
Why Gateways are Important in Your IoT Architecture
Why Gateways are Important in Your IoT ArchitectureWhy Gateways are Important in Your IoT Architecture
Why Gateways are Important in Your IoT Architecture
 
Interactive analytics at scale with druid
Interactive analytics at scale with druidInteractive analytics at scale with druid
Interactive analytics at scale with druid
 
codecentric AG: CQRS and Event Sourcing Applications with Cassandra
codecentric AG: CQRS and Event Sourcing Applications with Cassandracodecentric AG: CQRS and Event Sourcing Applications with Cassandra
codecentric AG: CQRS and Event Sourcing Applications with Cassandra
 
Time series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long versionTime series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long version
 
ML on Big Data: Real-Time Analysis on Time Series
ML on Big Data: Real-Time Analysis on Time SeriesML on Big Data: Real-Time Analysis on Time Series
ML on Big Data: Real-Time Analysis on Time Series
 
Time series slideshare
Time series slideshareTime series slideshare
Time series slideshare
 
Using spark for timeseries graph analytics
Using spark for timeseries graph analyticsUsing spark for timeseries graph analytics
Using spark for timeseries graph analytics
 
time series analysis
time series analysistime series analysis
time series analysis
 
Stream Computing & Analytics at Uber
Stream Computing & Analytics at UberStream Computing & Analytics at Uber
Stream Computing & Analytics at Uber
 

Similar to Ugif 04 2011 france ug04042011-jroy_ts

Ugif 12 2011-france ug12142011-tech_ts
Ugif 12 2011-france ug12142011-tech_tsUgif 12 2011-france ug12142011-tech_ts
Ugif 12 2011-france ug12142011-tech_ts
UGIF
 
Apache Solr as a compressed, scalable, and high performance time series database
Apache Solr as a compressed, scalable, and high performance time series databaseApache Solr as a compressed, scalable, and high performance time series database
Apache Solr as a compressed, scalable, and high performance time series database
Florian Lautenschlager
 

Similar to Ugif 04 2011 france ug04042011-jroy_ts (20)

Ugif 12 2011-france ug12142011-tech_ts
Ugif 12 2011-france ug12142011-tech_tsUgif 12 2011-france ug12142011-tech_ts
Ugif 12 2011-france ug12142011-tech_ts
 
Apache Solr as a compressed, scalable, and high performance time series database
Apache Solr as a compressed, scalable, and high performance time series databaseApache Solr as a compressed, scalable, and high performance time series database
Apache Solr as a compressed, scalable, and high performance time series database
 
Chronix Time Series Database - The New Time Series Kid on the Block
Chronix Time Series Database - The New Time Series Kid on the BlockChronix Time Series Database - The New Time Series Kid on the Block
Chronix Time Series Database - The New Time Series Kid on the Block
 
The new time series kid on the block
The new time series kid on the blockThe new time series kid on the block
The new time series kid on the block
 
Redis For Distributed & Fault Tolerant Data Plumbing Infrastructure
Redis For Distributed & Fault Tolerant Data Plumbing Infrastructure Redis For Distributed & Fault Tolerant Data Plumbing Infrastructure
Redis For Distributed & Fault Tolerant Data Plumbing Infrastructure
 
Ibm_IoT_Architecture_and_Capabilities
Ibm_IoT_Architecture_and_CapabilitiesIbm_IoT_Architecture_and_Capabilities
Ibm_IoT_Architecture_and_Capabilities
 
WW Historian 10
WW Historian 10WW Historian 10
WW Historian 10
 
PyTables
PyTablesPyTables
PyTables
 
Large Data Analyze With PyTables
Large Data Analyze With PyTablesLarge Data Analyze With PyTables
Large Data Analyze With PyTables
 
PyTables
PyTablesPyTables
PyTables
 
Py tables
Py tablesPy tables
Py tables
 
OSDC 2016 - Chronix - A fast and efficient time series storage based on Apach...
OSDC 2016 - Chronix - A fast and efficient time series storage based on Apach...OSDC 2016 - Chronix - A fast and efficient time series storage based on Apach...
OSDC 2016 - Chronix - A fast and efficient time series storage based on Apach...
 
A Fast and Efficient Time Series Storage Based on Apache Solr
A Fast and Efficient Time Series Storage Based on Apache SolrA Fast and Efficient Time Series Storage Based on Apache Solr
A Fast and Efficient Time Series Storage Based on Apache Solr
 
Chronix: A fast and efficient time series storage based on Apache Solr
Chronix: A fast and efficient time series storage based on Apache SolrChronix: A fast and efficient time series storage based on Apache Solr
Chronix: A fast and efficient time series storage based on Apache Solr
 
IBM IoT Architecture and Capabilities at the Edge and Cloud
IBM IoT Architecture and Capabilities at the Edge and Cloud IBM IoT Architecture and Capabilities at the Edge and Cloud
IBM IoT Architecture and Capabilities at the Edge and Cloud
 
Apache con 2020 use cases and optimizations of iotdb
Apache con 2020 use cases and optimizations of iotdbApache con 2020 use cases and optimizations of iotdb
Apache con 2020 use cases and optimizations of iotdb
 
IBM i Performance management and performance data collectors june 2012
IBM i Performance management and performance data collectors june 2012IBM i Performance management and performance data collectors june 2012
IBM i Performance management and performance data collectors june 2012
 
How to run your Hadoop Cluster in 10 minutes
How to run your Hadoop Cluster in 10 minutesHow to run your Hadoop Cluster in 10 minutes
How to run your Hadoop Cluster in 10 minutes
 
Cloud Standards and Virtualization
Cloud Standards and VirtualizationCloud Standards and Virtualization
Cloud Standards and Virtualization
 
Python and MongoDB as a Market Data Platform by James Blackburn
Python and MongoDB as a Market Data Platform by James BlackburnPython and MongoDB as a Market Data Platform by James Blackburn
Python and MongoDB as a Market Data Platform by James Blackburn
 

More from UGIF

UGIF 09 2013 Fy13 q3, corporate presentation the inflection point in the ap...
UGIF 09 2013 Fy13 q3, corporate presentation   the inflection point in the ap...UGIF 09 2013 Fy13 q3, corporate presentation   the inflection point in the ap...
UGIF 09 2013 Fy13 q3, corporate presentation the inflection point in the ap...
UGIF
 
Ugif 09 2013 open source - session tech
Ugif 09 2013   open source - session techUgif 09 2013   open source - session tech
Ugif 09 2013 open source - session tech
UGIF
 
Ugif 09 2013 new environment and dynamic setting in ids 12.10
Ugif 09 2013   new environment and dynamic setting in ids 12.10Ugif 09 2013   new environment and dynamic setting in ids 12.10
Ugif 09 2013 new environment and dynamic setting in ids 12.10
UGIF
 
Ugif 09 2013 open source
Ugif 09 2013   open sourceUgif 09 2013   open source
Ugif 09 2013 open source
UGIF
 
Ugif 10 2012 ppt0000001
Ugif 10 2012 ppt0000001Ugif 10 2012 ppt0000001
Ugif 10 2012 ppt0000001
UGIF
 
Ugif 10 2012 informix pssc-benchmark -l.revel_oct2012
Ugif 10 2012 informix pssc-benchmark -l.revel_oct2012Ugif 10 2012 informix pssc-benchmark -l.revel_oct2012
Ugif 10 2012 informix pssc-benchmark -l.revel_oct2012
UGIF
 
Ugif 10 2012 beauty ofifmxdiskstructs ugif
Ugif 10 2012 beauty ofifmxdiskstructs ugifUgif 10 2012 beauty ofifmxdiskstructs ugif
Ugif 10 2012 beauty ofifmxdiskstructs ugif
UGIF
 
Ugif 10 2012 lycia2 introduction in 45 minutes
Ugif 10 2012 lycia2 introduction in 45 minutesUgif 10 2012 lycia2 introduction in 45 minutes
Ugif 10 2012 lycia2 introduction in 45 minutes
UGIF
 
Ugif 10 2012 genero ugif october 3, 2012 ibm france, français
Ugif 10 2012 genero   ugif october 3, 2012  ibm france, français Ugif 10 2012 genero   ugif october 3, 2012  ibm france, français
Ugif 10 2012 genero ugif october 3, 2012 ibm france, français
UGIF
 
Ugif 10 2012 iiug paris-business-update
Ugif 10 2012 iiug paris-business-updateUgif 10 2012 iiug paris-business-update
Ugif 10 2012 iiug paris-business-update
UGIF
 
Ugif 10 2012 ppt0000002
Ugif 10 2012 ppt0000002Ugif 10 2012 ppt0000002
Ugif 10 2012 ppt0000002
UGIF
 
Ugif 12 2011-smart meters-11102011
Ugif 12 2011-smart meters-11102011Ugif 12 2011-smart meters-11102011
Ugif 12 2011-smart meters-11102011
UGIF
 
Ugif 12 2011-informix iwa
Ugif 12 2011-informix iwaUgif 12 2011-informix iwa
Ugif 12 2011-informix iwa
UGIF
 
Ugif 12 2011-ibm cap-seine
Ugif 12 2011-ibm cap-seineUgif 12 2011-ibm cap-seine
Ugif 12 2011-ibm cap-seine
UGIF
 
Ugif 12 2011-four js primer presentation - new graphic charter - short versio...
Ugif 12 2011-four js primer presentation - new graphic charter - short versio...Ugif 12 2011-four js primer presentation - new graphic charter - short versio...
Ugif 12 2011-four js primer presentation - new graphic charter - short versio...
UGIF
 
Ugif 12 2011-discover informix keynote 2012
Ugif 12 2011-discover informix keynote 2012Ugif 12 2011-discover informix keynote 2012
Ugif 12 2011-discover informix keynote 2012
UGIF
 
Ugif 04 2011 storage prov-pot_march_2011
Ugif 04 2011   storage prov-pot_march_2011Ugif 04 2011   storage prov-pot_march_2011
Ugif 04 2011 storage prov-pot_march_2011
UGIF
 

More from UGIF (20)

UGIF 09 2013 Fy13 q3, corporate presentation the inflection point in the ap...
UGIF 09 2013 Fy13 q3, corporate presentation   the inflection point in the ap...UGIF 09 2013 Fy13 q3, corporate presentation   the inflection point in the ap...
UGIF 09 2013 Fy13 q3, corporate presentation the inflection point in the ap...
 
Ugif 09 2013 open source - session tech
Ugif 09 2013   open source - session techUgif 09 2013   open source - session tech
Ugif 09 2013 open source - session tech
 
Ugif 09 2013 new environment and dynamic setting in ids 12.10
Ugif 09 2013   new environment and dynamic setting in ids 12.10Ugif 09 2013   new environment and dynamic setting in ids 12.10
Ugif 09 2013 new environment and dynamic setting in ids 12.10
 
Ugif 09 2013 open source
Ugif 09 2013   open sourceUgif 09 2013   open source
Ugif 09 2013 open source
 
Ugif 09 2013
Ugif 09 2013Ugif 09 2013
Ugif 09 2013
 
Ugif 09 2013 psm
Ugif 09 2013   psmUgif 09 2013   psm
Ugif 09 2013 psm
 
Ugif 09 2013 friug 201309 axional web studio
Ugif 09 2013 friug 201309   axional web studioUgif 09 2013 friug 201309   axional web studio
Ugif 09 2013 friug 201309 axional web studio
 
Ugif 10 2012 ppt0000001
Ugif 10 2012 ppt0000001Ugif 10 2012 ppt0000001
Ugif 10 2012 ppt0000001
 
Ugif 10 2012 informix pssc-benchmark -l.revel_oct2012
Ugif 10 2012 informix pssc-benchmark -l.revel_oct2012Ugif 10 2012 informix pssc-benchmark -l.revel_oct2012
Ugif 10 2012 informix pssc-benchmark -l.revel_oct2012
 
Ugif 10 2012 beauty ofifmxdiskstructs ugif
Ugif 10 2012 beauty ofifmxdiskstructs ugifUgif 10 2012 beauty ofifmxdiskstructs ugif
Ugif 10 2012 beauty ofifmxdiskstructs ugif
 
Ugif 10 2012 lycia2 introduction in 45 minutes
Ugif 10 2012 lycia2 introduction in 45 minutesUgif 10 2012 lycia2 introduction in 45 minutes
Ugif 10 2012 lycia2 introduction in 45 minutes
 
Ugif 10 2012 genero ugif october 3, 2012 ibm france, français
Ugif 10 2012 genero   ugif october 3, 2012  ibm france, français Ugif 10 2012 genero   ugif october 3, 2012  ibm france, français
Ugif 10 2012 genero ugif october 3, 2012 ibm france, français
 
Ugif 10 2012 iiug paris-business-update
Ugif 10 2012 iiug paris-business-updateUgif 10 2012 iiug paris-business-update
Ugif 10 2012 iiug paris-business-update
 
Ugif 10 2012 ppt0000002
Ugif 10 2012 ppt0000002Ugif 10 2012 ppt0000002
Ugif 10 2012 ppt0000002
 
Ugif 12 2011-smart meters-11102011
Ugif 12 2011-smart meters-11102011Ugif 12 2011-smart meters-11102011
Ugif 12 2011-smart meters-11102011
 
Ugif 12 2011-informix iwa
Ugif 12 2011-informix iwaUgif 12 2011-informix iwa
Ugif 12 2011-informix iwa
 
Ugif 12 2011-ibm cap-seine
Ugif 12 2011-ibm cap-seineUgif 12 2011-ibm cap-seine
Ugif 12 2011-ibm cap-seine
 
Ugif 12 2011-four js primer presentation - new graphic charter - short versio...
Ugif 12 2011-four js primer presentation - new graphic charter - short versio...Ugif 12 2011-four js primer presentation - new graphic charter - short versio...
Ugif 12 2011-four js primer presentation - new graphic charter - short versio...
 
Ugif 12 2011-discover informix keynote 2012
Ugif 12 2011-discover informix keynote 2012Ugif 12 2011-discover informix keynote 2012
Ugif 12 2011-discover informix keynote 2012
 
Ugif 04 2011 storage prov-pot_march_2011
Ugif 04 2011   storage prov-pot_march_2011Ugif 04 2011   storage prov-pot_march_2011
Ugif 04 2011 storage prov-pot_march_2011
 

Ugif 04 2011 france ug04042011-jroy_ts

  • 1. TimeSeries Technical Presentation Jacques Roy April 6, 2011 © 2010 IBM Corporation
  • 2. Agenda ■ What is TimeSeries ■ Why TimeSeries ■ Components ■ Usage ■ 2 © 2010 IBM Corporation
  • 3. “Give me the Jan 1st element from time series “X” Most useful when a range of data is normally read “Give me the Jan 1st thru Jan 10th elements from time series “X” Access to one time series is usually completed before moving to the next time series. 3 © 2010 IBM Corporation
  • 4. Challenges Managing Time Series Data ■ Slow Performance – Extremely slow data access specially for ordered set of rows due to the data layout and disk I/O – Operations hard or impossible to do in standard SQL ■ High Storage Requirements – Time series are usually stored as "tall – thin" tables with a very large number of rows – May need one index to enforce uniqueness and another for index only read, more space used for index than data – Huge space requirements in standard relational layout, due to the volume and data ■ Complex Querying – Can be difficult to write SQL to work with the data 4
  • 5. Informix Solution ● TimeSeries Data Type : Native time series support ■ Store time series elements as an ordered set of elements – Uses less space because the "key" is factored out and the time field takes either 0 (for regular) or 11 ( for irregular) bytes – Access is faster than index-only-read – SQL can be made much simpler ■ Freedom to manage time series data: – Freedom to choose what and how it is stored – Freedom to choose the time series interval – Freedom to choose where the time series is stored MeterId Reading No other RDBMS 1001 2010-01-01,daily,{(12.34,12567),(12.56,9000),(12.34,55567),..} has native time 2001 2010-05-05,daily,{(199.08,6780),(198.55,3400),(198.12,250),..} series support 2011 2010-09-01,daily,{(9.34,8067),(9.56,9000),(9.40,10780),..} 5
  • 6. Key Strengths of Informix TimeSeries Performance – Extremely fast data access: Data clustered on disk to reduce I/O – Provides very high degree of parallelism on reads and writes – Provides continuous loading of data with minimal impact on concurrent queries Space Savings – Provides high level of compression – Can be over 50% space savings over standard relational layout Usability – Time series tool kit allows custom analytics to be written – Handles operations hard or impossible to do in standard SQL – Conceptually closer to how users think of time series – No other RDBMS has native time series support 6 © 2010 IBM Corporation
  • 7. Smart Meters Data: Schema Example Primary Key mtr_id date Col1 Col2 ColN 1 Mon Value 1 Value 2 ……. Value N 1 Tue Value 1 Value 2 ……. Value N 1 Wed Value 1 Value 2 ……. Value N Relational Schema ... ... ... ... ……. ... 13 Mon Value 1 Value 2 ……. Value N 13 Tue Value 1 Value 2 ……. Value N 13 Wed Value 1 Value 2 ……. Value N ... ... ... ... ……. ... mtr_id Series (int) timeseries(mtr_data) 1 [(Mon, v1, ...)(Tue,v1…)] 2 [(Mon, v1, ...)(Tue,v1…)] Above schema using 3 [(Mon, v1, ...)(Tue,v1…)] Informix TimeSeries 4 [(Mon, v1, ...)(Tue,v1…)] … … Save space and increase performance with faster data access with Informix 7 © 2010 IBM Corporation
  • 8. TimeSeries Space Savings Example TimeSeries data type takes much less space than traditional relational storage ● – Proof of concept example: • Regular TimeSeries, 15 minute interval • Relational database used ~ 1TB (1000GB) • Informix used ~340GB The reason for this is: – The TimeSeries does not repeat data •MeterID: 4 bytes per reading •TimeStamp: Could be 12 bytes per reading •Assuming a 8 byte reading, that ~66% savings •3X less storage! Data Storage Comparison for 1 million meters 8 © 2010 IBM Corporation
  • 9. TimeSeries Performance Performance – Faster accessing sets of data • Ordered data – Much faster combining time series – For data loading into timeseries, Informix outperforms the nearest competition by more than 30x times – For report generation from Performance Comparison for Data timeseries, Informix outperforms Loads and Reports for 1 Million Meters the nearest competition by more than 90x times 9 © 2010 IBM Corporation
  • 10. Who’s Interested in TimeSeries Energy: smart meters Capital Markets – Arbitrage opportunities, breakout signals, risk/return optimization, portfolio management, VaR calculations, simulations, backtesting... Telecommunications: – Network monitoring, load prediction, blocked calls (lost revenue) from load, phone usage, fraud detection and analysis... Manufacturing: – Machinery going out of spec; process sampling and analysis Logistics: – Location of a fleet (e.g. GPS); route analysis Scientific research: – Temperature over time... 10 Informix Dynamic Server, TimeSeries DataBlade Module class © 2007 IBM Corporation
  • 11. TimeSeries: Key Concepts ■ Containers – Specialized storage for TimeSeries EXECUTE PROCEDURE TSContainerCreate('raw_container', 'rootdbs', 'meter_data', 100, 50); ■ Timeseries data element: row type – Flexibility to define as many parts as needed CREATE ROW TYPE meter_data ( tstamp datetime year to fraction(5), value decimal(14,3) ); ■ Timeseries types: regular, irregular – Covers regular intervals and sparse data distribution ■ Calendar – Defines business patterns 11 © 2010 IBM Corporation
  • 12. Features Unique to Regular TimeSeries Only one element per “on” interval Value "persists" to end of interval An element for an “on” interval may be missing, entire element will be NULL Calendar determines offset in TimeSeries of given time point Elements can be accessed by offset or time point Time point not stored; calculated from header + date/time arithmetic 12 Informix Dynamic Server, TimeSeries DataBlade Module class © 2007 IBM Corporation
  • 13. Features Unique to Irregular TimeSeries Data can be entered at any time point within a valid "on" interval Element persist until next element No NULL elements Elements can only be accessed by time No duplicate time points allowed If element already exists at given time point either an error is raise or a unique time point is found: – round time point up to nearest second – search back for first element – add 10 microseconds, this is new time point 13 Informix Dynamic Server, TimeSeries DataBlade Module class © 2007 IBM Corporation
  • 14. Accessing Timeseries Access through standard tabular view – Makes TimeSeries look like a standard relational table SQL Functions – 103 functions Customized functions – Written in Stored Procedure Language (SPL), “C”, Java – 65 “C” functions 14 Informix Dynamic Server, TimeSeries DataBlade Module class © 2007 IBM Corporation
  • 15. TimeSeries Header A TimeSeries needs information that sets its context: – Calendar: Time period where data is found – Origin: Time origin of the TimeSeries – Threshold: in-row storage threshold – Container: where to store the out-of-row data – Metadata: optional data added by the TimeSeries creator 15 Informix Dynamic Server, TimeSeries DataBlade Module class © 2007 IBM Corporation
  • 16. Calendar and Calendar Patterns A calendar pattern is needed before we can create a calendar: INSERT INTO CalendarPatterns VALUES('day', '{1 on, 2 off, 4 on}, day' ); A Calendar defines a set of valid times at which the TimeSeries can record data. (July 8, 2005 is a Friday) INSERT INTO CalendarTable(c_name, c_calendar) VALUES('calday', 'startdate(2005-07-08 00:00:00.00000), pattstart(2005-07-08 00:00:00.00000), pattname(day)' ); You can provide a pattern explicitly: INSERT INTO CalendarTable(c_name, c_calendar) VALUES('weekcal', 'startdate(2005-07-08 00:00:00.00000), pattstart(2005-07-08 00:00:00.00000), pattern({1 on, 2 off, 4 on}, day)' ); 16 Informix Dynamic Server, TimeSeries DataBlade Module class © 2007 IBM Corporation
  • 17. TimeSeries: Table A TimeSeries resides in a table: CREATE TABLE ts_data ( loc_esi_id char(20) NOT NULL, measure_unit varchar(10) NOT NULL, direction char(1) NOT NULL, multiplier TimeSeries(meter_data), raw_reads timeseries(meter_data), PRIMARY KEY(loc_esi_id, measure_unit, direction) ) LOCK MODE ROW; 17 Informix Dynamic Server, TimeSeries DataBlade Module class © 2007 IBM Corporation
  • 18. Populating a TimeSeries A timeSeries must first be created: INSERT INTO taqtrade_day VALUES("IBM.N", TSCreate('calday', '2005-07-08 00:00:00.00000', 20, 0, 0, 'taqtrade_day') ); It can be created through the input function: INSERT INTO taqtrade VALUES("AA.N", 'irregular, container(taqtrade), origin(2007-04-03 06:30:00.00000), calendar(calsec), [(4.48, . . .)@2007-04-03 06:30:03.00003, (4.50,. . .)@2007-04-03 06:30:03.00119, . . .]' ); 18 Informix Dynamic Server, TimeSeries DataBlade Module class © 2007 IBM Corporation
  • 19. The Virtual Table Interface Makes a TimeSeries look like a table: EXECUTE PROCEDURE TSCreateVirtualTab('ts_data_v', 'ts_data', 'origin(2010-11-10 00:00:00.00000), calendar(cal15min),container(raw_container), threshold(0), regular', 0, 'raw_reads'); Virtual table created: CREATE TABLE ts_data_v ( loc_esi_id char(20), measure_unit varchar(10,0), direction char(1), tstamp datetime year to fraction(5), value decimal(14,3) ); 19 Informix Dynamic Server, TimeSeries DataBlade Module class © 2007 IBM Corporation
  • 20. Quick Review A TimeSeries resides in a container – The container resides in a dbspace – The container is for a specific element type (row type) – A container is for either a regular or irregular TimeSeries (not both) – A container can contain multiple TimeSeries A TimeSeries requires a calendar – Defines when the data starts, defines a pattern of valid values A TimeSeries data is defines as a row type – Defines the values tracked You can operate on TimeSeries through special SQL functions or use the virtual table interface and standard SQL 20 Informix Dynamic Server, TimeSeries DataBlade Module class © 2007 IBM Corporation
  • 21. DEMO 21 Informix Dynamic Server, TimeSeries DataBlade Module class © 2007 IBM Corporation