SlideShare a Scribd company logo
1 of 28
Astronomical Data Processing Using
   SciQL, an SQL Based Query
     Language for Array Data



 Ying Zhang, Bart Scheers, Martin Kersten, Milena Ivanova, Niels Nes
                          CWI Amsterdam

             ADASS XXI, Nov. 06-10, 2011, Paris, France

                                       !"#$%&'()*+,#-&$.#/(012#&+$#%3$%#,(
                                       2.#(4&#$5()*+,#-&$".1(6&$&



                                       !"#$%&'()&"#*+,-(     ./0/123
                                       4")*'()5"%%,%*'(*#-(( 6!7(8 9:7;;9
Why Not RDBMS?

             SQL is difficult

                No appropriate array denotations

                No functional complete operation set

             DBMSs are slow

                Too much overhead

                Size limitations (due to BLOB representations)

                Existing foreign files

                Scale

                ...




2011-11-09                               ADASS XXI                  3
SciQL
             An array query language based on SQL:2003

             To lower the entrance fee to RDBMSs



             Distinguish features (Kersten et al. AD ’11; Zhang et al.
             IDEAS2011):

                Arrays and tables as first class citizens of DBMSs

                Seamless integration of relational and array paradigms

                Named dimensions with constraints

                Flexible structure-based grouping




             LOFAR Transient Key Project use case

2011-11-09                                ADASS XXI                          4
Array Definitions
                        y               null

                    3       0.0   0.0          0.0     0.0
                    2       0.0   0.0          0.0     0.0
             null                                                null
                    1       0.0   0.0          0.0     0.0
                    0       0.0   0.0          0.0     0.0
                                                             x
                            0     1            2       3
                                        null

             CREATE ARRAY A1 (
              x INT DIMENSION [0:1:4], y INT DIMENSION [0:1:4],
              v FLOAT DEFAULT 0.0);




2011-11-09                        ADASS XXI                             5
Array Definitions
                        y                   null

                    3       0.0       0.0          0.0     0.0
                    2       0.0       0.0          0.0     0.0
             null                                                    null
                    1       0.0       0.0          0.0     0.0
                    0       0.0       0.0          0.0     0.0
                                                                 x
                            0          1           2       3
                                            null

             CREATE ARRAY A1 (
              x INT DIMENSION [0:1:4], y INT DIMENSION [0:1:4],
              v FLOAT DEFAULT 0.0);


                                      dimensions,
                                  any scalar data type



2011-11-09                             ADASS XXI                            5
Array Definitions
                        y                    null

                    3       0.0        0.0          0.0         0.0
                    2       0.0        0.0          0.0         0.0
             null                                                         null
                    1       0.0        0.0          0.0         0.0
                    0       0.0        0.0          0.0         0.0
                                                                      x
                            0            1           2          3
                                             null

             CREATE ARRAY A1 (
              x INT DIMENSION [0:1:4], y INT DIMENSION [0:1:4],
              v FLOAT DEFAULT 0.0);

                            dimensional range:
                            [(start|∗) : (step|∗) : (stop|∗)]




2011-11-09                              ADASS XXI                                6
Array Definitions
                        y                null

                    3       0.0   0.0           0.0     0.0
                    2       0.0   0.0           0.0     0.0
             null                                                 null
                    1       0.0   0.0           0.0     0.0
                    0       0.0   0.0           0.0     0.0
                                                              x
                            0        1           2      3
                                         null

             CREATE ARRAY A1 (
              x INT DIMENSION [0:1:4], y INT DIMENSION [0:1:4],
              v FLOAT DEFAULT 0.0);




                   cell values,
              any column data type


2011-11-09                           ADASS XXI                           7
Array Tiling
                SELECT [x], [y], AVG(v) FROM A1
                GROUP BY A1[x:x+2][y:y+2];


                        y              null

                    3       0.0   0.0         0.0   0.0

                    2       0.0   0.0         0.0   0.0
             null                                         null
                    1       0.0   0.5         0.5   0.5

                    0       0.0   0.0         0.0   0.0
                             0     1           2     3
                                                           x
                                       null




2011-11-09                        ADASS XXI                                 8
Array Tiling
                      SELECT [x], [y], AVG(v) FROM A1
                      GROUP BY A1[x:x+2][y:y+2];


                              y              null

                          3       0.0   0.0         0.0   0.0


   Anchor point:          2       0.0   0.0         0.0   0.0
     A1[x][y]      null                                         null
                          1       0.0   0.5         0.5   0.5

                          0       0.0   0.0         0.0   0.0
                                   0     1           2     3
                                                                 x
                                             null




2011-11-09                              ADASS XXI                                 8
Array Tiling
                      SELECT [x], [y], AVG(v) FROM A1
                      GROUP BY A1[x:x+2][y:y+2];


                              y              null

                          3       0.0   0.0         0.0   0.0


   Anchor point:          2       0.0   0.0         0.0   0.0
     A1[x][y]      null                                         null
                          1       0.0   0.5         0.5   0.5

                          0       0.0   0.0         0.0   0.0
                                   0     1           2     3
                                                                 x
                                             null




2011-11-09                              ADASS XXI                                 8
Array Tiling
                      SELECT [x], [y], AVG(v) FROM A1
                      GROUP BY A1[x:x+2][y:y+2];


                              y              null

                          3       0.0   0.0         0.0   0.0


   Anchor point:          2       0.0   0.0         0.0   0.0
     A1[x][y]      null                                         null
                          1       0.0   0.5         0.5   0.5

                          0       0.0   0.0         0.0   0.0
                                   0     1           2     3
                                                                 x
                                             null




2011-11-09                              ADASS XXI                                 8
Array Tiling
                      SELECT [x], [y], AVG(v) FROM A1
                      GROUP BY A1[x:x+2][y:y+2];


                              y              null

                          3       0.0   0.0         0.0   0.0


   Anchor point:          2       0.0   0.0         0.0   0.0
     A1[x][y]      null                                         null
                          1       0.0   0.5         0.5   0.5

                          0       0.0   0.0         0.0   0.0
                                   0     1           2     3
                                                                 x
                                             null




2011-11-09                              ADASS XXI                                 8
Array Tiling
                      SELECT [x], [y], AVG(v) FROM A1
                      GROUP BY A1[x:x+2][y:y+2];


                              y              null

                          3       0.0   0.0         0.0   0.0


   Anchor point:          2       0.0   0.0         0.0   0.0
     A1[x][y]      null                                         null
                          1       0.0   0.5         0.5   0.5

                          0       0.0   0.0         0.0   0.0
                                   0     1           2     3
                                                                 x
                                             null




2011-11-09                              ADASS XXI                                 8
Array Tiling
                SELECT [x], [y], AVG(v) FROM A1
                GROUP BY A1[x:x+2][y:y+2];


                        y               null

                    3        0.0   0.0         0.0   0.0

                    2        0.0   0.0         0.0   0.0
             null                                          null
                    1       0.125 0.25 0.25 0.25

                    0       0.125 0.25 0.25 0.25
                              0     1           2     3
                                                            x
                                        null




2011-11-09                         ADASS XXI                                 9
LOFAR Catalogue
                                                                                                         ra DOUBLE,
  zone (Gray et al. 2006)                                frequency                                       decl DOUBLE,
                                                                                                         ra_err DOUBLE,
  90                                                                                                     decl_err DOUBLE,
                                                         ...                                             flux DOUBLE,
   ...                                                                                                   ...
                                                         ν4
   2
                                                         ν3
    1
                                                         ν2                                          V
    0                                                                                            U
                                                         ν1                                  Q
   -1                                                                                    I
   -2                                                          t1   t2   t3   t4   ...           time
   ...
  -90

          0   1   2   3   ...   357   358   359   meridian



         CREATE ARRAY LOFARsrc (
           zone INT DIMENSION[-90:1:91], mrdn INT DIMENSION[0:1:360],
           ts   TIMESTAMP DIMENSION,     freq INT DIMENSION[30:10:241],
           id   INT DIMENSION[0:1:*],    stks CHAR(1) DIMENSION
                   CHECK(stks=`I' OR stks=`Q' OR stks=`U' OR stks=`V'),
           ra DOUBLE, decl DOUBLE, ra_err DOUBLE, decl_err DOUBLE,
           flux DOUBLE, ...);

2011-11-09                                               ADASS XXI                                                          10
LOFAR Use Case
                                                                                                          ra DOUBLE,
  zone (Gray et al. 2006)                                 frequency                                       decl DOUBLE,
                                                                                                          ra_err DOUBLE,
  90                                                                                                      decl_err DOUBLE,
                                                          ...                                             flux DOUBLE,
   ...                                                                                                    ...
                                                          ν4
   2
                                                          ν3
    1
                                                          ν2                                          V
    0                                                                                             U
                                                          ν1                                  Q
   -1                                                                                     I
   -2                                                           t1   t2   t3   t4   ...           time
   ...
  -90

         0    1   2   3    ...   357   358   359   meridian



             Similarity of the flux of a LOFAR source at frequencies 30
             MHz and 200 MHz

                          cross-correlation of two time series


2011-11-09                                                ADASS XXI                                                          11
Cross-Correlation

        idx       0         1        2        3
  F
        val       4         3        6        2

                                              idx        0       1       2
                                     G
                                              val        1       5       7




                      idx       -3       -2         -1       0       1       2
             Cr
                      val




2011-11-09                                               ADASS XXI                          12
Cross-Correlation
                                                                                      Cr.idx = -3

                  idx     0        1        2          3
         F                                                                            F [3 : 4]
                  val     4        3        6          2

                                            idx        0       1       2              G [0 : 1]
                                   G
                                            val        1       5       7




                    idx       -3       -2         -1       0       1       2
             Cr
                    val       2




2011-11-09                                             ADASS XXI                                  13
Cross-Correlation
                                                                                          Cr.idx = -2

                            idx        0        1          2       3
                  F                                                                       F [2 : 4]
                            val        4        3          6       2

                                                idx        0       1       2              G [0 : 2]
                                       G
                                                val        1       5       7




                      idx         -3       -2         -1       0       1       2
             Cr
                      val         2        16




2011-11-09                                                 ADASS XXI                                  14
Cross-Correlation
                                                                                      Cr.idx = -1

                                 idx        0          1       2       3
                        F                                                             F [1 : 4]
                                 val        4          3       6       2

                                            idx        0       1       2              G [0 : 3]
                                 G
                                            val        1       5       7




                  idx       -3         -2         -1       0       1       2
             Cr
                  val       2        16         47




2011-11-09                                             ADASS XXI                                  15
Cross-Correlation
                                                                                 Cr.idx = 0

                                      idx        0        1       2       3
                             F                                                   F [0 : 3]
                                      val        4        3       6       2

                                      idx        0        1       2              G [0 : 3]
                             G
                                      val        1        5       7




                  idx   -3       -2         -1       0        1       2
             Cr
                  val   2        16     47           61




2011-11-09                                       ADASS XXI                                   16
Cross-Correlation
                                                                                    Cr.idx = 1

                                                 idx       0        1       2   3
                                      F                                             F [0 : 2]
                                                 val       4        3       6   2

                                      idx        0         1        2               G [1 : 3]
                             G
                                      val        1         5        7




                  idx   -3       -2         -1         0       1        2
             Cr
                  val   2        16       47         61        41




2011-11-09                                       ADASS XXI                                      17
Cross-Correlation
                                                                                         Cr.idx = 2

                                                          idx       0        1   2   3
                                                 F                                       F [0 : 1]
                                                          val       4        3   6   2

                                      idx        0        1         2                    G [2 : 3]
                             G
                                      val        1        5         7




                  idx   -3       -2         -1       0          1       2
             Cr
                  val   2        16     47           61       41        28




2011-11-09                                       ADASS XXI                                           18
LOFAR Use Case
                                                                                                           ra DOUBLE,
  zone (Gray et al. 2006)                                  frequency                                       decl DOUBLE,
                                                                                                           ra_err DOUBLE,
  90                                                                                                       decl_err DOUBLE,
                                                           ...                                             flux DOUBLE,
   ...                                                                                                     ...
                                                           ν4
   2
                                                           ν3
    1
                                                           ν2                                          V
    0                                                                                              U
                                                           ν1                                  Q
   -1                                                                                      I
   -2                                                            t1   t2   t3   t4   ...           time
   ...
  -90

            0   1   2   3   ...   357   358   359   meridian
         DECLARE fcnt INT, gcnt INT;
         SET fcnt = SELECT COUNT(*) FROM LOFARsrc[*][*][*][30][11][‘I’];
         SET gcnt = SELECT COUNT(*) FROM LOFARsrc[*][*][*][200][11][‘I’];

         CREATE ARRAY VIEW F (idx INT DIMENSION[0:1:fcnt], flux DOUBLE DEFAULT 0.0) AS SELECT flux FROM
            LOFARsrc[*][*][*][30][11][‘I’];
         CREATE ARRAY VIEW G (idx INT DIMENSION[0:1:gcnt], val DOUBLE DEFAULT 0.0) AS SELECT flux FROM
            LOFARsrc[*][*][*][200][11][‘I’];

         CREATE ARRAY CrCorr30_200 (idx INT DIMENSION[-fcnt+1:1:gcnt], val DOUBLE DEFAULT 0.0);
         INSERT INTO CrCorr SELECT SUM(F.flux * G.flux) FROM F, G, CrCorr30_200 AS C
           GROUP BY F[MAX(0, -C.idx) : MIN(fcnt, gcnt-C.idx)], G[MAX(0, C.idx) : MIN(gcnt, fcnt+C.idx)];



2011-11-09                                                 ADASS XXI                                                          19
LOFAR Use Case
                                                                                                           ra DOUBLE,
  zone (Gray et al. 2006)                                  frequency                                       decl DOUBLE,
                                                                                                           ra_err DOUBLE,
  90                                                                                                       decl_err DOUBLE,
                                                           ...                                             flux DOUBLE,
   ...                                                                                                     ...
                                                           ν4
   2
                                                           ν3
    1
                                                           ν2                                          V
    0                                                                                              U
                                                           ν1                                  Q
   -1                                                                                      I
   -2                                                            t1   t2   t3   t4   ...           time
   ...
  -90

            0   1   2   3   ...   357   358   359   meridian                           retrieve the time series

         DECLARE fcnt INT, gcnt INT;
         SET fcnt = SELECT COUNT(*) FROM LOFARsrc[*][*][*][30][11][‘I’];
         SET gcnt = SELECT COUNT(*) FROM LOFARsrc[*][*][*][200][11][‘I’];

         CREATE ARRAY VIEW F (idx INT DIMENSION[0:1:fcnt], flux DOUBLE DEFAULT 0.0) AS SELECT flux FROM
            LOFARsrc[*][*][*][30][11][‘I’];
         CREATE ARRAY VIEW G (idx INT DIMENSION[0:1:gcnt], val DOUBLE DEFAULT 0.0) AS SELECT flux FROM
            LOFARsrc[*][*][*][200][11][‘I’];

         CREATE ARRAY CrCorr30_200 (idx INT DIMENSION[-fcnt+1:1:gcnt], val DOUBLE DEFAULT 0.0);
         INSERT INTO CrCorr SELECT SUM(F.flux * G.flux) FROM F, G, CrCorr30_200 AS C
           GROUP BY F[MAX(0, -C.idx) : MIN(fcnt, gcnt-C.idx)], G[MAX(0, C.idx) : MIN(gcnt, fcnt+C.idx)];



2011-11-09                                                 ADASS XXI                                                          19
LOFAR Use Case
                                                                                                           ra DOUBLE,
   zone                                                    frequency                                       decl DOUBLE,
                                                                                                           ra_err DOUBLE,
  90                                                                                                       decl_err DOUBLE,
                                                           ...                                             flux DOUBLE,
   ...                                                                                                     ...
                                                           ν4
   2
                                                           ν3
    1
                                                           ν2                                          V
    0                                                                                              U
                                                           ν1                                  Q
   -1                                                                                      I
   -2                                                            t1   t2   t3   t4   ...           time
   ...
  -90

            0   1   2   3   ...   357   358   359   meridian     dynamic grouping for every iteration

         DECLARE fcnt INT, gcnt INT;
         SET fcnt = SELECT COUNT(*) FROM LOFARsrc[*][*][*][30][11][‘I’];
         SET gcnt = SELECT COUNT(*) FROM LOFARsrc[*][*][*][200][11][‘I’];

         CREATE ARRAY VIEW F (idx INT DIMENSION[0:1:fcnt], flux DOUBLE DEFAULT 0.0) AS SELECT flux FROM
            LOFARsrc[*][*][*][30][11][‘I’];
         CREATE ARRAY VIEW G (idx INT DIMENSION[0:1:gcnt], val DOUBLE DEFAULT 0.0) AS SELECT flux FROM
            LOFARsrc[*][*][*][200][11][‘I’];

         CREATE ARRAY CrCorr30_200 (idx INT DIMENSION[-fcnt+1:1:gcnt], val DOUBLE DEFAULT 0.0);
         INSERT INTO CrCorr SELECT SUM(F.flux * G.flux) FROM F, G, CrCorr30_200 AS C
           GROUP BY F[MAX(0, -C.idx) : MIN(fcnt, gcnt-C.idx)], G[MAX(0, C.idx) : MIN(gcnt, fcnt+C.idx)];



2011-11-09                                                 ADASS XXI                                                          20
Conclusion

             SciQL: a novel query language for scientific data

                A symbiosis of relational and array paradigm

             Simplifies expression of complex scientific algorithms

             Leave optimisation to DBMS kernel

             Opens opportunities to enhance scientific data mining



             Under active implementation


                                                   !"#$%&'()*+,#-&$.#/(012#&+$#%3$%#,(

                   www.scilens.org          www.monetdb.org
                                                   2.#(4&#$5()*+,#-&$".1(6&$&



                                                   !"#$%&'()&"#*+,-(     ./0/123
                                                   4")*'()5"%%,%*'(*#-(( 6!7(8 9:7;;9




2011-11-09                             ADASS XXI                                                 21

More Related Content

More from PlanetData Network of Excellence

Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstreamDemo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstreamPlanetData Network of Excellence
 
On the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingOn the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingPlanetData Network of Excellence
 
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...PlanetData Network of Excellence
 
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatchLinking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatchPlanetData Network of Excellence
 
SciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMSSciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMSPlanetData Network of Excellence
 
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduceScalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReducePlanetData Network of Excellence
 
Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...PlanetData Network of Excellence
 
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of FactsTowards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of FactsPlanetData Network of Excellence
 
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...PlanetData Network of Excellence
 
Adaptive Semantic Data Management Techniques for Federations of Endpoints
Adaptive Semantic Data Management Techniques for Federations of EndpointsAdaptive Semantic Data Management Techniques for Federations of Endpoints
Adaptive Semantic Data Management Techniques for Federations of EndpointsPlanetData Network of Excellence
 
Exploring The Hubness-Related Properties of Oceanographic Sensor Data
Exploring The Hubness-Related Properties of Oceanographic Sensor DataExploring The Hubness-Related Properties of Oceanographic Sensor Data
Exploring The Hubness-Related Properties of Oceanographic Sensor DataPlanetData Network of Excellence
 

More from PlanetData Network of Excellence (20)

Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstreamDemo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
 
On the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingOn the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream Processing
 
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
 
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatchLinking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch
 
SciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMSSciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMS
 
CLODA: A Crowdsourced Linked Open Data Architecture
CLODA: A Crowdsourced Linked Open Data ArchitectureCLODA: A Crowdsourced Linked Open Data Architecture
CLODA: A Crowdsourced Linked Open Data Architecture
 
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduceScalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
 
Data and Knowledge Evolution
Data and Knowledge Evolution  Data and Knowledge Evolution
Data and Knowledge Evolution
 
Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...
 
Access Control for RDF graphs using Abstract Models
Access Control for RDF graphs using Abstract ModelsAccess Control for RDF graphs using Abstract Models
Access Control for RDF graphs using Abstract Models
 
Arrays in Databases, the next frontier?
Arrays in Databases, the next frontier?Arrays in Databases, the next frontier?
Arrays in Databases, the next frontier?
 
Abstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF DatasetsAbstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF Datasets
 
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of FactsTowards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
 
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
 
Heuristic based Query Optimisation for SPARQL
Heuristic based Query Optimisation for SPARQLHeuristic based Query Optimisation for SPARQL
Heuristic based Query Optimisation for SPARQL
 
Adaptive Semantic Data Management Techniques for Federations of Endpoints
Adaptive Semantic Data Management Techniques for Federations of EndpointsAdaptive Semantic Data Management Techniques for Federations of Endpoints
Adaptive Semantic Data Management Techniques for Federations of Endpoints
 
Building a Front End for a Sensor Data Cloud
Building a Front End for a Sensor Data CloudBuilding a Front End for a Sensor Data Cloud
Building a Front End for a Sensor Data Cloud
 
OntoGen Extension for Exploring Image Collections
OntoGen Extension for Exploring Image CollectionsOntoGen Extension for Exploring Image Collections
OntoGen Extension for Exploring Image Collections
 
Exploring The Hubness-Related Properties of Oceanographic Sensor Data
Exploring The Hubness-Related Properties of Oceanographic Sensor DataExploring The Hubness-Related Properties of Oceanographic Sensor Data
Exploring The Hubness-Related Properties of Oceanographic Sensor Data
 
Exposing Real World Information for the Web of Things
Exposing Real World Information for the Web of ThingsExposing Real World Information for the Web of Things
Exposing Real World Information for the Web of Things
 

Recently uploaded

What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 

Recently uploaded (20)

What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 

Astronomical Data Processing Using SciQL, an SQL Based Query Language for Array Data

  • 1. Astronomical Data Processing Using SciQL, an SQL Based Query Language for Array Data Ying Zhang, Bart Scheers, Martin Kersten, Milena Ivanova, Niels Nes CWI Amsterdam ADASS XXI, Nov. 06-10, 2011, Paris, France !"#$%&'()*+,#-&$.#/(012#&+$#%3$%#,( 2.#(4&#$5()*+,#-&$".1(6&$& !"#$%&'()&"#*+,-( ./0/123 4")*'()5"%%,%*'(*#-(( 6!7(8 9:7;;9
  • 2.
  • 3. Why Not RDBMS? SQL is difficult No appropriate array denotations No functional complete operation set DBMSs are slow Too much overhead Size limitations (due to BLOB representations) Existing foreign files Scale ... 2011-11-09 ADASS XXI 3
  • 4. SciQL An array query language based on SQL:2003 To lower the entrance fee to RDBMSs Distinguish features (Kersten et al. AD ’11; Zhang et al. IDEAS2011): Arrays and tables as first class citizens of DBMSs Seamless integration of relational and array paradigms Named dimensions with constraints Flexible structure-based grouping LOFAR Transient Key Project use case 2011-11-09 ADASS XXI 4
  • 5. Array Definitions y null 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0 x 0 1 2 3 null CREATE ARRAY A1 ( x INT DIMENSION [0:1:4], y INT DIMENSION [0:1:4], v FLOAT DEFAULT 0.0); 2011-11-09 ADASS XXI 5
  • 6. Array Definitions y null 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0 x 0 1 2 3 null CREATE ARRAY A1 ( x INT DIMENSION [0:1:4], y INT DIMENSION [0:1:4], v FLOAT DEFAULT 0.0); dimensions, any scalar data type 2011-11-09 ADASS XXI 5
  • 7. Array Definitions y null 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0 x 0 1 2 3 null CREATE ARRAY A1 ( x INT DIMENSION [0:1:4], y INT DIMENSION [0:1:4], v FLOAT DEFAULT 0.0); dimensional range: [(start|∗) : (step|∗) : (stop|∗)] 2011-11-09 ADASS XXI 6
  • 8. Array Definitions y null 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.0 0.0 0.0 0 0.0 0.0 0.0 0.0 x 0 1 2 3 null CREATE ARRAY A1 ( x INT DIMENSION [0:1:4], y INT DIMENSION [0:1:4], v FLOAT DEFAULT 0.0); cell values, any column data type 2011-11-09 ADASS XXI 7
  • 9. Array Tiling SELECT [x], [y], AVG(v) FROM A1 GROUP BY A1[x:x+2][y:y+2]; y null 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null 2011-11-09 ADASS XXI 8
  • 10. Array Tiling SELECT [x], [y], AVG(v) FROM A1 GROUP BY A1[x:x+2][y:y+2]; y null 3 0.0 0.0 0.0 0.0 Anchor point: 2 0.0 0.0 0.0 0.0 A1[x][y] null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null 2011-11-09 ADASS XXI 8
  • 11. Array Tiling SELECT [x], [y], AVG(v) FROM A1 GROUP BY A1[x:x+2][y:y+2]; y null 3 0.0 0.0 0.0 0.0 Anchor point: 2 0.0 0.0 0.0 0.0 A1[x][y] null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null 2011-11-09 ADASS XXI 8
  • 12. Array Tiling SELECT [x], [y], AVG(v) FROM A1 GROUP BY A1[x:x+2][y:y+2]; y null 3 0.0 0.0 0.0 0.0 Anchor point: 2 0.0 0.0 0.0 0.0 A1[x][y] null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null 2011-11-09 ADASS XXI 8
  • 13. Array Tiling SELECT [x], [y], AVG(v) FROM A1 GROUP BY A1[x:x+2][y:y+2]; y null 3 0.0 0.0 0.0 0.0 Anchor point: 2 0.0 0.0 0.0 0.0 A1[x][y] null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null 2011-11-09 ADASS XXI 8
  • 14. Array Tiling SELECT [x], [y], AVG(v) FROM A1 GROUP BY A1[x:x+2][y:y+2]; y null 3 0.0 0.0 0.0 0.0 Anchor point: 2 0.0 0.0 0.0 0.0 A1[x][y] null null 1 0.0 0.5 0.5 0.5 0 0.0 0.0 0.0 0.0 0 1 2 3 x null 2011-11-09 ADASS XXI 8
  • 15. Array Tiling SELECT [x], [y], AVG(v) FROM A1 GROUP BY A1[x:x+2][y:y+2]; y null 3 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 null null 1 0.125 0.25 0.25 0.25 0 0.125 0.25 0.25 0.25 0 1 2 3 x null 2011-11-09 ADASS XXI 9
  • 16. LOFAR Catalogue ra DOUBLE, zone (Gray et al. 2006) frequency decl DOUBLE, ra_err DOUBLE, 90 decl_err DOUBLE, ... flux DOUBLE, ... ... ν4 2 ν3 1 ν2 V 0 U ν1 Q -1 I -2 t1 t2 t3 t4 ... time ... -90 0 1 2 3 ... 357 358 359 meridian CREATE ARRAY LOFARsrc ( zone INT DIMENSION[-90:1:91], mrdn INT DIMENSION[0:1:360], ts TIMESTAMP DIMENSION, freq INT DIMENSION[30:10:241], id INT DIMENSION[0:1:*], stks CHAR(1) DIMENSION CHECK(stks=`I' OR stks=`Q' OR stks=`U' OR stks=`V'), ra DOUBLE, decl DOUBLE, ra_err DOUBLE, decl_err DOUBLE, flux DOUBLE, ...); 2011-11-09 ADASS XXI 10
  • 17. LOFAR Use Case ra DOUBLE, zone (Gray et al. 2006) frequency decl DOUBLE, ra_err DOUBLE, 90 decl_err DOUBLE, ... flux DOUBLE, ... ... ν4 2 ν3 1 ν2 V 0 U ν1 Q -1 I -2 t1 t2 t3 t4 ... time ... -90 0 1 2 3 ... 357 358 359 meridian Similarity of the flux of a LOFAR source at frequencies 30 MHz and 200 MHz cross-correlation of two time series 2011-11-09 ADASS XXI 11
  • 18. Cross-Correlation idx 0 1 2 3 F val 4 3 6 2 idx 0 1 2 G val 1 5 7 idx -3 -2 -1 0 1 2 Cr val 2011-11-09 ADASS XXI 12
  • 19. Cross-Correlation Cr.idx = -3 idx 0 1 2 3 F F [3 : 4] val 4 3 6 2 idx 0 1 2 G [0 : 1] G val 1 5 7 idx -3 -2 -1 0 1 2 Cr val 2 2011-11-09 ADASS XXI 13
  • 20. Cross-Correlation Cr.idx = -2 idx 0 1 2 3 F F [2 : 4] val 4 3 6 2 idx 0 1 2 G [0 : 2] G val 1 5 7 idx -3 -2 -1 0 1 2 Cr val 2 16 2011-11-09 ADASS XXI 14
  • 21. Cross-Correlation Cr.idx = -1 idx 0 1 2 3 F F [1 : 4] val 4 3 6 2 idx 0 1 2 G [0 : 3] G val 1 5 7 idx -3 -2 -1 0 1 2 Cr val 2 16 47 2011-11-09 ADASS XXI 15
  • 22. Cross-Correlation Cr.idx = 0 idx 0 1 2 3 F F [0 : 3] val 4 3 6 2 idx 0 1 2 G [0 : 3] G val 1 5 7 idx -3 -2 -1 0 1 2 Cr val 2 16 47 61 2011-11-09 ADASS XXI 16
  • 23. Cross-Correlation Cr.idx = 1 idx 0 1 2 3 F F [0 : 2] val 4 3 6 2 idx 0 1 2 G [1 : 3] G val 1 5 7 idx -3 -2 -1 0 1 2 Cr val 2 16 47 61 41 2011-11-09 ADASS XXI 17
  • 24. Cross-Correlation Cr.idx = 2 idx 0 1 2 3 F F [0 : 1] val 4 3 6 2 idx 0 1 2 G [2 : 3] G val 1 5 7 idx -3 -2 -1 0 1 2 Cr val 2 16 47 61 41 28 2011-11-09 ADASS XXI 18
  • 25. LOFAR Use Case ra DOUBLE, zone (Gray et al. 2006) frequency decl DOUBLE, ra_err DOUBLE, 90 decl_err DOUBLE, ... flux DOUBLE, ... ... ν4 2 ν3 1 ν2 V 0 U ν1 Q -1 I -2 t1 t2 t3 t4 ... time ... -90 0 1 2 3 ... 357 358 359 meridian DECLARE fcnt INT, gcnt INT; SET fcnt = SELECT COUNT(*) FROM LOFARsrc[*][*][*][30][11][‘I’]; SET gcnt = SELECT COUNT(*) FROM LOFARsrc[*][*][*][200][11][‘I’]; CREATE ARRAY VIEW F (idx INT DIMENSION[0:1:fcnt], flux DOUBLE DEFAULT 0.0) AS SELECT flux FROM LOFARsrc[*][*][*][30][11][‘I’]; CREATE ARRAY VIEW G (idx INT DIMENSION[0:1:gcnt], val DOUBLE DEFAULT 0.0) AS SELECT flux FROM LOFARsrc[*][*][*][200][11][‘I’]; CREATE ARRAY CrCorr30_200 (idx INT DIMENSION[-fcnt+1:1:gcnt], val DOUBLE DEFAULT 0.0); INSERT INTO CrCorr SELECT SUM(F.flux * G.flux) FROM F, G, CrCorr30_200 AS C GROUP BY F[MAX(0, -C.idx) : MIN(fcnt, gcnt-C.idx)], G[MAX(0, C.idx) : MIN(gcnt, fcnt+C.idx)]; 2011-11-09 ADASS XXI 19
  • 26. LOFAR Use Case ra DOUBLE, zone (Gray et al. 2006) frequency decl DOUBLE, ra_err DOUBLE, 90 decl_err DOUBLE, ... flux DOUBLE, ... ... ν4 2 ν3 1 ν2 V 0 U ν1 Q -1 I -2 t1 t2 t3 t4 ... time ... -90 0 1 2 3 ... 357 358 359 meridian retrieve the time series DECLARE fcnt INT, gcnt INT; SET fcnt = SELECT COUNT(*) FROM LOFARsrc[*][*][*][30][11][‘I’]; SET gcnt = SELECT COUNT(*) FROM LOFARsrc[*][*][*][200][11][‘I’]; CREATE ARRAY VIEW F (idx INT DIMENSION[0:1:fcnt], flux DOUBLE DEFAULT 0.0) AS SELECT flux FROM LOFARsrc[*][*][*][30][11][‘I’]; CREATE ARRAY VIEW G (idx INT DIMENSION[0:1:gcnt], val DOUBLE DEFAULT 0.0) AS SELECT flux FROM LOFARsrc[*][*][*][200][11][‘I’]; CREATE ARRAY CrCorr30_200 (idx INT DIMENSION[-fcnt+1:1:gcnt], val DOUBLE DEFAULT 0.0); INSERT INTO CrCorr SELECT SUM(F.flux * G.flux) FROM F, G, CrCorr30_200 AS C GROUP BY F[MAX(0, -C.idx) : MIN(fcnt, gcnt-C.idx)], G[MAX(0, C.idx) : MIN(gcnt, fcnt+C.idx)]; 2011-11-09 ADASS XXI 19
  • 27. LOFAR Use Case ra DOUBLE, zone frequency decl DOUBLE, ra_err DOUBLE, 90 decl_err DOUBLE, ... flux DOUBLE, ... ... ν4 2 ν3 1 ν2 V 0 U ν1 Q -1 I -2 t1 t2 t3 t4 ... time ... -90 0 1 2 3 ... 357 358 359 meridian dynamic grouping for every iteration DECLARE fcnt INT, gcnt INT; SET fcnt = SELECT COUNT(*) FROM LOFARsrc[*][*][*][30][11][‘I’]; SET gcnt = SELECT COUNT(*) FROM LOFARsrc[*][*][*][200][11][‘I’]; CREATE ARRAY VIEW F (idx INT DIMENSION[0:1:fcnt], flux DOUBLE DEFAULT 0.0) AS SELECT flux FROM LOFARsrc[*][*][*][30][11][‘I’]; CREATE ARRAY VIEW G (idx INT DIMENSION[0:1:gcnt], val DOUBLE DEFAULT 0.0) AS SELECT flux FROM LOFARsrc[*][*][*][200][11][‘I’]; CREATE ARRAY CrCorr30_200 (idx INT DIMENSION[-fcnt+1:1:gcnt], val DOUBLE DEFAULT 0.0); INSERT INTO CrCorr SELECT SUM(F.flux * G.flux) FROM F, G, CrCorr30_200 AS C GROUP BY F[MAX(0, -C.idx) : MIN(fcnt, gcnt-C.idx)], G[MAX(0, C.idx) : MIN(gcnt, fcnt+C.idx)]; 2011-11-09 ADASS XXI 20
  • 28. Conclusion SciQL: a novel query language for scientific data A symbiosis of relational and array paradigm Simplifies expression of complex scientific algorithms Leave optimisation to DBMS kernel Opens opportunities to enhance scientific data mining Under active implementation !"#$%&'()*+,#-&$.#/(012#&+$#%3$%#,( www.scilens.org www.monetdb.org 2.#(4&#$5()*+,#-&$".1(6&$& !"#$%&'()&"#*+,-( ./0/123 4")*'()5"%%,%*'(*#-(( 6!7(8 9:7;;9 2011-11-09 ADASS XXI 21