SlideShare una empresa de Scribd logo
1 de 40
Descargar para leer sin conexión
Conformal prediction of air pollution concentrations for
         the Barcelona Metropolitan Region
                     PhD Thesis summary


                           Olga Ivina

                        University of Girona
                       GRECS research group
              CIBER de Epidemiolog´ y la Salud P´blica
                                   ıa           u


                      November 22, 2012




                                                         1 / 42
Outline
   Introduction
          Air pollution and its effects
          Air pollution exposure assessment
          Conformal predictors for air pollution problem
   Objectives
   Methods and data
          Kriging
          Conformal predictors
          Computing
          Data
   Results
          Ordinary kriging and RRCM models in default setting
          Kernelisation: a Gaussian kernel
          Kernelisation: other kernels
          Comparison of models
   Discussion
   Conclusion
          Conformal predictors and geostatistics
          Future research
                                                                2 / 42
Air pollution and its effects
Introduction


Air pollutant is a problem of growing concern all over the world.
There exists great body of scientific evidence of hazardous effect of air
pollution on people’s health and well-being, as well as on general
ecological condition of our planet.
In people: association with adverse health outcomes - both in adults and
in children. Children are specially susceptible to pollution. They get
affected from the very first stages of their lives and on. Linked outcomes
(to name a few):
- preterm birth and low birth weight
- asthma aggravation, cough and bronchitis
- allergies: hay fever, rhinitis, ...
- excess risk of mortality

                                                                          3 / 42
Air pollution and its effects - 2
Introduction




Adults are influenced by pollution as well. In them, pollution is linked to
both long-term and short-term health effects (to name a few):
- respiratory: COPD, asthma, chronic bronchitis
- lung cancer
- cardiovascular morbidity
- mortality: cancer, all-cause, cardiopulmonary, non-accidental,...

Special factors of impact: SES and geographical location of a person.




                                                                             4 / 42
Air pollution and its effects - 3
Introduction




           Global air pollution map produced by Envisat’s SCIAMACHY.
       Authors: S. Beirle, U. Platt and T. Wagner, University of Heidelberg’s Institute for Environmental Physics.




                                                                                                                     5 / 42
Air pollution and its effects - 4
Introduction

The main contributor to air pollution in urban areas is traffic. Two -
”criteria” - traffic-related air pollutants are taken up in this study:
- nitrogen dioxide (NO2)
- particulate matter PM10

NO2 effects:
      short-term: respiratory effects and asthma aggravation
      long-term: risk of coronary heart disease and fatal events

PM10 effects:
      short-term: aggravation of respiratory and cardiovascular diseases,
      premature death, ...
      long-term: development of heart and lung diseases, premature
      death,...
                                                                            6 / 42
Air pollution exposure assessment
Introduction




Problem: direct measurements of pollution not always available.
There exists a large number of models aimed t predict pollution at a given
spot. The main classes are:
- proximity models
- geostatistical models
- land use regression (LUR) models
- dispersion models
- integrated meteorological emission (IME) models
- hybrid models



                                                                        7 / 42
Conformal predictors for air pollution problem
Introduction




Problem: nowadays existing methods for air pollution exposure
assessment may lack confidence in predictions.
In order to tackle this problem, this research suggests making use of a
newly developed approach that is conformal predictors. A conformal
predictor is a “confidence predictor”, where the level of confidence for
prediction is introduced ad hoc. This prediction is always valid - provided
by definition of conformal predictor.




                                                                          8 / 42
Conformal predictors for air pollution problem - 2
Introduction




A conformal predictor is defined by some nonconformity measure, and it
has two major desiderata:
- validity of predictions
- efficiency of preditions

Conformal predictors are flexible: they can be based upon almost any
underlying statistical algorithm.
In air pollution modeling, if a regression-based algorithm is taken up, such
as LUR or kriging, regression residuals serve as a nonconformity measure.




                                                                          9 / 42
Objectives




This dissertation has two major objectives:
  1   To demonstrate the capacity of conformal predictors as a method for
      spatial environmental modeling.
  2   To provide valid estimates of nitrogen dioxide and fine particulate
      matter for Barcelona Metropolitan Region.




                                                                           10 / 42
Kriging
Methods and data



Kriging is a spatial interpolation method. Provides a prediction of a factor
of interest in an unobserved point on the basis of a set of observed points.
Also provides an estimate of error variance (called “kriging variance”).
First introduced in 1951 by a South African engineer D.H. Krige in his
master work devoted to estimation of a mineral ore body. The method has
been further developed: nowadays the notion “kriging” stands for asset of
methods such as ordinary kriging, simple kriging, co-kriging, Bayesian
kriging etc.
In its simples form, a kriging estimate of the data at an unobserved
location is a linear combination of the observed data. The coefficients of
the equation depend on spatial structure of the data and on the spatial
covariance.


                                                                         11 / 42
Kriging - 2
Methods and data

The most common kriging is ordinary kriging. It is used when the mean
of the second order stationary process is unknown. It is based on a
geostatistical concept of variogram, and its approach - covariance function.
Let there be n neighboring observed locations, x1 , . . . , xn , and an
unobserved location x0 , on a spatial domain D. Let Z (x) : x ∈ D denote
the process, and let it have a variogram γ(h). Then the ordinary kriging
            ∗
estimate ZOK (x0 ) at the unobserved point x0 will take the following
analytical form:
                                        n
                          ∗
                         ZOK (x0 ) =         ωα Z (xα ),                (1)
                                       α=1

where ωα are the kriging weights. Ordinary kriging provides BLUE
estimates of a random field, together with an error variance estimate
(kriging variance.)

                                                                         12 / 42
New methods. Conformal predictors
Methods and data




How it works? Provided: pairs of observations of (xi , yi ) where xi is an
object and yi is a label. Then

                                     Z := X × Y                                      (2)

denotes the example space. Z is a measurable space. Given an incomplete
data sequence (x1 , y1 ), (x2 , y2 ), . . . , (xn−1 , yn−1 ) ∈ Z∗ , the aim is to predict
a label yn for an object xn . An operator:

                                  D : Z∗ × X → Y                                     (3)

denotes then a simple predictor. (e.g., an ordinary kriging predictor).



                                                                                     13 / 42
New methods. Conformal predictors - 2
Methods and data



The prediction can be described as:

                   yn = D(x1 , y1 , x2 , y2 , . . . ; xn−1 ), Yn ∈ Y.   (4)

Let us allow the predictor to output the prediction sets Yn large enough to
provide the confidence in prediction. This means, that the real value of yn
will fall in Yn with a given level of confidence, which is chosen and
provided to a predictor ad hoc.
A conformal predictor is a confidence predictor defined by some
nonconformity measure. Given the measure, a conformal predictor outputs
the prediction set assuming that the new example conforms with the
observed ones.


                                                                        14 / 42
New methods. Conformal predictors - 3
Methods and data



Ridge regression confidence machine (RRCM) is a regression-based
conformal predictor. It makes use of the ridge regression procedure (A. E.
Hoerl, 1971) as an underlying algorithm.
Suppose Xn is the n × p matrix of objects (independent variables), and Yn
is the vector of labels (dependent variables). Then, a RRCM estimate of
parameters ω takes form:

                        ω = (Xn Xn + aIp )−1 Xn Yn ,                     (5)

where a is a ridge factor. a = 0 yields a standard least squares estimate.
The nonconformity scores for this predictor are the regression residuals:
|ei | := |yi − yi |.
               ˆ


                                                                         15 / 42
New methods. Conformal predictors - 4
Methods and data




Based on a significance level for prediction introduced (roughly, a
probability of error not to exceed), a RRCM predictor outputs a set of
labels y for yn :

        Si := {y : αi (y ) ≥ αn (y )} = {y : |ai + bi y | ≥ |an + bn y |},   (6)
where ai and bi are the components of the vectors A and B.
RRCM outputs prediction sets instead of point predictions (what kriging
does). These sets can be in form of a point, an interval, a ray, a union of
two rays, the whole real line, or empty. Usually, it is an interval.




                                                                             16 / 42
New methods. Conformal predictors - 5
Methods and data


When the number of parameters p is large, computation is hard. “Kernel
trick” is a method that helps deal with hight-dimensional data. It allows to
consider nonlinearity in RRCM.
A kernel is a similarity measure that operates in a feature space. Provided
an input space X with a dot product, and an operator Φ that maps X to a
feature space H:

                                 Φ:X →H
                               x → x := Φ(x)
a kernel will be defined as follows. For xα , xβ ∈ X :

                        k(xα , xβ ) = Φ(xα ), Φ(xβ )                    (7)


                                                                         17 / 42
New methods. Conformal predictors - 6
Methods and data




Any conventional covariance function for kriging can be taken up as
a kernel for RRCM. This research uses three (positive definite) kernels:

     a dot product kernel (default)
     a radial basis Gaussian kernel
     an inhomogeneous polynomial kernel of a second degree




                                                                    18 / 42
Computing
Methods and data




All computational work made with R.
- Kriging: geoR package. Function krige.conv
- RRCM: PredictiveRegression package. Function iidpred.
- “Kernel trick” self-developed (on the basis of the PredictiveRegression
                :
package) functions for RRCM in “dual form” and for implementing the
kernels.




                                                                      19 / 42
Data
Methods and data




The data for this study has been kindly provided by XVPCA (Network for
Monitoring and Forecasting of Air Pollution) of the Generalitat de
Catalunya.
Mean annual concentrations of two criteria pollutants, NO2 and PM10, are
provided for the Barcelona Metropolitan Region, together with the
geographical coordinates of the monitoring stations(Mercator, UTM 31).
Time frames:
     - NO2: 1998 - 2009, ex. 2003
     - PM10: 2001 - 2009, ex.2003




                                                                      20 / 42
Data - 2
Methods and data




49 monitoring stations over the area in total.
Barcelona Metropolitan Region has a territory of about 3200 km2 and
accommodates over 5 million inhabitants.
In BMR, there happen about 107 million displacements weekly, 54.1% of
them - by means of motorized transport.




                                                                      21 / 42
Data - 3
Methods and data




           Table: 1. Data on mean annual nitrogen dioxide concentrations
                          Available observations for each year
    1998    1999   2000   2001 2002 2004 2005 2006               2007   2008   2009
     24      25     25     25      25      24      22      24     25     25     24


        Table: 2. Data on mean annual particulate matter concentrations
                           Available observations for each year
                2001   2002 2004 2005 2006 2007 2008                 2009
                 22     24      28      28      29      30      33    36




                                                                                      22 / 42
Data - 4
Methods and data




Two major drawbacks, or limiting factors, of the data set:
     Size: there was a small number of observations for each year and
     pollutant,
     Distribution: the measurement spots are situated quite far apart
     from one another, and they are distributed, or placed, unevenly over
     the geographic region.

Also, the data is the mean averages, and more frequent observations were
unavailable for this study.




                                                                        23 / 42
Ordinary kriging and RRCM modeling results
Results




                                             24 / 42
Ordinary kriging and RRCM modeling results - 2
Results




                                                 25 / 42
Ordinary kriging and RRCM modeling results - 3
Results




                                                 26 / 42
Kernelisation: a Gaussian kernel
Results




                                   27 / 42
Kernelisation: a Gaussian kernel - 2
Results




                                       28 / 42
Kernelisation: a Gaussian kernel - 3
Results




                                       29 / 42
Comparison of the RRCM models
Results




                                30 / 42
Comparison of the RRCM models - 2
Results




                                    31 / 42
Comparison of the RRCM models - 3
Results




           Table: Comparison of models for different ridge factors (µg/m3 )
                    linear iid                   RBF                     polynomial
   ridge     0.01       1          2      0.01     1       2      0.01       1          2
   2001     64.46     64.44      67.13   71.08   63.11   66.06   71.95     74.63      77.24
   2002     43.43     42.46      45.54   47.41   42.91   45.05   50.44     53.17      55.82
   2004     47.26     39.17      34.59   51.48   39.29   35.19   34.66     37.00      39.51
   2005     39.65     45.14      49.28   35.50   47.60   51.91   51.44     54.76      57.76
   2006     47.68     45.40      48.63   55.51   46.09   48.86   52.48     55.27      57.86
   2007     91.43     94.02      96.45   85.40   94.09   96.65   99.83     102.11     104.29
   2008     49.48     50.90      52.58   45.42   55.27   58.21   55.60     57.26      58.91
   2009     28.42     27.32      29.01   29.16   26.11   27.79   32.26     33.67      35.09




                                                                                           32 / 42
Comparison of the RRCM models - 4
Results




                                    33 / 42
Comparison of the RRCM models - 5
Results




                                    34 / 42
Comparison of the RRCM models - 6
Results




            Table: Comparison of models for different ridge factors (µg/m3 )
                      linear iid                   RBF                     polynomial
    ridge      0.01       1          2      0.01     1       2      0.01        1         2
    1998      76.08     72.33      68.27   65.81   72.37   68.37   65.27     64.71      65.99
    1999      66.31     60.11      61.44   67.68   60.57   60.39   65.32     68.20      70.87
    2000      51.69     55.27      57.89   50.91   52.90   55.63   61.89     64.19      66.38
    2001      36.25     41.30      44.90   35.32   38.65   42.36   49.54     52.34      54.95
    2002      52.12      46.57     49.51   47.78   51.44   57.38   54.51     56.99      59.37
    2004      53.65     59.11      62.46   53.89   56.95   60.41   67.06     69.36      71.60
    2005      78.75     84.77      88.57   79.44   82.18   86.14   94.41     96.94      99.43
    2006      61.79     66.39      69.78   61.24   63.82   67.38   74.90     77.36      79.76
    2007      47.01     49.35      53.13   48.15   47.11   51.04   57.15     59.91      62.48
    2008      46.96     50.15      53.58   47.45   48.04   51.55   57.63     60.21      62.63
    2009      55.59     55.17      53.89   48.38   54.35   52.68   52.79     55.19      57.57




                                                                                                35 / 42
Efficiency of predictions
Discussion




Kriging predictions are smooth and vary little, also made for mean annual
data. Error estimates, however, are huge in case of nitrogen dioxide, and
small in case of airborne particles - subject to properties of the substances:
NO2 is known to have a generally larger variability than PM10.
Kriging intervals can be derived, assuming the Gaussianity of data
distribution. This assumption is common, but not always correct. RRCM
makes no assumption on data distribution, apart from being iid.
Two factors help boost the efficiency of RRCM prediction: kernels and
ridge factor. The least is chosen by the brute force method (or the method
of consecutive approximations).



                                                                          36 / 42
Conformal predictors and geostatistics
Conclusion




                     Table: Comparison of OK and RRCM
                    OK                            RRCM
             point predictions      prediction sets (usually intervals)
           regression algorithm            regression algorithm
         Gaussianity assumption               iid assumption
         estimates error variance                     -
            uses variogram and            uses any appropriate
           covariance function                     kernel
              to approach it
                      -                       ridge factor
           may lack confidence              confidence level is
                                         chosen and guaranteed

                                                                          37 / 42
Future research
Conclusion




     Extend the existing data set for BMR
     Provide additional validation for the methods
     Test these models on the data for other cities
     Develop conformal predictors on the basis of other popular air
     pollution exposure modeling algorithms (land use regression,
     dispersion models etc.)




                                                                      38 / 42
Selected references


    V.Vovk, A.Gammerman, G.Shafer, Algorithmic learning in a random
    world, Springer (2005).
    V.Vovk, I.Nouretdinov, A. Gammerman, On-line predictive linear
    regression, The Annals of Statistics (2009).
    H. Wackernagel, Multivariate geostatistics: an introduction with
    applications, Springer (2003).
    B. Sch¨lkopf, J. Smola, Learning with kernels: support vector
          o
    machines, regularization, optimization, and beyond, MIT Press
    (2002).
    A. Lertxundi-Manterola, M. Saez, Modelling of nitrogen dioxide (NO2)
    and fine particulate matter (PM10) air pollution in the metropolitan
    areas of Barcelona and Bilbao, Spain, Environmetrics (2009).


                                                                       39 / 42
Selected references - 2



    A. Hoerl, R. Kennard, Ridge regression: Biased estimation for
    nonorthogonal problems, Technometrics 12.1 (1970).
    P. Diggle, P. Ribeiro Jr., Model-Based Geostatistics, Springer (2007).
    P. Ribeiro Jr., P. Diggle, geoR: a package for geostatistical analysis,
    R-NEWS 1.2 (2001).
    N. Cressie, Statistics for spatial data, Wiley (1993).
    M. Jerrett et al., A review and evaluation of intraurban air pollution
    exposure models, Journal of exposure analysis and environmental
    epidemiology (2005).




                                                                          40 / 42

Más contenido relacionado

La actualidad más candente

Optical rotatory dispersion
Optical rotatory dispersionOptical rotatory dispersion
Optical rotatory dispersionSujit Patel
 
2018 ELECTRON DIFFRACTION AND APPLICATIONS
2018 ELECTRON DIFFRACTION AND APPLICATIONS2018 ELECTRON DIFFRACTION AND APPLICATIONS
2018 ELECTRON DIFFRACTION AND APPLICATIONSHarsh Mohan
 
Siddhesh karekar roll no. 02
Siddhesh karekar roll no. 02Siddhesh karekar roll no. 02
Siddhesh karekar roll no. 02prasad karekar
 
Crystallography and X ray Diffraction - Quick Overview
Crystallography and X ray Diffraction - Quick OverviewCrystallography and X ray Diffraction - Quick Overview
Crystallography and X ray Diffraction - Quick OverviewNakkiran Arulmozhi
 
UV- VISIBLE-NIR spectroscopy-IIT DHANBAD
UV- VISIBLE-NIR spectroscopy-IIT DHANBADUV- VISIBLE-NIR spectroscopy-IIT DHANBAD
UV- VISIBLE-NIR spectroscopy-IIT DHANBADSHIV SHANKAR
 
adsorption of methylene blue onto xanthogenated modified chitosan microbeads
adsorption of methylene blue onto xanthogenated modified chitosan microbeadsadsorption of methylene blue onto xanthogenated modified chitosan microbeads
adsorption of methylene blue onto xanthogenated modified chitosan microbeadsSiti Nadzifah Ghazali
 
Structure types of crystals
Structure types of crystalsStructure types of crystals
Structure types of crystalsPicasa_10
 
Imperfections in solids
Imperfections in solidsImperfections in solids
Imperfections in solidsNitika Sharma
 
Photochemical reaction
Photochemical reactionPhotochemical reaction
Photochemical reactionRabia Aziz
 
environmental impact assesment
environmental impact assesment environmental impact assesment
environmental impact assesment Nisha Jindal
 
.trashed-1684685946-SLATER’S RULE.pptx
.trashed-1684685946-SLATER’S RULE.pptx.trashed-1684685946-SLATER’S RULE.pptx
.trashed-1684685946-SLATER’S RULE.pptxShivaniRaj40
 
Environmental impact assessment 2020
Environmental impact assessment 2020Environmental impact assessment 2020
Environmental impact assessment 2020Chandrakant Singh
 
Spectroscopic methods IR part 2
Spectroscopic methods IR part 2Spectroscopic methods IR part 2
Spectroscopic methods IR part 2Chris Sonntag
 
Alpha axial haloketone rule and octant rule
Alpha axial haloketone rule and octant ruleAlpha axial haloketone rule and octant rule
Alpha axial haloketone rule and octant ruleDr. Krishna Swamy. G
 
M.Sc.I Inorganic Chemistry Question paper
M.Sc.I  Inorganic Chemistry   Question paper M.Sc.I  Inorganic Chemistry   Question paper
M.Sc.I Inorganic Chemistry Question paper Shivaji Burungale
 

La actualidad más candente (20)

miller indices Patwa[[g
miller indices Patwa[[gmiller indices Patwa[[g
miller indices Patwa[[g
 
Optical rotatory dispersion
Optical rotatory dispersionOptical rotatory dispersion
Optical rotatory dispersion
 
Literature review of microplastics
Literature review of microplastics Literature review of microplastics
Literature review of microplastics
 
2018 ELECTRON DIFFRACTION AND APPLICATIONS
2018 ELECTRON DIFFRACTION AND APPLICATIONS2018 ELECTRON DIFFRACTION AND APPLICATIONS
2018 ELECTRON DIFFRACTION AND APPLICATIONS
 
Siddhesh karekar roll no. 02
Siddhesh karekar roll no. 02Siddhesh karekar roll no. 02
Siddhesh karekar roll no. 02
 
Crystallography and X ray Diffraction - Quick Overview
Crystallography and X ray Diffraction - Quick OverviewCrystallography and X ray Diffraction - Quick Overview
Crystallography and X ray Diffraction - Quick Overview
 
Potentiometry
PotentiometryPotentiometry
Potentiometry
 
EIA
EIAEIA
EIA
 
UV- VISIBLE-NIR spectroscopy-IIT DHANBAD
UV- VISIBLE-NIR spectroscopy-IIT DHANBADUV- VISIBLE-NIR spectroscopy-IIT DHANBAD
UV- VISIBLE-NIR spectroscopy-IIT DHANBAD
 
adsorption of methylene blue onto xanthogenated modified chitosan microbeads
adsorption of methylene blue onto xanthogenated modified chitosan microbeadsadsorption of methylene blue onto xanthogenated modified chitosan microbeads
adsorption of methylene blue onto xanthogenated modified chitosan microbeads
 
Structure types of crystals
Structure types of crystalsStructure types of crystals
Structure types of crystals
 
Imperfections in solids
Imperfections in solidsImperfections in solids
Imperfections in solids
 
Photochemical reaction
Photochemical reactionPhotochemical reaction
Photochemical reaction
 
environmental impact assesment
environmental impact assesment environmental impact assesment
environmental impact assesment
 
.trashed-1684685946-SLATER’S RULE.pptx
.trashed-1684685946-SLATER’S RULE.pptx.trashed-1684685946-SLATER’S RULE.pptx
.trashed-1684685946-SLATER’S RULE.pptx
 
Bandtheory of solids
Bandtheory of solidsBandtheory of solids
Bandtheory of solids
 
Environmental impact assessment 2020
Environmental impact assessment 2020Environmental impact assessment 2020
Environmental impact assessment 2020
 
Spectroscopic methods IR part 2
Spectroscopic methods IR part 2Spectroscopic methods IR part 2
Spectroscopic methods IR part 2
 
Alpha axial haloketone rule and octant rule
Alpha axial haloketone rule and octant ruleAlpha axial haloketone rule and octant rule
Alpha axial haloketone rule and octant rule
 
M.Sc.I Inorganic Chemistry Question paper
M.Sc.I  Inorganic Chemistry   Question paper M.Sc.I  Inorganic Chemistry   Question paper
M.Sc.I Inorganic Chemistry Question paper
 

Destacado

COMPOSITE FABRICATION TECHNIQUES
COMPOSITE FABRICATION TECHNIQUESCOMPOSITE FABRICATION TECHNIQUES
COMPOSITE FABRICATION TECHNIQUESMNNIT Allahabad
 
Ammc's fabricated by friction stir process
Ammc's fabricated by friction stir processAmmc's fabricated by friction stir process
Ammc's fabricated by friction stir processSelf-employed
 
Aluminum a lloy ash composites
Aluminum a lloy ash compositesAluminum a lloy ash composites
Aluminum a lloy ash compositesNouman Ali
 
Al-B4C Nanocomposite by MR.Govahi
Al-B4C Nanocomposite by MR.GovahiAl-B4C Nanocomposite by MR.Govahi
Al-B4C Nanocomposite by MR.GovahiMohamadreza Govahi
 
Carbon Nanotubes Synthesis
Carbon Nanotubes SynthesisCarbon Nanotubes Synthesis
Carbon Nanotubes SynthesisSteevan Sequeira
 
Synthesis & Characterisation of CNT reinforced Al Nanocomposite
Synthesis & Characterisation of CNT reinforced Al NanocompositeSynthesis & Characterisation of CNT reinforced Al Nanocomposite
Synthesis & Characterisation of CNT reinforced Al NanocompositeMalik Tayyab
 
FINAL YEAR PROJECT PPT
FINAL YEAR PROJECT PPTFINAL YEAR PROJECT PPT
FINAL YEAR PROJECT PPTMATHAVAN S
 
Mmc (metel mtrix composite)
Mmc (metel mtrix composite)Mmc (metel mtrix composite)
Mmc (metel mtrix composite)Yogesh Baghel
 
Dr.R.Narayanasamy, Dr.S.Sivasankaran and Dr.K.Siva Prasad on Mechanical Alloying
Dr.R.Narayanasamy, Dr.S.Sivasankaran and Dr.K.Siva Prasad on Mechanical AlloyingDr.R.Narayanasamy, Dr.S.Sivasankaran and Dr.K.Siva Prasad on Mechanical Alloying
Dr.R.Narayanasamy, Dr.S.Sivasankaran and Dr.K.Siva Prasad on Mechanical AlloyingDr.Ramaswamy Narayanasamy
 
Izmir Institute of Technology
Izmir Institute of Technology Izmir Institute of Technology
Izmir Institute of Technology Mesut Güngör
 
AVS 2012 Poster v3 (final version)
AVS 2012 Poster v3 (final version)AVS 2012 Poster v3 (final version)
AVS 2012 Poster v3 (final version)Liwang Ye
 
Seminar on tribological behaviour of alumina reinfoeced composite material na...
Seminar on tribological behaviour of alumina reinfoeced composite material na...Seminar on tribological behaviour of alumina reinfoeced composite material na...
Seminar on tribological behaviour of alumina reinfoeced composite material na...Sidharth Adhikari
 
A REVIEW ON GRAPHENE REINFORCED ALUMINIUM MATRIX COMPOSITE
A REVIEW ON GRAPHENE REINFORCED ALUMINIUM MATRIX COMPOSITEA REVIEW ON GRAPHENE REINFORCED ALUMINIUM MATRIX COMPOSITE
A REVIEW ON GRAPHENE REINFORCED ALUMINIUM MATRIX COMPOSITEAnubhav Mahapatra
 
Metal matrix composites
Metal matrix compositesMetal matrix composites
Metal matrix compositesHiep Tran
 
Msu composites2009
Msu composites2009Msu composites2009
Msu composites2009Vane Mt
 

Destacado (20)

COMPOSITE FABRICATION TECHNIQUES
COMPOSITE FABRICATION TECHNIQUESCOMPOSITE FABRICATION TECHNIQUES
COMPOSITE FABRICATION TECHNIQUES
 
Ammc's fabricated by friction stir process
Ammc's fabricated by friction stir processAmmc's fabricated by friction stir process
Ammc's fabricated by friction stir process
 
Aluminum a lloy ash composites
Aluminum a lloy ash compositesAluminum a lloy ash composites
Aluminum a lloy ash composites
 
Al-B4C Nanocomposite by MR.Govahi
Al-B4C Nanocomposite by MR.GovahiAl-B4C Nanocomposite by MR.Govahi
Al-B4C Nanocomposite by MR.Govahi
 
Carbon Nanotubes Synthesis
Carbon Nanotubes SynthesisCarbon Nanotubes Synthesis
Carbon Nanotubes Synthesis
 
Oryx
OryxOryx
Oryx
 
Presentation1
Presentation1Presentation1
Presentation1
 
Synthesis & Characterisation of CNT reinforced Al Nanocomposite
Synthesis & Characterisation of CNT reinforced Al NanocompositeSynthesis & Characterisation of CNT reinforced Al Nanocomposite
Synthesis & Characterisation of CNT reinforced Al Nanocomposite
 
Presentation1
Presentation1Presentation1
Presentation1
 
FINAL YEAR PROJECT PPT
FINAL YEAR PROJECT PPTFINAL YEAR PROJECT PPT
FINAL YEAR PROJECT PPT
 
Drive shaft by using composite material
Drive shaft by using composite materialDrive shaft by using composite material
Drive shaft by using composite material
 
projeecttt (2)
projeecttt (2)projeecttt (2)
projeecttt (2)
 
Mmc (metel mtrix composite)
Mmc (metel mtrix composite)Mmc (metel mtrix composite)
Mmc (metel mtrix composite)
 
Dr.R.Narayanasamy, Dr.S.Sivasankaran and Dr.K.Siva Prasad on Mechanical Alloying
Dr.R.Narayanasamy, Dr.S.Sivasankaran and Dr.K.Siva Prasad on Mechanical AlloyingDr.R.Narayanasamy, Dr.S.Sivasankaran and Dr.K.Siva Prasad on Mechanical Alloying
Dr.R.Narayanasamy, Dr.S.Sivasankaran and Dr.K.Siva Prasad on Mechanical Alloying
 
Izmir Institute of Technology
Izmir Institute of Technology Izmir Institute of Technology
Izmir Institute of Technology
 
AVS 2012 Poster v3 (final version)
AVS 2012 Poster v3 (final version)AVS 2012 Poster v3 (final version)
AVS 2012 Poster v3 (final version)
 
Seminar on tribological behaviour of alumina reinfoeced composite material na...
Seminar on tribological behaviour of alumina reinfoeced composite material na...Seminar on tribological behaviour of alumina reinfoeced composite material na...
Seminar on tribological behaviour of alumina reinfoeced composite material na...
 
A REVIEW ON GRAPHENE REINFORCED ALUMINIUM MATRIX COMPOSITE
A REVIEW ON GRAPHENE REINFORCED ALUMINIUM MATRIX COMPOSITEA REVIEW ON GRAPHENE REINFORCED ALUMINIUM MATRIX COMPOSITE
A REVIEW ON GRAPHENE REINFORCED ALUMINIUM MATRIX COMPOSITE
 
Metal matrix composites
Metal matrix compositesMetal matrix composites
Metal matrix composites
 
Msu composites2009
Msu composites2009Msu composites2009
Msu composites2009
 

Similar a Air pollution prediction using conformal kriging and machine learning

Moving Target Detection Using CA, SO and GO-CFAR detectors in Nonhomogeneous ...
Moving Target Detection Using CA, SO and GO-CFAR detectors in Nonhomogeneous ...Moving Target Detection Using CA, SO and GO-CFAR detectors in Nonhomogeneous ...
Moving Target Detection Using CA, SO and GO-CFAR detectors in Nonhomogeneous ...mathsjournal
 
Moving Target Detection Using CA, SO and GO-CFAR detectors in Nonhomogeneous ...
Moving Target Detection Using CA, SO and GO-CFAR detectors in Nonhomogeneous ...Moving Target Detection Using CA, SO and GO-CFAR detectors in Nonhomogeneous ...
Moving Target Detection Using CA, SO and GO-CFAR detectors in Nonhomogeneous ...mathsjournal
 
Improvement of Anomaly Detection Algorithms in Hyperspectral Images Using Dis...
Improvement of Anomaly Detection Algorithms in Hyperspectral Images Using Dis...Improvement of Anomaly Detection Algorithms in Hyperspectral Images Using Dis...
Improvement of Anomaly Detection Algorithms in Hyperspectral Images Using Dis...sipij
 
Boosting CED Using Robust Orientation Estimation
Boosting CED Using Robust Orientation EstimationBoosting CED Using Robust Orientation Estimation
Boosting CED Using Robust Orientation Estimationijma
 
Path Loss Prediction by Robust Regression Methods
Path Loss Prediction by Robust Regression MethodsPath Loss Prediction by Robust Regression Methods
Path Loss Prediction by Robust Regression Methodsijceronline
 
Photoacoustic tomography based on the application of virtual detectors
Photoacoustic tomography based on the application of virtual detectorsPhotoacoustic tomography based on the application of virtual detectors
Photoacoustic tomography based on the application of virtual detectorsIAEME Publication
 
A statistical approach to spectrum sensing using bayes factor and p-Values
A statistical approach to spectrum sensing using bayes factor and p-ValuesA statistical approach to spectrum sensing using bayes factor and p-Values
A statistical approach to spectrum sensing using bayes factor and p-ValuesIJECEIAES
 
MAXENTROPIC APPROACH TO DECOMPOUND AGGREGATE RISK LOSSES
MAXENTROPIC APPROACH TO DECOMPOUND AGGREGATE RISK LOSSESMAXENTROPIC APPROACH TO DECOMPOUND AGGREGATE RISK LOSSES
MAXENTROPIC APPROACH TO DECOMPOUND AGGREGATE RISK LOSSESErika G. G.
 
Calculation of solar radiation by using regression methods
Calculation of solar radiation by using regression methodsCalculation of solar radiation by using regression methods
Calculation of solar radiation by using regression methodsmehmet şahin
 
Effects of missing observations on
Effects of missing observations onEffects of missing observations on
Effects of missing observations onijcsa
 
Boosting ced using robust orientation estimation
Boosting ced using robust orientation estimationBoosting ced using robust orientation estimation
Boosting ced using robust orientation estimationijma
 
Composite Analysis of Phase Resolved Partial Discharge Patterns using Statist...
Composite Analysis of Phase Resolved Partial Discharge Patterns using Statist...Composite Analysis of Phase Resolved Partial Discharge Patterns using Statist...
Composite Analysis of Phase Resolved Partial Discharge Patterns using Statist...IJMER
 
Spatial Point Processes and Their Applications in Epidemiology
Spatial Point Processes and Their Applications in EpidemiologySpatial Point Processes and Their Applications in Epidemiology
Spatial Point Processes and Their Applications in EpidemiologyLilac Liu Xu
 
Modeling the Chlorophyll-a from Sea Surface Reflectance in West Africa by Dee...
Modeling the Chlorophyll-a from Sea Surface Reflectance in West Africa by Dee...Modeling the Chlorophyll-a from Sea Surface Reflectance in West Africa by Dee...
Modeling the Chlorophyll-a from Sea Surface Reflectance in West Africa by Dee...gerogepatton
 
article_mdimagh_haddar_2012
article_mdimagh_haddar_2012article_mdimagh_haddar_2012
article_mdimagh_haddar_2012Mdimagh Ridha
 
Ellipsometry- non destructive measuring method
Ellipsometry- non destructive measuring methodEllipsometry- non destructive measuring method
Ellipsometry- non destructive measuring methodViji Vijitha
 
OPTIMIZATION OF MANUFACTURE OF FIELDEFFECT HETEROTRANSISTORS WITHOUT P-NJUNCT...
OPTIMIZATION OF MANUFACTURE OF FIELDEFFECT HETEROTRANSISTORS WITHOUT P-NJUNCT...OPTIMIZATION OF MANUFACTURE OF FIELDEFFECT HETEROTRANSISTORS WITHOUT P-NJUNCT...
OPTIMIZATION OF MANUFACTURE OF FIELDEFFECT HETEROTRANSISTORS WITHOUT P-NJUNCT...ijrap
 
Cell hole identification in carcinogenic segment using Geodesic Methodology: ...
Cell hole identification in carcinogenic segment using Geodesic Methodology: ...Cell hole identification in carcinogenic segment using Geodesic Methodology: ...
Cell hole identification in carcinogenic segment using Geodesic Methodology: ...Soumen Santra
 

Similar a Air pollution prediction using conformal kriging and machine learning (20)

Moving Target Detection Using CA, SO and GO-CFAR detectors in Nonhomogeneous ...
Moving Target Detection Using CA, SO and GO-CFAR detectors in Nonhomogeneous ...Moving Target Detection Using CA, SO and GO-CFAR detectors in Nonhomogeneous ...
Moving Target Detection Using CA, SO and GO-CFAR detectors in Nonhomogeneous ...
 
Moving Target Detection Using CA, SO and GO-CFAR detectors in Nonhomogeneous ...
Moving Target Detection Using CA, SO and GO-CFAR detectors in Nonhomogeneous ...Moving Target Detection Using CA, SO and GO-CFAR detectors in Nonhomogeneous ...
Moving Target Detection Using CA, SO and GO-CFAR detectors in Nonhomogeneous ...
 
Improvement of Anomaly Detection Algorithms in Hyperspectral Images Using Dis...
Improvement of Anomaly Detection Algorithms in Hyperspectral Images Using Dis...Improvement of Anomaly Detection Algorithms in Hyperspectral Images Using Dis...
Improvement of Anomaly Detection Algorithms in Hyperspectral Images Using Dis...
 
CLIM: Transition Workshop - Incorporating Spatial Dependence in Remote Sensin...
CLIM: Transition Workshop - Incorporating Spatial Dependence in Remote Sensin...CLIM: Transition Workshop - Incorporating Spatial Dependence in Remote Sensin...
CLIM: Transition Workshop - Incorporating Spatial Dependence in Remote Sensin...
 
Boosting CED Using Robust Orientation Estimation
Boosting CED Using Robust Orientation EstimationBoosting CED Using Robust Orientation Estimation
Boosting CED Using Robust Orientation Estimation
 
NIR imaging
NIR imaging NIR imaging
NIR imaging
 
Path Loss Prediction by Robust Regression Methods
Path Loss Prediction by Robust Regression MethodsPath Loss Prediction by Robust Regression Methods
Path Loss Prediction by Robust Regression Methods
 
Photoacoustic tomography based on the application of virtual detectors
Photoacoustic tomography based on the application of virtual detectorsPhotoacoustic tomography based on the application of virtual detectors
Photoacoustic tomography based on the application of virtual detectors
 
A statistical approach to spectrum sensing using bayes factor and p-Values
A statistical approach to spectrum sensing using bayes factor and p-ValuesA statistical approach to spectrum sensing using bayes factor and p-Values
A statistical approach to spectrum sensing using bayes factor and p-Values
 
MAXENTROPIC APPROACH TO DECOMPOUND AGGREGATE RISK LOSSES
MAXENTROPIC APPROACH TO DECOMPOUND AGGREGATE RISK LOSSESMAXENTROPIC APPROACH TO DECOMPOUND AGGREGATE RISK LOSSES
MAXENTROPIC APPROACH TO DECOMPOUND AGGREGATE RISK LOSSES
 
Calculation of solar radiation by using regression methods
Calculation of solar radiation by using regression methodsCalculation of solar radiation by using regression methods
Calculation of solar radiation by using regression methods
 
Effects of missing observations on
Effects of missing observations onEffects of missing observations on
Effects of missing observations on
 
Boosting ced using robust orientation estimation
Boosting ced using robust orientation estimationBoosting ced using robust orientation estimation
Boosting ced using robust orientation estimation
 
Composite Analysis of Phase Resolved Partial Discharge Patterns using Statist...
Composite Analysis of Phase Resolved Partial Discharge Patterns using Statist...Composite Analysis of Phase Resolved Partial Discharge Patterns using Statist...
Composite Analysis of Phase Resolved Partial Discharge Patterns using Statist...
 
Spatial Point Processes and Their Applications in Epidemiology
Spatial Point Processes and Their Applications in EpidemiologySpatial Point Processes and Their Applications in Epidemiology
Spatial Point Processes and Their Applications in Epidemiology
 
Modeling the Chlorophyll-a from Sea Surface Reflectance in West Africa by Dee...
Modeling the Chlorophyll-a from Sea Surface Reflectance in West Africa by Dee...Modeling the Chlorophyll-a from Sea Surface Reflectance in West Africa by Dee...
Modeling the Chlorophyll-a from Sea Surface Reflectance in West Africa by Dee...
 
article_mdimagh_haddar_2012
article_mdimagh_haddar_2012article_mdimagh_haddar_2012
article_mdimagh_haddar_2012
 
Ellipsometry- non destructive measuring method
Ellipsometry- non destructive measuring methodEllipsometry- non destructive measuring method
Ellipsometry- non destructive measuring method
 
OPTIMIZATION OF MANUFACTURE OF FIELDEFFECT HETEROTRANSISTORS WITHOUT P-NJUNCT...
OPTIMIZATION OF MANUFACTURE OF FIELDEFFECT HETEROTRANSISTORS WITHOUT P-NJUNCT...OPTIMIZATION OF MANUFACTURE OF FIELDEFFECT HETEROTRANSISTORS WITHOUT P-NJUNCT...
OPTIMIZATION OF MANUFACTURE OF FIELDEFFECT HETEROTRANSISTORS WITHOUT P-NJUNCT...
 
Cell hole identification in carcinogenic segment using Geodesic Methodology: ...
Cell hole identification in carcinogenic segment using Geodesic Methodology: ...Cell hole identification in carcinogenic segment using Geodesic Methodology: ...
Cell hole identification in carcinogenic segment using Geodesic Methodology: ...
 

Air pollution prediction using conformal kriging and machine learning

  • 1. Conformal prediction of air pollution concentrations for the Barcelona Metropolitan Region PhD Thesis summary Olga Ivina University of Girona GRECS research group CIBER de Epidemiolog´ y la Salud P´blica ıa u November 22, 2012 1 / 42
  • 2. Outline Introduction Air pollution and its effects Air pollution exposure assessment Conformal predictors for air pollution problem Objectives Methods and data Kriging Conformal predictors Computing Data Results Ordinary kriging and RRCM models in default setting Kernelisation: a Gaussian kernel Kernelisation: other kernels Comparison of models Discussion Conclusion Conformal predictors and geostatistics Future research 2 / 42
  • 3. Air pollution and its effects Introduction Air pollutant is a problem of growing concern all over the world. There exists great body of scientific evidence of hazardous effect of air pollution on people’s health and well-being, as well as on general ecological condition of our planet. In people: association with adverse health outcomes - both in adults and in children. Children are specially susceptible to pollution. They get affected from the very first stages of their lives and on. Linked outcomes (to name a few): - preterm birth and low birth weight - asthma aggravation, cough and bronchitis - allergies: hay fever, rhinitis, ... - excess risk of mortality 3 / 42
  • 4. Air pollution and its effects - 2 Introduction Adults are influenced by pollution as well. In them, pollution is linked to both long-term and short-term health effects (to name a few): - respiratory: COPD, asthma, chronic bronchitis - lung cancer - cardiovascular morbidity - mortality: cancer, all-cause, cardiopulmonary, non-accidental,... Special factors of impact: SES and geographical location of a person. 4 / 42
  • 5. Air pollution and its effects - 3 Introduction Global air pollution map produced by Envisat’s SCIAMACHY. Authors: S. Beirle, U. Platt and T. Wagner, University of Heidelberg’s Institute for Environmental Physics. 5 / 42
  • 6. Air pollution and its effects - 4 Introduction The main contributor to air pollution in urban areas is traffic. Two - ”criteria” - traffic-related air pollutants are taken up in this study: - nitrogen dioxide (NO2) - particulate matter PM10 NO2 effects: short-term: respiratory effects and asthma aggravation long-term: risk of coronary heart disease and fatal events PM10 effects: short-term: aggravation of respiratory and cardiovascular diseases, premature death, ... long-term: development of heart and lung diseases, premature death,... 6 / 42
  • 7. Air pollution exposure assessment Introduction Problem: direct measurements of pollution not always available. There exists a large number of models aimed t predict pollution at a given spot. The main classes are: - proximity models - geostatistical models - land use regression (LUR) models - dispersion models - integrated meteorological emission (IME) models - hybrid models 7 / 42
  • 8. Conformal predictors for air pollution problem Introduction Problem: nowadays existing methods for air pollution exposure assessment may lack confidence in predictions. In order to tackle this problem, this research suggests making use of a newly developed approach that is conformal predictors. A conformal predictor is a “confidence predictor”, where the level of confidence for prediction is introduced ad hoc. This prediction is always valid - provided by definition of conformal predictor. 8 / 42
  • 9. Conformal predictors for air pollution problem - 2 Introduction A conformal predictor is defined by some nonconformity measure, and it has two major desiderata: - validity of predictions - efficiency of preditions Conformal predictors are flexible: they can be based upon almost any underlying statistical algorithm. In air pollution modeling, if a regression-based algorithm is taken up, such as LUR or kriging, regression residuals serve as a nonconformity measure. 9 / 42
  • 10. Objectives This dissertation has two major objectives: 1 To demonstrate the capacity of conformal predictors as a method for spatial environmental modeling. 2 To provide valid estimates of nitrogen dioxide and fine particulate matter for Barcelona Metropolitan Region. 10 / 42
  • 11. Kriging Methods and data Kriging is a spatial interpolation method. Provides a prediction of a factor of interest in an unobserved point on the basis of a set of observed points. Also provides an estimate of error variance (called “kriging variance”). First introduced in 1951 by a South African engineer D.H. Krige in his master work devoted to estimation of a mineral ore body. The method has been further developed: nowadays the notion “kriging” stands for asset of methods such as ordinary kriging, simple kriging, co-kriging, Bayesian kriging etc. In its simples form, a kriging estimate of the data at an unobserved location is a linear combination of the observed data. The coefficients of the equation depend on spatial structure of the data and on the spatial covariance. 11 / 42
  • 12. Kriging - 2 Methods and data The most common kriging is ordinary kriging. It is used when the mean of the second order stationary process is unknown. It is based on a geostatistical concept of variogram, and its approach - covariance function. Let there be n neighboring observed locations, x1 , . . . , xn , and an unobserved location x0 , on a spatial domain D. Let Z (x) : x ∈ D denote the process, and let it have a variogram γ(h). Then the ordinary kriging ∗ estimate ZOK (x0 ) at the unobserved point x0 will take the following analytical form: n ∗ ZOK (x0 ) = ωα Z (xα ), (1) α=1 where ωα are the kriging weights. Ordinary kriging provides BLUE estimates of a random field, together with an error variance estimate (kriging variance.) 12 / 42
  • 13. New methods. Conformal predictors Methods and data How it works? Provided: pairs of observations of (xi , yi ) where xi is an object and yi is a label. Then Z := X × Y (2) denotes the example space. Z is a measurable space. Given an incomplete data sequence (x1 , y1 ), (x2 , y2 ), . . . , (xn−1 , yn−1 ) ∈ Z∗ , the aim is to predict a label yn for an object xn . An operator: D : Z∗ × X → Y (3) denotes then a simple predictor. (e.g., an ordinary kriging predictor). 13 / 42
  • 14. New methods. Conformal predictors - 2 Methods and data The prediction can be described as: yn = D(x1 , y1 , x2 , y2 , . . . ; xn−1 ), Yn ∈ Y. (4) Let us allow the predictor to output the prediction sets Yn large enough to provide the confidence in prediction. This means, that the real value of yn will fall in Yn with a given level of confidence, which is chosen and provided to a predictor ad hoc. A conformal predictor is a confidence predictor defined by some nonconformity measure. Given the measure, a conformal predictor outputs the prediction set assuming that the new example conforms with the observed ones. 14 / 42
  • 15. New methods. Conformal predictors - 3 Methods and data Ridge regression confidence machine (RRCM) is a regression-based conformal predictor. It makes use of the ridge regression procedure (A. E. Hoerl, 1971) as an underlying algorithm. Suppose Xn is the n × p matrix of objects (independent variables), and Yn is the vector of labels (dependent variables). Then, a RRCM estimate of parameters ω takes form: ω = (Xn Xn + aIp )−1 Xn Yn , (5) where a is a ridge factor. a = 0 yields a standard least squares estimate. The nonconformity scores for this predictor are the regression residuals: |ei | := |yi − yi |. ˆ 15 / 42
  • 16. New methods. Conformal predictors - 4 Methods and data Based on a significance level for prediction introduced (roughly, a probability of error not to exceed), a RRCM predictor outputs a set of labels y for yn : Si := {y : αi (y ) ≥ αn (y )} = {y : |ai + bi y | ≥ |an + bn y |}, (6) where ai and bi are the components of the vectors A and B. RRCM outputs prediction sets instead of point predictions (what kriging does). These sets can be in form of a point, an interval, a ray, a union of two rays, the whole real line, or empty. Usually, it is an interval. 16 / 42
  • 17. New methods. Conformal predictors - 5 Methods and data When the number of parameters p is large, computation is hard. “Kernel trick” is a method that helps deal with hight-dimensional data. It allows to consider nonlinearity in RRCM. A kernel is a similarity measure that operates in a feature space. Provided an input space X with a dot product, and an operator Φ that maps X to a feature space H: Φ:X →H x → x := Φ(x) a kernel will be defined as follows. For xα , xβ ∈ X : k(xα , xβ ) = Φ(xα ), Φ(xβ ) (7) 17 / 42
  • 18. New methods. Conformal predictors - 6 Methods and data Any conventional covariance function for kriging can be taken up as a kernel for RRCM. This research uses three (positive definite) kernels: a dot product kernel (default) a radial basis Gaussian kernel an inhomogeneous polynomial kernel of a second degree 18 / 42
  • 19. Computing Methods and data All computational work made with R. - Kriging: geoR package. Function krige.conv - RRCM: PredictiveRegression package. Function iidpred. - “Kernel trick” self-developed (on the basis of the PredictiveRegression : package) functions for RRCM in “dual form” and for implementing the kernels. 19 / 42
  • 20. Data Methods and data The data for this study has been kindly provided by XVPCA (Network for Monitoring and Forecasting of Air Pollution) of the Generalitat de Catalunya. Mean annual concentrations of two criteria pollutants, NO2 and PM10, are provided for the Barcelona Metropolitan Region, together with the geographical coordinates of the monitoring stations(Mercator, UTM 31). Time frames: - NO2: 1998 - 2009, ex. 2003 - PM10: 2001 - 2009, ex.2003 20 / 42
  • 21. Data - 2 Methods and data 49 monitoring stations over the area in total. Barcelona Metropolitan Region has a territory of about 3200 km2 and accommodates over 5 million inhabitants. In BMR, there happen about 107 million displacements weekly, 54.1% of them - by means of motorized transport. 21 / 42
  • 22. Data - 3 Methods and data Table: 1. Data on mean annual nitrogen dioxide concentrations Available observations for each year 1998 1999 2000 2001 2002 2004 2005 2006 2007 2008 2009 24 25 25 25 25 24 22 24 25 25 24 Table: 2. Data on mean annual particulate matter concentrations Available observations for each year 2001 2002 2004 2005 2006 2007 2008 2009 22 24 28 28 29 30 33 36 22 / 42
  • 23. Data - 4 Methods and data Two major drawbacks, or limiting factors, of the data set: Size: there was a small number of observations for each year and pollutant, Distribution: the measurement spots are situated quite far apart from one another, and they are distributed, or placed, unevenly over the geographic region. Also, the data is the mean averages, and more frequent observations were unavailable for this study. 23 / 42
  • 24. Ordinary kriging and RRCM modeling results Results 24 / 42
  • 25. Ordinary kriging and RRCM modeling results - 2 Results 25 / 42
  • 26. Ordinary kriging and RRCM modeling results - 3 Results 26 / 42
  • 27. Kernelisation: a Gaussian kernel Results 27 / 42
  • 28. Kernelisation: a Gaussian kernel - 2 Results 28 / 42
  • 29. Kernelisation: a Gaussian kernel - 3 Results 29 / 42
  • 30. Comparison of the RRCM models Results 30 / 42
  • 31. Comparison of the RRCM models - 2 Results 31 / 42
  • 32. Comparison of the RRCM models - 3 Results Table: Comparison of models for different ridge factors (µg/m3 ) linear iid RBF polynomial ridge 0.01 1 2 0.01 1 2 0.01 1 2 2001 64.46 64.44 67.13 71.08 63.11 66.06 71.95 74.63 77.24 2002 43.43 42.46 45.54 47.41 42.91 45.05 50.44 53.17 55.82 2004 47.26 39.17 34.59 51.48 39.29 35.19 34.66 37.00 39.51 2005 39.65 45.14 49.28 35.50 47.60 51.91 51.44 54.76 57.76 2006 47.68 45.40 48.63 55.51 46.09 48.86 52.48 55.27 57.86 2007 91.43 94.02 96.45 85.40 94.09 96.65 99.83 102.11 104.29 2008 49.48 50.90 52.58 45.42 55.27 58.21 55.60 57.26 58.91 2009 28.42 27.32 29.01 29.16 26.11 27.79 32.26 33.67 35.09 32 / 42
  • 33. Comparison of the RRCM models - 4 Results 33 / 42
  • 34. Comparison of the RRCM models - 5 Results 34 / 42
  • 35. Comparison of the RRCM models - 6 Results Table: Comparison of models for different ridge factors (µg/m3 ) linear iid RBF polynomial ridge 0.01 1 2 0.01 1 2 0.01 1 2 1998 76.08 72.33 68.27 65.81 72.37 68.37 65.27 64.71 65.99 1999 66.31 60.11 61.44 67.68 60.57 60.39 65.32 68.20 70.87 2000 51.69 55.27 57.89 50.91 52.90 55.63 61.89 64.19 66.38 2001 36.25 41.30 44.90 35.32 38.65 42.36 49.54 52.34 54.95 2002 52.12 46.57 49.51 47.78 51.44 57.38 54.51 56.99 59.37 2004 53.65 59.11 62.46 53.89 56.95 60.41 67.06 69.36 71.60 2005 78.75 84.77 88.57 79.44 82.18 86.14 94.41 96.94 99.43 2006 61.79 66.39 69.78 61.24 63.82 67.38 74.90 77.36 79.76 2007 47.01 49.35 53.13 48.15 47.11 51.04 57.15 59.91 62.48 2008 46.96 50.15 53.58 47.45 48.04 51.55 57.63 60.21 62.63 2009 55.59 55.17 53.89 48.38 54.35 52.68 52.79 55.19 57.57 35 / 42
  • 36. Efficiency of predictions Discussion Kriging predictions are smooth and vary little, also made for mean annual data. Error estimates, however, are huge in case of nitrogen dioxide, and small in case of airborne particles - subject to properties of the substances: NO2 is known to have a generally larger variability than PM10. Kriging intervals can be derived, assuming the Gaussianity of data distribution. This assumption is common, but not always correct. RRCM makes no assumption on data distribution, apart from being iid. Two factors help boost the efficiency of RRCM prediction: kernels and ridge factor. The least is chosen by the brute force method (or the method of consecutive approximations). 36 / 42
  • 37. Conformal predictors and geostatistics Conclusion Table: Comparison of OK and RRCM OK RRCM point predictions prediction sets (usually intervals) regression algorithm regression algorithm Gaussianity assumption iid assumption estimates error variance - uses variogram and uses any appropriate covariance function kernel to approach it - ridge factor may lack confidence confidence level is chosen and guaranteed 37 / 42
  • 38. Future research Conclusion Extend the existing data set for BMR Provide additional validation for the methods Test these models on the data for other cities Develop conformal predictors on the basis of other popular air pollution exposure modeling algorithms (land use regression, dispersion models etc.) 38 / 42
  • 39. Selected references V.Vovk, A.Gammerman, G.Shafer, Algorithmic learning in a random world, Springer (2005). V.Vovk, I.Nouretdinov, A. Gammerman, On-line predictive linear regression, The Annals of Statistics (2009). H. Wackernagel, Multivariate geostatistics: an introduction with applications, Springer (2003). B. Sch¨lkopf, J. Smola, Learning with kernels: support vector o machines, regularization, optimization, and beyond, MIT Press (2002). A. Lertxundi-Manterola, M. Saez, Modelling of nitrogen dioxide (NO2) and fine particulate matter (PM10) air pollution in the metropolitan areas of Barcelona and Bilbao, Spain, Environmetrics (2009). 39 / 42
  • 40. Selected references - 2 A. Hoerl, R. Kennard, Ridge regression: Biased estimation for nonorthogonal problems, Technometrics 12.1 (1970). P. Diggle, P. Ribeiro Jr., Model-Based Geostatistics, Springer (2007). P. Ribeiro Jr., P. Diggle, geoR: a package for geostatistical analysis, R-NEWS 1.2 (2001). N. Cressie, Statistics for spatial data, Wiley (1993). M. Jerrett et al., A review and evaluation of intraurban air pollution exposure models, Journal of exposure analysis and environmental epidemiology (2005). 40 / 42