SlideShare una empresa de Scribd logo
1 de 39
Descargar para leer sin conexión
Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013

Perspectives of Predictive Modeling
Arthur Charpentier
charpentier.arthur@uqam.ca
http ://freakonometrics.hypotheses.org/

(SOA Webcast, November 2013)

1
Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013

Agenda
•
◦
◦
•
◦
◦
•
•
◦
◦
◦
•

Introduction to Predictive Modeling
Prediction, best estimate, expected value and confidence interval
Parametric versus nonparametric models
Linear Models and (Ordinary) Least Squares
From least squares to the Gaussian model
Smoothing continuous covariates
From Linear Models to G.L.M.
Modeling a TRUE-FALSE variable
The logistic regression
R.O.C. curve
Classification tree (and random forests)
From individual to functional data

2
Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013

0.015

Prediction ? Best estimate ?
E.g. predicting someone’s weight (Y )
−→

q

qqq qqqqqqqqqqqqqqqqqqqqqqqqq qqqqqqqqqqqqq qqqq q
qqqqqq qq qqqq
q

qq

qqq

q

0.010

Consider a sample {y1 , · · · , yn }

Density

Model : Yi = β0 + εi
with E(ε) = 0 and Var(ε) = σ 2

0.005

ε is some unpredictable noise

0.000

Y = y is our ‘best guess’...

100

150

200

250

Weight (lbs.)

Predicting means estimating E(Y ).
Recall that E(Y ) = argmin{ Y − y
y∈R

ε
L2 }

= argmin{E [Y − y]2 }
y∈R
least squares

3
Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013

Best estimate with some confidence
0.015

E.g. predicting someone’s weight (Y )
Give an interval [y− , y+ ] such that

q

qq

qqq

q

we can estimate the distribution of Y
dF (x)
F (y) = P(Y ≤ y) or f (y) =
dx x=y

0.000

Confidence intervals can be derived if

0.005

Density

0.010

P(Y ∈ [y− , y+ ]) = 1 − α

qqq qqqqqqqqqqqqqqqqqqqqqqqqq qqqqqqqqqqqqq qqqq q
qqqqqq qq qqqq
q

100

150

200

250

Weight (lbs.)

(related to the idea of “quantifying uncertainty” in our prediction...)
4
Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013

Parametric inference
0.015

E.g. predicting someone’s weight (Y )
Assume that F ∈ F = {Fθ , θ ∈ Θ}

0.010

1. Provide an estimate θ
Density

2. Compute bound estimates

y+ =

θ
F −1 (1
θ

− α/2)

0.005

y− = F −1 (α/2)

Standard estimation technique :
0.000

−→ maximum likelihood techniques

90%

100

150

200

250

Weight (lbs.)

n

θ = argmax
θ∈Θ

log fθ (yi )
i=1


 explicit (analytical) expression for θ
 numerical optimization (Newton Raphson)

log likelihood

5
Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013

Non-parametric inference

natural estimator for P(Y ≤ y)

qq

qqq

q

0.010

qqq qqqqqqqqqqqqqqqqqqqqqqqqq qqqqqqqqqqqqq qqqq q
qqqqqq qq qqqq
q

0.005

#{i such that yi ≤y}

q

Density

1. Empirical distribution function
n
1
1(yi ≤ y)
F (y) =
n i=1

0.015

E.g. predicting someone’s weight (Y )

2. Compute bound estimates
90%

(α/2)

y+ = F −1 (1 − α/2)

0.000

y− = F

−1

100

150

200

250

Weight (lbs.)

6
Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013

Prediction using some covariates

βM

1(X1,i = F) + εi

0.010
0.005

β1

Male

βF −βM

with E(ε) = 0 and Var(ε) = σ 2

0.000

or Yi = β0 +

Female

Density

based on his/her sex (X1 )

 β + ε if X = F
F
i
1,i
Model : Yi =
 βH + εi if X1,i = M

0.015

E.g. predicting someone’s weight (Y )

100

150

200

250

Weight (lbs.)

7
Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013

Prediction using some (categorical) covariates
E.g. predicting someone’s weight (Y )
0.015

based on his/her sex (X1 )
Conditional parametric model

0.005

Male

0.000

−→ our prediction will be

Female

Density


 F if X = F
θF
1,i
i.e. Yi ∼
 Fθ if X1,i = M
M

0.010

assume that Y |X1 = x1 ∼ Fθ(x1 )

100

conditional on the covariate

150

200

250

Weight (lbs.)

8
Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013

Prediction using some (categorical) covariates

0.010
0.005

Density
0.005

Density

0.010

0.015

Prediction of Y when X1 = M

0.015

Prediction of Y when X1 = F

0.000

90%

0.000

90%

100

150

200
Weight (lbs.)

250

100

150

200

250

Weight (lbs.)

9
Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013

Linear Models, and Ordinary Least Squares
E.g. predicting someone’s weight (Y )

250

q

based on his/her height (X2 )
q
q

q

Linear Model : Yi = β0 + β2 X2,i + εi
200

2

Conditional parametric model

q
q
q

150

q

q

100

q
q

q
q

q

q
q
q q
q q q q
q
q
q
q q
q
q
q q
q q
q
q
q
q
q
q
q
q
q q
q
q
q
q

E.g. Gaussian Linear Model

β0 +β2 x2

q

q

q

Y |X2 = x2 ∼ N ( µ(x2 ) , σ 2 (x2 ))

q

q

q q
q
q

q q
q
q

q

q

assume that Y |X2 = x2 ∼ Fθ(x2 )

q

q
q

q

Weight (lbs.)

with E(ε) = 0 and Var(ε) = σ

q

q

q

q

q
q
q q q
q
q
q q
q
q
q
q
q
q q
q q
q q
q
q q
q
q

q

q
q

q

q
q q

q
q

q
q

q

q
q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q

q

q
q
q
q
q

q q
q

q

q
q

q

q q

q

q q
q
q
q q
q
q q
q
q
q
q

q
q
q

q
q

q q
q
q
q

q

q
q q

q
q

q
q

q

σ2

5.0

5.5

6.0

6.5

Height (in.)

n

[Yi − X T β]2
i

−→ ordinary least squares, β = argmin
i=1

β is also the M.L. estimator of β
10
Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013

Prediction using no covariates

250

q
q

q
q
q
q

q
q

q
q

q
q

q

q
q

q q
q q
q
q

Weight (lbs.)

q
q
q
q
q
q
q
q

q
q

q q
q q
q
q
q
q

q
q

150

q
q
q
q

q
q

q
q
q
q
q q
q q
q q q q
q q q q
q
q
q
q
q
q
q q
q
q q
q
q
q q
q
q q
q q
q
q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q q
q q
q
q
q
q
q
q
q
q

q
q
q
q
q
q

q
q

q
q

q
q

q
q

q
q

q
q
q
q
q q q
q q q
q
q
q
q q
q
q q
q
q
q
q
q
q
q
q
q
q
q q
q q
q q
q q
q q
q q
q
q
q q
q q
q
q
q
q

q
q

q
q
q
q

q
q

q
q
q q
q q

q
q
q
q

q
q
q
q

q
q

q
q
q q
q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q

q
q

q
q
q
q
q
q
q
q
q
q

q q
q q

q
q

q
q

q
q

q
q

q
q

q q
q q

q
q

q q
q q
q
q
q
q
q q
q
q q
q
q q
q q
q
q
q
q
q
q
q
q

q
q
q
q
q
q

q

q
q
q
q

q q
q q
q
q

q
q
q
q

q
q

q
q

q q
q q

q
q

q
q

q
q

q
q

q
q

q q
q q q
q
qq
q
q
qq
q
q q
q
qq
q
qq
q q
q
q
qq
qqqq
q
q
q
q
q
q
qq
q
q
qq
q q
q q
q
q
q
q q
q
q q
q
q q q q q q
q q
q
q
q
q
q q q q q q
q
q
q
q
q q
q q qq
q
q
q q
q
q
q
q
q
q q q qq
q
q
q q
q q q q
q q qq q q
q
q
q
q
q
qq
q
q
q q q q qq
q
q
qq q
q
q qqqqqqq
q
q
q q
q
q
q q qq
q
q
q qq
q
q
q
q

q
q

90%

200

q
q
q
q

q
q

q
q

100

q
q

q

q

q

q

q

q
q

q
q

q
q

q

q

q
q

5.0

5.5

6.0

6.5

Height (in.)

11
Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013

Prediction using a categorical covariates
E.g. predicting someone’s weight (Y )
based on his/her sex (X1 )
E.g. Gaussian linear model
Y |X1 = M ∼ N (µM , σ 2 )
q

1
E(Y |X1 = M) =
nM

q
q

q q
q q q
q
qq
q
qq
q
q q
q qq
q
q
q
qq
q q q
q
qqqq
q
q
q
q
q
q
qq
q
q
qq
q
qq
q
qq q
q
q
q
q q
q
q q q q q q
q
q
q q q
q
q q q q q q
q
q
q
q
q q
q q qq
q
q
q q
q
q qq q q
q
q
q q
q
q q q qq qq q
q q q q q
q
q
q
q
q
q
q
q
q
q q q q qq
q
q
qq q
q
q qqqqqqq
q
q
q q
q
q
q
q q q
q
q
q qq
q
q
q
q
q

q

q

q

Yi = Y (M)
i:X1,i =M

q

q

q

q

Y ∈ Y (M) ± u1−α/2 · σ

q

q
q

q
q

q

q

1.96

Remark In the linear model, Var(ε) = σ 2 does not depend on X1 .
12
Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013

Prediction using a categorical covariates
E.g. predicting someone’s weight (Y )
based on his/her sex (X1 )
E.g. Gaussian linear model
Y |X1 = F ∼ N (µF , σ 2 )
q

1
E(Y |X1 = F) =
nF

q
q

q q
q q q
q
qq
q
qq
q
q q
q qq
q
q
q
qq
q q q
q
qqqq
q
q
q
q
q
q
qq
q
q
qq
q
qq
q
qq q
q
q
q
q q
q
q q q q q q
q
q
q q q
q
q q q q q q
q
q
q
q
q q
q q qq
q
q
q q
q
q qq q q
q
q
q q
q
q q q qq qq q
q q q q q
q
q
q
q
q
q
q
q
q
q q q q qq
q
q
qq q
q
q qqqqqqq
q
q
q q
q
q
q
q q q
q
q
q qq
q
q
q
q
q

q

q

q

Yi = Y (F)
i:X1,i =F

q

q

q

q

Y ∈ Y (F) ± u1−α/2 · σ

q

q
q

q
q

q

q

1.96

Remark In the linear model, Var(ε) = σ 2 does not depend on X1 .
13
Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013

Prediction using a continuous covariates

E.g. predicting someone’s weight (Y )
based on his/her height (X2 )
E.g. Gaussian linear model
Y |X2 = x2 ∼ N (β0 + β1 x2 , σ 2 )
q

E(Y |X2 = x2 ) = β0 + β1 x2 = Y (x2 )
Y ∈ Y (x2 ) ± u1−α/2 · σ

q
q

q q
q q q
q
q
qq
q
qq
q
q q
q
qq
q
qq
q q
q
q
qq
qqqq
q
q
q
q
q
q
qq
q
q
qq
q q
q
q
qq q
q q
q
q q
q
q q q q q q
q q
q
q
q
q
q q q q q q
q
q
q
q
q q
q q qq
q
q
q q
q
q
q
q
q
q q q qq
q
q
q q
q q q q
q q qq q q
q
q
q
q
q
qq
q
q
q q q q qq
q
q
qq q
q
q qqqqqqq
q
q
q q
q
q
q q qq
q
q
q qq
q
q
q
q
q

q

q

q

q

q

q
q

q
q

q
q

q

q

1.96

14
Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013

Improving our prediction ?

sex as
covariate

(Empirical) residuals, εi = Yi − X T β
i

height as
covariate
no covariates

Yi

R2 or log-likelihood

−50

0

50

100

−50

0

50

100

parsimony principle ?
−→ penalizing the likelihood with
the number of covariates
Akaike (AIC) or
Schwarz (BIC) criteria

q
q
q

q

qqq

q

qq

q

15
Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013

Relaxing the linear assumption in predictions
Use of b-spline function basis to estimate µ(·) where µ(x) = E(Y |X = x)

250

q

250

q

q
q

q

q

q

q
q
q

q

q q
q
q

q

q

q
q
q q
q q q q
q
q
q
q q
q
q
q q
q q
q
q
q
q
q
q
q
q
q q
q
q
q
q

q
q
q

q
q

q

q

q

q
q
q q q
q
q
q q
q
q
q
q
q
q q
q q
q q
q
q q
q
q

q

q
q

q

q
q q

q
q

q

q
q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q

q

q
q
q
q
q

q q
q

q

q
q

q

q
q
q

q q

q

q q
q
q q
q
q q
q
q
q
q

q q
q
q
q

q

q

q
q
q
q q
q
q

q

q
q q

q

q

q

q
q
q

q
q
q
q

q
q

q

q
q
q q
q q q q
q
q
q
q q
q
q
q q
q q
q
q
q
q
q
q
q
q
q q
q
q
q
q

q

5.0

q

q

100

150

q
q

q
q

q

q

q q
q
q

150

Weight (lbs.)

q

q

q

q
q

q

q

200

q
q q
q

q

q

Weight (lbs.)

200

q
q
q

100

q

q
q

q
q

q

q

q

q
q
q q q
q
q
q q
q
q
q
q
q
q q
q q
q q
q
q q
q
q

q

q
q

q

q
q q

q
q

q
q

q

q
q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q

q

q
q
q
q
q

q q
q

q

q
q

q

q q

q

q q
q
q
q q
q
q q
q
q
q
q

q
q
q

q
q

q q
q
q
q

q

q
q q

q
q

q
q

q

5.5

6.0
Height (in.)

6.5

5.0

5.5

6.0

6.5

Height (in.)

16
Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013

Relaxing the linear assumption in predictions

E.g. predicting someone’s weight (Y )
based on his/her height (X2 )
E.g. Gaussian linear model
Y |X2 = x2 ∼ N (µ(x2 ), σ 2 )

q
q
q

q q
q q q
q
q
qq
q
qq
q
q q
q qq
q
qq
q
q
q q q
q
qqqq
q
q
q
q
q
q
qq
q
q
qq
q
qq
q
qq q
q q
q
q q
q
q q q q q q
q
q
q q q
q
q q q q q q
q
q
q
q
q q
q q qq
q
q
q q
q
q qq q q
q
q
q q
q
q q q qq qq q
q q q q q
q
q
q
q
q
qq
q
q
q q q q qq
q
q
qq q
q
q qqqqqqq
q
q
q q
q
q
q
q q q
q
q
q qq
q
q
q
q
q

E(Y |X2 = x2 ) = µ(x2 ) = Y (x2 )
Y ∈ Y (x2 ) ± u1−α/2 · σ

q

q

q

q

q

q
q

q
q

q
q

q

q

1.96

Gaussian model : E(Y |X = x) = µ(x) (e.g. xT β) and Var(Y |X = x) = σ 2 .
17
Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013

Nonlinearities and missing covariates
250

q

E.g. predicting someone’s weight (Y )
q
q

q

based on his/her height and sex

q

q

200

model mispecification

q

q

q q
q
q

q
q
q
q

q

q q
q
q

q

150

q

E.g. Gaussian linear model

 β
0,F + β2,F X2,i + εi if X1,i = F
Yi =
 β0,M + β2,M X2,i + εi if X1,i = M

q

q

Weight (lbs.)

−→ nonlinearities can be related to

q

q
q

q
q
q

q
q

q

q
q
q q
q q q q
q
q
q
q q
q
q
q q
q q
q
q
q
q
q
q
q
q
q q
q
q
q
q

100

q

q

q

q
q
q

q

q
q

q q
q
q
q
q
q q q
q
q
q
q
q
q
q q q
q q
q q
q
q
q
q q
q
q
q
q
q

q
q
q

q

q
q q

q
q

q
q

q

q
q

q

q

q
q
q
q
q

q q
q

q

q
q

q

q q

q

q q
q
q
q q
q
q q
q
q
q
q

q
q
q

q
q

q q
q
q
q

q

q
q q

q
q

q

q

q
q

q
q

q

q

q

5.0

5.5

6.0

6.

Height (in.)

n

ωi (x) · [Yi − X T β]2
i

−→ local linear regression, β x = argmin
i=1
T

and set Y (x) = x β x
18
Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013

Local regression and smoothing techniques
q
q

250

250

q

q
q

q

q
q

q

q

q

q

q q

q

q
q

q
q
q
q
q

q
q
q
q
q
q

q
q

q

100

q
q

q

q
q

q
q

q
q
q
q

q
q q
q q
q q q q
q q q q
q
q
q
q qq
q
q q
q
q q
q
q
q q
q
q q
q q
q
q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q q
q q
q
q
q
q
q
q
q

q
q
q

q
q
q q
q

q
q

q
q
q
q
q q q
q q q
q
q
q
q q
q
q q
q
q
q
q
q
q
q
q
q
q
q q
q q
q q
q q
q q
q q
q
q
q q
q q
q

q

q
q

q
q
q
q

q
q

q

q
q

q

q
q
q
q

q

q

q
q
q

q

q
q

q
q

q
q
q
q
q

q

q

q
q
q

q

q
q
q
q

q
q

q
q

q
q

5.0

q
q

q
q

q
q
q

q q
q q
q
q
q
q

q
q

q
q

q q

q
q q
q
q q
q
q
q
q

q
q q

q
q
q
q
q
q

q
q

q

q q
q

q
q

q

q

q
q
q

q
q

q
q
q
q
q
q

q

q

q
q

100

150

q
q

q
q

q

q
q
q
q

q

q

q
q

q
q

q q
q q
q
q
q
q

150

q
q

q q
q

q

q
q

q
q

q

q
q

q
q

q
q

Weight (lbs.)

Weight (lbs.)

q q
q

q
q

q
q

q

200

200

q

q

q
q
q
q

q
q

q

q
q

q
q
q
q

q
q q
q q
q q q q
q q q q
q
q
q
q
q
q
q q
q
q q
q
q
q q
q
q q
q q
q
q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q q
q q
q
q
q
q
q
q
q

q
q
q
q
q q q
q q q
q
q
q
q q
q
q q
q
q
q
q
q
q
q
q
q
q
q q
q q
q q
q q
q q
q q
q
q
q q
q q
q
q

q
q

q
q
q
q

q
q

q
q
q q
q q

q
q

q
q
q
q

q
q

q
q

q
q
q
q
q
q
q
q
q
q

q q
q q

q
q

q
q

q
q

q
q
q q
q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q

q
q

q
q
q
q

q
q

q q
q q

q
q

q q
q q
q
q
q
q
q q
q
q q
q
q q
q q
q
q
q
q
q
q
q
q

q
q
q
q
q
q

q
q
q
q

q q
q q
q
q

q
q
q
q

q
q

q
q

q q
q q

q
q

q
q

q
q

q
q

q

q
q

q
q

q
q

5.5

6.0
Height (in.)

6.5

5.0

5.5

6.0

6.5

Height (in.)

19
Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013

k-nearest neighbours and smoothing techniques
q
q

250

250

q

q
q

q

q
q

q

q

q

q
q

q

q q

q

q
q

q
q
q

q q
q

q

q
q

q
q

q
q

q q
q
q

q
q
q
q

q

150

q
q

q
q
q

q
q
q

q
q

q
q
q

q
q

q
q q
q q
q q q q
q q q q
q
q
q
q
q q
q
q q
q
q q
q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q q
q q
q
q
q

q

q

q

q
q q
q
q

q

q
q

q

q
q
q

q

q
q

q
q

q
q
q
q
q

q

q

q

q
q

q

q
q
q
q
q

q

q
q

q
q

q
q

q q
q q
q
q

q
q

q
q

q
q
q
q

q
q q
q q
q q q q
q q q q
q
q
q
q
q
q
q q
q
q q
q
q
q q
q
q q
q q
q
q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q q
q q
q
q
q
q
q
q
q

q
q
q
q
q q q
q q q
q
q
q
q q
q
q q
q
q
q
q
q
q
q
q
q
q
q q
q q
q q
q q
q q
q q
q
q
q q
q q
q
q

q
q

q
q
q
q

q
q

q
q
q q
q q

q
q

q
q
q
q

q
q

q
q

q
q

q
q
q
q
q
q
q
q
q
q

q q
q q

q
q

q
q

q
q

q
q
q q
q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q

q
q

q
q
q
q

q q
q q

q
q

q q
q q
q
q
q
q
q q
q
q q
q
q q
q q
q
q
q
q
q
q
q
q

q
q
q
q
q
q

q
q
q
q

q q
q q
q
q

q
q
q
q

q
q

q
q

q q
q q

q
q

q
q

q
q

q
q

q

q
q

q
q

q
q

q

5.0

q
q

q
q

q
q

q q
q q
q
q
q
q

q
q

q q

q q
q
q q
q
q q
q
q
q
q

q
q q

q
q
q
q
q
q

q
q

100

q
q

q

q
q

q
q
q
q q q
q
q
q q
q
q
q
q
q
q
q
q q
q q
q
q q
q
q
q
q q
q
q

q
q

q

q
q
q

q

q

q

q
q

100

q q
q
q

q

q

q

q
q

q
q
q
q

150

q

q
q

q
q

q

q

q
q

q
q

Weight (lbs.)

Weight (lbs.)

q q
q

q
q

q
q

q

200

200

q

q

q
q
q
q

q
q

q

5.5

6.0
Height (in.)

6.5

5.0

5.5

6.0

6.5

Height (in.)

20
Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013

Kernel regression and smoothing techniques
q
q

250

250

q

q

q
q
q
q

q

q

q
q

q
q

q

q

q
q

q

q
q

q

q
q

q

q

q
q

q

q
q

q
q

q

q

q
q

q
q

q

q q
q q
q
q

q

q

q

q
q

q

q

q
q

q

q

q

q

q q

q

q

q

q

q q
q

q

q

q
qq

q

q

q

q

q

q

q

q
q

q

q

q

q

q
q

q
q

q

q
q

q

150

q

q

q

q

q

q

qq

q

q

q

q

q

q

q

q

q

q

q

q

q
q

qq

q

q

q

q

q
q
q
q
q

q

q

q
q

q

q

q
q
q

q

q
q

q

q

q
q

q
q

q
q

q
q

q
q

q
q q
q q
q q q q
q q q q
q
q
q
q
q
q
q q
q
q q
q
q
q q
q
q q
q q
q
q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q q
q q
q
q
q
q
q
q
q

q
q
q
q
q q q
q q q
q
q
q
q q
q
q q
q
q
q
q
q
q
q
q
q
q
q q
q q
q q
q q
q q
q q
q
q
q q
q q
q
q

q
q

q
q
q
q

q
q

q
q
q q
q q

q
q

q
q
q
q

q
q

q
q

q
q
q
q
q
q
q
q
q
q

q q
q q

q
q

q
q

q
q

q
q
q q
q q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q

q
q

q
q
q
q

q
q

q q
q q

q
q

q q
q q
q
q
q
q
q q
q
q q
q
q q
q q
q
q
q
q
q
q
q
q

q
q
q
q
q
q

q
q
q
q

q q
q q
q
q

q
q
q
q

q
q

q
q

q q
q q

q
q

q
q

q
q

q
q

q

q
q

q
q

q
q

q
q

5.0

q
q

q

q

q

q

q

q

q

q
q
q
q

q

q

q
q

q

q

q
q

q

q

q
q

q

q

q

q

q

q

q

q

q

q

q
q

q

q

q

q

q
q

q

100

q

q

q

q

q

q

q
q

q

q

q
q

q

q

q

q

q
q
q
q

q

q

q

q
q

q
q

q
q

q

q

q

q

q

q

q

q
q

q

q

q

q

q

q

q

q

q

q
q q
q q q q
q
q
q q
q
q
q q
q q
q
q
q
q
q
q
q
q q
q

q
q

q

q

q

q

q

q

q
q

q
q

q

q

q

q q
q

q

q

q

q

q

q

q

q

q

q

q

q

q q
q q
q
q
q
q

q
q

q q

q
q q
q
q q
q
q
q
q

q

q

q
q
q q

q

q

q

q

q

q
q
q
q
q
q

q
q

q

q

q

q

q

q

q

q

q

q

q
q
q q q
q
q
q q
q
q
q
q
q
q q
q q
q q
q
q q
q
q

q
q

q

q q
q

q

100

q
q

q

q

150

q
q

q

q

q

q

qqq

q
q

q

q

q q

q

q
q

q

q
q

q

q

q

q

q
q

q

q

q

q

q

q

q

q q
q
q

q

q

q
q

Weight (lbs.)

Weight (lbs.)

q

q

q q
q

q

q

q
q

q
q

200

200

q
q

q

q

q

5.5

6.0
Height (in.)

6.5

5.0

5.5

6.0

6.5

Height (in.)

21
Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013

Multiple linear regression

10

E.g. predicting someone’s misperception

5

of his/her weight (Y )

0

based on his/her height and weight

−5

X2

−10
250

−→ linear model

200

ht

ig
We

6.0

150

s.)
(lb

E(Y |X2 , X3 ) = β0 + β2 X2 + β3 X3
Var(Y |X2 , X3 ) = σ

X3

5.5

2

g
Hei

100

in.)

ht (

5.0

22
Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013

Multiple non-linear regression

10

E.g. predicting someone’s misperception

5

of his/her weight (Y )

0

based on his/her height and weight

−5

X2

−→ non-linear model

250
200

ht

6.0

150

s.)
(lb

Var(Y |X2 , X3 ) = σ

−10

ig
We

E(Y |X2 , X3 ) = h(X2 , X3 )

X3

5.5

2

ght

100

Hei

)

(in.

5.0

23
Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013

Away from the Gaussian model

Y is not necessarily Gaussian
Y can be a counting variable,
E.g. Poisson Y ∼ P(λ(x))
Y can be a FALSE-TRUE variable,
q

E.g. Binomial Y ∼ B(p(x))
−→ Generalized Linear Model

q

q

q

q

q
q

E.g. Y |X2 = x2 ∼ P e

q

q

(see next section)
β0 +β2 x2

q
q

q q
q q q
q
qq
q
qq
q
q q
q qq
q
q
q
qq
q q q
q
qqqq
q
q
q
q
q
q
qq
q
q
qq
q
qq
q
qq q
q q
q
q q
q
q q q q q q
q
q
q q q
q
q q q q q q
q
q
q
q
q q
q q qq
q
q
q q
q
q
q
q
q
qq qq
q
q
q q q qq qq q
q q q q q
q
q
q
q
q
q
q
q
q qq
q
q q q q q
q
q
q
q
q qqqqqqq
q
q
q q
q
q
q q qq
q
q
q qq
q
q
q
q
q

q
q

q
q

q

q

Remark With a Poisson model, E(Y |X2 = x2 ) = Var(Y |X2 = x2 ).
24
Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013

Logistic regression

qqq

q q

q

0.4

of his/her weight (Y )

 1 if prediction > observed weight
Yi =
 0 if prediction ≤ observed weight

0.0

E.g. predicting someone’s misperception

qq q q q q q q q q q q q q q q q q q q
q
qq

0.8

q q q

qq q q q q q q q q q q q q q q q q q q q q q qq q q q qq qq
q
qq
qq qq
q
q

5.0

5.5

6.0

6.5

Height (in.)

1.0

Bernoulli variable,
qq

0.4

0.6

0.8

q

P(Y = y|X = x) = p(x)y (1 − p(x))1−y ,
where y ∈ {0, 1}

0.2

−→ logistic regression

0.0

P(Y = y) = py (1 − p)1−y , where y ∈ {0, 1}

q qqqqqqqqqqq qqqq qq q
q qqqqqqqq qq q q
q

q

q q qqqqq qqqqqqqqqqqq q qq
q
qqqqq qqqqq
qqqqq q

100

150

200

q

q

q

250

Weight (lbs.)

25
Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013

Logistic regression

1.0

−→ logistic regression
0.8

y

P(Y = y|X = x) = p(x) (1 − p(x))

1−y

,
0.6

where y ∈ {0, 1}
P(Y = 1|X = x)
Odds ratio
= exp xT β
P(Y = 0|X = x)

0.4

0.2

0.0
250
200

ht

ig
We

6.0

150

s.)
(lb

exp xT β
E(Y |X = x) =
1 + exp (xT β)
Estimation of β ?

5.5

g
Hei

100

in.)
ht (

5.0

−→ maximum likelihood β (Newton - Raphson)

26
Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013

Smoothed logistic regression

qqq

q q

q

0.4
0.0

GLMs are linear since
P(Y = 1|X = x)
= exp xT β
P(Y = 0|X = x)

qq q q q q q q q q q q q q q q q q q q
q
qq

0.8

q q q

qq q q q q q q q q q q q q q q q q q q q q q qq q q q qq qq
q
qq
qq qq
q
q

5.0

5.5

6.0

6.5

q

qq

0.8

q qqqqqqqqqqq qqqq qq q
q qqqqqqqq qq q q
q

0.2
0.0

exp (h(x))
E(Y |X = x) =
1 + exp (h(x))

0.4

0.6

−→ smooth nonlinear function instead
P(Y = 1|X = x)
= exp (h(x))
P(Y = 0|X = x)

1.0

Height (in.)

q

q q qqqqq qqqqqqqqqqqq q qq
q
qqqqq qqqqq
qqqqq q

100

150

200

q

q

q

250

Weight (lbs.)

27
Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013

Smoothed logistic regression

1.0

−→ non linear logistic regression
P(Y = 1|X = x)
= exp (h(x))
P(Y = 0|X = x)

0.8

0.6

0.4

0.2

E(Y|X = x) =

exp (h(x))
1 + exp (h(x))

0.0
250
200

ht

ig
We
s.)
(lb

Remark we do not predict Y here,

6.0

150
5.5

ght
Hei

100

but E(Y |X = x).

)
(in.

5.0

28
Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013

Predictive modeling for a {0, 1} variable
1.0

What is a good {0, 1}-model ?
Observed

Observed

−→ decision theory

0.2

Y=1

q

0.6

q

Y=0

^
Y=1

qq qqq qqqqq qq qq q
qq qqq qqqq qqq
qq qqq qqqq qq q
q qqq qq
q
qq
qq
q

0.4

^
Y=0

q

0.8

Predicted

0.0


 if P(Y |X = x) ≤ s, then Y = 0
 if P(Y |X = x) > s, then Y = 1

q qqqqqqqqqqqqqqqqq q
q qqqqqqqqqqqqq qq q
qqq qqqq qqq q
qq qqqqqq q
qqq q q q
qq
qq qqq
q
q

0.0

0.2

0.4

q

0.6

q

0.8

1.0

0.8

1.0

1.0

Predicted

error

Y =1

error

fine

0.6

fine

0.4

Y =0

q

0.2

Y =1

q

0.0

Y =0

True Positive Rate

0.8

q

0.0

0.2

0.4

0.6

False Positive Rate

29
Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013

q

Observed

Observed

Y=0

T P (s) = P(Y (s) = 1|Y = 1)
nY =1,Y =1
=
nY =1
False positive rate

^
Y=1

0.0

0.2

Y=1

q

qq qqq qqqqq qq qq q
qq qqq qqqq qqq
qq qqq qqqq q q
q qqq qq
q
qq

0.6

^
Y=0

q

0.8

Predicted

0.4

True positive rate

1.0

R.O.C. curve

q qqqqqqqqqqqqqqqqq q
q qqqqqqqqqqqqq qq q
qq qqqqqq q
qqq qqqq qqq q
qq q q q
q
q

0.0

0.2

0.4

q

0.6

q

0.8

1.0

F P (s) = P(Y (s) = 1|Y = 0)
nY =1,Y =1
=
nY =1

q

q

0.6
0.4
0.0

0.2

True Positive Rate

0.8

q

R.O.C. curve is
{F P (s), T P (s)), s ∈ (0, 1)}

1.0

Predicted

0.0

0.2

0.4

0.6

0.8

1.0

False Positive Rate

(see also model gain curve)
30
Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013

Classification tree (CART)

250

q

q
q

q

If Y is a TRUE-FALSE variable
200

E(Y |X = x) = pj if x ∈ Aj

q

q

q

q

q q
q
q

q

q q
q
q

150

q
q

q
q
q q
q q q q
q
q q
q
q q
q q
q
q
q
q
q
q
q
q q
q
q
q
q

regions of the X-space.

q

100

q
q

q

q

q

q

q

q
q
q q q
q
q
q
q
q
q
q
q
q
q
q
q
q q q
q q
q q
q
q
q
q q
q
q
q
q
q

q
q
q
q

q

q
q q

q
q

q

q

q
q

q q
q
q
q

q

q
q q

q

q q
q
q q
q
q
q

q
q

q

q

q
q

q

q
q
q
q
q
q

q
q
q

q
q

where A1 , · · · , Ak are disjoint

q

q

Weight (lbs.)

prediction is a classification problem.

q

q

q
q

q q

q

q

q
q
q

q

q
q

q
q

q
q

q

q

5.0

5.5

6.0

6.5

Height (in.)

31
Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013

Classification tree (CART)
−→ iterative process

q

250

P(Y=1) = 38.2%
P(Y=0) = 61.8%

Step 1. Find two subset of indices
either A1 = {i, X1,i < s}
or A1 = {i, X2,i < s}

q

q

q q
q

200

q
q

q
q
q

q

150

q
q

q

100

q

& maximize heterogeneity between subsets

q

q
q

q

q

q
q
q q
q q q q
q
q q
q
q q
q q
q
q
q
q
q
q
q
q q
q
q
q
q

maximize homogeneity within subsets

q

q

q q
q
q

q

q

and A2 = {i, X2,i > s}

q

q

Weight (lbs.)

and A2 = {i, X1,i > s}

q
q

q

q

q

q

q
q
q q q
q
q
q
q
q
q
q
q
q
q
q
q
q q q
q q
q q
q
q
q
q q
q
q
q
q
q

q
q
q
q

q

q
q q

q
q

q

q

q

q
q

q

q
q q

q

q q
q
q q
q
q
q

q
q

q
q
q
q

q

q

q
q

q

q

q

q q
q

q
q

q q

q

q

q
q
q

q

q
q

q
q

q
q

q

P(Y=1) = 25%
P(Y=0) = 75%

q

5.0

5.5

6.0

6.5

Height (in.)

32
Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013

Classification tree (CART)

250

q

q
q

q

q

q

y∈{0,1}

0.317

q

q q
q
q

q

q q
q
q

q

150

q
q
q

q
q
q q
q q q q
q
q q
q
q q
q q
q
q
q
q
q
q
q
q q
q
q
q
q

q
q
q

q

q

q

q

q
q
q q q
q
q
q
q
q
q
q
q
q
q
q
q
q q q
q q
q q
q
q
q
q q
q
q
q
q
q

q
q
q
q

q

q
q q

q
q

q

q

q
q
q

q q
q
q
q

q

q
q q

q

q q
q
q q
q
q
q

q
q

q

q

q
q

q

q
q
q
q
q
q

q
q
q

q

100

x∈{A1 ,A2 }

nx,y
nx,y
· 1−
nx
nx

q

q

0.523

Weight (lbs.)

E.g. Gini index
nx
−
n

200

Need an impurity criteria

q

q

q
q

q q

q

q

q
q
q

q

q
q

q
q

q
q

q

0.273

q

5.0

5.5

6.0

6.5

Height (in.)

33
Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013

Classification tree (CART)

250

q

Step k. Given partition A1 , · · · , Ak

q
q

q

0.472

q

q

q q
q

q

q q
q
q

q

q
q
q

q
q
q q
q q q q
q
q q
q
q q
q q
q
q
q
q
q
q
q
q q
q
q
q
q

q

100

q
q

q

q

q

q

q
q
q q q
q
q
q
q
q
q
q
q
q
q
q
q
q q q
q q
q q
q
q
q
q q
q
q
q
q
q

q
q
q
q

q

q
q q

q
q

q

q

q
q

q q
q
q
q

q

q
q q

q

q q
q
q q
q
q
q

q
q

q

q

q
q

q

q
q
q
q
q
q

q
q
q

q

maximize homogeneity within subsets
& maximize heterogeneity between subsets

q

q
q

150

or according to X2

q

0.621 q

0.444

Weight (lbs.)

either according to X1

200

find which subset Aj will be divided,

q

q

q
q

q q

q

q
q
0.111

q
q
q
q

q

q
q

q
q

q

0.273

q

5.0

5.5

6.0

6.5

Height (in.)

34
Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013

Visualizing classification trees (CART)
Y < 124.561
|

Y < 124.561
|

X < 5.39698

0.3429

X < 5.69226

0.1500

X < 5.69226
0.2727

X < 5.52822

0.4444

0.5231

0.3175

Y < 162.04

0.6207

0.1111

0.4722

35
Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013

From trees to forests
q
q

250

Problem CART tree are not robust
−→ boosting and bagging

q
q
q

q
q

q
q

q
q
q

q
q
q
q

q
q

q
q

q q
q
q
q
q

q

q

q
q
q
q

q
q

0.808

Then aggregate all the trees

q
q

q
0.520 q
q

q q q q
q
q q
q
q
q q
q q
q
q q
q
q
q
q
q
q
q
q
q
q
q
q q
q
q
q
q
q
q

q
q

100

q
q

q

q
q

q
q

q
q

q

q
q

q

q

q
q q
q
q
q

150

repeat this resampling strategy

q

q
q
q
q

q
q

Weight (lbs.)

and generate a classification tree

0.467

200

use bootstrap : resample in the data

0.576

q

q

q
q

q
q
q
q q
q q q
q
q
q
q
q
q
q
q
q q q
q
q
q
q
q
q
q
q
q
q
q q q
q q q
q q
q q
q
q q
q
q
q
q
q
q
q q
q
q
q
q
q
q
q
q

q
q

q
q

q
q
q

q

q
q q
q q

q
q

q
q

q
q

q
q

q
q

q
q
q

q q
q q

q
q q
q
q
q
q
q

q

q q
q q

q
q

q
q

q
q

q
q q

q
q

q
q

q
q
q
q

q

q

q

q
q
q
q
q
q

q

q
q

q
q

q
q

q

q

q q
q
q

q

q
q

q
0.040

q
q
q
q
q
q
q
q

q
q

q
q

q
q

q
q
q
q

q
q

q

0.217

q
q

5.0

5.5

6.0

6.5

Height (in.)

36
Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013

10

A short word on functional data
q

q
q

q

−15

q
q
q
q

q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q

q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q

q
q
q
q
q
q

Individual data, {Yi , (X1,i , · · · , Xk,i , · · · )}

q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q

q

q

q

q

q

q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q

q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q

q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q

q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q

q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q

q

q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q

q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q

q
q

q

5
0
−5
−10

q

q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q

q

−20

Temperature (in °Celsius)

q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q

q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q

q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q

q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q

q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q

q

q

q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q

q

q

q

q
q

q
q

q

q

5

10

15

Week of winter (from Dec. 1st till March 31st)

Functional data, {Y i = (Yi,1 , · · · , Yi,t , · · · )}
qq
qq

0
−10
−20
−30

Temperature (in °Celsius)

10

E.g. Winter temperature, in Montréal, QC

q
q
qq q
qq
q
q
qq
q
q
q
qq q
q
q
q
qq
q
qq qqqq
q q
q
q
q
qqqqqqqqq
q
qqq qqq
qq
q
q
q
qq
q
q
qq q
qq q
q
qq
qqqqq
q
q
qq q q qqq qqqqqqqqqqqqqq
qqq qqqqqq
q
qqq
q
q
q
qq
q
q
qq
qq
q
qq
q
qq
q
qq
q
q
q
qqq q
q
q q q q
q q
qqq
q
qq
qq
qqqqqq
qqqqqqqqqqqq qqqqqqqq
q
qq q q
qqqq q qq
q q q q qqqq qqqqqqqq
qqqqqqq
q
q qq
q
q
q
q q q qq
q
q q
qqqq
q
qq
qqq
q q
qq
q
qq
q
qqq
q
q q q
q qqqq q qqqqqqqqqqqq
q
qqqqqqq qq qqqqq
q
q
q
qqqqq qq qqq
q q
q
q qq q q q q qq
qq qqqq qqqqqqq
q
qq
q
q
q q
q
qq
qqqqqqq qqqqq qqqqqqqq qq
q
q q
q qqqqqqqqqqq qqqqqqqqqqqqqqqqqqqqq
qq q
qqqq
q
qq qqq
qq q
q q
q q q
qqq qqq
q
q
qq qqq
qq q qq qq
q q q q q qqqqq
q
qq
q
qq qq
qqqqq qqqqqqqq qq
qqqq qqqqqqq qqqqqqqqqqq q qqqq
qqqqqqqqq
q qq qq qqq
qqqqqq
qq q qqqqq qqqqqqqqqqqqqqqqqqqqqqqqqqqqq qqqqqqqq
q
qqq
q
q
q qq
qqqqqqqqqqqqqqqqqqqqqqqqqq qqqqqqqqqq
q
qqqqq
qqqqqqqqqqqqqq qqq
q qqq
qq
q qqq
q qqqqqqqqqq qq qqqqqqq qqqqqqqqqqqqq qqq q qqq qqqqqqqqqqqqqqq qqqqqqqqqqqqqqqqqqqqqqqqqq
q
q
q
qq
q qqqqqqqqqqqqqq q qqqqqqqqqq
qqqqqqq q q qqq q qq
qq
q
qqq
q qqq q
q qqq q
q
qq
qqq qqqqqqqqqqq q qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
qqqqq qqqqqqqq qqqq
q qqqqqq q
qq qq q
q qq
qqqq q
qq
q qq
q
qq qq qqqq qq qqqq
q
q q
q
q
qqq
qq qq qqq
q
qqqqq qqqqqqqqqqqqqqqqqqqqqqqqqqq
qq q qqqqqq qqqqq qqqqqqqq qqqqq
q q qq
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq qqqqqqqqq qqqqqqqqqqqqqqqqqq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
q q q
q q
qq q qq
qqqq qq
qqqqqqqqqq
qqqqqqqqqqqq
qqqqqqqqqqq qqqq qqqqqqq qqqqqq qq
qqqqqq
q
q q q
q
q
q
q
q
q
q
q
q
qq qq q
q q q
qqq qqqqqqqqq qqqqqqqqqqqqqqqqqqq
q q qqq
q qq qqq qqqqqqqqqqqqq qqqqqqq qqqqqqqqqqqqqqqq
q qqqqqqq q
qqq
qqqq qq
qq qqqqq q
q
qqqq q
q
q q qqqqqqq qqq qqqqq qq q
qqq
qqqqqq qqqq
q
q qq
q q
qqqqq q
q
q
qq qqqqqqqq qqqqqqqqqqqqqqqqq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
qq qqqq qq q
q
q
qqqqq q
q qq
q
q q q
q
q
q
q qq qqqq qqq qqqqqq
q qq
q
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq qqqqqqqqq
qq
q
qqqq
qq
qq qqqq qqqq
q q
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
qq
qq
qqqqqqqqqqqqqqqqqqqqqqq
q
q
qqqqqqqqqqqqqq
q
qqq
q
qq
qqqqq qqqqq q qqqq
q q qqq
q
q
qq qqqqqqqq qqqqq
qqqq
qq qq
q qq
q qqq
qq
qqqq q
q
q
qqq q qq q q
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq qqqqqqqqq
qq qqqqq
q
qqqqqqqq q qqqqqq qqqqqqqqqqqqqqqqq qqqqqq qqqq qqq q qqq qq q
qq q
q qqq
q
qqq q qqqq qq qqqq q qqqqqqqqq qqqqqq q
qq
q q
qq
q
q
qq qq q
q
q q
q
qqqq
q qqqq
qq
q q
q
qqqqq qq q
q
q
q qq
q
qqqqqqqqqqqq qqqqqqqq qqqqqqqqqqqqqqqqqqqq qqqqqq q qqqqqqqqq qqq qqqqqqq
qqqqq qqqqqqqq qqqqqqqqqqqqq
qq
q
qq
qq
q
q
q
qq
qqq
q
qqqqqqqqqqqqqqqqqqqqq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
q
q
q
qq qqqq
q q q q
q
qqqqq q q
qqq qqq
q qq qqq q qqqq qqqqqq
q q qqqqqqqqqqqqqqqqqqqqqq qqqqqqqqqqqqqqq
q q
q
q
q q qqq qqq
q
q
qq
q
q q qqqqqq qqq qqqqqqq qqqqqqq
q
qqq
q
q
q
q q
q
q
q
q qq q
qqqqq qqqq q qqqq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq qqq qqqqqqqqqqq qq
q qqqq q qq q q
q
q
q
qq
qqqqqqqqqqqqqqq qq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq qqqqqqqqqqqqqqqqqqqq qqqqq qqqqq qqqqqqqqqqqq q
q
qqq qq qq q qqqq qqqqq q qq q q
q q qqqqq q qqq qq qqqqq qqqq
qq
qqq
q qq
qqq qqqqq q
q
qqqqqqqq
q qqq
q qqq
q
q
qq
qqq
q
qq
qqqqq q q
qqqqqqq
qq
qq
qq
qq
qq qq
q
qq q
qq qqq
q
qqqqqqq qq
qq
q
qqqq
qq
q q
qqqqqqqqqqqqqqqqqqqqqqqqqqqqq qqqqqqqqqqq qqqqqqqqqqqqqqqqqqqq qqqqqq qqqq qqqqqqqqqqqqqq qq qq qq
q
q
q q
q
q
qq qqqqq q qqq
qqq q qq qq
q
qqq qqqqq qq
qqqqqqq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq qqqq qq qqqqqqqqqqqqqqqqq
qq
q q q
q q qq q
q
qq qq q
q
qqq
qqq
q qq
qq
q
q
q q
q q
q
qqqq qqqqqq qqqqqq qqqq
q qqq
q
q qq q q q
qqqq
q
qqqqqqq
q
q
q qqqq qqqqqqqq qq
q
qqqqq
q
q
q
qqqqqqqqqqqqqqq qqqqqqqqqqqqqqqqqqqqq
qq q
q
q
q q
q qqq q
q
q
q
qqq q qqq qqqqq qq qqqq q
q qq qqqqq qqqq q qqq q qqqqq q
q q
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq q qqqqq
q q q
qqq
q
q q qq q
qqqq qq qqqqqqqq qq q qqqqqqqqqqq
qqqqqqqqqqqqq qq qqq q q
qqqq
q
q
qqqqqq
q q
q
qq
q q
q q
q q qqq q
q
q qqq qqqqqq
qqqq
q qqqqqqqqqq q qqq q qqqqqqqq qq qq qq
qq
qqqq qqqqq qq qqqq qq qq q qqqqqqqqqq
q
q
q q
q
q
q
qqqq q
q
q
qqq
q qqqqqqqqqqqqqqqqqq q qqqqqqqqq qqqqqqqqq qqqqqqqqqqqqqqqqqqqqqqqqqqqqq qqqqqq qqqqqqqqqq
q
q
q
qqq
q
qq q
qqq qq
qq
qq qqqq q q
q
q
qqq q qqq
q
qq qqqqqqqqqqqqq qqqqqqqqq qq q qqqqqqqqqqqqqq qqqqqqqqqqqqqqqqqqqqqqqqqqq qqqqqqq q qq
q
qqqq
q
qqqq qqqqq qqqqq q
q
q
q
qq
q
q
q
q qq qq qq
qq q
q
qq
qqq
qqqqq
q
q
qqqqqqqqq q qqqqqq q qq
q
q qqqq
qq
q
q
qq
q q q qqqq
q q qq q
q
q qq
qq q
qq
qqqq qqqqqqqqqqq qqqqqqqqqqqqqqqq qqqqqqqqqqqqqqqqqq q
q
q
qq
q
q q
q
qqqqq qqqqq
qqqqqqq
q
q
q
q
q
q qqqq
qqqqq q
q
q
q
qqqqq qqq qqqq
q
q
q
q q q
qqqq
q
q
q
q q
q qq
qqq
q q q
q
q q
q
qq
q
q q qqqqqqq qqq qq qqqqq qqqqqqqqqqqqqqq q qq qqq qqqqqqqqqqqqqqq qq
q
q
qq
q qq
q
q
q
q q q
q
q q qq
q q
qq qqqqqq qqqqqqq qqq qq
q qqq qq
q
q qq
q qqq
q qqqq
qqqq
qq q qqqq qqqqqqqqqqqqqq
qq q
q
q
q
q q
q
q
q qqqq qqq qq
q q
q
q
qqq
qq
q
q
qq
qqq
q
q
q q
q
qqqqqq q qqq
q
q
q q qq
qqq
q
q
q qq q
q q
q qqqq
q
q
qqqq qqqqqq qq
q
qq
q
q
q qqq
q qq
qq
q q
q
q
q
q
qq
q
q
q
q
qqq
q
q
q
qqq
q
q
q
qqqqqq
q
q
q
q
q
q
q
q
q
q
qq

0

20

40

60

80

100

120

Day of winter (from Dec. 1st till March 31st)

37
Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013

A short word on functional data
Daily dataset

2001

1982
q

1974
1999
2007

0

PCA Comp. 1

40

50

q

q q

q

2006

2009

q

q

1957
q

2011

1997

20

q

q

−50

q

1986

2005
q

1961

qq
qq q
q
q
qqqq
qqqq
q
qq qq
q qq
q q q qq
qqq qqqq q
q q
q qq qq q q
qqqqq q qqq
q
qq q
q
q
q qqqqqqqqqqq
q q
q q q q qq
q
qqq q qq
q qqqqqqq q q qqq
q
q
qq q q qqqq qqq
q q q qqqq q q qq
q q
q
q qq
q
qq qq qq qq q qqqq q qq
qqqq
qqq
qq q
qq qqq q qq q qqqq q qq
q q q
q
qq q q q q q qq
q
qqq qq q qq qq qq qq q qqq
qqq q qqq qq qq q
q qq q q q qq
q
qq q q
q
q qq qq qq qq q q q q
qq q q
q
q
q
q qq
qqqqq qq q qq qq qq qq q
q
q
qq
qq qq qq
q
q q q q qq q q q q q q q qq q q
q qq q q
qqq q qqqq qqq q q qq qq q q qq q
q qq q
q q qq
q q
qq q
q
qq q qqq
q
q
q q qq q
q
qqq q qqqq q qqq q q q q qq q q qq q
q qq q q
q
qqqqq qq q
q
q q
q q q qqq q qqq q q q q q q q
qq q q q
qqq qqq q
q q q q q q q q qqqq qq q qqq
qqq qqq q
q
q qq
q
qqqq qqq q qqq qqq q q qq q qqq q
qqqqqq q
q
q
qq
q q qq qqq q
qqqq qqq qq q qq qq q qqqq q q q qq q
q
q
q q qq
qqqq
q q qqq q
qqq
qqqq
q
q q qqq q q
q qq q qqq q qqqqqqq q q
q q q qq
q q q qqq q q q qqq
q
q q q
q
q q
qqq q
q
qq
q
qq q
qq
q
qqq q qqqq qqqq q
q
qq qqq
q
q
q
q qq qqq q qq qq q q qq qqq
q qq q q qq q
qq qq qq qqqqq q
q qqq
q
qqq qqq q
qqqq q q qq q qqqq qqqq qq q q qqq q
qq q qqqqqqq q
q qq qq q
q
q qq
q
q
qq
q qq
q
q
q
q
q qqqq qqqq qq
q
q q q qqq q qqq q
q
qq qqqqqqq q qqqq qq qqqq qq q
q q qq q q q
q qq qqq qq q qq
qqq
q
qq
q
qq q q q q q q q q qq q q q
q
qq
qq
q
q
q q
qqq qqqqqqqqqq q qq qqqq qq q
qq qq qq q qqqq
q
q
q q q qq q
qqq q qq qqq
qq q
q q qqqqqq q q qqqq qq qqq q q qq
q qq q q qqq q
q qq q
q q q
q
qq qq qqqqqqqq
qq q qq q q
q
qqq q
qq qq qq qq q q
q
q
qq
qq
q q q q q q q qq q
q qqq q qqqqqq q q qqqq qq qq q qq q
q q
qq q
qq qqqq qqqq
qq
q qq
q
qq q qq q q qq qqq q
q q qq
q
q q qq qq q q q qqqq qq qqq
q
q q q qqq
q
qq q qq
q
qq
q qq qq
qq
q qqqqq qq
qqq q q qqqqqq q q qqq q qq q qq q
q q qq q q qqqq q qq
qq
q q qqq q q
q q
q
q
q
q
q qq q q
qqqq qqqq q qq qq
q qqqq q qqq q q q q q qq q q q q q
qq
q
q q qq q
qq q
q
qqqqq q qqqqq qq
q q q q q qq
q q
q
q
q
qq qq q qqqqq q q qqqqqq q q q qq q
q
q qqqqq
q
qq qqqqqqqqq qq qq qq
q
qq
q
q qq qq q q qqqqqq q q q q q
q q qq
q
q q
q
q
q
qq q q q qq q q qq qq qq
q qq q
q
qq
q
qq
qq
q qq q
q q
q
q
q q qq q qq q qqq qq q qq q
qq q q
q
qqqq qq q q q
q
qq
q
qqqq q qq qq q qq qqq q q
qq q qqq qq q q qq qq q qq q q
qqq qqq
q q q q qq
q
q qqq q
q
q
qq
qq q q q qq q
q qq q qq q q qq q q qq
q q qqqq q qqq qqqq q qqq qq q qqqqqqqq q
q q q q q qq qq qq q q qq
q
qq q
qqqq qq
q
q
q
q qq
q
qq
qq q q
q
q qq
qq q q qqq qqqq qq qqq qqq qqq q q qqq q
q q q
q qq q q
q qq qq qq qqqqq
qq
qq q
qq
q
qq q
q qq q qq qqq q q q
q q q
q qq q qqq qq q
q q
q q
qq q q qqqq qq q q qqqq q qq q q q q qqq q
q
qq q
q q
qq qq q qqqqqq q qqqqqq qqq qqq qqqqq q
q
q qq q qq qqqqq
q q q qq qq qq
qq
qq
qq
qqq q q qq q qq qq q qqqqq q q q
q
q q q
q
q q q qqq q
qq
q
q qq q qqq qqqqqq q
qqq q q qqqq qq q q qqqqq q q q q q q q qq
q q q
q
q
qq
q q
q
q q
q
qq q qqq qqq qq q q qq q qqq qqqqqq qqqq qqqqq q q qq q qqq q q qq q
q
q
q
qq
q
qq q
q q q qq
qq
q
q qqq q q qqq qqqq qqqqq q q q qq q qq q
q
q q qq qqq q q qqqqq q
q qq q q q q q q
q
q
q
q
q
q q q qq q qqqq q q qq q q q qq
q qq qqq q qqqqqq q q q qqq q qqqqq qqqq q qqqqq qq q qq q q q qq
q
q
q
q
q qq qq qqqq qqqq q qqq q qqqqq qq qqqq qqq qq q qq q q q q q
q
qq qq qq
q
q
q
q
q
q qq
q
q q
q
q
q qqqqqqq qq qqqqqqq q qqqqq qqqq q qqqq q qq q qq q q q q
q
q
q qqqq qq q qqq q qq q q qqqqq qqqq q qqq q qq q qq q q q
q q qq q q qq q q qqq qq qqq qq q q q q qqqqq qq q
q
q q q q q qq q q
q
qq qq qq q q q q
qq
q
q qq
qq qq q q qq q q
qqqqq q q qq q qq q q q q
q
qqq q
qqqq q
qq
qq
qq qqq qq qq qq q q q
qqqq q q qqqq q qq q qq q
q q q
q
qq q q qq q qq q q
qqqq q q qq qqqq q qq q
q
qqqq qq q q q
q q qq q
q qq
qq
q q
qq qqq qq q q q q q
q
qq qqq q q qq q q
qq q q q qq q
qq
qq
qqqqq q q qq q q
q q
q
q
q
q
qqqqq q q qq
q
q
q
q qq
q q qq q q
q
qqq
qq
q
qq q q
q
q
q
q
qq q
q
qqq
q
qq
qq
q

JANUARY

DECEMBER

FEBRUARY

MARCH

q

1979
q

Comp.2

0

q

q

q

q

1994

1996

1966
1964
1973
2010
1955

1978

q

20

40

60

80

100

120

Day of winter (from Dec. 1st till March 31st)

1987
q

qq

1953
q

q

1962 1954

1991

q

q

q

1989
q

1960

0

q

2000
q

1988
1963
1965
1959 1990

q

q

30

q

q

1992
1971

2004

q

q

q

q

2002

1998
q

1977

1972
1985

20

q

q

q

1984

q

q

2008
q

1968
q

10

1956
1995
q

q

2003
q

1969
1975

1983

q

q

1976

q

q

1993
q

1980
1967

−20

q

q

−20

−10

0

10
Comp.1

20

30

40

50

−30

−20

1981

0

q

q

−10

1958
1970

PCA Comp. 2

q

q
q
q
q
q
q
q
q
q
q
q
qq
q
q
q
qq
qq
q
q
q
q
q
qq
q
q
q
q
q
qq
q
q
q
q
q
qq
qq
qq
q
q q
q
q
q q
qq
q
q
q qq
qq
q q
q
q q q
q qq
q q
q
qq q q
q
q q q
q q
q q
q qq q
q
q
q
q
q
qq qq q q
q
q q q q
q q q qq q q
qq q qq q
q
q q q qq q q
qq q
qq q
q
q
qq
q
q q q
qq
q
q q
qq q q qq q q q q q qq
q
qq
q
qq q qqqq q qq q q q q qqq q
q q q qq q q q
q
q q
q
q q qqq
q
q qq
q
q qq q q q q q q
q q
q
q
q
q q
q
qq
q
q qq
q
q q
q
qq
q
qq q q
q
q
q
q
q
q q q
qq q q
q q
q
q q q qq
qq qq q q qq q q qq qq qq
q q
qq q q qq qq q qq q qq
q q
q q qq
q
q
q
q
q
q
q
q q
q
q qq q
q
q
q
q q
q
q
q q
q q
q q
qq q q q qqqqq q
q q
q qq q q
q q
qq q
qq q q
qq q q
q q
q q
q
qq qq q
q
q
q q
q qqq
q
qq q
qq q
q q
q
q
q q
q
qq q q
q q q q q qq qq
q
q qq
q qq q
qq q qq
q q q q q q q qq
qq q q
q
q
q qq q q q q qq
q q
q q
q
q
qq q q q q q q qq
q
q
qq
q
q
q qqq
q
q q q
q
q
q q
q
q
q q
q q
q
q q
q q
qq qqq q qqqq q qq qq qq q
q
q q q q
qq
q
q qq
q q
q q
q q
q q qq
q
q
q q q q qqqqqq qq q q q q qqq
q
q
q qq
q q
q
q
qq q
qq
q q q qq q
q q
q q qq
q
q
q
q
qq
q
qq q
q q q q q qq q qq qq q q q q
q
q q q
q
q
q
q
q q qqq qq q qq qq q q q q
qqq
q
q
q
q qq qq q
q
qq q
q
qq
q
qq q q q qqq qq q
qq
q q
q
qq q
q
qqq q q qqq qq q
q q
q q qq
q
qq
q
qq
q
q q
qq q q q q
q q
q
q q
qq
q qq
q
q
q q
q
q
q
qq
q
qq
q
q
q
q q q qq qq
q
q
q
q
q q qq
q
q q
q
q
qq qq
qq
qq qq
q q q
q
q q
q
qq
q q
qqq
q
qq
q qq q
q q
q q
q
qq
q
q
q
q
q
q
q
qq q
q
q
q
q qq
qq
q
q
q
q
qq
q
q
q
q
q qq q
q qq
qq
q
q
q
q
q
qq q q q
q
q
q
q
q
qq
q
q
q
q
q
qq
qq
q
qq
q
q
q
qq
q
qq
q
q
q qq q q
q
qq
qq
q q qq q
q
q
q q q q
q
q
q
q q q q
qq
q
q q q q
qq
qqq q q
q q
qq q q
q
qq
q q q q
q
qqq qq qq
q
q
q
qq q qq q
q
qq q
qq
q
q
qq
qq q q
qq
q q
q
q
q q
qq
qq q q q
q
q
q qqq
q
q
q
q
q
q
q
q
q
q qqq
qq q
qq
q
q q q
q
q
q
q
qq
q
q
qq q
qq
q
qq q
q q qq
q qq q
qq
q
q q
q
q qq
q
q
q
q
q
q
q qq
qq
q
q
q
q
q q
q qq
qq q
q
q q q qq q
q
q
q
q
q
q
qq qq q
q
q q
qqq qq q
q
qq
q q
q
q
q
qq
q
q
qq q q
q
qqqq qq q
q q
q
q
q qq
q q
q
qq
q
qq
qq
q qq q
qqqq qq q
q qq qq q
q q
q qq
q
qq
q qq
q
q q
q
q
qq
q
qq
q
q
q
q
q
q qq q q
q
qq
q
q
q
q q
qq qq q q
q
q
q
q
q
qq qq
q
q
q q q qq q
q
q
q
qq
q
q
qq q qqq q
q
qq q
q
q qqq
q
q
q
q
q qq q qq q
q
qq q q
qq q q q
q
q q
qq q
q q
qq q
qq
q qq
q q
qq q q q
q qqq q
q q
q
qqq
q
q q
q
q qq
q qq q q
qq q q q qq
q
q
q
q q q qq
q
qq q
q qq
q q
q
q
q q q qq q
qq qq q
qq q
q q qq
qq
q
q
q q q qq
q q
q q qq
qq
q qq
qq
q
q q q q q qq q
qq q
q
qq q
qq
qq qq q
qq q
q qq
q
q q
qq
q
q
qq
q q
q
q
q qq
q
qq q q q
q
q
q
q
q q qq q q q q q qq
q
q qq
q
qq q q
q
q
q
q q q qq q q q
q q
q
q qqq q q q
q
q
q
q
q q qq q
q q
qq q q
qq
q q
q
q q qqq q q q q q
q
q
q q
q qq q qqq
q q
q
qq
qqq
q q q qq q q q q qq
q
qqq
q
q
qq
qq qq q qq q
q
q q
q
q
q
qqq qq
q qq
qq
q
q q q qq
qq qq q qqq q
q
qq
qq
q
q
qqq
q
q q
q
qq
q
qq q
q qq
q
q
qq q
q qq qq qq q q qq
qq
q
q
q
qq
q qq q q qq
q
q
q
q
q
qq
q
q
q
q
q
q
qqq
q q q
qq qq q qq q qqq
q
q
q
q
qq
q
qqq
q
q q
q
q qqq
q
q
q
q q qq
q
qq
q
qq
q q
q
q
q q q q q q qqq
q
q
q
q
q
q
q q q qq q qq q
q
q
qq
qq
q
q qq
q
q
q
q qqq
qq
qq
q
q qq qq qqqqq q
q
q
q q
q
qq
q
q
q
q q q q qq q
qq q
q qq q
q
q
q q
q
q q q q qq q
qq
q q
q
q
qq
q q qq qq
qq
q q qq qqq
qq qq
q q qq qq qq q
q
qq q q
q
q qq
q q
q
qqq qq
q
qq
q
qq qq
q
q
q
q
q
q
qq
q
qqq
q qq
qq
qqq
qq
q
q
qq
qq
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q

0

20

40

60

80

100

120

Day of winter (from Dec. 1st till March 31st)

38
Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013

To go further...

forthcoming book entitled
Computational Actuarial Science with R

39

Más contenido relacionado

La actualidad más candente (20)

Slides barcelona Machine Learning
Slides barcelona Machine LearningSlides barcelona Machine Learning
Slides barcelona Machine Learning
 
Slides ineq-4
Slides ineq-4Slides ineq-4
Slides ineq-4
 
Slides guanauato
Slides guanauatoSlides guanauato
Slides guanauato
 
Slides smart-2015
Slides smart-2015Slides smart-2015
Slides smart-2015
 
Slides amsterdam-2013
Slides amsterdam-2013Slides amsterdam-2013
Slides amsterdam-2013
 
Slides sales-forecasting-session2-web
Slides sales-forecasting-session2-webSlides sales-forecasting-session2-web
Slides sales-forecasting-session2-web
 
Graduate Econometrics Course, part 4, 2017
Graduate Econometrics Course, part 4, 2017Graduate Econometrics Course, part 4, 2017
Graduate Econometrics Course, part 4, 2017
 
Slides erasmus
Slides erasmusSlides erasmus
Slides erasmus
 
Slides dauphine
Slides dauphineSlides dauphine
Slides dauphine
 
Slides univ-van-amsterdam
Slides univ-van-amsterdamSlides univ-van-amsterdam
Slides univ-van-amsterdam
 
Quantile and Expectile Regression
Quantile and Expectile RegressionQuantile and Expectile Regression
Quantile and Expectile Regression
 
Slides econometrics-2017-graduate-2
Slides econometrics-2017-graduate-2Slides econometrics-2017-graduate-2
Slides econometrics-2017-graduate-2
 
Slides edf-2
Slides edf-2Slides edf-2
Slides edf-2
 
transformations and nonparametric inference
transformations and nonparametric inferencetransformations and nonparametric inference
transformations and nonparametric inference
 
Econometrics, PhD Course, #1 Nonlinearities
Econometrics, PhD Course, #1 NonlinearitiesEconometrics, PhD Course, #1 Nonlinearities
Econometrics, PhD Course, #1 Nonlinearities
 
Lundi 16h15-copules-charpentier
Lundi 16h15-copules-charpentierLundi 16h15-copules-charpentier
Lundi 16h15-copules-charpentier
 
Berlin
BerlinBerlin
Berlin
 
Slides compiegne
Slides compiegneSlides compiegne
Slides compiegne
 
Inequalities #3
Inequalities #3Inequalities #3
Inequalities #3
 
Slides toulouse
Slides toulouseSlides toulouse
Slides toulouse
 

Destacado (16)

Slides act6420-e2014-ts-1
Slides act6420-e2014-ts-1Slides act6420-e2014-ts-1
Slides act6420-e2014-ts-1
 
Slides arbres-ubuntu
Slides arbres-ubuntuSlides arbres-ubuntu
Slides arbres-ubuntu
 
Slides 2040-6
Slides 2040-6Slides 2040-6
Slides 2040-6
 
Slides inequality 2017
Slides inequality 2017Slides inequality 2017
Slides inequality 2017
 
Slides 2040-4
Slides 2040-4Slides 2040-4
Slides 2040-4
 
Julien slides - séminaire quantact
Julien slides - séminaire quantactJulien slides - séminaire quantact
Julien slides - séminaire quantact
 
Econometrics 2017-graduate-3
Econometrics 2017-graduate-3Econometrics 2017-graduate-3
Econometrics 2017-graduate-3
 
Soutenance julie viard_partie_1
Soutenance julie viard_partie_1Soutenance julie viard_partie_1
Soutenance julie viard_partie_1
 
Slides ensae-2016-4
Slides ensae-2016-4Slides ensae-2016-4
Slides ensae-2016-4
 
Slides ensae-2016-7
Slides ensae-2016-7Slides ensae-2016-7
Slides ensae-2016-7
 
Slides ensae-2016-9
Slides ensae-2016-9Slides ensae-2016-9
Slides ensae-2016-9
 
Slides ensae-2016-5
Slides ensae-2016-5Slides ensae-2016-5
Slides ensae-2016-5
 
Slides ensae-2016-6
Slides ensae-2016-6Slides ensae-2016-6
Slides ensae-2016-6
 
Slides ensae-2016-8
Slides ensae-2016-8Slides ensae-2016-8
Slides ensae-2016-8
 
Slides ensae-2016-11
Slides ensae-2016-11Slides ensae-2016-11
Slides ensae-2016-11
 
Slides ensae-2016-10
Slides ensae-2016-10Slides ensae-2016-10
Slides ensae-2016-10
 

Similar a So a webinar-2013-2

Appendix to MLPI Lecture 2 - Monte Carlo Methods (Basics)
Appendix to MLPI Lecture 2 - Monte Carlo Methods (Basics)Appendix to MLPI Lecture 2 - Monte Carlo Methods (Basics)
Appendix to MLPI Lecture 2 - Monte Carlo Methods (Basics)Dahua Lin
 
Bayes Independence Test
Bayes Independence TestBayes Independence Test
Bayes Independence TestJoe Suzuki
 
Orthogonal basis and gram schmidth process
Orthogonal basis and gram schmidth processOrthogonal basis and gram schmidth process
Orthogonal basis and gram schmidth processgidc engineering college
 
1 hofstad
1 hofstad1 hofstad
1 hofstadYandex
 
Minimum mean square error estimation and approximation of the Bayesian update
Minimum mean square error estimation and approximation of the Bayesian updateMinimum mean square error estimation and approximation of the Bayesian update
Minimum mean square error estimation and approximation of the Bayesian updateAlexander Litvinenko
 
Tensor Completion for PDEs with uncertain coefficients and Bayesian Update te...
Tensor Completion for PDEs with uncertain coefficients and Bayesian Update te...Tensor Completion for PDEs with uncertain coefficients and Bayesian Update te...
Tensor Completion for PDEs with uncertain coefficients and Bayesian Update te...Alexander Litvinenko
 
A Characterization of the Zero-Inflated Logarithmic Series Distribution
A Characterization of the Zero-Inflated Logarithmic Series DistributionA Characterization of the Zero-Inflated Logarithmic Series Distribution
A Characterization of the Zero-Inflated Logarithmic Series Distributioninventionjournals
 
orthogonal.pptx
orthogonal.pptxorthogonal.pptx
orthogonal.pptxJaseSharma
 
Solovay Kitaev theorem
Solovay Kitaev theoremSolovay Kitaev theorem
Solovay Kitaev theoremJamesMa54
 
Weighted Analogue of Inverse Maxwell Distribution with Applications
Weighted Analogue of Inverse Maxwell Distribution with ApplicationsWeighted Analogue of Inverse Maxwell Distribution with Applications
Weighted Analogue of Inverse Maxwell Distribution with ApplicationsPremier Publishers
 
Wasserstein gan
Wasserstein ganWasserstein gan
Wasserstein ganJinho Lee
 

Similar a So a webinar-2013-2 (20)

MUMS Opening Workshop - Panel Discussion: Facts About Some Statisitcal Models...
MUMS Opening Workshop - Panel Discussion: Facts About Some Statisitcal Models...MUMS Opening Workshop - Panel Discussion: Facts About Some Statisitcal Models...
MUMS Opening Workshop - Panel Discussion: Facts About Some Statisitcal Models...
 
Appendix to MLPI Lecture 2 - Monte Carlo Methods (Basics)
Appendix to MLPI Lecture 2 - Monte Carlo Methods (Basics)Appendix to MLPI Lecture 2 - Monte Carlo Methods (Basics)
Appendix to MLPI Lecture 2 - Monte Carlo Methods (Basics)
 
On Uq(sl2)-actions on the quantum plane
On Uq(sl2)-actions on the quantum planeOn Uq(sl2)-actions on the quantum plane
On Uq(sl2)-actions on the quantum plane
 
Bayes Independence Test
Bayes Independence TestBayes Independence Test
Bayes Independence Test
 
Orthogonal basis and gram schmidth process
Orthogonal basis and gram schmidth processOrthogonal basis and gram schmidth process
Orthogonal basis and gram schmidth process
 
1 hofstad
1 hofstad1 hofstad
1 hofstad
 
lec12.ppt
lec12.pptlec12.ppt
lec12.ppt
 
HPWFcorePRES--FUR2016
HPWFcorePRES--FUR2016HPWFcorePRES--FUR2016
HPWFcorePRES--FUR2016
 
Minimum mean square error estimation and approximation of the Bayesian update
Minimum mean square error estimation and approximation of the Bayesian updateMinimum mean square error estimation and approximation of the Bayesian update
Minimum mean square error estimation and approximation of the Bayesian update
 
Probable
ProbableProbable
Probable
 
Tensor Completion for PDEs with uncertain coefficients and Bayesian Update te...
Tensor Completion for PDEs with uncertain coefficients and Bayesian Update te...Tensor Completion for PDEs with uncertain coefficients and Bayesian Update te...
Tensor Completion for PDEs with uncertain coefficients and Bayesian Update te...
 
A Characterization of the Zero-Inflated Logarithmic Series Distribution
A Characterization of the Zero-Inflated Logarithmic Series DistributionA Characterization of the Zero-Inflated Logarithmic Series Distribution
A Characterization of the Zero-Inflated Logarithmic Series Distribution
 
2018 MUMS Fall Course - Sampling-based techniques for uncertainty propagation...
2018 MUMS Fall Course - Sampling-based techniques for uncertainty propagation...2018 MUMS Fall Course - Sampling-based techniques for uncertainty propagation...
2018 MUMS Fall Course - Sampling-based techniques for uncertainty propagation...
 
Aed.pptx
Aed.pptxAed.pptx
Aed.pptx
 
orthogonal.pptx
orthogonal.pptxorthogonal.pptx
orthogonal.pptx
 
Solovay Kitaev theorem
Solovay Kitaev theoremSolovay Kitaev theorem
Solovay Kitaev theorem
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
Weighted Analogue of Inverse Maxwell Distribution with Applications
Weighted Analogue of Inverse Maxwell Distribution with ApplicationsWeighted Analogue of Inverse Maxwell Distribution with Applications
Weighted Analogue of Inverse Maxwell Distribution with Applications
 
Wasserstein gan
Wasserstein ganWasserstein gan
Wasserstein gan
 
Least Squares
Least SquaresLeast Squares
Least Squares
 

Más de Arthur Charpentier (20)

Family History and Life Insurance
Family History and Life InsuranceFamily History and Life Insurance
Family History and Life Insurance
 
ACT6100 introduction
ACT6100 introductionACT6100 introduction
ACT6100 introduction
 
Family History and Life Insurance (UConn actuarial seminar)
Family History and Life Insurance (UConn actuarial seminar)Family History and Life Insurance (UConn actuarial seminar)
Family History and Life Insurance (UConn actuarial seminar)
 
Control epidemics
Control epidemics Control epidemics
Control epidemics
 
STT5100 Automne 2020, introduction
STT5100 Automne 2020, introductionSTT5100 Automne 2020, introduction
STT5100 Automne 2020, introduction
 
Family History and Life Insurance
Family History and Life InsuranceFamily History and Life Insurance
Family History and Life Insurance
 
Machine Learning in Actuarial Science & Insurance
Machine Learning in Actuarial Science & InsuranceMachine Learning in Actuarial Science & Insurance
Machine Learning in Actuarial Science & Insurance
 
Reinforcement Learning in Economics and Finance
Reinforcement Learning in Economics and FinanceReinforcement Learning in Economics and Finance
Reinforcement Learning in Economics and Finance
 
Optimal Control and COVID-19
Optimal Control and COVID-19Optimal Control and COVID-19
Optimal Control and COVID-19
 
Slides OICA 2020
Slides OICA 2020Slides OICA 2020
Slides OICA 2020
 
Lausanne 2019 #3
Lausanne 2019 #3Lausanne 2019 #3
Lausanne 2019 #3
 
Lausanne 2019 #4
Lausanne 2019 #4Lausanne 2019 #4
Lausanne 2019 #4
 
Lausanne 2019 #2
Lausanne 2019 #2Lausanne 2019 #2
Lausanne 2019 #2
 
Lausanne 2019 #1
Lausanne 2019 #1Lausanne 2019 #1
Lausanne 2019 #1
 
Side 2019 #10
Side 2019 #10Side 2019 #10
Side 2019 #10
 
Side 2019 #11
Side 2019 #11Side 2019 #11
Side 2019 #11
 
Side 2019 #12
Side 2019 #12Side 2019 #12
Side 2019 #12
 
Side 2019 #9
Side 2019 #9Side 2019 #9
Side 2019 #9
 
Side 2019 #8
Side 2019 #8Side 2019 #8
Side 2019 #8
 
Side 2019 #7
Side 2019 #7Side 2019 #7
Side 2019 #7
 

So a webinar-2013-2

  • 1. Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013 Perspectives of Predictive Modeling Arthur Charpentier charpentier.arthur@uqam.ca http ://freakonometrics.hypotheses.org/ (SOA Webcast, November 2013) 1
  • 2. Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013 Agenda • ◦ ◦ • ◦ ◦ • • ◦ ◦ ◦ • Introduction to Predictive Modeling Prediction, best estimate, expected value and confidence interval Parametric versus nonparametric models Linear Models and (Ordinary) Least Squares From least squares to the Gaussian model Smoothing continuous covariates From Linear Models to G.L.M. Modeling a TRUE-FALSE variable The logistic regression R.O.C. curve Classification tree (and random forests) From individual to functional data 2
  • 3. Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013 0.015 Prediction ? Best estimate ? E.g. predicting someone’s weight (Y ) −→ q qqq qqqqqqqqqqqqqqqqqqqqqqqqq qqqqqqqqqqqqq qqqq q qqqqqq qq qqqq q qq qqq q 0.010 Consider a sample {y1 , · · · , yn } Density Model : Yi = β0 + εi with E(ε) = 0 and Var(ε) = σ 2 0.005 ε is some unpredictable noise 0.000 Y = y is our ‘best guess’... 100 150 200 250 Weight (lbs.) Predicting means estimating E(Y ). Recall that E(Y ) = argmin{ Y − y y∈R ε L2 } = argmin{E [Y − y]2 } y∈R least squares 3
  • 4. Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013 Best estimate with some confidence 0.015 E.g. predicting someone’s weight (Y ) Give an interval [y− , y+ ] such that q qq qqq q we can estimate the distribution of Y dF (x) F (y) = P(Y ≤ y) or f (y) = dx x=y 0.000 Confidence intervals can be derived if 0.005 Density 0.010 P(Y ∈ [y− , y+ ]) = 1 − α qqq qqqqqqqqqqqqqqqqqqqqqqqqq qqqqqqqqqqqqq qqqq q qqqqqq qq qqqq q 100 150 200 250 Weight (lbs.) (related to the idea of “quantifying uncertainty” in our prediction...) 4
  • 5. Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013 Parametric inference 0.015 E.g. predicting someone’s weight (Y ) Assume that F ∈ F = {Fθ , θ ∈ Θ} 0.010 1. Provide an estimate θ Density 2. Compute bound estimates y+ = θ F −1 (1 θ − α/2) 0.005 y− = F −1 (α/2) Standard estimation technique : 0.000 −→ maximum likelihood techniques 90% 100 150 200 250 Weight (lbs.) n θ = argmax θ∈Θ log fθ (yi ) i=1   explicit (analytical) expression for θ  numerical optimization (Newton Raphson) log likelihood 5
  • 6. Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013 Non-parametric inference natural estimator for P(Y ≤ y) qq qqq q 0.010 qqq qqqqqqqqqqqqqqqqqqqqqqqqq qqqqqqqqqqqqq qqqq q qqqqqq qq qqqq q 0.005 #{i such that yi ≤y} q Density 1. Empirical distribution function n 1 1(yi ≤ y) F (y) = n i=1 0.015 E.g. predicting someone’s weight (Y ) 2. Compute bound estimates 90% (α/2) y+ = F −1 (1 − α/2) 0.000 y− = F −1 100 150 200 250 Weight (lbs.) 6
  • 7. Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013 Prediction using some covariates βM 1(X1,i = F) + εi 0.010 0.005 β1 Male βF −βM with E(ε) = 0 and Var(ε) = σ 2 0.000 or Yi = β0 + Female Density based on his/her sex (X1 )   β + ε if X = F F i 1,i Model : Yi =  βH + εi if X1,i = M 0.015 E.g. predicting someone’s weight (Y ) 100 150 200 250 Weight (lbs.) 7
  • 8. Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013 Prediction using some (categorical) covariates E.g. predicting someone’s weight (Y ) 0.015 based on his/her sex (X1 ) Conditional parametric model 0.005 Male 0.000 −→ our prediction will be Female Density   F if X = F θF 1,i i.e. Yi ∼  Fθ if X1,i = M M 0.010 assume that Y |X1 = x1 ∼ Fθ(x1 ) 100 conditional on the covariate 150 200 250 Weight (lbs.) 8
  • 9. Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013 Prediction using some (categorical) covariates 0.010 0.005 Density 0.005 Density 0.010 0.015 Prediction of Y when X1 = M 0.015 Prediction of Y when X1 = F 0.000 90% 0.000 90% 100 150 200 Weight (lbs.) 250 100 150 200 250 Weight (lbs.) 9
  • 10. Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013 Linear Models, and Ordinary Least Squares E.g. predicting someone’s weight (Y ) 250 q based on his/her height (X2 ) q q q Linear Model : Yi = β0 + β2 X2,i + εi 200 2 Conditional parametric model q q q 150 q q 100 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q E.g. Gaussian Linear Model β0 +β2 x2 q q q Y |X2 = x2 ∼ N ( µ(x2 ) , σ 2 (x2 )) q q q q q q q q q q q q assume that Y |X2 = x2 ∼ Fθ(x2 ) q q q q Weight (lbs.) with E(ε) = 0 and Var(ε) = σ q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q σ2 5.0 5.5 6.0 6.5 Height (in.) n [Yi − X T β]2 i −→ ordinary least squares, β = argmin i=1 β is also the M.L. estimator of β 10
  • 11. Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013 Prediction using no covariates 250 q q q q q q q q q q q q q q q q q q q q q Weight (lbs.) q q q q q q q q q q q q q q q q q q q q 150 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q qq q q q q qq q qq q q q q qq qqqq q q q q q q qq q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q qq q q q q q q q q q q qq q q q q q q q qq q q q q q q qq q q qq q q q qqqqqqq q q q q q q q q qq q q q qq q q q q q q 90% 200 q q q q q q q q 100 q q q q q q q q q q q q q q q q q 5.0 5.5 6.0 6.5 Height (in.) 11
  • 12. Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013 Prediction using a categorical covariates E.g. predicting someone’s weight (Y ) based on his/her sex (X1 ) E.g. Gaussian linear model Y |X1 = M ∼ N (µM , σ 2 ) q 1 E(Y |X1 = M) = nM q q q q q q q q qq q qq q q q q qq q q q qq q q q q qqqq q q q q q q qq q q qq q qq q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q qq q q q q q q q q q q qq qq q q q q q q q q q q q q q q q q q q q qq q q qq q q q qqqqqqq q q q q q q q q q q q q q qq q q q q q q q q Yi = Y (M) i:X1,i =M q q q q Y ∈ Y (M) ± u1−α/2 · σ q q q q q q q 1.96 Remark In the linear model, Var(ε) = σ 2 does not depend on X1 . 12
  • 13. Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013 Prediction using a categorical covariates E.g. predicting someone’s weight (Y ) based on his/her sex (X1 ) E.g. Gaussian linear model Y |X1 = F ∼ N (µF , σ 2 ) q 1 E(Y |X1 = F) = nF q q q q q q q q qq q qq q q q q qq q q q qq q q q q qqqq q q q q q q qq q q qq q qq q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q qq q q q q q q q q q q qq qq q q q q q q q q q q q q q q q q q q q qq q q qq q q q qqqqqqq q q q q q q q q q q q q q qq q q q q q q q q Yi = Y (F) i:X1,i =F q q q q Y ∈ Y (F) ± u1−α/2 · σ q q q q q q q 1.96 Remark In the linear model, Var(ε) = σ 2 does not depend on X1 . 13
  • 14. Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013 Prediction using a continuous covariates E.g. predicting someone’s weight (Y ) based on his/her height (X2 ) E.g. Gaussian linear model Y |X2 = x2 ∼ N (β0 + β1 x2 , σ 2 ) q E(Y |X2 = x2 ) = β0 + β1 x2 = Y (x2 ) Y ∈ Y (x2 ) ± u1−α/2 · σ q q q q q q q q q qq q qq q q q q qq q qq q q q q qq qqqq q q q q q q qq q q qq q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q qq q q q q q q q q q q qq q q q q q q q qq q q q q q q qq q q qq q q q qqqqqqq q q q q q q q q qq q q q qq q q q q q q q q q q q q q q q q q q 1.96 14
  • 15. Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013 Improving our prediction ? sex as covariate (Empirical) residuals, εi = Yi − X T β i height as covariate no covariates Yi R2 or log-likelihood −50 0 50 100 −50 0 50 100 parsimony principle ? −→ penalizing the likelihood with the number of covariates Akaike (AIC) or Schwarz (BIC) criteria q q q q qqq q qq q 15
  • 16. Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013 Relaxing the linear assumption in predictions Use of b-spline function basis to estimate µ(·) where µ(x) = E(Y |X = x) 250 q 250 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 5.0 q q 100 150 q q q q q q q q q q 150 Weight (lbs.) q q q q q q q 200 q q q q q q Weight (lbs.) 200 q q q 100 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 5.5 6.0 Height (in.) 6.5 5.0 5.5 6.0 6.5 Height (in.) 16
  • 17. Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013 Relaxing the linear assumption in predictions E.g. predicting someone’s weight (Y ) based on his/her height (X2 ) E.g. Gaussian linear model Y |X2 = x2 ∼ N (µ(x2 ), σ 2 ) q q q q q q q q q q qq q qq q q q q qq q qq q q q q q q qqqq q q q q q q qq q q qq q qq q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q qq q q q q q q q q q q qq qq q q q q q q q q q q q qq q q q q q q qq q q qq q q q qqqqqqq q q q q q q q q q q q q q qq q q q q q E(Y |X2 = x2 ) = µ(x2 ) = Y (x2 ) Y ∈ Y (x2 ) ± u1−α/2 · σ q q q q q q q q q q q q q 1.96 Gaussian model : E(Y |X = x) = µ(x) (e.g. xT β) and Var(Y |X = x) = σ 2 . 17
  • 18. Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013 Nonlinearities and missing covariates 250 q E.g. predicting someone’s weight (Y ) q q q based on his/her height and sex q q 200 model mispecification q q q q q q q q q q q q q q q q 150 q E.g. Gaussian linear model   β 0,F + β2,F X2,i + εi if X1,i = F Yi =  β0,M + β2,M X2,i + εi if X1,i = M q q Weight (lbs.) −→ nonlinearities can be related to q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 100 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 5.0 5.5 6.0 6. Height (in.) n ωi (x) · [Yi − X T β]2 i −→ local linear regression, β x = argmin i=1 T and set Y (x) = x β x 18
  • 19. Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013 Local regression and smoothing techniques q q 250 250 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 100 q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 5.0 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 100 150 q q q q q q q q q q q q q q q q q q q q q q q 150 q q q q q q q q q q q q q q q q q Weight (lbs.) Weight (lbs.) q q q q q q q q 200 200 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 5.5 6.0 Height (in.) 6.5 5.0 5.5 6.0 6.5 Height (in.) 19
  • 20. Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013 k-nearest neighbours and smoothing techniques q q 250 250 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 150 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 5.0 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 100 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 100 q q q q q q q q q q q q q 150 q q q q q q q q q q q Weight (lbs.) Weight (lbs.) q q q q q q q q 200 200 q q q q q q q q q 5.5 6.0 Height (in.) 6.5 5.0 5.5 6.0 6.5 Height (in.) 20
  • 21. Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013 Kernel regression and smoothing techniques q q 250 250 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q 150 q q q q q q qq q q q q q q q q q q q q q q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 5.0 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 100 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 100 q q q q 150 q q q q q q qqq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q Weight (lbs.) Weight (lbs.) q q q q q q q q q q q 200 200 q q q q q 5.5 6.0 Height (in.) 6.5 5.0 5.5 6.0 6.5 Height (in.) 21
  • 22. Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013 Multiple linear regression 10 E.g. predicting someone’s misperception 5 of his/her weight (Y ) 0 based on his/her height and weight −5 X2 −10 250 −→ linear model 200 ht ig We 6.0 150 s.) (lb E(Y |X2 , X3 ) = β0 + β2 X2 + β3 X3 Var(Y |X2 , X3 ) = σ X3 5.5 2 g Hei 100 in.) ht ( 5.0 22
  • 23. Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013 Multiple non-linear regression 10 E.g. predicting someone’s misperception 5 of his/her weight (Y ) 0 based on his/her height and weight −5 X2 −→ non-linear model 250 200 ht 6.0 150 s.) (lb Var(Y |X2 , X3 ) = σ −10 ig We E(Y |X2 , X3 ) = h(X2 , X3 ) X3 5.5 2 ght 100 Hei ) (in. 5.0 23
  • 24. Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013 Away from the Gaussian model Y is not necessarily Gaussian Y can be a counting variable, E.g. Poisson Y ∼ P(λ(x)) Y can be a FALSE-TRUE variable, q E.g. Binomial Y ∼ B(p(x)) −→ Generalized Linear Model q q q q q q E.g. Y |X2 = x2 ∼ P e q q (see next section) β0 +β2 x2 q q q q q q q q qq q qq q q q q qq q q q qq q q q q qqqq q q q q q q qq q q qq q qq q qq q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q q q q q qq qq q q q q q qq qq q q q q q q q q q q q q q q q qq q q q q q q q q q q q qqqqqqq q q q q q q q q qq q q q qq q q q q q q q q q q q Remark With a Poisson model, E(Y |X2 = x2 ) = Var(Y |X2 = x2 ). 24
  • 25. Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013 Logistic regression qqq q q q 0.4 of his/her weight (Y )   1 if prediction > observed weight Yi =  0 if prediction ≤ observed weight 0.0 E.g. predicting someone’s misperception qq q q q q q q q q q q q q q q q q q q q qq 0.8 q q q qq q q q q q q q q q q q q q q q q q q q q q qq q q q qq qq q qq qq qq q q 5.0 5.5 6.0 6.5 Height (in.) 1.0 Bernoulli variable, qq 0.4 0.6 0.8 q P(Y = y|X = x) = p(x)y (1 − p(x))1−y , where y ∈ {0, 1} 0.2 −→ logistic regression 0.0 P(Y = y) = py (1 − p)1−y , where y ∈ {0, 1} q qqqqqqqqqqq qqqq qq q q qqqqqqqq qq q q q q q q qqqqq qqqqqqqqqqqq q qq q qqqqq qqqqq qqqqq q 100 150 200 q q q 250 Weight (lbs.) 25
  • 26. Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013 Logistic regression 1.0 −→ logistic regression 0.8 y P(Y = y|X = x) = p(x) (1 − p(x)) 1−y , 0.6 where y ∈ {0, 1} P(Y = 1|X = x) Odds ratio = exp xT β P(Y = 0|X = x) 0.4 0.2 0.0 250 200 ht ig We 6.0 150 s.) (lb exp xT β E(Y |X = x) = 1 + exp (xT β) Estimation of β ? 5.5 g Hei 100 in.) ht ( 5.0 −→ maximum likelihood β (Newton - Raphson) 26
  • 27. Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013 Smoothed logistic regression qqq q q q 0.4 0.0 GLMs are linear since P(Y = 1|X = x) = exp xT β P(Y = 0|X = x) qq q q q q q q q q q q q q q q q q q q q qq 0.8 q q q qq q q q q q q q q q q q q q q q q q q q q q qq q q q qq qq q qq qq qq q q 5.0 5.5 6.0 6.5 q qq 0.8 q qqqqqqqqqqq qqqq qq q q qqqqqqqq qq q q q 0.2 0.0 exp (h(x)) E(Y |X = x) = 1 + exp (h(x)) 0.4 0.6 −→ smooth nonlinear function instead P(Y = 1|X = x) = exp (h(x)) P(Y = 0|X = x) 1.0 Height (in.) q q q qqqqq qqqqqqqqqqqq q qq q qqqqq qqqqq qqqqq q 100 150 200 q q q 250 Weight (lbs.) 27
  • 28. Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013 Smoothed logistic regression 1.0 −→ non linear logistic regression P(Y = 1|X = x) = exp (h(x)) P(Y = 0|X = x) 0.8 0.6 0.4 0.2 E(Y|X = x) = exp (h(x)) 1 + exp (h(x)) 0.0 250 200 ht ig We s.) (lb Remark we do not predict Y here, 6.0 150 5.5 ght Hei 100 but E(Y |X = x). ) (in. 5.0 28
  • 29. Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013 Predictive modeling for a {0, 1} variable 1.0 What is a good {0, 1}-model ? Observed Observed −→ decision theory 0.2 Y=1 q 0.6 q Y=0 ^ Y=1 qq qqq qqqqq qq qq q qq qqq qqqq qqq qq qqq qqqq qq q q qqq qq q qq qq q 0.4 ^ Y=0 q 0.8 Predicted 0.0   if P(Y |X = x) ≤ s, then Y = 0  if P(Y |X = x) > s, then Y = 1 q qqqqqqqqqqqqqqqqq q q qqqqqqqqqqqqq qq q qqq qqqq qqq q qq qqqqqq q qqq q q q qq qq qqq q q 0.0 0.2 0.4 q 0.6 q 0.8 1.0 0.8 1.0 1.0 Predicted error Y =1 error fine 0.6 fine 0.4 Y =0 q 0.2 Y =1 q 0.0 Y =0 True Positive Rate 0.8 q 0.0 0.2 0.4 0.6 False Positive Rate 29
  • 30. Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013 q Observed Observed Y=0 T P (s) = P(Y (s) = 1|Y = 1) nY =1,Y =1 = nY =1 False positive rate ^ Y=1 0.0 0.2 Y=1 q qq qqq qqqqq qq qq q qq qqq qqqq qqq qq qqq qqqq q q q qqq qq q qq 0.6 ^ Y=0 q 0.8 Predicted 0.4 True positive rate 1.0 R.O.C. curve q qqqqqqqqqqqqqqqqq q q qqqqqqqqqqqqq qq q qq qqqqqq q qqq qqqq qqq q qq q q q q q 0.0 0.2 0.4 q 0.6 q 0.8 1.0 F P (s) = P(Y (s) = 1|Y = 0) nY =1,Y =1 = nY =1 q q 0.6 0.4 0.0 0.2 True Positive Rate 0.8 q R.O.C. curve is {F P (s), T P (s)), s ∈ (0, 1)} 1.0 Predicted 0.0 0.2 0.4 0.6 0.8 1.0 False Positive Rate (see also model gain curve) 30
  • 31. Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013 Classification tree (CART) 250 q q q q If Y is a TRUE-FALSE variable 200 E(Y |X = x) = pj if x ∈ Aj q q q q q q q q q q q q q 150 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q regions of the X-space. q 100 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q where A1 , · · · , Ak are disjoint q q Weight (lbs.) prediction is a classification problem. q q q q q q q q q q q q q q q q q q q q 5.0 5.5 6.0 6.5 Height (in.) 31
  • 32. Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013 Classification tree (CART) −→ iterative process q 250 P(Y=1) = 38.2% P(Y=0) = 61.8% Step 1. Find two subset of indices either A1 = {i, X1,i < s} or A1 = {i, X2,i < s} q q q q q 200 q q q q q q 150 q q q 100 q & maximize heterogeneity between subsets q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q maximize homogeneity within subsets q q q q q q q q and A2 = {i, X2,i > s} q q Weight (lbs.) and A2 = {i, X1,i > s} q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q P(Y=1) = 25% P(Y=0) = 75% q 5.0 5.5 6.0 6.5 Height (in.) 32
  • 33. Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013 Classification tree (CART) 250 q q q q q q y∈{0,1} 0.317 q q q q q q q q q q q 150 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 100 x∈{A1 ,A2 } nx,y nx,y · 1− nx nx q q 0.523 Weight (lbs.) E.g. Gini index nx − n 200 Need an impurity criteria q q q q q q q q q q q q q q q q q q q 0.273 q 5.0 5.5 6.0 6.5 Height (in.) 33
  • 34. Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013 Classification tree (CART) 250 q Step k. Given partition A1 , · · · , Ak q q q 0.472 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 100 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q maximize homogeneity within subsets & maximize heterogeneity between subsets q q q 150 or according to X2 q 0.621 q 0.444 Weight (lbs.) either according to X1 200 find which subset Aj will be divided, q q q q q q q q q 0.111 q q q q q q q q q q 0.273 q 5.0 5.5 6.0 6.5 Height (in.) 34
  • 35. Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013 Visualizing classification trees (CART) Y < 124.561 | Y < 124.561 | X < 5.39698 0.3429 X < 5.69226 0.1500 X < 5.69226 0.2727 X < 5.52822 0.4444 0.5231 0.3175 Y < 162.04 0.6207 0.1111 0.4722 35
  • 36. Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013 From trees to forests q q 250 Problem CART tree are not robust −→ boosting and bagging q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 0.808 Then aggregate all the trees q q q 0.520 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 100 q q q q q q q q q q q q q q q q q q q q 150 repeat this resampling strategy q q q q q q q Weight (lbs.) and generate a classification tree 0.467 200 use bootstrap : resample in the data 0.576 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 0.040 q q q q q q q q q q q q q q q q q q q q q 0.217 q q 5.0 5.5 6.0 6.5 Height (in.) 36
  • 37. Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013 10 A short word on functional data q q q q −15 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q Individual data, {Yi , (X1,i , · · · , Xk,i , · · · )} q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 5 0 −5 −10 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q −20 Temperature (in °Celsius) q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 5 10 15 Week of winter (from Dec. 1st till March 31st) Functional data, {Y i = (Yi,1 , · · · , Yi,t , · · · )} qq qq 0 −10 −20 −30 Temperature (in °Celsius) 10 E.g. Winter temperature, in Montréal, QC q q qq q qq q q qq q q q qq q q q q qq q qq qqqq q q q q q qqqqqqqqq q qqq qqq qq q q q qq q q qq q qq q q qq qqqqq q q qq q q qqq qqqqqqqqqqqqqq qqq qqqqqq q qqq q q q qq q q qq qq q qq q qq q qq q q q qqq q q q q q q q q qqq q qq qq qqqqqq qqqqqqqqqqqq qqqqqqqq q qq q q qqqq q qq q q q q qqqq qqqqqqqq qqqqqqq q q qq q q q q q q qq q q q qqqq q qq qqq q q qq q qq q qqq q q q q q qqqq q qqqqqqqqqqqq q qqqqqqq qq qqqqq q q q qqqqq qq qqq q q q q qq q q q q qq qq qqqq qqqqqqq q qq q q q q q qq qqqqqqq qqqqq qqqqqqqq qq q q q q qqqqqqqqqqq qqqqqqqqqqqqqqqqqqqqq qq q qqqq q qq qqq qq q q q q q q qqq qqq q q qq qqq qq q qq qq q q q q q qqqqq q qq q qq qq qqqqq qqqqqqqq qq qqqq qqqqqqq qqqqqqqqqqq q qqqq qqqqqqqqq q qq qq qqq qqqqqq qq q qqqqq qqqqqqqqqqqqqqqqqqqqqqqqqqqqq qqqqqqqq q qqq q q q qq qqqqqqqqqqqqqqqqqqqqqqqqqq qqqqqqqqqq q qqqqq qqqqqqqqqqqqqq qqq q qqq qq q qqq q qqqqqqqqqq qq qqqqqqq qqqqqqqqqqqqq qqq q qqq qqqqqqqqqqqqqqq qqqqqqqqqqqqqqqqqqqqqqqqqq q q q qq q qqqqqqqqqqqqqq q qqqqqqqqqq qqqqqqq q q qqq q qq qq q qqq q qqq q q qqq q q qq qqq qqqqqqqqqqq q qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq qqqqq qqqqqqqq qqqq q qqqqqq q qq qq q q qq qqqq q qq q qq q qq qq qqqq qq qqqq q q q q q qqq qq qq qqq q qqqqq qqqqqqqqqqqqqqqqqqqqqqqqqqq qq q qqqqqq qqqqq qqqqqqqq qqqqq q q qq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq qqqqqqqqq qqqqqqqqqqqqqqqqqq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq q q q q q qq q qq qqqq qq qqqqqqqqqq qqqqqqqqqqqq qqqqqqqqqqq qqqq qqqqqqq qqqqqq qq qqqqqq q q q q q q q q q q q q q qq qq q q q q qqq qqqqqqqqq qqqqqqqqqqqqqqqqqqq q q qqq q qq qqq qqqqqqqqqqqqq qqqqqqq qqqqqqqqqqqqqqqq q qqqqqqq q qqq qqqq qq qq qqqqq q q qqqq q q q q qqqqqqq qqq qqqqq qq q qqq qqqqqq qqqq q q qq q q qqqqq q q q qq qqqqqqqq qqqqqqqqqqqqqqqqq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq qq qqqq qq q q q qqqqq q q qq q q q q q q q q qq qqqq qqq qqqqqq q qq q qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq qqqqqqqqq qq q qqqq qq qq qqqq qqqq q q qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq qq qq qqqqqqqqqqqqqqqqqqqqqqq q q qqqqqqqqqqqqqq q qqq q qq qqqqq qqqqq q qqqq q q qqq q q qq qqqqqqqq qqqqq qqqq qq qq q qq q qqq qq qqqq q q q qqq q qq q q qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq qqqqqqqqq qq qqqqq q qqqqqqqq q qqqqqq qqqqqqqqqqqqqqqqq qqqqqq qqqq qqq q qqq qq q qq q q qqq q qqq q qqqq qq qqqq q qqqqqqqqq qqqqqq q qq q q qq q q qq qq q q q q q qqqq q qqqq qq q q q qqqqq qq q q q q qq q qqqqqqqqqqqq qqqqqqqq qqqqqqqqqqqqqqqqqqqq qqqqqq q qqqqqqqqq qqq qqqqqqq qqqqq qqqqqqqq qqqqqqqqqqqqq qq q qq qq q q q qq qqq q qqqqqqqqqqqqqqqqqqqqq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq q q q qq qqqq q q q q q qqqqq q q qqq qqq q qq qqq q qqqq qqqqqq q q qqqqqqqqqqqqqqqqqqqqqq qqqqqqqqqqqqqqq q q q q q q qqq qqq q q qq q q q qqqqqq qqq qqqqqqq qqqqqqq q qqq q q q q q q q q q qq q qqqqq qqqq q qqqq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq qqq qqqqqqqqqqq qq q qqqq q qq q q q q q qq qqqqqqqqqqqqqqq qq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq qqqqqqqqqqqqqqqqqqqq qqqqq qqqqq qqqqqqqqqqqq q q qqq qq qq q qqqq qqqqq q qq q q q q qqqqq q qqq qq qqqqq qqqq qq qqq q qq qqq qqqqq q q qqqqqqqq q qqq q qqq q q qq qqq q qq qqqqq q q qqqqqqq qq qq qq qq qq qq q qq q qq qqq q qqqqqqq qq qq q qqqq qq q q qqqqqqqqqqqqqqqqqqqqqqqqqqqqq qqqqqqqqqqq qqqqqqqqqqqqqqqqqqqq qqqqqq qqqq qqqqqqqqqqqqqq qq qq qq q q q q q q qq qqqqq q qqq qqq q qq qq q qqq qqqqq qq qqqqqqq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq qqqq qq qqqqqqqqqqqqqqqqq qq q q q q q qq q q qq qq q q qqq qqq q qq qq q q q q q q q qqqq qqqqqq qqqqqq qqqq q qqq q q qq q q q qqqq q qqqqqqq q q q qqqq qqqqqqqq qq q qqqqq q q q qqqqqqqqqqqqqqq qqqqqqqqqqqqqqqqqqqqq qq q q q q q q qqq q q q q qqq q qqq qqqqq qq qqqq q q qq qqqqq qqqq q qqq q qqqqq q q q qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq q qqqqq q q q qqq q q q qq q qqqq qq qqqqqqqq qq q qqqqqqqqqqq qqqqqqqqqqqqq qq qqq q q qqqq q q qqqqqq q q q qq q q q q q q qqq q q q qqq qqqqqq qqqq q qqqqqqqqqq q qqq q qqqqqqqq qq qq qq qq qqqq qqqqq qq qqqq qq qq q qqqqqqqqqq q q q q q q q qqqq q q q qqq q qqqqqqqqqqqqqqqqqq q qqqqqqqqq qqqqqqqqq qqqqqqqqqqqqqqqqqqqqqqqqqqqqq qqqqqq qqqqqqqqqq q q q qqq q qq q qqq qq qq qq qqqq q q q q qqq q qqq q qq qqqqqqqqqqqqq qqqqqqqqq qq q qqqqqqqqqqqqqq qqqqqqqqqqqqqqqqqqqqqqqqqqq qqqqqqq q qq q qqqq q qqqq qqqqq qqqqq q q q q qq q q q q qq qq qq qq q q qq qqq qqqqq q q qqqqqqqqq q qqqqqq q qq q q qqqq qq q q qq q q q qqqq q q qq q q q qq qq q qq qqqq qqqqqqqqqqq qqqqqqqqqqqqqqqq qqqqqqqqqqqqqqqqqq q q q qq q q q q qqqqq qqqqq qqqqqqq q q q q q q qqqq qqqqq q q q q qqqqq qqq qqqq q q q q q q qqqq q q q q q q qq qqq q q q q q q q qq q q q qqqqqqq qqq qq qqqqq qqqqqqqqqqqqqqq q qq qqq qqqqqqqqqqqqqqq qq q q qq q qq q q q q q q q q q qq q q qq qqqqqq qqqqqqq qqq qq q qqq qq q q qq q qqq q qqqq qqqq qq q qqqq qqqqqqqqqqqqqq qq q q q q q q q q q qqqq qqq qq q q q q qqq qq q q qq qqq q q q q q qqqqqq q qqq q q q q qq qqq q q q qq q q q q qqqq q q qqqq qqqqqq qq q qq q q q qqq q qq qq q q q q q q qq q q q q qqq q q q qqq q q q qqqqqq q q q q q q q q q q qq 0 20 40 60 80 100 120 Day of winter (from Dec. 1st till March 31st) 37
  • 38. Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013 A short word on functional data Daily dataset 2001 1982 q 1974 1999 2007 0 PCA Comp. 1 40 50 q q q q 2006 2009 q q 1957 q 2011 1997 20 q q −50 q 1986 2005 q 1961 qq qq q q q qqqq qqqq q qq qq q qq q q q qq qqq qqqq q q q q qq qq q q qqqqq q qqq q qq q q q q qqqqqqqqqqq q q q q q q qq q qqq q qq q qqqqqqq q q qqq q q qq q q qqqq qqq q q q qqqq q q qq q q q q qq q qq qq qq qq q qqqq q qq qqqq qqq qq q qq qqq q qq q qqqq q qq q q q q qq q q q q q qq q qqq qq q qq qq qq qq q qqq qqq q qqq qq qq q q qq q q q qq q qq q q q q qq qq qq qq q q q q qq q q q q q q qq qqqqq qq q qq qq qq qq q q q qq qq qq qq q q q q q qq q q q q q q q qq q q q qq q q qqq q qqqq qqq q q qq qq q q qq q q qq q q q qq q q qq q q qq q qqq q q q q qq q q qqq q qqqq q qqq q q q q qq q q qq q q qq q q q qqqqq qq q q q q q q q qqq q qqq q q q q q q q qq q q q qqq qqq q q q q q q q q q qqqq qq q qqq qqq qqq q q q qq q qqqq qqq q qqq qqq q q qq q qqq q qqqqqq q q q qq q q qq qqq q qqqq qqq qq q qq qq q qqqq q q q qq q q q q q qq qqqq q q qqq q qqq qqqq q q q qqq q q q qq q qqq q qqqqqqq q q q q q qq q q q qqq q q q qqq q q q q q q q qqq q q qq q qq q qq q qqq q qqqq qqqq q q qq qqq q q q q qq qqq q qq qq q q qq qqq q qq q q qq q qq qq qq qqqqq q q qqq q qqq qqq q qqqq q q qq q qqqq qqqq qq q q qqq q qq q qqqqqqq q q qq qq q q q qq q q qq q qq q q q q q qqqq qqqq qq q q q q qqq q qqq q q qq qqqqqqq q qqqq qq qqqq qq q q q qq q q q q qq qqq qq q qq qqq q qq q qq q q q q q q q q qq q q q q qq qq q q q q qqq qqqqqqqqqq q qq qqqq qq q qq qq qq q qqqq q q q q q qq q qqq q qq qqq qq q q q qqqqqq q q qqqq qq qqq q q qq q qq q q qqq q q qq q q q q q qq qq qqqqqqqq qq q qq q q q qqq q qq qq qq qq q q q q qq qq q q q q q q q qq q q qqq q qqqqqq q q qqqq qq qq q qq q q q qq q qq qqqq qqqq qq q qq q qq q qq q q qq qqq q q q qq q q q qq qq q q q qqqq qq qqq q q q q qqq q qq q qq q qq q qq qq qq q qqqqq qq qqq q q qqqqqq q q qqq q qq q qq q q q qq q q qqqq q qq qq q q qqq q q q q q q q q q qq q q qqqq qqqq q qq qq q qqqq q qqq q q q q q qq q q q q q qq q q q qq q qq q q qqqqq q qqqqq qq q q q q q qq q q q q q qq qq q qqqqq q q qqqqqq q q q qq q q q qqqqq q qq qqqqqqqqq qq qq qq q qq q q qq qq q q qqqqqq q q q q q q q qq q q q q q q qq q q q qq q q qq qq qq q qq q q qq q qq qq q qq q q q q q q q qq q qq q qqq qq q qq q qq q q q qqqq qq q q q q qq q qqqq q qq qq q qq qqq q q qq q qqq qq q q qq qq q qq q q qqq qqq q q q q qq q q qqq q q q qq qq q q q qq q q qq q qq q q qq q q qq q q qqqq q qqq qqqq q qqq qq q qqqqqqqq q q q q q q qq qq qq q q qq q qq q qqqq qq q q q q qq q qq qq q q q q qq qq q q qqq qqqq qq qqq qqq qqq q q qqq q q q q q qq q q q qq qq qq qqqqq qq qq q qq q qq q q qq q qq qqq q q q q q q q qq q qqq qq q q q q q qq q q qqqq qq q q qqqq q qq q q q q qqq q q qq q q q qq qq q qqqqqq q qqqqqq qqq qqq qqqqq q q q qq q qq qqqqq q q q qq qq qq qq qq qq qqq q q qq q qq qq q qqqqq q q q q q q q q q q q qqq q qq q q qq q qqq qqqqqq q qqq q q qqqq qq q q qqqqq q q q q q q q qq q q q q q qq q q q q q q qq q qqq qqq qq q q qq q qqq qqqqqq qqqq qqqqq q q qq q qqq q q qq q q q q qq q qq q q q q qq qq q q qqq q q qqq qqqq qqqqq q q q qq q qq q q q q qq qqq q q qqqqq q q qq q q q q q q q q q q q q q q qq q qqqq q q qq q q q qq q qq qqq q qqqqqq q q q qqq q qqqqq qqqq q qqqqq qq q qq q q q qq q q q q q qq qq qqqq qqqq q qqq q qqqqq qq qqqq qqq qq q qq q q q q q q qq qq qq q q q q q q qq q q q q q q qqqqqqq qq qqqqqqq q qqqqq qqqq q qqqq q qq q qq q q q q q q q qqqq qq q qqq q qq q q qqqqq qqqq q qqq q qq q qq q q q q q qq q q qq q q qqq qq qqq qq q q q q qqqqq qq q q q q q q q qq q q q qq qq qq q q q q qq q q qq qq qq q q qq q q qqqqq q q qq q qq q q q q q qqq q qqqq q qq qq qq qqq qq qq qq q q q qqqq q q qqqq q qq q qq q q q q q qq q q qq q qq q q qqqq q q qq qqqq q qq q q qqqq qq q q q q q qq q q qq qq q q qq qqq qq q q q q q q qq qqq q q qq q q qq q q q qq q qq qq qqqqq q q qq q q q q q q q q qqqqq q q qq q q q q qq q q qq q q q qqq qq q qq q q q q q q qq q q qqq q qq qq q JANUARY DECEMBER FEBRUARY MARCH q 1979 q Comp.2 0 q q q q 1994 1996 1966 1964 1973 2010 1955 1978 q 20 40 60 80 100 120 Day of winter (from Dec. 1st till March 31st) 1987 q qq 1953 q q 1962 1954 1991 q q q 1989 q 1960 0 q 2000 q 1988 1963 1965 1959 1990 q q 30 q q 1992 1971 2004 q q q q 2002 1998 q 1977 1972 1985 20 q q q 1984 q q 2008 q 1968 q 10 1956 1995 q q 2003 q 1969 1975 1983 q q 1976 q q 1993 q 1980 1967 −20 q q −20 −10 0 10 Comp.1 20 30 40 50 −30 −20 1981 0 q q −10 1958 1970 PCA Comp. 2 q q q q q q q q q q q q qq q q q qq qq q q q q q qq q q q q q qq q q q q q qq qq qq q q q q q q q qq q q q qq qq q q q q q q q qq q q q qq q q q q q q q q q q q qq q q q q q q qq qq q q q q q q q q q q qq q q qq q qq q q q q q qq q q qq q qq q q q qq q q q q qq q q q qq q q qq q q q q q qq q qq q qq q qqqq q qq q q q q qqq q q q q qq q q q q q q q q q qqq q q qq q q qq q q q q q q q q q q q q q q qq q q qq q q q q qq q qq q q q q q q q q q q qq q q q q q q q q qq qq qq q q qq q q qq qq qq q q qq q q qq qq q qq q qq q q q q qq q q q q q q q q q q q qq q q q q q q q q q q q q q q qq q q q qqqqq q q q q qq q q q q qq q qq q q qq q q q q q q q qq qq q q q q q q qqq q qq q qq q q q q q q q q qq q q q q q q q qq qq q q qq q qq q qq q qq q q q q q q q qq qq q q q q q qq q q q q qq q q q q q q qq q q q q q q qq q q qq q q q qqq q q q q q q q q q q q q q q q q q q q qq qqq q qqqq q qq qq qq q q q q q q qq q q qq q q q q q q q q qq q q q q q q qqqqqq qq q q q q qqq q q q qq q q q q qq q qq q q q qq q q q q q qq q q q q qq q qq q q q q q q qq q qq qq q q q q q q q q q q q q q q qqq qq q qq qq q q q q qqq q q q q qq qq q q qq q q qq q qq q q q qqq qq q qq q q q qq q q qqq q q qqq qq q q q q q qq q qq q qq q q q qq q q q q q q q q q qq q qq q q q q q q q qq q qq q q q q q q qq qq q q q q q q qq q q q q q qq qq qq qq qq q q q q q q q qq q q qqq q qq q qq q q q q q q qq q q q q q q q qq q q q q q qq qq q q q q qq q q q q q qq q q qq qq q q q q q qq q q q q q q q q qq q q q q q qq qq q qq q q q qq q qq q q q qq q q q qq qq q q qq q q q q q q q q q q q q q q qq q q q q q qq qqq q q q q qq q q q qq q q q q q qqq qq qq q q q qq q qq q q qq q qq q q qq qq q q qq q q q q q q qq qq q q q q q q qqq q q q q q q q q q q qqq qq q qq q q q q q q q q qq q q qq q qq q qq q q q qq q qq q qq q q q q q qq q q q q q q q qq qq q q q q q q q qq qq q q q q q qq q q q q q q q qq qq q q q q qqq qq q q qq q q q q q qq q q qq q q q qqqq qq q q q q q q qq q q q qq q qq qq q qq q qqqq qq q q qq qq q q q q qq q qq q qq q q q q q qq q qq q q q q q q qq q q q qq q q q q q qq qq q q q q q q q qq qq q q q q q qq q q q q qq q q qq q qqq q q qq q q q qqq q q q q q qq q qq q q qq q q qq q q q q q q qq q q q qq q qq q qq q q qq q q q q qqq q q q q qqq q q q q q qq q qq q q qq q q q qq q q q q q q qq q qq q q qq q q q q q q q qq q qq qq q qq q q q qq qq q q q q q qq q q q q qq qq q qq qq q q q q q q qq q qq q q qq q qq qq qq q qq q q qq q q q qq q q qq q q q q q qq q qq q q q q q q q q q qq q q q q q qq q q qq q qq q q q q q q q q qq q q q q q q q qqq q q q q q q q q q qq q q q qq q q qq q q q q q qqq q q q q q q q q q q qq q qqq q q q qq qqq q q q qq q q q q qq q qqq q q qq qq qq q qq q q q q q q q qqq qq q qq qq q q q q qq qq qq q qqq q q qq qq q q qqq q q q q qq q qq q q qq q q qq q q qq qq qq q q qq qq q q q qq q qq q q qq q q q q q qq q q q q q q qqq q q q qq qq q qq q qqq q q q q qq q qqq q q q q q qqq q q q q q qq q qq q qq q q q q q q q q q q qqq q q q q q q q q q qq q qq q q q qq qq q q qq q q q q qqq qq qq q q qq qq qqqqq q q q q q q qq q q q q q q q qq q qq q q qq q q q q q q q q q q qq q qq q q q q qq q q qq qq qq q q qq qqq qq qq q q qq qq qq q q qq q q q q qq q q q qqq qq q qq q qq qq q q q q q q qq q qqq q qq qq qqq qq q q qq qq q q q q q q q q q q q q q q q q q q q q 0 20 40 60 80 100 120 Day of winter (from Dec. 1st till March 31st) 38
  • 39. Arthur CHARPENTIER - Predictive Modeling - SoA Webinar, 2013 To go further... forthcoming book entitled Computational Actuarial Science with R 39