SlideShare una empresa de Scribd logo
1 de 30
Descargar para leer sin conexión
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Actuariat de l’Assurance Non-Vie # 9
A. Charpentier (UQAM & Université de Rennes 1)
ENSAE ParisTech, Octobre 2015 - Janvier 2016.
http://freakonometrics.hypotheses.org
@freakonometrics 1
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Fourre-Tout sur la Tarification
• modèle collectif vs. modèle individuel
• cas de la grande dimension
• choix de variables
• choix de modèles
@freakonometrics 2
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Modèle individuel ou modèle collectif ? La loi Tweedie
Consider a Tweedie distribution, with variance function power p ∈ (1, 2), mean µ
and scale parameter φ, then it is a compound Poisson model,
• N ∼ P(λ) with λ =
φµ2−p
2 − p
• Yi ∼ G(α, β) with α = −
p − 2
p − 1
and β =
φµ1−p
p − 1
Consversely, consider a compound Poisson model N ∼ P(λ) and Yi ∼ G(α, β),
then
• variance function power is p =
α + 2
α + 1
• mean is µ =
λα
β
• scale parameter is φ =
[λα]
α+2
α+1 −1
β2− α+2
α+1
α + 1
@freakonometrics 3
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Modèle individuel ou modèle collectif ? La régression Tweedie
In the context of regression
Ni ∼ P(λi) with λi = exp[XT
i βλ]
Yj,i ∼ G(µi, φ) with µi = exp[XT
i βµ]
Then Si = Y1,i + · · · + YN,i has a Tweedie distribution
• variance function power is p =
φ + 2
φ + 1
• mean is λiµi
• scale parameter is
λ
1
φ+1 −1
i
µ
φ
φ+1
i
φ
1 + φ
There are 1 + 2dim(X) degrees of freedom.
@freakonometrics 4
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Modèle individuel ou modèle collectif ? La régression Tweedie
Remark Note that the scale parameter should not depend on i.
A Tweedie regression is
• variance function power is p ∈ (1, 2)
• mean is µi = exp[XT
i βTweedie]
• scale parameter is φ
There are 2 + dim(X) degrees of freedom.
@freakonometrics 5
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Double Modèle Fr´quence - Coût Individuel
Considérons les bases suivantes, en RC, pour la fréquence
1 > freq = merge(contrat ,nombre_RC)
pour les coûts individuels
1 > sinistre_RC = sinistre [( sinistre$garantie =="1RC")&(sinistre$cout >0)
,]
2 > sinistre_RC = merge(sinistre_RC ,contrat)
et pour les co ûts agrégés par police
1 > agg_RC = aggregate(sinistre_RC$cout , by=list(sinistre_RC$nocontrat)
, FUN=’sum’)
2 > names(agg_RC)=c(’nocontrat ’,’cout_RC’)
3 > global_RC = merge(contrat , agg_RC , all.x=TRUE)
4 > global_RC$cout_RC[is.na(global_DO$cout_RC)]=0
@freakonometrics 6
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Double Modèle Fr´quence - Coût Individuel
1 > library(splines)
2 > reg_f = glm(nb_RC~zone+bs( ageconducteur )+carburant , offset=log(
exposition),data=freq ,family=poisson)
3 > reg_c = glm(cout~zone+bs( ageconducteur )+carburant , data=sinistre_RC
,family=Gamma(link="log"))
Simple Modèle Coût par Police
1 > library(tweedie)
2 > library(statmod)
3 > reg_a = glm(cout_RC~zone+bs( ageconducteur )+carburant , offset=log(
exposition),data=global_RC ,family=tweedie(var.power =1.5 , link.
power =0))
@freakonometrics 7
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Comparaison des primes
1 > freq2 = freq
2 > freq2$exposition = 1
3 > P_f = predict(reg_f,newdata=freq2 ,type="response")
4 > P_c = predict(reg_c,newdata=freq2 ,type="response")
5 prime1 = P_f*P_c
1 > k = 1.5
2 > reg_a = glm(cout_DO~zone+bs( ageconducteur )+carburant , offset=log(
exposition),data=global_DO ,family=tweedie(var.power=k, link.power
=0))
3 > prime2 = predict(reg_a,newdata=freq2 ,type="response")
1 > arrows (1:100 , prime1 [1:100] ,1:100 , prime2 [1:100] , length =.1)
@freakonometrics 8
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Impact du degré Tweedie sur les Primes Pures
@freakonometrics 9
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Impact du degré Tweedie sur les Primes Pures
Comparaison des primes pures, assurés no1, no2 et no 3 (DO)
@freakonometrics 10
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
‘Optimisation’ du Paramètre Tweedie
1 > dev = function(k){
2 + reg = glm(cout_RC~zone+bs( ageconducteur )+
carburant , data=global_RC , family=
tweedie(var.power=k, link.power =0) ,
offset=log(exposition))
3 + reg$deviance
4 + }
@freakonometrics 11
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Tarification et données massives (Big Data)
Problèmes classiques avec des données massives
• beaucoup de variables explicatives, k grand, XT
X peut-être non inversible
• gros volumes de données, e.g. données télématiques
• données non quantitatives, e.g. texte, localisation, etc.
@freakonometrics 12
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
La fascination pour les estimateurs sans biais
En statistique mathématique, on aime les estimateurs sans biais car ils ont
plusieurs propriétés intéressantes. Mais ne peut-on pas considérer des estimateurs
biaisés, potentiellement meilleurs ?
Consider a sample, i.i.d., {y1, · · · , yn} with distribution N(µ, σ2
). Define
θ = αY . What is the optimal α to get the best estimator of µ ?
• bias: bias θ = E θ − µ = (α − 1)µ
• variance: Var θ =
α2
σ2
n
• mse: mse θ = (α − 1)2
µ2
+
α2
σ2
n
The optimal value is α =
µ2
µ2 +
σ2
n
< 1.
@freakonometrics 13
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Linear Model
Consider some linear model yi = xT
i β + εi for all i = 1, · · · , n.
Assume that εi are i.i.d. with E(ε) = 0 (and finite variance). Write





y1
...
yn





y,n×1
=





1 x1,1 · · · x1,k
...
...
...
...
1 xn,1 · · · xn,k





X,n×(k+1)








β0
β1
...
βk








β,(k+1)×1
+





ε1
...
εn





ε,n×1
.
Assuming ε ∼ N(0, σ2
I), the maximum likelihood estimator of β is
β = argmin{ y − XT
β 2
} = (XT
X)−1
XT
y
... under the assumtption that XT
X is a full-rank matrix.
What if XT
i X cannot be inverted? Then β = [XT
X]−1
XT
y does not exist, but
βλ = [XT
X + λI]−1
XT
y always exist if λ > 0.
@freakonometrics 14
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Ridge Regression
The estimator β = [XT
X + λI]−1
XT
y is the Ridge estimate obtained as solution
of
β = argmin
β



n
i=1
[yi − β0 − xT
i β]2
+ λ β 2
1Tβ2



for some tuning parameter λ. One can also write
β = argmin
β; β 2 ≤s
{ Y − XT
β 2
}
Remark Note that we solve β = argmin
β
{objective(β)} where
objective(β) = L(β)
training loss
+ R(β)
regularization
@freakonometrics 15
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Going further on sparcity issues
In severall applications, k can be (very) large, but a lot of features are just noise:
βj = 0 for many j’s. Let s denote the number of relevent features, with s << k,
cf Hastie, Tibshirani & Wainwright (2015),
s = card{S} where S = {j; βj = 0}
The model is now y = XT
SβS + ε, where XT
SXS is a full rank matrix.
@freakonometrics 16
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Going further on sparcity issues
Define a 0 = 1(|ai| > 0). Ici dim(β) = s.
We wish we could solve
β = argmin
β; β 0 ≤s
{ Y − XT
β 2
}
Problem: it is usually not possible to describe all possible constraints, since
s
k
coefficients should be chosen here (with k (very) large).
Idea: solve the dual problem
β = argmin
β; Y −XTβ 2 ≤h
{ β 0
}
where we might convexify the 0 norm, · 0
.
@freakonometrics 17
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Regularization 0, 1 et 2
@freakonometrics 18
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Going further on sparcity issues
On [−1, +1]k
, the convex hull of β 0
is β 1
On [−a, +a]k
, the convex hull of β 0
is a−1
β 1
Hence,
β = argmin
β; β 1 ≤˜s
{ Y − XT
β 2
}
is equivalent (Kuhn-Tucker theorem) to the Lagragian optimization problem
β = argmin{ Y − XT
β 2 +λ β 1 }
@freakonometrics 19
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
LASSO Least Absolute Shrinkage and Selection Operator
β ∈ argmin{ Y − XT
β 2
+λ β 1
}
is a convex problem (several algorithms ), but not strictly convex (no unicity of
the minimum). Nevertheless, predictions y = xT
β are unique
MM, minimize majorization, coordinate descent Hunter (2003).
@freakonometrics 20
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Optimal LASSO Penalty
Use cross validation, e.g. K-fold,
β(−k)(λ) = argmin



i∈Ik
[yi − xT
i β]2
+ λ β



then compute the sum of the squared errors,
Qk(λ) =
i∈Ik
[yi − xT
i β(−k)(λ)]2
and finally solve
λ = argmin Q(λ) =
1
K
k
Qk(λ)
Note that this might overfit, so Hastie, Tibshiriani & Friedman (2009) suggest the
largest λ such that
Q(λ) ≤ Q(λ ) + se[λ ] with se[λ]2
=
1
K2
K
k=1
[Qk(λ) − Q(λ)]2
@freakonometrics 21
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
1 > freq = merge(contrat ,nombre_RC)
2 > freq = merge(freq ,nombre_DO)
3 > freq [ ,10]= as.factor(freq [ ,10])
4 > mx=cbind(freq[,c(4,5,6)],freq [ ,9]=="D",
freq [ ,3]%in%c("A","B","C"))
5 > colnames(mx)=c(names(freq)[c(4,5,6)],"
diesel","zone")
6 > for(i in 1: ncol(mx)) mx[,i]=( mx[,i]-mean(
mx[,i]))/sd(mx[,i])
7 > names(mx)
8 [1] puissance agevehicule ageconducteur
diesel zone
9 > library(glmnet)
10 > fit = glmnet(x=as.matrix(mx), y=freq [,11],
offset=log(freq [ ,2]), family = "poisson
")
11 > plot(fit , xvar="lambda", label=TRUE)
LASSO, Fréquence RC
−10 −9 −8 −7 −6 −5
−0.20−0.15−0.10−0.050.000.050.10
Log Lambda
Coefficients
4 4 4 4 3 1
1
3
4
5
@freakonometrics 22
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
LASSO, Fréquence RC
1 > plot(fit ,label=TRUE)
2 > cvfit = cv.glmnet(x=as.matrix(mx), y=freq
[,11], offset=log(freq [ ,2]),family = "
poisson")
3 > plot(cvfit)
4 > cvfit$lambda.min
5 [1] 0.0002845703
6 > log(cvfit$lambda.min)
7 [1] -8.16453
• Cross validation curve + error bars
0.0 0.1 0.2 0.3 0.4
−0.20−0.15−0.10−0.050.000.050.10
L1 Norm
Coefficients
0 2 3 4 4
1
3
4
5
−10 −9 −8 −7 −6 −5
0.2460.2480.2500.2520.2540.2560.258
log(Lambda)
PoissonDeviance
q
q
q
q
q
q
q
q
q
q
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
4 4 4 4 4 4 4 4 4 4 4 4 4 3 3 3 2 1
@freakonometrics 23
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
1 > freq = merge(contrat ,nombre_RC)
2 > freq = merge(freq ,nombre_DO)
3 > freq [ ,10]= as.factor(freq [ ,10])
4 > mx=cbind(freq[,c(4,5,6)],freq [ ,9]=="D",
freq [ ,3]%in%c("A","B","C"))
5 > colnames(mx)=c(names(freq)[c(4,5,6)],"
diesel","zone")
6 > for(i in 1: ncol(mx)) mx[,i]=( mx[,i]-mean(
mx[,i]))/sd(mx[,i])
7 > names(mx)
8 [1] puissance agevehicule ageconducteur
diesel zone
9 > library(glmnet)
10 > fit = glmnet(x=as.matrix(mx), y=freq [,12],
offset=log(freq [ ,2]), family = "poisson
")
11 > plot(fit , xvar="lambda", label=TRUE)
LASSO, Fréquence DO
−9 −8 −7 −6 −5 −4
−0.8−0.6−0.4−0.20.0
Log Lambda
Coefficients
4 4 3 2 1 1
1
2
4
5
@freakonometrics 24
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
LASSO, Fréquence DO
1 > plot(fit ,label=TRUE)
2 > cvfit = cv.glmnet(x=as.matrix(mx), y=freq
[,12], offset=log(freq [ ,2]),family = "
poisson")
3 > plot(cvfit)
4 > cvfit$lambda.min
5 [1] 0.0004744917
6 > log(cvfit$lambda.min)
7 [1] -7.653266
• Cross validation curve + error bars
0.0 0.2 0.4 0.6 0.8 1.0
−0.8−0.6−0.4−0.20.0
L1 Norm
Coefficients
0 1 1 1 2 4
1
2
4
5
−9 −8 −7 −6 −5 −4
0.2150.2200.2250.2300.235
log(Lambda)
PoissonDeviance
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
q
qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
4 4 4 4 3 3 3 3 3 2 2 1 1 1 1 1 1 1 1
@freakonometrics 25
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Model Selection and Gini/Lorentz (on incomes)
Consider an ordered sample {y1, · · · , yn}, then Lorenz curve is
{Fi, Li} with Fi =
i
n
and Li =
i
j=1 yj
n
j=1 yj
The theoretical curve, given a distribution F, is
u → L(u) =
F −1
(u)
−∞
tdF(t)
+∞
−∞
tdF(t)
see Gastwirth (1972, econpapers.repec.org)
@freakonometrics 26
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Model Selection and Gini/Lorentz (on incomes)
1 > library(ineq)
2 > set.seed (1)
3 > (x<-sort(rlnorm (5,0,1)))
4 [1] 0.4336018 0.5344838 1.2015872 1.3902836
4.9297132
5 > Lc.sim <- Lc(x)
6 > plot(Lc.sim)
7 > points ((1:4)/5,( cumsum(x)/sum(x))[1:4] , pch
=19, col="blue")
8 > lines(Lc.lognorm , parameter =1,lty =2)
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Lorenz curve
p
L(p)
q
q
q
q
@freakonometrics 27
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Model Selection and Gini/Lorentz (on incomes)
Gini index is the ratio of the areas
A
A + B
. Thus,
G =
2
n(n − 1)x
n
i=1
i · xi:n −
n + 1
n − 1
=
1
E(Y )
∞
0
F(y)(1 − F(y))dy
1 > Gini(x)
2 [1] 0.4640003
q
q
0.0 0.2 0.4 0.6 0.8 1.0
0.00.20.40.60.81.0
p
L(p)
q
q
q
q
A
B
@freakonometrics 28
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Comparing Models
Consider an ordered sample {y1, · · · , yn} of incomes,
with y1 ≤ y2 ≤ · · · ≤ yn, then Lorenz curve is
{Fi, Li} with Fi =
i
n
and Li =
i
j=1 yj
n
j=1 yj
We have observed losses yi and premiums π(xi). Con-
sider an ordered sample by the model, see Frees, Meyers
& Cummins (2014), π(x1) ≥ π(x2) ≥ · · · ≥ π(xn), then
plot
{Fi, Li} with Fi =
i
n
and Li =
i
j=1 yj
n
j=1 yj
0 20 40 60 80 100
020406080100
Proportion (%)
Income(%)
poorest ← → richest
0 20 40 60 80 100
020406080100
Proportion (%)
Losses(%)
more risky ← → less risky
@freakonometrics 29
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Choix et comparaison de modèle en tarification
See Frees et al. (2010) or Tevet (2013).
@freakonometrics 30

Más contenido relacionado

La actualidad más candente (20)

Slides toulouse
Slides toulouseSlides toulouse
Slides toulouse
 
Quantile and Expectile Regression
Quantile and Expectile RegressionQuantile and Expectile Regression
Quantile and Expectile Regression
 
Testing for Extreme Volatility Transmission
Testing for Extreme Volatility Transmission Testing for Extreme Volatility Transmission
Testing for Extreme Volatility Transmission
 
Slides econometrics-2018-graduate-2
Slides econometrics-2018-graduate-2Slides econometrics-2018-graduate-2
Slides econometrics-2018-graduate-2
 
Proba stats-r1-2017
Proba stats-r1-2017Proba stats-r1-2017
Proba stats-r1-2017
 
Slides ACTINFO 2016
Slides ACTINFO 2016Slides ACTINFO 2016
Slides ACTINFO 2016
 
Slides ensae-2016-11
Slides ensae-2016-11Slides ensae-2016-11
Slides ensae-2016-11
 
Slides erm
Slides ermSlides erm
Slides erm
 
Slides ensae-2016-8
Slides ensae-2016-8Slides ensae-2016-8
Slides ensae-2016-8
 
Slides ensae 11bis
Slides ensae 11bisSlides ensae 11bis
Slides ensae 11bis
 
Lundi 16h15-copules-charpentier
Lundi 16h15-copules-charpentierLundi 16h15-copules-charpentier
Lundi 16h15-copules-charpentier
 
Slides univ-van-amsterdam
Slides univ-van-amsterdamSlides univ-van-amsterdam
Slides univ-van-amsterdam
 
Slides simplexe
Slides simplexeSlides simplexe
Slides simplexe
 
Slides edf-2
Slides edf-2Slides edf-2
Slides edf-2
 
Slides econ-lm
Slides econ-lmSlides econ-lm
Slides econ-lm
 
Slides amsterdam-2013
Slides amsterdam-2013Slides amsterdam-2013
Slides amsterdam-2013
 
Sildes buenos aires
Sildes buenos airesSildes buenos aires
Sildes buenos aires
 
Inequality #4
Inequality #4Inequality #4
Inequality #4
 
Slides ineq-4
Slides ineq-4Slides ineq-4
Slides ineq-4
 
Slides guanauato
Slides guanauatoSlides guanauato
Slides guanauato
 

Destacado (11)

Slides ensae 7
Slides ensae 7Slides ensae 7
Slides ensae 7
 
Slides ensae-2016-3
Slides ensae-2016-3Slides ensae-2016-3
Slides ensae-2016-3
 
Slides ensae 10
Slides ensae 10Slides ensae 10
Slides ensae 10
 
Slides ensae-2016-6
Slides ensae-2016-6Slides ensae-2016-6
Slides ensae-2016-6
 
Slides ensae-2016-5
Slides ensae-2016-5Slides ensae-2016-5
Slides ensae-2016-5
 
Slides ensae-2016-7
Slides ensae-2016-7Slides ensae-2016-7
Slides ensae-2016-7
 
Slides ensae-2016-2
Slides ensae-2016-2Slides ensae-2016-2
Slides ensae-2016-2
 
Slides ensae-2016-1
Slides ensae-2016-1Slides ensae-2016-1
Slides ensae-2016-1
 
Slides ensae-2016-10
Slides ensae-2016-10Slides ensae-2016-10
Slides ensae-2016-10
 
Slides networks-2017-2
Slides networks-2017-2Slides networks-2017-2
Slides networks-2017-2
 
Slides ensae-2016-4
Slides ensae-2016-4Slides ensae-2016-4
Slides ensae-2016-4
 

Similar a Slides ensae 9

Hands-On Algorithms for Predictive Modeling
Hands-On Algorithms for Predictive ModelingHands-On Algorithms for Predictive Modeling
Hands-On Algorithms for Predictive ModelingArthur Charpentier
 
Bayesian Experimental Design for Stochastic Kinetic Models
Bayesian Experimental Design for Stochastic Kinetic ModelsBayesian Experimental Design for Stochastic Kinetic Models
Bayesian Experimental Design for Stochastic Kinetic ModelsColin Gillespie
 
Machine Learning in Actuarial Science & Insurance
Machine Learning in Actuarial Science & InsuranceMachine Learning in Actuarial Science & Insurance
Machine Learning in Actuarial Science & InsuranceArthur Charpentier
 
A kernel-free particle method: Smile Problem Resolved
A kernel-free particle method: Smile Problem ResolvedA kernel-free particle method: Smile Problem Resolved
A kernel-free particle method: Smile Problem ResolvedKaiju Capital Management
 
Options Portfolio Selection
Options Portfolio SelectionOptions Portfolio Selection
Options Portfolio Selectionguasoni
 
Research internship on optimal stochastic theory with financial application u...
Research internship on optimal stochastic theory with financial application u...Research internship on optimal stochastic theory with financial application u...
Research internship on optimal stochastic theory with financial application u...Asma Ben Slimene
 
Presentation on stochastic control problem with financial applications (Merto...
Presentation on stochastic control problem with financial applications (Merto...Presentation on stochastic control problem with financial applications (Merto...
Presentation on stochastic control problem with financial applications (Merto...Asma Ben Slimene
 
Slides sales-forecasting-session2-web
Slides sales-forecasting-session2-webSlides sales-forecasting-session2-web
Slides sales-forecasting-session2-webArthur Charpentier
 
Biosight: Quantitative Methods for Policy Analysis: Stochastic Dynamic Progra...
Biosight: Quantitative Methods for Policy Analysis: Stochastic Dynamic Progra...Biosight: Quantitative Methods for Policy Analysis: Stochastic Dynamic Progra...
Biosight: Quantitative Methods for Policy Analysis: Stochastic Dynamic Progra...IFPRI-EPTD
 
fb69b412-97cb-4e8d-8a28-574c09557d35-160618025920
fb69b412-97cb-4e8d-8a28-574c09557d35-160618025920fb69b412-97cb-4e8d-8a28-574c09557d35-160618025920
fb69b412-97cb-4e8d-8a28-574c09557d35-160618025920Karl Rudeen
 

Similar a Slides ensae 9 (20)

Hands-On Algorithms for Predictive Modeling
Hands-On Algorithms for Predictive ModelingHands-On Algorithms for Predictive Modeling
Hands-On Algorithms for Predictive Modeling
 
Slides lyon-anr
Slides lyon-anrSlides lyon-anr
Slides lyon-anr
 
Slides irisa
Slides irisaSlides irisa
Slides irisa
 
Bayesian Experimental Design for Stochastic Kinetic Models
Bayesian Experimental Design for Stochastic Kinetic ModelsBayesian Experimental Design for Stochastic Kinetic Models
Bayesian Experimental Design for Stochastic Kinetic Models
 
Slides ub-2
Slides ub-2Slides ub-2
Slides ub-2
 
Side 2019 #3
Side 2019 #3Side 2019 #3
Side 2019 #3
 
Machine Learning in Actuarial Science & Insurance
Machine Learning in Actuarial Science & InsuranceMachine Learning in Actuarial Science & Insurance
Machine Learning in Actuarial Science & Insurance
 
Slides ub-3
Slides ub-3Slides ub-3
Slides ub-3
 
A kernel-free particle method: Smile Problem Resolved
A kernel-free particle method: Smile Problem ResolvedA kernel-free particle method: Smile Problem Resolved
A kernel-free particle method: Smile Problem Resolved
 
Options Portfolio Selection
Options Portfolio SelectionOptions Portfolio Selection
Options Portfolio Selection
 
Research internship on optimal stochastic theory with financial application u...
Research internship on optimal stochastic theory with financial application u...Research internship on optimal stochastic theory with financial application u...
Research internship on optimal stochastic theory with financial application u...
 
Presentation on stochastic control problem with financial applications (Merto...
Presentation on stochastic control problem with financial applications (Merto...Presentation on stochastic control problem with financial applications (Merto...
Presentation on stochastic control problem with financial applications (Merto...
 
Lausanne 2019 #1
Lausanne 2019 #1Lausanne 2019 #1
Lausanne 2019 #1
 
Slides angers-sfds
Slides angers-sfdsSlides angers-sfds
Slides angers-sfds
 
Slides sales-forecasting-session2-web
Slides sales-forecasting-session2-webSlides sales-forecasting-session2-web
Slides sales-forecasting-session2-web
 
Slides essec
Slides essecSlides essec
Slides essec
 
Basic concepts and how to measure price volatility
Basic concepts and how to measure price volatility Basic concepts and how to measure price volatility
Basic concepts and how to measure price volatility
 
Biosight: Quantitative Methods for Policy Analysis: Stochastic Dynamic Progra...
Biosight: Quantitative Methods for Policy Analysis: Stochastic Dynamic Progra...Biosight: Quantitative Methods for Policy Analysis: Stochastic Dynamic Progra...
Biosight: Quantitative Methods for Policy Analysis: Stochastic Dynamic Progra...
 
Slides rmetrics-1
Slides rmetrics-1Slides rmetrics-1
Slides rmetrics-1
 
fb69b412-97cb-4e8d-8a28-574c09557d35-160618025920
fb69b412-97cb-4e8d-8a28-574c09557d35-160618025920fb69b412-97cb-4e8d-8a28-574c09557d35-160618025920
fb69b412-97cb-4e8d-8a28-574c09557d35-160618025920
 

Más de Arthur Charpentier (20)

Family History and Life Insurance
Family History and Life InsuranceFamily History and Life Insurance
Family History and Life Insurance
 
ACT6100 introduction
ACT6100 introductionACT6100 introduction
ACT6100 introduction
 
Family History and Life Insurance (UConn actuarial seminar)
Family History and Life Insurance (UConn actuarial seminar)Family History and Life Insurance (UConn actuarial seminar)
Family History and Life Insurance (UConn actuarial seminar)
 
Control epidemics
Control epidemics Control epidemics
Control epidemics
 
STT5100 Automne 2020, introduction
STT5100 Automne 2020, introductionSTT5100 Automne 2020, introduction
STT5100 Automne 2020, introduction
 
Family History and Life Insurance
Family History and Life InsuranceFamily History and Life Insurance
Family History and Life Insurance
 
Reinforcement Learning in Economics and Finance
Reinforcement Learning in Economics and FinanceReinforcement Learning in Economics and Finance
Reinforcement Learning in Economics and Finance
 
Optimal Control and COVID-19
Optimal Control and COVID-19Optimal Control and COVID-19
Optimal Control and COVID-19
 
Slides OICA 2020
Slides OICA 2020Slides OICA 2020
Slides OICA 2020
 
Lausanne 2019 #3
Lausanne 2019 #3Lausanne 2019 #3
Lausanne 2019 #3
 
Lausanne 2019 #4
Lausanne 2019 #4Lausanne 2019 #4
Lausanne 2019 #4
 
Lausanne 2019 #2
Lausanne 2019 #2Lausanne 2019 #2
Lausanne 2019 #2
 
Side 2019 #10
Side 2019 #10Side 2019 #10
Side 2019 #10
 
Side 2019 #11
Side 2019 #11Side 2019 #11
Side 2019 #11
 
Side 2019 #12
Side 2019 #12Side 2019 #12
Side 2019 #12
 
Side 2019 #9
Side 2019 #9Side 2019 #9
Side 2019 #9
 
Side 2019 #8
Side 2019 #8Side 2019 #8
Side 2019 #8
 
Side 2019 #7
Side 2019 #7Side 2019 #7
Side 2019 #7
 
Side 2019 #6
Side 2019 #6Side 2019 #6
Side 2019 #6
 
Side 2019 #5
Side 2019 #5Side 2019 #5
Side 2019 #5
 

Último

Call Girls Howrah ( 8250092165 ) Cheap rates call girls | Get low budget
Call Girls Howrah ( 8250092165 ) Cheap rates call girls | Get low budgetCall Girls Howrah ( 8250092165 ) Cheap rates call girls | Get low budget
Call Girls Howrah ( 8250092165 ) Cheap rates call girls | Get low budgetSareena Khatun
 
Explore Dual Citizenship in Africa | Citizenship Benefits & Requirements
Explore Dual Citizenship in Africa | Citizenship Benefits & RequirementsExplore Dual Citizenship in Africa | Citizenship Benefits & Requirements
Explore Dual Citizenship in Africa | Citizenship Benefits & Requirementsmarketingkingdomofku
 
Dubai Call Girls Deira O525547819 Dubai Call Girls Bur Dubai Multiple
Dubai Call Girls Deira O525547819 Dubai Call Girls Bur Dubai MultipleDubai Call Girls Deira O525547819 Dubai Call Girls Bur Dubai Multiple
Dubai Call Girls Deira O525547819 Dubai Call Girls Bur Dubai Multiplekojalpk89
 
20240419-SMC-submission-Annual-Superannuation-Performance-Test-–-design-optio...
20240419-SMC-submission-Annual-Superannuation-Performance-Test-–-design-optio...20240419-SMC-submission-Annual-Superannuation-Performance-Test-–-design-optio...
20240419-SMC-submission-Annual-Superannuation-Performance-Test-–-design-optio...Henry Tapper
 
CBD Belapur((Thane)) Charming Call Girls📞❤9833754194 Kamothe Beautiful Call G...
CBD Belapur((Thane)) Charming Call Girls📞❤9833754194 Kamothe Beautiful Call G...CBD Belapur((Thane)) Charming Call Girls📞❤9833754194 Kamothe Beautiful Call G...
CBD Belapur((Thane)) Charming Call Girls📞❤9833754194 Kamothe Beautiful Call G...priyasharma62062
 
Turbhe Fantastic Escorts📞📞9833754194 Kopar Khairane Marathi Call Girls-Kopar ...
Turbhe Fantastic Escorts📞📞9833754194 Kopar Khairane Marathi Call Girls-Kopar ...Turbhe Fantastic Escorts📞📞9833754194 Kopar Khairane Marathi Call Girls-Kopar ...
Turbhe Fantastic Escorts📞📞9833754194 Kopar Khairane Marathi Call Girls-Kopar ...priyasharma62062
 
logistics industry development power point ppt.pdf
logistics industry development power point ppt.pdflogistics industry development power point ppt.pdf
logistics industry development power point ppt.pdfSalimullah13
 
Webinar on E-Invoicing for Fintech Belgium
Webinar on E-Invoicing for Fintech BelgiumWebinar on E-Invoicing for Fintech Belgium
Webinar on E-Invoicing for Fintech BelgiumFinTech Belgium
 
Significant AI Trends for the Financial Industry in 2024 and How to Utilize Them
Significant AI Trends for the Financial Industry in 2024 and How to Utilize ThemSignificant AI Trends for the Financial Industry in 2024 and How to Utilize Them
Significant AI Trends for the Financial Industry in 2024 and How to Utilize Them360factors
 
Famous Kala Jadu, Black magic expert in Faisalabad and Kala ilam specialist i...
Famous Kala Jadu, Black magic expert in Faisalabad and Kala ilam specialist i...Famous Kala Jadu, Black magic expert in Faisalabad and Kala ilam specialist i...
Famous Kala Jadu, Black magic expert in Faisalabad and Kala ilam specialist i...batoole333
 
MASTERING FOREX: STRATEGIES FOR SUCCESS.pdf
MASTERING FOREX: STRATEGIES FOR SUCCESS.pdfMASTERING FOREX: STRATEGIES FOR SUCCESS.pdf
MASTERING FOREX: STRATEGIES FOR SUCCESS.pdfCocity Enterprises
 
Famous No1 Amil Baba Love marriage Astrologer Specialist Expert In Pakistan a...
Famous No1 Amil Baba Love marriage Astrologer Specialist Expert In Pakistan a...Famous No1 Amil Baba Love marriage Astrologer Specialist Expert In Pakistan a...
Famous No1 Amil Baba Love marriage Astrologer Specialist Expert In Pakistan a...janibaber266
 
Escorts Indore Call Girls-9155612368-Vijay Nagar Decent Fantastic Call Girls ...
Escorts Indore Call Girls-9155612368-Vijay Nagar Decent Fantastic Call Girls ...Escorts Indore Call Girls-9155612368-Vijay Nagar Decent Fantastic Call Girls ...
Escorts Indore Call Girls-9155612368-Vijay Nagar Decent Fantastic Call Girls ...sanakhan51485
 
cost-volume-profit analysis.ppt(managerial accounting).pptx
cost-volume-profit analysis.ppt(managerial accounting).pptxcost-volume-profit analysis.ppt(managerial accounting).pptx
cost-volume-profit analysis.ppt(managerial accounting).pptxazadalisthp2020i
 
Bhubaneswar🌹Kalpana Mesuem ❤CALL GIRLS 9777949614 💟 CALL GIRLS IN bhubaneswa...
Bhubaneswar🌹Kalpana Mesuem  ❤CALL GIRLS 9777949614 💟 CALL GIRLS IN bhubaneswa...Bhubaneswar🌹Kalpana Mesuem  ❤CALL GIRLS 9777949614 💟 CALL GIRLS IN bhubaneswa...
Bhubaneswar🌹Kalpana Mesuem ❤CALL GIRLS 9777949614 💟 CALL GIRLS IN bhubaneswa...Call Girls Mumbai
 
GIFT City Overview India's Gateway to Global Finance
GIFT City Overview  India's Gateway to Global FinanceGIFT City Overview  India's Gateway to Global Finance
GIFT City Overview India's Gateway to Global FinanceGaurav Kanudawala
 
Mahendragarh Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Mahendragarh Escorts 🥰 8617370543 Call Girls Offer VIP Hot GirlsMahendragarh Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Mahendragarh Escorts 🥰 8617370543 Call Girls Offer VIP Hot GirlsDeepika Singh
 
Pension dashboards forum 1 May 2024 (1).pdf
Pension dashboards forum 1 May 2024 (1).pdfPension dashboards forum 1 May 2024 (1).pdf
Pension dashboards forum 1 May 2024 (1).pdfHenry Tapper
 

Último (20)

Call Girls Howrah ( 8250092165 ) Cheap rates call girls | Get low budget
Call Girls Howrah ( 8250092165 ) Cheap rates call girls | Get low budgetCall Girls Howrah ( 8250092165 ) Cheap rates call girls | Get low budget
Call Girls Howrah ( 8250092165 ) Cheap rates call girls | Get low budget
 
Explore Dual Citizenship in Africa | Citizenship Benefits & Requirements
Explore Dual Citizenship in Africa | Citizenship Benefits & RequirementsExplore Dual Citizenship in Africa | Citizenship Benefits & Requirements
Explore Dual Citizenship in Africa | Citizenship Benefits & Requirements
 
Dubai Call Girls Deira O525547819 Dubai Call Girls Bur Dubai Multiple
Dubai Call Girls Deira O525547819 Dubai Call Girls Bur Dubai MultipleDubai Call Girls Deira O525547819 Dubai Call Girls Bur Dubai Multiple
Dubai Call Girls Deira O525547819 Dubai Call Girls Bur Dubai Multiple
 
20240419-SMC-submission-Annual-Superannuation-Performance-Test-–-design-optio...
20240419-SMC-submission-Annual-Superannuation-Performance-Test-–-design-optio...20240419-SMC-submission-Annual-Superannuation-Performance-Test-–-design-optio...
20240419-SMC-submission-Annual-Superannuation-Performance-Test-–-design-optio...
 
CBD Belapur((Thane)) Charming Call Girls📞❤9833754194 Kamothe Beautiful Call G...
CBD Belapur((Thane)) Charming Call Girls📞❤9833754194 Kamothe Beautiful Call G...CBD Belapur((Thane)) Charming Call Girls📞❤9833754194 Kamothe Beautiful Call G...
CBD Belapur((Thane)) Charming Call Girls📞❤9833754194 Kamothe Beautiful Call G...
 
Turbhe Fantastic Escorts📞📞9833754194 Kopar Khairane Marathi Call Girls-Kopar ...
Turbhe Fantastic Escorts📞📞9833754194 Kopar Khairane Marathi Call Girls-Kopar ...Turbhe Fantastic Escorts📞📞9833754194 Kopar Khairane Marathi Call Girls-Kopar ...
Turbhe Fantastic Escorts📞📞9833754194 Kopar Khairane Marathi Call Girls-Kopar ...
 
logistics industry development power point ppt.pdf
logistics industry development power point ppt.pdflogistics industry development power point ppt.pdf
logistics industry development power point ppt.pdf
 
Call Girls in Tilak Nagar (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in Tilak Nagar (delhi) call me [🔝9953056974🔝] escort service 24X7Call Girls in Tilak Nagar (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in Tilak Nagar (delhi) call me [🔝9953056974🔝] escort service 24X7
 
Webinar on E-Invoicing for Fintech Belgium
Webinar on E-Invoicing for Fintech BelgiumWebinar on E-Invoicing for Fintech Belgium
Webinar on E-Invoicing for Fintech Belgium
 
Significant AI Trends for the Financial Industry in 2024 and How to Utilize Them
Significant AI Trends for the Financial Industry in 2024 and How to Utilize ThemSignificant AI Trends for the Financial Industry in 2024 and How to Utilize Them
Significant AI Trends for the Financial Industry in 2024 and How to Utilize Them
 
Famous Kala Jadu, Black magic expert in Faisalabad and Kala ilam specialist i...
Famous Kala Jadu, Black magic expert in Faisalabad and Kala ilam specialist i...Famous Kala Jadu, Black magic expert in Faisalabad and Kala ilam specialist i...
Famous Kala Jadu, Black magic expert in Faisalabad and Kala ilam specialist i...
 
MASTERING FOREX: STRATEGIES FOR SUCCESS.pdf
MASTERING FOREX: STRATEGIES FOR SUCCESS.pdfMASTERING FOREX: STRATEGIES FOR SUCCESS.pdf
MASTERING FOREX: STRATEGIES FOR SUCCESS.pdf
 
Famous No1 Amil Baba Love marriage Astrologer Specialist Expert In Pakistan a...
Famous No1 Amil Baba Love marriage Astrologer Specialist Expert In Pakistan a...Famous No1 Amil Baba Love marriage Astrologer Specialist Expert In Pakistan a...
Famous No1 Amil Baba Love marriage Astrologer Specialist Expert In Pakistan a...
 
Escorts Indore Call Girls-9155612368-Vijay Nagar Decent Fantastic Call Girls ...
Escorts Indore Call Girls-9155612368-Vijay Nagar Decent Fantastic Call Girls ...Escorts Indore Call Girls-9155612368-Vijay Nagar Decent Fantastic Call Girls ...
Escorts Indore Call Girls-9155612368-Vijay Nagar Decent Fantastic Call Girls ...
 
cost-volume-profit analysis.ppt(managerial accounting).pptx
cost-volume-profit analysis.ppt(managerial accounting).pptxcost-volume-profit analysis.ppt(managerial accounting).pptx
cost-volume-profit analysis.ppt(managerial accounting).pptx
 
Bhubaneswar🌹Kalpana Mesuem ❤CALL GIRLS 9777949614 💟 CALL GIRLS IN bhubaneswa...
Bhubaneswar🌹Kalpana Mesuem  ❤CALL GIRLS 9777949614 💟 CALL GIRLS IN bhubaneswa...Bhubaneswar🌹Kalpana Mesuem  ❤CALL GIRLS 9777949614 💟 CALL GIRLS IN bhubaneswa...
Bhubaneswar🌹Kalpana Mesuem ❤CALL GIRLS 9777949614 💟 CALL GIRLS IN bhubaneswa...
 
GIFT City Overview India's Gateway to Global Finance
GIFT City Overview  India's Gateway to Global FinanceGIFT City Overview  India's Gateway to Global Finance
GIFT City Overview India's Gateway to Global Finance
 
Mahendragarh Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Mahendragarh Escorts 🥰 8617370543 Call Girls Offer VIP Hot GirlsMahendragarh Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
Mahendragarh Escorts 🥰 8617370543 Call Girls Offer VIP Hot Girls
 
Pension dashboards forum 1 May 2024 (1).pdf
Pension dashboards forum 1 May 2024 (1).pdfPension dashboards forum 1 May 2024 (1).pdf
Pension dashboards forum 1 May 2024 (1).pdf
 
Call Girls in Yamuna Vihar (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in  Yamuna Vihar  (delhi) call me [🔝9953056974🔝] escort service 24X7Call Girls in  Yamuna Vihar  (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in Yamuna Vihar (delhi) call me [🔝9953056974🔝] escort service 24X7
 

Slides ensae 9

  • 1. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Actuariat de l’Assurance Non-Vie # 9 A. Charpentier (UQAM & Université de Rennes 1) ENSAE ParisTech, Octobre 2015 - Janvier 2016. http://freakonometrics.hypotheses.org @freakonometrics 1
  • 2. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Fourre-Tout sur la Tarification • modèle collectif vs. modèle individuel • cas de la grande dimension • choix de variables • choix de modèles @freakonometrics 2
  • 3. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Modèle individuel ou modèle collectif ? La loi Tweedie Consider a Tweedie distribution, with variance function power p ∈ (1, 2), mean µ and scale parameter φ, then it is a compound Poisson model, • N ∼ P(λ) with λ = φµ2−p 2 − p • Yi ∼ G(α, β) with α = − p − 2 p − 1 and β = φµ1−p p − 1 Consversely, consider a compound Poisson model N ∼ P(λ) and Yi ∼ G(α, β), then • variance function power is p = α + 2 α + 1 • mean is µ = λα β • scale parameter is φ = [λα] α+2 α+1 −1 β2− α+2 α+1 α + 1 @freakonometrics 3
  • 4. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Modèle individuel ou modèle collectif ? La régression Tweedie In the context of regression Ni ∼ P(λi) with λi = exp[XT i βλ] Yj,i ∼ G(µi, φ) with µi = exp[XT i βµ] Then Si = Y1,i + · · · + YN,i has a Tweedie distribution • variance function power is p = φ + 2 φ + 1 • mean is λiµi • scale parameter is λ 1 φ+1 −1 i µ φ φ+1 i φ 1 + φ There are 1 + 2dim(X) degrees of freedom. @freakonometrics 4
  • 5. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Modèle individuel ou modèle collectif ? La régression Tweedie Remark Note that the scale parameter should not depend on i. A Tweedie regression is • variance function power is p ∈ (1, 2) • mean is µi = exp[XT i βTweedie] • scale parameter is φ There are 2 + dim(X) degrees of freedom. @freakonometrics 5
  • 6. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Double Modèle Fr´quence - Coût Individuel Considérons les bases suivantes, en RC, pour la fréquence 1 > freq = merge(contrat ,nombre_RC) pour les coûts individuels 1 > sinistre_RC = sinistre [( sinistre$garantie =="1RC")&(sinistre$cout >0) ,] 2 > sinistre_RC = merge(sinistre_RC ,contrat) et pour les co ûts agrégés par police 1 > agg_RC = aggregate(sinistre_RC$cout , by=list(sinistre_RC$nocontrat) , FUN=’sum’) 2 > names(agg_RC)=c(’nocontrat ’,’cout_RC’) 3 > global_RC = merge(contrat , agg_RC , all.x=TRUE) 4 > global_RC$cout_RC[is.na(global_DO$cout_RC)]=0 @freakonometrics 6
  • 7. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Double Modèle Fr´quence - Coût Individuel 1 > library(splines) 2 > reg_f = glm(nb_RC~zone+bs( ageconducteur )+carburant , offset=log( exposition),data=freq ,family=poisson) 3 > reg_c = glm(cout~zone+bs( ageconducteur )+carburant , data=sinistre_RC ,family=Gamma(link="log")) Simple Modèle Coût par Police 1 > library(tweedie) 2 > library(statmod) 3 > reg_a = glm(cout_RC~zone+bs( ageconducteur )+carburant , offset=log( exposition),data=global_RC ,family=tweedie(var.power =1.5 , link. power =0)) @freakonometrics 7
  • 8. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Comparaison des primes 1 > freq2 = freq 2 > freq2$exposition = 1 3 > P_f = predict(reg_f,newdata=freq2 ,type="response") 4 > P_c = predict(reg_c,newdata=freq2 ,type="response") 5 prime1 = P_f*P_c 1 > k = 1.5 2 > reg_a = glm(cout_DO~zone+bs( ageconducteur )+carburant , offset=log( exposition),data=global_DO ,family=tweedie(var.power=k, link.power =0)) 3 > prime2 = predict(reg_a,newdata=freq2 ,type="response") 1 > arrows (1:100 , prime1 [1:100] ,1:100 , prime2 [1:100] , length =.1) @freakonometrics 8
  • 9. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Impact du degré Tweedie sur les Primes Pures @freakonometrics 9
  • 10. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Impact du degré Tweedie sur les Primes Pures Comparaison des primes pures, assurés no1, no2 et no 3 (DO) @freakonometrics 10
  • 11. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 ‘Optimisation’ du Paramètre Tweedie 1 > dev = function(k){ 2 + reg = glm(cout_RC~zone+bs( ageconducteur )+ carburant , data=global_RC , family= tweedie(var.power=k, link.power =0) , offset=log(exposition)) 3 + reg$deviance 4 + } @freakonometrics 11
  • 12. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Tarification et données massives (Big Data) Problèmes classiques avec des données massives • beaucoup de variables explicatives, k grand, XT X peut-être non inversible • gros volumes de données, e.g. données télématiques • données non quantitatives, e.g. texte, localisation, etc. @freakonometrics 12
  • 13. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 La fascination pour les estimateurs sans biais En statistique mathématique, on aime les estimateurs sans biais car ils ont plusieurs propriétés intéressantes. Mais ne peut-on pas considérer des estimateurs biaisés, potentiellement meilleurs ? Consider a sample, i.i.d., {y1, · · · , yn} with distribution N(µ, σ2 ). Define θ = αY . What is the optimal α to get the best estimator of µ ? • bias: bias θ = E θ − µ = (α − 1)µ • variance: Var θ = α2 σ2 n • mse: mse θ = (α − 1)2 µ2 + α2 σ2 n The optimal value is α = µ2 µ2 + σ2 n < 1. @freakonometrics 13
  • 14. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Linear Model Consider some linear model yi = xT i β + εi for all i = 1, · · · , n. Assume that εi are i.i.d. with E(ε) = 0 (and finite variance). Write      y1 ... yn      y,n×1 =      1 x1,1 · · · x1,k ... ... ... ... 1 xn,1 · · · xn,k      X,n×(k+1)         β0 β1 ... βk         β,(k+1)×1 +      ε1 ... εn      ε,n×1 . Assuming ε ∼ N(0, σ2 I), the maximum likelihood estimator of β is β = argmin{ y − XT β 2 } = (XT X)−1 XT y ... under the assumtption that XT X is a full-rank matrix. What if XT i X cannot be inverted? Then β = [XT X]−1 XT y does not exist, but βλ = [XT X + λI]−1 XT y always exist if λ > 0. @freakonometrics 14
  • 15. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Ridge Regression The estimator β = [XT X + λI]−1 XT y is the Ridge estimate obtained as solution of β = argmin β    n i=1 [yi − β0 − xT i β]2 + λ β 2 1Tβ2    for some tuning parameter λ. One can also write β = argmin β; β 2 ≤s { Y − XT β 2 } Remark Note that we solve β = argmin β {objective(β)} where objective(β) = L(β) training loss + R(β) regularization @freakonometrics 15
  • 16. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Going further on sparcity issues In severall applications, k can be (very) large, but a lot of features are just noise: βj = 0 for many j’s. Let s denote the number of relevent features, with s << k, cf Hastie, Tibshirani & Wainwright (2015), s = card{S} where S = {j; βj = 0} The model is now y = XT SβS + ε, where XT SXS is a full rank matrix. @freakonometrics 16
  • 17. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Going further on sparcity issues Define a 0 = 1(|ai| > 0). Ici dim(β) = s. We wish we could solve β = argmin β; β 0 ≤s { Y − XT β 2 } Problem: it is usually not possible to describe all possible constraints, since s k coefficients should be chosen here (with k (very) large). Idea: solve the dual problem β = argmin β; Y −XTβ 2 ≤h { β 0 } where we might convexify the 0 norm, · 0 . @freakonometrics 17
  • 18. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Regularization 0, 1 et 2 @freakonometrics 18
  • 19. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Going further on sparcity issues On [−1, +1]k , the convex hull of β 0 is β 1 On [−a, +a]k , the convex hull of β 0 is a−1 β 1 Hence, β = argmin β; β 1 ≤˜s { Y − XT β 2 } is equivalent (Kuhn-Tucker theorem) to the Lagragian optimization problem β = argmin{ Y − XT β 2 +λ β 1 } @freakonometrics 19
  • 20. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 LASSO Least Absolute Shrinkage and Selection Operator β ∈ argmin{ Y − XT β 2 +λ β 1 } is a convex problem (several algorithms ), but not strictly convex (no unicity of the minimum). Nevertheless, predictions y = xT β are unique MM, minimize majorization, coordinate descent Hunter (2003). @freakonometrics 20
  • 21. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Optimal LASSO Penalty Use cross validation, e.g. K-fold, β(−k)(λ) = argmin    i∈Ik [yi − xT i β]2 + λ β    then compute the sum of the squared errors, Qk(λ) = i∈Ik [yi − xT i β(−k)(λ)]2 and finally solve λ = argmin Q(λ) = 1 K k Qk(λ) Note that this might overfit, so Hastie, Tibshiriani & Friedman (2009) suggest the largest λ such that Q(λ) ≤ Q(λ ) + se[λ ] with se[λ]2 = 1 K2 K k=1 [Qk(λ) − Q(λ)]2 @freakonometrics 21
  • 22. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 1 > freq = merge(contrat ,nombre_RC) 2 > freq = merge(freq ,nombre_DO) 3 > freq [ ,10]= as.factor(freq [ ,10]) 4 > mx=cbind(freq[,c(4,5,6)],freq [ ,9]=="D", freq [ ,3]%in%c("A","B","C")) 5 > colnames(mx)=c(names(freq)[c(4,5,6)]," diesel","zone") 6 > for(i in 1: ncol(mx)) mx[,i]=( mx[,i]-mean( mx[,i]))/sd(mx[,i]) 7 > names(mx) 8 [1] puissance agevehicule ageconducteur diesel zone 9 > library(glmnet) 10 > fit = glmnet(x=as.matrix(mx), y=freq [,11], offset=log(freq [ ,2]), family = "poisson ") 11 > plot(fit , xvar="lambda", label=TRUE) LASSO, Fréquence RC −10 −9 −8 −7 −6 −5 −0.20−0.15−0.10−0.050.000.050.10 Log Lambda Coefficients 4 4 4 4 3 1 1 3 4 5 @freakonometrics 22
  • 23. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 LASSO, Fréquence RC 1 > plot(fit ,label=TRUE) 2 > cvfit = cv.glmnet(x=as.matrix(mx), y=freq [,11], offset=log(freq [ ,2]),family = " poisson") 3 > plot(cvfit) 4 > cvfit$lambda.min 5 [1] 0.0002845703 6 > log(cvfit$lambda.min) 7 [1] -8.16453 • Cross validation curve + error bars 0.0 0.1 0.2 0.3 0.4 −0.20−0.15−0.10−0.050.000.050.10 L1 Norm Coefficients 0 2 3 4 4 1 3 4 5 −10 −9 −8 −7 −6 −5 0.2460.2480.2500.2520.2540.2560.258 log(Lambda) PoissonDeviance q q q q q q q q q q qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq 4 4 4 4 4 4 4 4 4 4 4 4 4 3 3 3 2 1 @freakonometrics 23
  • 24. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 1 > freq = merge(contrat ,nombre_RC) 2 > freq = merge(freq ,nombre_DO) 3 > freq [ ,10]= as.factor(freq [ ,10]) 4 > mx=cbind(freq[,c(4,5,6)],freq [ ,9]=="D", freq [ ,3]%in%c("A","B","C")) 5 > colnames(mx)=c(names(freq)[c(4,5,6)]," diesel","zone") 6 > for(i in 1: ncol(mx)) mx[,i]=( mx[,i]-mean( mx[,i]))/sd(mx[,i]) 7 > names(mx) 8 [1] puissance agevehicule ageconducteur diesel zone 9 > library(glmnet) 10 > fit = glmnet(x=as.matrix(mx), y=freq [,12], offset=log(freq [ ,2]), family = "poisson ") 11 > plot(fit , xvar="lambda", label=TRUE) LASSO, Fréquence DO −9 −8 −7 −6 −5 −4 −0.8−0.6−0.4−0.20.0 Log Lambda Coefficients 4 4 3 2 1 1 1 2 4 5 @freakonometrics 24
  • 25. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 LASSO, Fréquence DO 1 > plot(fit ,label=TRUE) 2 > cvfit = cv.glmnet(x=as.matrix(mx), y=freq [,12], offset=log(freq [ ,2]),family = " poisson") 3 > plot(cvfit) 4 > cvfit$lambda.min 5 [1] 0.0004744917 6 > log(cvfit$lambda.min) 7 [1] -7.653266 • Cross validation curve + error bars 0.0 0.2 0.4 0.6 0.8 1.0 −0.8−0.6−0.4−0.20.0 L1 Norm Coefficients 0 1 1 1 2 4 1 2 4 5 −9 −8 −7 −6 −5 −4 0.2150.2200.2250.2300.235 log(Lambda) PoissonDeviance q q q q q q q q q q q q q q q q qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq 4 4 4 4 3 3 3 3 3 2 2 1 1 1 1 1 1 1 1 @freakonometrics 25
  • 26. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Model Selection and Gini/Lorentz (on incomes) Consider an ordered sample {y1, · · · , yn}, then Lorenz curve is {Fi, Li} with Fi = i n and Li = i j=1 yj n j=1 yj The theoretical curve, given a distribution F, is u → L(u) = F −1 (u) −∞ tdF(t) +∞ −∞ tdF(t) see Gastwirth (1972, econpapers.repec.org) @freakonometrics 26
  • 27. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Model Selection and Gini/Lorentz (on incomes) 1 > library(ineq) 2 > set.seed (1) 3 > (x<-sort(rlnorm (5,0,1))) 4 [1] 0.4336018 0.5344838 1.2015872 1.3902836 4.9297132 5 > Lc.sim <- Lc(x) 6 > plot(Lc.sim) 7 > points ((1:4)/5,( cumsum(x)/sum(x))[1:4] , pch =19, col="blue") 8 > lines(Lc.lognorm , parameter =1,lty =2) 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Lorenz curve p L(p) q q q q @freakonometrics 27
  • 28. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Model Selection and Gini/Lorentz (on incomes) Gini index is the ratio of the areas A A + B . Thus, G = 2 n(n − 1)x n i=1 i · xi:n − n + 1 n − 1 = 1 E(Y ) ∞ 0 F(y)(1 − F(y))dy 1 > Gini(x) 2 [1] 0.4640003 q q 0.0 0.2 0.4 0.6 0.8 1.0 0.00.20.40.60.81.0 p L(p) q q q q A B @freakonometrics 28
  • 29. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Comparing Models Consider an ordered sample {y1, · · · , yn} of incomes, with y1 ≤ y2 ≤ · · · ≤ yn, then Lorenz curve is {Fi, Li} with Fi = i n and Li = i j=1 yj n j=1 yj We have observed losses yi and premiums π(xi). Con- sider an ordered sample by the model, see Frees, Meyers & Cummins (2014), π(x1) ≥ π(x2) ≥ · · · ≥ π(xn), then plot {Fi, Li} with Fi = i n and Li = i j=1 yj n j=1 yj 0 20 40 60 80 100 020406080100 Proportion (%) Income(%) poorest ← → richest 0 20 40 60 80 100 020406080100 Proportion (%) Losses(%) more risky ← → less risky @freakonometrics 29
  • 30. Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9 Choix et comparaison de modèle en tarification See Frees et al. (2010) or Tevet (2013). @freakonometrics 30