dataset- https---docs-google-com-spreadsheets-d-1t_rnuYco8eGSVJ3DnrGsm.pdf

•

0 recomendaciones•4 vistas

(1) The Tweedie regression model estimates the power parameter p to be 1.5 and the scale parameter φ to be 0.1 based on a linear regression of the sample means and variances. (2) Forward selection enters 11 predictors into the model in order of statistical significance, with the quasi-likelihood and deviance statistics reported at each step. (3) The final model shows good calibration with an RMSE of $5,000, relative error of 30%, Pearson correlation of 0.6 and distance correlation of 0.4 between predicted and observed claims.

Educación

dataset:
https://docs.google.com/spreadsheets/d/1t_rnuYco8eGSVJ3DnrGsmP0A8VvC7L4m/edit?usp=
sharing&ouid=113562796292872953384&rtpof=true&sd=true
please use python code to solve the problem.
In insurance ratemaking, the term Pure Premium is defined as the claim amount divided by the
Exposure duration. Nowadays, actuaries assume a Tweedie distribution for the claim amount.
Using the claim_history.xIsx, we will train a Tweedie regression model to study how policy
attributes affect the claim amount. The model has the following specifications. - Response
Variable: CLM_AMT - Distribution: Tweedie - Link Function: Natural logarithm - Offset
Variable: Natural logarithm of EXPOSURE - Categorical Predictors: CAR_TYPE, CAR_USE,
EDUCATION, GENDER, MSTATUS, PARENT1, RED_CAR, REVOKED, and
URBANICITY. Reorder the categories of each predictor in ascending order of the number of
observations. - Interval Predictors: AGE, BLUEBOOK, CAR_AGE, HOME_VAL,
HOMEKIDS, INCOME, YOJ, KIDSDRIV, MVR_PTS, TIF, and TRAVTIME. Please divide
BLUEBOOK, HOME_VAL, and INCOME by 1000 before training the model. - The model
always includes the Intercept term. We will first drop all missing values casewisely in all the
predictors, the target variable, and the offset variable. Then, we will only use complete
observations for training our models. a) (10 points). We will first estimate the Tweedie
distribution's Power parameter p and Scale parameter . To this end, we will calculate the sample
means and the sample variances of the claim amount for each value combination of the
predictors. Then, we will train a linear regression model to help us estimate the two parameters.
What are their values? Please provide us with your appropriate chart. b) (10 points). We will use
the Forward Selection method to enter predictors into our model. Our entry threshold is 0.05 .
Please provide a summary report of the Forward Selection in a table. The report should include
(1) the Step Number, (2) the Predictor Entered, (3) the Model Degree of Freedom (i.e., the
number of non-aliased parameters), (4) the Quasi-Loglikelihood value, (5) the Deviance Chi-
squares statistic between the current and the previous models, (6) the corresponding Deviance
Degree of Freedom, and (7) the corresponding Chi-square significance. c) (10 points). We will
calculate the Root Mean Squared Error, the Relative Error, the Pearson correlation, and the
Distance correlation between the observed and the predicted claim amounts of your final model.
Please comment on their values. d) (10 points). Please show a table of the complete set of
parameters of your final model (including the aliased parameters). Besides the parameter
estimates, please also include the standard errors, the 95% asymptotic confidence intervals, and
the exponentiated parameter estimates. Conventionally, aliased parameters have zero standard
errors and confidence intervals. Please also provide us with the final estimate of the Tweedie
distribution's scale parameter . e) (10 points). Please generate a Two-way Lift chart for
comparing your final model with the Interceptonly model. Based on the chart, what will you
conclude about your final model?

Más contenido relacionado

Similar a dataset- https---docs-google-com-spreadsheets-d-1t_rnuYco8eGSVJ3DnrGsm.pdf

Regression krigingFAO

Credit Card Fraud Detection Using Machine Learning & Data ScienceIRJET Journal

reportArthur He

Two methods for optimising cognitive model parametersUniversity of Huddersfield

Hidalgo jairo, yandun marco 595Marco Yandun

CFM Challenge - Course ProjectKhalilBergaoui

Performance Comparision of Machine Learning AlgorithmsDinusha Dilanka

Automobile Insurance Claim Fraud Detection using Random Forest and ADASYNIRJET Journal

IRJET- Titanic Survival Analysis using Logistic RegressionIRJET Journal

IRJET- Ad-Click Prediction using Prediction Algorithm: Machine Learning ApproachIRJET Journal

Supervised Learning.pdfgadissaassefa

ECN 425 Introduction to Econometrics Alvin Murphy .docxtidwellveronique

V. pacáková, d. breberalogyalaa

Human_Activity_Recognition_Predictive_ModelDavid Ritchie

Chapter 18,19heba_ahmad

Telecom customer churn predictionSaleesh Satheeshchandran

Telecom Churn AnalysisVasudev pendyala

Creating an Explainable Machine Learning AlgorithmBill Fite

Explainable Machine LearningBill Fite

Similar a dataset- https---docs-google-com-spreadsheets-d-1t_rnuYco8eGSVJ3DnrGsm.pdf (20)

Regression kriging

Credit Card Fraud Detection Using Machine Learning & Data Science

report

Two methods for optimising cognitive model parameters

Hidalgo jairo, yandun marco 595

CFM Challenge - Course Project

Performance Comparision of Machine Learning Algorithms

Automobile Insurance Claim Fraud Detection using Random Forest and ADASYN

IRJET- Titanic Survival Analysis using Logistic Regression

IRJET- Ad-Click Prediction using Prediction Algorithm: Machine Learning Approach

Supervised Learning.pdf

ECN 425 Introduction to Econometrics Alvin Murphy .docx

V. pacáková, d. brebera

Human_Activity_Recognition_Predictive_Model

Chapter 18,19

Telecom customer churn prediction

Telecom Churn Analysis

Creating an Explainable Machine Learning Algorithm

Explainable Machine Learning

Más de Phile3CGreeneg

cyber security help matching Match the phases of the cyber kill chain.pdfPhile3CGreeneg

Cyanide is a rapid and lethal poison that can lead to death as fast as.pdfPhile3CGreeneg

cyber security question Questimn-y ierisisinetion Cnviaz Cevirta Crivi.pdfPhile3CGreeneg

Cyanide poisons the electron transport chain complex IV- If cyanide is (1).pdfPhile3CGreeneg

DDOS - SPOOFING - SNIFFING Select the correct match among the list- Ut.pdfPhile3CGreeneg

Debermine the requied valye of the missing srebabilty to make Question.pdfPhile3CGreeneg

David is testing the hypothesis that people from Minnesota have a life.pdfPhile3CGreeneg

Dave Fletcher was able to determine the activity times for constructin (1).pdfPhile3CGreeneg

Database Access with JDBC When a result set is created- its record poi.pdfPhile3CGreeneg

Data visualization makes data easier to understand and reveals pattern.pdfPhile3CGreeneg

Data that are usually continuous measurements are called-A- Attributes (1).pdfPhile3CGreeneg

Data Structures Explain all the double linked list coding processes at.pdfPhile3CGreeneg

Current Attempt in Progress The trial balance columns of the worksheet.pdfPhile3CGreeneg

Data Handing Worksheet I Exercise- Gene X is successfully cloned into (1).pdfPhile3CGreeneg

Data are important for making decisions and developing your plan- Sea.pdfPhile3CGreeneg

Data For (G+) Bands.pdfPhile3CGreeneg

Data has to meet certain conditions-constraints-requirements in order.pdfPhile3CGreeneg

Data refers to a- information that have a particular meaning within a.pdfPhile3CGreeneg

Data Cleaning and Manipulation (10 points) There are five errors in th.pdfPhile3CGreeneg

Data concerring Bocivell Einterprises Corporation's singte product app.pdfPhile3CGreeneg

Más de Phile3CGreeneg (20)

cyber security help matching Match the phases of the cyber kill chain.pdf

Cyanide is a rapid and lethal poison that can lead to death as fast as.pdf

cyber security question Questimn-y ierisisinetion Cnviaz Cevirta Crivi.pdf

Cyanide poisons the electron transport chain complex IV- If cyanide is (1).pdf

DDOS - SPOOFING - SNIFFING Select the correct match among the list- Ut.pdf

Debermine the requied valye of the missing srebabilty to make Question.pdf

David is testing the hypothesis that people from Minnesota have a life.pdf

Dave Fletcher was able to determine the activity times for constructin (1).pdf

Database Access with JDBC When a result set is created- its record poi.pdf

Data visualization makes data easier to understand and reveals pattern.pdf

Data that are usually continuous measurements are called-A- Attributes (1).pdf

Data Structures Explain all the double linked list coding processes at.pdf

Current Attempt in Progress The trial balance columns of the worksheet.pdf

Data Handing Worksheet I Exercise- Gene X is successfully cloned into (1).pdf

Data are important for making decisions and developing your plan- Sea.pdf

Data For (G+) Bands.pdf

Data has to meet certain conditions-constraints-requirements in order.pdf

Data refers to a- information that have a particular meaning within a.pdf

Data Cleaning and Manipulation (10 points) There are five errors in th.pdf

Data concerring Bocivell Einterprises Corporation's singte product app.pdf

Último

UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi

Application orientated numerical on hev.pptRamjanShidvankar

NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...Amil baba

Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Pooja Bhuva

How to Add New Custom Addons Path in Odoo 17Celine George

Micro-Scholarship, What it is, How can it help me.pdfPoh-Sun Goh

ICT role in 21st century education and it's challenges.MaryamAhmad92

Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxPooja Bhuva

Salient Features of India constitution especially power and functionsKarakKing

This PowerPoint helps students to consider the concept of infinity.christianmathematics

HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxmarlenawright1

Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic

Sociology 101 Demonstration of Learning Exhibitjbellavia9

Holdier Curriculum Vitae (April 2024).pdfagholdier

How to Create and Manage Wizard in Odoo 17Celine George

Fostering Friendships - Enhancing Social Bonds in the ClassroomPooky Knightsmith

2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptxMaritesTamaniVerdade

How to Manage Global Discount in Odoo 17 POSCeline George

Food safety_Challenges food safety laboratories_.pdfSherif Taha

Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid

dataset- https---docs-google-com-spreadsheets-d-1t_rnuYco8eGSVJ3DnrGsm.pdf

1. dataset: https://docs.google.com/spreadsheets/d/1t_rnuYco8eGSVJ3DnrGsmP0A8VvC7L4m/edit?usp= sharing&ouid=113562796292872953384&rtpof=true&sd=true please use python code to solve the problem. In insurance ratemaking, the term Pure Premium is defined as the claim amount divided by the Exposure duration. Nowadays, actuaries assume a Tweedie distribution for the claim amount. Using the claim_history.xIsx, we will train a Tweedie regression model to study how policy attributes affect the claim amount. The model has the following specifications. - Response Variable: CLM_AMT - Distribution: Tweedie - Link Function: Natural logarithm - Offset Variable: Natural logarithm of EXPOSURE - Categorical Predictors: CAR_TYPE, CAR_USE, EDUCATION, GENDER, MSTATUS, PARENT1, RED_CAR, REVOKED, and URBANICITY. Reorder the categories of each predictor in ascending order of the number of observations. - Interval Predictors: AGE, BLUEBOOK, CAR_AGE, HOME_VAL, HOMEKIDS, INCOME, YOJ, KIDSDRIV, MVR_PTS, TIF, and TRAVTIME. Please divide BLUEBOOK, HOME_VAL, and INCOME by 1000 before training the model. - The model always includes the Intercept term. We will first drop all missing values casewisely in all the predictors, the target variable, and the offset variable. Then, we will only use complete observations for training our models. a) (10 points). We will first estimate the Tweedie distribution's Power parameter p and Scale parameter . To this end, we will calculate the sample means and the sample variances of the claim amount for each value combination of the predictors. Then, we will train a linear regression model to help us estimate the two parameters. What are their values? Please provide us with your appropriate chart. b) (10 points). We will use the Forward Selection method to enter predictors into our model. Our entry threshold is 0.05 . Please provide a summary report of the Forward Selection in a table. The report should include (1) the Step Number, (2) the Predictor Entered, (3) the Model Degree of Freedom (i.e., the number of non-aliased parameters), (4) the Quasi-Loglikelihood value, (5) the Deviance Chi- squares statistic between the current and the previous models, (6) the corresponding Deviance Degree of Freedom, and (7) the corresponding Chi-square significance. c) (10 points). We will calculate the Root Mean Squared Error, the Relative Error, the Pearson correlation, and the Distance correlation between the observed and the predicted claim amounts of your final model. Please comment on their values. d) (10 points). Please show a table of the complete set of parameters of your final model (including the aliased parameters). Besides the parameter estimates, please also include the standard errors, the 95% asymptotic confidence intervals, and the exponentiated parameter estimates. Conventionally, aliased parameters have zero standard errors and confidence intervals. Please also provide us with the final estimate of the Tweedie distribution's scale parameter . e) (10 points). Please generate a Two-way Lift chart for comparing your final model with the Interceptonly model. Based on the chart, what will you conclude about your final model?

dataset- https---docs-google-com-spreadsheets-d-1t_rnuYco8eGSVJ3DnrGsm.pdf

Recomendados

Recomendados

Más contenido relacionado

Similar a dataset- https---docs-google-com-spreadsheets-d-1t_rnuYco8eGSVJ3DnrGsm.pdf

Similar a dataset- https---docs-google-com-spreadsheets-d-1t_rnuYco8eGSVJ3DnrGsm.pdf (20)

Más de Phile3CGreeneg

Más de Phile3CGreeneg (20)

Último

Último (20)

dataset- https---docs-google-com-spreadsheets-d-1t_rnuYco8eGSVJ3DnrGsm.pdf