SlideShare una empresa de Scribd logo
1 de 15
Estimating Time to Default
using Survival Analysis tools


  Problems arising due to insufficient
  samples and censoring




           Estimating Time to Default using Survival Analysis   1
Contents

• Purpose of the presentation
• State the problem
• Formalize the problem and the model
• Data used and method applied
• Model assessment
• Results
• Conclusions


                Estimating Time to Default using Survival Analysis   2
Purpose of the presentation

• Adapt the author’s paper “Survival prediction
  using gene expression data: a review and
  comparison” to a company default setting
• Establish the framework in which company
  defaults can be viewed as a Survival Analysis
  problem
• Define the data generated and the model
  applied
• Present the findings and the conclusions

              Estimating Time to Default using Survival Analysis   3
Formalizing the task 1
The problem
• Assume that there is a way to measure many
  features of few companies (questionnaire,
  qualitative research)
• This results in large quantities of data, but only
  a few independent samples
• It would be useful to know which features are
  relevant for use of statistical methods
• Standard statistics cannot be used
• Observations might be censored i.e. the
  company doesn’t default while observed

               Estimating Time to Default using Survival Analysis   4
Formalizing the task 2
Requirements of a solution
• Apply a method which can incorporate the
  censoring of the data
• It also has to be able to reduce the number of
  predictors efficiently
• It should be qualitatively well posed, i.e.
  characterizing the relevant features
• It should be time and computation power
  efficient


               Estimating Time to Default using Survival Analysis   5
Formalizing the task 3
Definitions I
• A company is in default if it fails to meet its
  obligations
• Merge/acquisition does not qualify
• An event occurs if the company defaults or if it
  gets out of scope for any other reason
  (including end of observation)
• Time observed is understood as the time
  between the beginning of the observation and
  the occurrence of an event

                Estimating Time to Default using Survival Analysis   6
Formalizing the task 4
Definitions II
• An event is censored if it is not a default
• An indicator shows whether an event is
  censored or not (True/False)
• For every sample, there is an observation of
  (censored) time to default
• On every sample (i.e. company) the same
  features are measured (predictors)
• A model is defining a connection between the
  predictors and the observations
                 Estimating Time to Default using Survival Analysis   7
Data used

• Since no real-life data is available, this
  presentation is based on a simulation
• The simulation assumes 500 features
  measured on 50 companies
• Only 50 features are relevant predictors, i.e. the
  simulated time to default is dependent only on
  50 features
• 1/3 of the observed time are censored
• The simulation has been run 1000 times

                Estimating Time to Default using Survival Analysis   8
Method applied

• The methodology is called Supervised Principal
  Component Analysis method
• It was developed by Blair et al. for similar
  setups
• It has the advantage of first finding the relevant
  predictors (thus the name supervised) and then
  building quick-to-use predictors from them
  (principal component analysis)



               Estimating Time to Default using Survival Analysis   9
Model assessment

• Assessing the model is based on a measure of
  success in estimation
• The most straightforward measures are
  applied:
  •   How large part of the relevant and irrelevant predictors
      had been characterized as relevant
  •   p-value: measuring the probability of accidental
      success
• The results are shown in the following slides

                  Estimating Time to Default using Survival Analysis   10
Selection of each feature
            (% over the 1000 runs)
Relevant
features
                                               90%



                                                                     Irrelevant features
                                                                     with extra noise



                 30%                                                 Irrelevant features
                                                                     with no extra noise
               15%




                Estimating Time to Default using Survival Analysis                  11
Histogram of relevant genes selected




           Estimating Time to Default using Survival Analysis   12
Histogram of irrelevant genes selected




          Estimating Time to Default using Survival Analysis   13
P-values of the principal component constructed




             Estimating Time to Default using Survival Analysis   14
Conclusions

• The method found the relevant features in most
  of the cases
• With a good threshold (appearance over 90%),
  the relevant features can be found
• The p-values show high prediction power
• The estimates can be used for evaluation
• Really noisy features mislead the method
• Human interaction in the evaluation cannot be
  omitted
              Estimating Time to Default using Survival Analysis   15

Más contenido relacionado

Similar a Estimating Time To Default

Chromatography Data System: Getting It “Right First Time” Seminar Series – Pa...
Chromatography Data System: Getting It “Right First Time” Seminar Series – Pa...Chromatography Data System: Getting It “Right First Time” Seminar Series – Pa...
Chromatography Data System: Getting It “Right First Time” Seminar Series – Pa...Chromatography & Mass Spectrometry Solutions
 
mod 4.pdf ppt about the safety at industries
mod 4.pdf ppt about the safety at industriesmod 4.pdf ppt about the safety at industries
mod 4.pdf ppt about the safety at industriesMidhundas31
 
Random testing
Random testingRandom testing
Random testingCan KAYA
 
lean six sigma green belt cheat sheet.pdf
lean six sigma green belt cheat sheet.pdflean six sigma green belt cheat sheet.pdf
lean six sigma green belt cheat sheet.pdfNagaraju94925
 
Fundamentals_of_Software_testing.pptx
Fundamentals_of_Software_testing.pptxFundamentals_of_Software_testing.pptx
Fundamentals_of_Software_testing.pptxMusaBashir9
 
Practical Tools for Measurement Systems Analysis
Practical Tools for Measurement Systems AnalysisPractical Tools for Measurement Systems Analysis
Practical Tools for Measurement Systems AnalysisGabor Szabo, CQE
 
Bayesian clinical trials: software and logistics
Bayesian clinical trials: software and logisticsBayesian clinical trials: software and logistics
Bayesian clinical trials: software and logisticsJohn Cook
 
Chromatography Data System: Getting It “Right First Time” Seminar Series – Pa...
Chromatography Data System: Getting It “Right First Time” Seminar Series – Pa...Chromatography Data System: Getting It “Right First Time” Seminar Series – Pa...
Chromatography Data System: Getting It “Right First Time” Seminar Series – Pa...Chromatography & Mass Spectrometry Solutions
 
Five Things I Learned While Building Anomaly Detection Tools - Toufic Boubez ...
Five Things I Learned While Building Anomaly Detection Tools - Toufic Boubez ...Five Things I Learned While Building Anomaly Detection Tools - Toufic Boubez ...
Five Things I Learned While Building Anomaly Detection Tools - Toufic Boubez ...tboubez
 
Planning of experiment in industrial research
Planning of experiment in industrial researchPlanning of experiment in industrial research
Planning of experiment in industrial researchpbbharate
 
Using evolutionary testing to improve efficiency and quality
Using evolutionary testing to improve efficiency and qualityUsing evolutionary testing to improve efficiency and quality
Using evolutionary testing to improve efficiency and qualityFaysal Ahmed
 
Practical Tools for Measurement Systems Analysis
Practical Tools for Measurement Systems AnalysisPractical Tools for Measurement Systems Analysis
Practical Tools for Measurement Systems AnalysisGabor Szabo, CQE
 

Similar a Estimating Time To Default (20)

Chromatography Data System: Getting It “Right First Time” Seminar Series – Pa...
Chromatography Data System: Getting It “Right First Time” Seminar Series – Pa...Chromatography Data System: Getting It “Right First Time” Seminar Series – Pa...
Chromatography Data System: Getting It “Right First Time” Seminar Series – Pa...
 
mod 4.pdf ppt about the safety at industries
mod 4.pdf ppt about the safety at industriesmod 4.pdf ppt about the safety at industries
mod 4.pdf ppt about the safety at industries
 
Random testing
Random testingRandom testing
Random testing
 
lean six sigma green belt cheat sheet.pdf
lean six sigma green belt cheat sheet.pdflean six sigma green belt cheat sheet.pdf
lean six sigma green belt cheat sheet.pdf
 
Fundamentals_of_Software_testing.pptx
Fundamentals_of_Software_testing.pptxFundamentals_of_Software_testing.pptx
Fundamentals_of_Software_testing.pptx
 
Presentation5.ppt
Presentation5.pptPresentation5.ppt
Presentation5.ppt
 
Practical Tools for Measurement Systems Analysis
Practical Tools for Measurement Systems AnalysisPractical Tools for Measurement Systems Analysis
Practical Tools for Measurement Systems Analysis
 
Bayesian clinical trials: software and logistics
Bayesian clinical trials: software and logisticsBayesian clinical trials: software and logistics
Bayesian clinical trials: software and logistics
 
Chromatography Data System: Getting It “Right First Time” Seminar Series – Pa...
Chromatography Data System: Getting It “Right First Time” Seminar Series – Pa...Chromatography Data System: Getting It “Right First Time” Seminar Series – Pa...
Chromatography Data System: Getting It “Right First Time” Seminar Series – Pa...
 
Five Things I Learned While Building Anomaly Detection Tools - Toufic Boubez ...
Five Things I Learned While Building Anomaly Detection Tools - Toufic Boubez ...Five Things I Learned While Building Anomaly Detection Tools - Toufic Boubez ...
Five Things I Learned While Building Anomaly Detection Tools - Toufic Boubez ...
 
Planning of experiment in industrial research
Planning of experiment in industrial researchPlanning of experiment in industrial research
Planning of experiment in industrial research
 
Qc-gmp-qa
Qc-gmp-qaQc-gmp-qa
Qc-gmp-qa
 
Exploratory testing
Exploratory testingExploratory testing
Exploratory testing
 
mel705-15.ppt
mel705-15.pptmel705-15.ppt
mel705-15.ppt
 
mel705-15.ppt
mel705-15.pptmel705-15.ppt
mel705-15.ppt
 
Debugging (Part 2)
Debugging (Part 2)Debugging (Part 2)
Debugging (Part 2)
 
Using evolutionary testing to improve efficiency and quality
Using evolutionary testing to improve efficiency and qualityUsing evolutionary testing to improve efficiency and quality
Using evolutionary testing to improve efficiency and quality
 
Practical Tools for Measurement Systems Analysis
Practical Tools for Measurement Systems AnalysisPractical Tools for Measurement Systems Analysis
Practical Tools for Measurement Systems Analysis
 
Prototyping
PrototypingPrototyping
Prototyping
 
Module-4_Part-II.pptx
Module-4_Part-II.pptxModule-4_Part-II.pptx
Module-4_Part-II.pptx
 

Más de Functional Analytics

ownR platform extended technical introduction
ownR platform extended technical introductionownR platform extended technical introduction
ownR platform extended technical introductionFunctional Analytics
 
ownR extended technical introduction
ownR extended technical introductionownR extended technical introduction
ownR extended technical introductionFunctional Analytics
 
ownR platform technical description
ownR platform technical descriptionownR platform technical description
ownR platform technical descriptionFunctional Analytics
 
ownR platform technical introduction
ownR platform technical introductionownR platform technical introduction
ownR platform technical introductionFunctional Analytics
 
Financial Derivatives in Risk Management
Financial Derivatives in Risk ManagementFinancial Derivatives in Risk Management
Financial Derivatives in Risk ManagementFunctional Analytics
 

Más de Functional Analytics (8)

ownR platform extended technical introduction
ownR platform extended technical introductionownR platform extended technical introduction
ownR platform extended technical introduction
 
ownR extended technical introduction
ownR extended technical introductionownR extended technical introduction
ownR extended technical introduction
 
ownR platform technical description
ownR platform technical descriptionownR platform technical description
ownR platform technical description
 
ownR platform technical introduction
ownR platform technical introductionownR platform technical introduction
ownR platform technical introduction
 
Assessing reporting disclosure
Assessing reporting disclosureAssessing reporting disclosure
Assessing reporting disclosure
 
De-risking under solvency ii
De-risking under solvency iiDe-risking under solvency ii
De-risking under solvency ii
 
Financial Derivatives in Risk Management
Financial Derivatives in Risk ManagementFinancial Derivatives in Risk Management
Financial Derivatives in Risk Management
 
Quantifying Model Related Risks
Quantifying Model Related RisksQuantifying Model Related Risks
Quantifying Model Related Risks
 

Último

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 

Último (20)

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 

Estimating Time To Default

  • 1. Estimating Time to Default using Survival Analysis tools Problems arising due to insufficient samples and censoring Estimating Time to Default using Survival Analysis 1
  • 2. Contents • Purpose of the presentation • State the problem • Formalize the problem and the model • Data used and method applied • Model assessment • Results • Conclusions Estimating Time to Default using Survival Analysis 2
  • 3. Purpose of the presentation • Adapt the author’s paper “Survival prediction using gene expression data: a review and comparison” to a company default setting • Establish the framework in which company defaults can be viewed as a Survival Analysis problem • Define the data generated and the model applied • Present the findings and the conclusions Estimating Time to Default using Survival Analysis 3
  • 4. Formalizing the task 1 The problem • Assume that there is a way to measure many features of few companies (questionnaire, qualitative research) • This results in large quantities of data, but only a few independent samples • It would be useful to know which features are relevant for use of statistical methods • Standard statistics cannot be used • Observations might be censored i.e. the company doesn’t default while observed Estimating Time to Default using Survival Analysis 4
  • 5. Formalizing the task 2 Requirements of a solution • Apply a method which can incorporate the censoring of the data • It also has to be able to reduce the number of predictors efficiently • It should be qualitatively well posed, i.e. characterizing the relevant features • It should be time and computation power efficient Estimating Time to Default using Survival Analysis 5
  • 6. Formalizing the task 3 Definitions I • A company is in default if it fails to meet its obligations • Merge/acquisition does not qualify • An event occurs if the company defaults or if it gets out of scope for any other reason (including end of observation) • Time observed is understood as the time between the beginning of the observation and the occurrence of an event Estimating Time to Default using Survival Analysis 6
  • 7. Formalizing the task 4 Definitions II • An event is censored if it is not a default • An indicator shows whether an event is censored or not (True/False) • For every sample, there is an observation of (censored) time to default • On every sample (i.e. company) the same features are measured (predictors) • A model is defining a connection between the predictors and the observations Estimating Time to Default using Survival Analysis 7
  • 8. Data used • Since no real-life data is available, this presentation is based on a simulation • The simulation assumes 500 features measured on 50 companies • Only 50 features are relevant predictors, i.e. the simulated time to default is dependent only on 50 features • 1/3 of the observed time are censored • The simulation has been run 1000 times Estimating Time to Default using Survival Analysis 8
  • 9. Method applied • The methodology is called Supervised Principal Component Analysis method • It was developed by Blair et al. for similar setups • It has the advantage of first finding the relevant predictors (thus the name supervised) and then building quick-to-use predictors from them (principal component analysis) Estimating Time to Default using Survival Analysis 9
  • 10. Model assessment • Assessing the model is based on a measure of success in estimation • The most straightforward measures are applied: • How large part of the relevant and irrelevant predictors had been characterized as relevant • p-value: measuring the probability of accidental success • The results are shown in the following slides Estimating Time to Default using Survival Analysis 10
  • 11. Selection of each feature (% over the 1000 runs) Relevant features 90% Irrelevant features with extra noise 30% Irrelevant features with no extra noise 15% Estimating Time to Default using Survival Analysis 11
  • 12. Histogram of relevant genes selected Estimating Time to Default using Survival Analysis 12
  • 13. Histogram of irrelevant genes selected Estimating Time to Default using Survival Analysis 13
  • 14. P-values of the principal component constructed Estimating Time to Default using Survival Analysis 14
  • 15. Conclusions • The method found the relevant features in most of the cases • With a good threshold (appearance over 90%), the relevant features can be found • The p-values show high prediction power • The estimates can be used for evaluation • Really noisy features mislead the method • Human interaction in the evaluation cannot be omitted Estimating Time to Default using Survival Analysis 15