SlideShare a Scribd company logo
1 of 34
Download to read offline
Experiences with using
        R in credit risk
                   Hong Ooi
Introduction

• A bit about myself:
    • Did an actuarial degree, found it terribly boring and switched to
      statistics after graduating
    • Worked for several years as a general insurance risk modeller
    • Switched to banking for a change of scenery (at the top of the boom)
    • Now with ANZ Bank in Melbourne, Australia

• Contents:
   • LGD haircut modelling
   • Through-the-cycle calibration
   • Stress testing simulation app
   • SAS and R
   • Closing comments




                                     Page 2
Mortgage haircut model

• When a mortgage defaults, the bank can take possession of the property
  and sell it to recoup the loss1
                      p
• We have some idea of the market value of the property
• Actual sale price tends to be lower on average than the market value (the
  haircut)2
• If sale price > exposure at default, we don’t make a loss (excess is passed
       l    i                 t d f lt    d ’t    k    l    (       i       d
  on to customer); otherwise, we make a loss

  Expected loss = P(default) x EAD x P(possess) x exp.shortfall
    p              (       )          (p      )     p




Notes:
1. For ANZ, <10% of defaults actually result in possession
2. Meaning of “haircut” depends on context; very different when talking
   about, say, US mortgages


                                    Page 3
Sale price distribution
                          p


                                                  Valuation




Expected shortfall




                                     Haircut




                                     Exposure
                                     at default




                                $

                            Page 4
Stat modelling

• Modelling part is in finding parameters for the sale price distribution
    • Assumed distributional shape, eg Gaussian
    • Mean haircut relates average sale price to valuation
    • Spread (volatility) of sale price around haircut
• Once model is found, calculating expected shortfall is just (complicated)
  arithmetic




                                     Page 5
Valuation at origination                                       Valuation at kerbside                                            Valuation after possession
                 15




                                                                                15




                                                                                                                                                 15
                 14




                                                                                14




                                                                                                                                                 14
                 13
                  3




                                                                                13
                                                                                 3




                                                                                                                                                 13
                                                                                                                                                  3
log sale price




                                                               log sale price




                                                                                                                                log sale price
                 12




                                                                                12




                                                                                                                                                 12
                 11




                                                                                11




                                                                                                                                                 11
                 10




                                                                                10




                                                                                                                                                 10
                 9




                                                                                9




                                                                                                                                                 9
                      11        12           13      14   15                         10   11      12             13   14   15                         10        11      12           13   14   15

                                     log valuation                                               log valuation                                                       log valuation




                                                                                                Page 6
Volatility

Volatility of haircut (=sale price/valuation) appears to vary systematically:


                Property type                         SD(haircut)*

                A                                         11.6%

                B                                         9.3%
                                                          9 3%

                C                                         31.2%



                 State/territory                      SD(haircut)*
                 1                                        NA

                 2                                       13.3%

                 3                                        7.7%

                 4                                        9.2%

                 5                                       15.6%

                 6                                       18.4%

                 7                                       14.8%

                 * valued after possession




                                             Page 7
Volatility modelling

• Use regression to estimate haircut as a function of loan characteristics
• Use log-linear structure for volatility to ensure +ve variance estimates
      log linear
• Data is more heavy-tailed than Gaussian (even after deleting outliers), so
  extend to use t distribution
   • df acts as a shape parameter, controls how much influence outlying
     observations h
       b      ti    have

• Constant volatility/Gaussian error model is ordinary linear regression
• Varying-volatility models can be fit by generalised least squares, using gls
  Varying volatility
  in the nlme package; simpler and faster to directly maximise the Gaussian
  likelihood with optim/nlminb (latter will reproduce gls fit)
• Directly maximising the likelihood easily extended to t case: just replace all
  *norm functions with *t
    norm                t




                                      Page 8
Normal model residuals




          0.7




                                                                                    6
          0.6




                                                                                    4
          0.5




                                                                Sample quantiles

                                                                                    2
          0.4
Density

          0.3




                                                                                    0
          0.2




                                                                                    -2
          0.1




                                                                                    -4
          0.0




                -4   -2         0            2    4     6                                -3   -2       -1       0        1        2   3

                              Std residual                                                         Theoretical normal quantiles


                                                 t5-model residuals
           .4
          0.




                                                                                    10
          0.3




                                                                Sampl e quantiles

                                                                                    5
Density

          0.2
          0




                                                                                    0
          0.1




                                                                                    -5
          0.0




                -5        0             5          10                                    -6   -4       -2       0        2        4   6

                              Std residual                                                            Theoretical t-quantiles

                                                            Page 9
Example impact

Property type = C, state = 7, valuation $250k, EAD $240k

                                 Gaussian model

         Mean formula       Volatility formula         Expected
                                                      shortfall ($)
         ~1
          1                 ~1
                             1                             7,610
                                                           7 610
         ~1                 ~proptype                      23,686
         ~1                 ~proptype + state              29,931


                                   t5 model

         Mean formula       Volatility formula         Expected
                                                      shortfall ($)
         ~1                 ~1                             4,493
         ~1                 ~proptype                      10,190
         ~1                 ~proptype + state               ,
                                                           5,896


                                    Page 10
Notes on model behaviour

• Why does using a heavy-tailed error distribution reduce the expected
  shortfall?
    • With normal distrib, volatility is overestimated
       → Likelihood of low sale price is also inflated
       • t distrib corrects this
    • Extreme tails of the t less important
       • At lower end, sale price cannot go below 0
       • At higher end, sale price > EAD is gravy
    • This is not a monotonic relationship! At low enough thresholds
                                                             thresholds,
      eventually heavier tail of the t will make itself felt

• In most regression situations, assuming sufficient data, distributional
  assumptions (ie, normality, homoskedasticity) are not so critical: CLT
  comes into play
• Here, they are important: changing the distributional assumptions can
  change expected shortfall by big amounts
      g                           g


                                     Page 11
In SAS

• SAS has PROC MIXED for modelling variances, but only allows one grouping
  variable and assumes a normal distribution
• PROC NLIN does general nonlinear optimisation
• Also possible in PROC IML

• None of these are as flexible or powerful as R
• The R modelling function returns an object, which can be used to generate
  predictions, compute summaries, etc
• SAS 9.2 now has PROC PLM that does something similar, but requires the
  modelling proc to execute a STORE statement first
    • Only a few procs support this currently
    • If you’re fitting a custom model (like this one), you’re out of luck




                                    Page 12
Through-the-cycle calibration

• For capital purposes, we would like an estimate of default probability that
  doesn’t depend on the current state of the economy
              p                                        y
• This is called a through-the-cycle or long-run PD
    • Contrast with a point-in-time or spot PD, which is what most models will
      give you (data is inherently point-in-time)
•EExactly what l
       tl    h t long-run means can be the subject of philosophical debate; I’ll
                                      b th      bj t f hil       hi l d b t
  define it as a customer’s average risk, given their characteristics, across the
  different economic conditions that might arise
    • This is not a lifetime estimate: eg their age/time on books doesn’t
      change
    • Which variables are considered to be cyclical can be a tricky decision to
      make (many behavioural variables eg credit card balance are probably
      correlated with the economy) y)
    • During bad economic times, the long-run PD will be below the spot, and
      vice-versa during good times
        • You don’t want to have to raise capital during a crisis



                                      Page 13
TTC approach

• Start with the spot estimate:

                              PD(x, e) = f(x, e)

• x = individual customer’s characteristics
• e = economic variables (constant for all customers at any point in time)
• Average over the possible values of e to get a TTC estimate
                   PD




                              Economic cycle




                                               Page 14
TTC approach

• This is complicated numerically, can be done in various ways eg Monte
  Carlo
• Use backcasting for simplicity: take historical values of e, substitute into
  prediction equation, average the results
    • As we are interested in means rather than quantiles, this shouldn’t
      affect accuracy much (other practical issues will have much more
      impact)
• R used to handle backcasting, combining multiple spot PDs into one output
  value




                                     Page 15
TTC calculation

• Input from spot model is a prediction equation, along with sample of
  historical economic data

spot_predict <‐ function(data)
{
  # code copied from SAS; with preprocessing, can be arbitrarily complex
  xb <‐ with(data, b0 + x1 * b1 + x2 * b2 + ... )
            (                                   )
  plogis(xb)
}

ttc_predict <‐ function(data, ecodata, from = "2000‐01‐01", to = "2010‐12‐01")
{
  dates <‐ seq(as.Date(from), as.Date(to), by = "months")
  evars <‐ names(ecodata)
  pd <‐ matrix(nrow(data), length(dates)) # not very space‐efficient!
  for(i in seq_along(dates))
  {
    data[evars] <‐ subset(ecodata, date == dates[i], evars)
    pd[, i] <‐ spot_predict(data)
  }
  apply(pd, 1, mean)
   pp y(p          )
}


                                                 Page 16
Binning/cohorting

• Raw TTC estimate is a combination of many spot PDs, each of which is from
  a logistic regression
      g        g
  → TTC estimate is a complicated function of customer attributes
• Need to simplify for communication, implementation purposes

• Turn into bins or cohorts based on customer attributes: estimate for each
  cohort is the average for customers within the cohort
• Take pragmatic approach to defining cohorts
    • Create tiers based on small selection of variables that will split out
      riskiest customers
    • Within each tier, create contingency table using attributes deemed most
      interesting/important to the business
    • Number of cohorts limited by need for simplicity/manageability, <1000
                                             simplicity/manageability
      desirable
    • Not a data-driven approach, although selection of variables informed by
      data exploration/analysis


                                    Page 17
SAS

                                      Model specification code                 Portfolio data         Economic data



R
                                            Empirical logits
      TTC averaging function                                                   Portfolio data         Economic data
                                             PIT predictor




                                                                                   PIT PD




       Cohorting functions             Cohorting specification                     TTC PD




      Empty cell imputation                Initial cohort table




                                           Final cohort table                                   Cohorted TTC PD



SAS

                     Initialisation code                          Cohorting code




                                                                   Page 18
Binning/cohorting

          Example from nameless portfolio:


                        Raw TTC PD                                                            Cohorted TTC PD
                        Distribution of ILS long-run PD                                        Distribution of cohorted ILS long-run PD
          0.4




                                                                                0.4
          0.3




                                                                                0.3
   sity




                                                                         sity
          0.2
            2
Dens




                                                                      Dens

                                                                                0.2
          0.1




                                                                                0.1
          0.0




                                                                                0.0




                0.01%    0.1%                1%           10%                         0.01%          0.1%                1%               10%

                                      PD                                                                         PD




                                                                Page 19
Binning input

 varlist <‐ list(
     low_doc2=list(name="low_doc",
                   breaks=c("N", "Y"),
                   midp=c("N", "Y"),
                    id   ("N" "Y")
                   na.val="N"),
     enq     =list(name="wrst_nbr_enq",
                   breaks=c(‐Inf, 0, 5, 15, Inf),
                   midp=c(0, 3, 10, 25),
                   na.val=0),
     lvr     =list(name="new lvr basel",
     lvr      list(name new_lvr_basel ,
                   breaks=c(‐Inf, 60, 70, 80, 90, Inf),
                   midp=c(50, 60, 70, 80, 95),
                   na.val=70),
     ...




                                                      low_doc wrst_nbr_enq new_lvr_basel ... tier1 tier2
                   by application of               1        N            0            50 ...     1     1
                  expand.grid() and                2        N            0            60 ...     1     2
                                                   3        N            0            70 ...     1     3
                       friends...
                       friends                     4        N            0            80 ...     1     4
                                                   5        N            0            95 ...     1     5
                                                   6        N            3            50 ...     1     6
                                                   7        N            3            60 ...     1     7
                                                   8        N            3            70 ...     1     8
                                                   9        N            3            80 ...     1     9
                                                   10       N            3            95 ...     1    10
                                                   10       N            3            95         1    10



                                                    Page 20
Binning output

if low_doc = ' ' then low_doc2 = 1;
else if low_doc = 'Y' then low_doc2 = 1;
else low_doc2 = 2;

if wrst_nbr_enq = . then enq = 0;
else if wrst_nbr_enq <= 0 then enq = 0;
else if wrst_nbr_enq <= 5 then enq = 3;
else if wrst_nbr_enq <= 15 then enq = 10;
else enq = 25;

if new_lvr_basel = . then lvr = 70;
else if new_lvr_basel <= 60 then lvr = 50;
else if new_lvr_basel <= 70 then lvr = 60;
else if new_lvr_basel <= 80 then lvr = 70;
else if new_lvr_basel <= 90 then lvr = 80;
else lvr = 95;
...




if lvr = 50 and enq = 0 and low_doc2 = 'N' then do; tier2 = 1; ttc_pd = ________; end;
else if lvr = 60 and enq = 0 and low_doc2 = 'N' then do; tier2 = 2; ttc_pd = ________; end;
else if lvr = 70 and enq = 0 and low_doc2 = 'N' then do; tier2 = 3; ttc_pd = ________; end;
 l   if l          d           d l   d      ' ' h    d    i              d               d
else if lvr = 80 and enq = 0 and low_doc2 = 'N' then do; tier2 = 4; ttc_pd = ________; end;
else if lvr = 95 and enq = 0 and low_doc2 = 'N' then do; tier2 = 5; ttc_pd = ________; end;
else if lvr = 50 and enq = 3 and low_doc2 = 'N' then do; tier2 = 6; ttc_pd = ________; end;
else if lvr = 60 and enq = 3 and low_doc2 = 'N' then do; tier2 = 7; ttc_pd = ________; end;
...




                                                    Page 21
Stress testing simulation

• Banks run stress tests on their loan portfolios, to see what a downturn
  would do to their financial health
• Mathematical framework is similar to the “Vasicek model”:
    • Represent the economy by a parameter X
    • Each loan has a transition matrix that is shifted based on X, determining
      its i k
      it risk grade in year t given its grade i year t - 1
                 d i           i    it     d in
        • Defaults if bottom grade reached
• Take a scenario/simulation-based approach: set X to a stressed value, run
  N times, take the average
          ,                g
    • Contrast to VaR: “average result for a stressed economy”, as opposed to
      “stressed result for an average economy”
• Example data: portfolio of 100,000 commercial loans along with current risk
  grade,
  grade split by subportfolio
• Simulation horizon: ~3 years




                                     Page 22
Application outline

• Front end in Excel (because the business world lives in Excel)
    • Calls SAS to setup datasets
       • Calls R to do the actual computations

• Previous version was an ad-hoc script written entirely in SAS, took ~4
  hours to run, often crashed due to lack of disk space
    • Series of DATA steps (disk-bound)
    • Transition matrices represented by unrolled if-then-else statements
      (25x25 matrix becomes 625 lines of code)
• Reduced to 2 minutes with R, 1 megabyte of code cut to 10k
    • No rocket science involved: simply due to using a better tool
    • Similar times achievable with PROC IML, of which more later




                                     Page 23
Application outline

• For each subportfolio and year, get the median result and store it
• Next year s simulation uses this year s median portfolio
       year’s                      year’s
• To avoid having to store multiple transited copies of the portfolio, we
  manipulate random seeds

for(p in 1:nPortfolios) # varies by project
f ( i 1 P tf li ) #          i   b     j t
{
  for(y in 1:nYears)    # usually 2‐5
  {
    seed <‐ .GlobalEnv$.Random.seed
                      $
    for(i in 1:nIters)  # around 1,000, but could be less
      result[i] <‐ summary(doTransit(portfolio[p, y], T[p, y]))

        med <‐ which(result == median(result))
        portfolio[p, y + 1] <‐ doTransit(portfolio[p, y], T[p, y], seed, med)
    }
}




                                                 Page 24
Data structures

• For business reasons, we want to split the simulation by subportfolio
• And also present results for each year separately
    → Naturally have 2-dimensional (matrix) structure for output: [i, j]th
     entry is the result for the ith subportfolio, jth year

• But desired output for each [i, j] might be a bunch of summary statistics,
  diagnostics, etc
    → Output needs to be a list

• Similarly, we have a separate input transition matrix for each subportfolio
  and year
    → Input should be a matrix of matrices




                                     Page 25
Data structures

R allows matrices whose elements are lists:

T <‐ matrix(list(), nPortfolios, nYears)
for(i in 1:nPortfolios) for(j in 1:nYears)
    T[[i, j]] <‐ getMatrix(i, j, ...)

M <‐ matrix(list(), nPortfolios, nYears)
M[[i, j]]$result <‐ doTransit(i, j, ...)
M[[i, j]]$sumstat <‐ summary(M[[i, j]]$result) 


• Better than the alternatives:
   • List of lists loses indexing ability
   • Separate matrices for each output stat is clumsy
   • Multi-way arrays conflate data and metadata




                                      Page 26
PROC IML: a gateway to R

• As of SAS 9.2, you can use IML to execute R code, and transfer datasets to
  and from R:

   PROC IML;
     call ExportDataSetToR('portfol', 'portfol');  /* creates a data frame */
     call ExportMatrixToR("&Rfuncs", 'rfuncs');
     call ExportMatrixToR("&Rscript", 'rscript');
     call ExportMatrixToR("&nIters", 'nIters');
     ...
     submit /R;
       source(rfuncs)
       source(rscript)
     endsubmit;
     call ImportDataSetFromR('result', 'result');
   QUIT;


• No messing around with import/export via CSV, transport files, etc
• Half the code in an earlier version was for import/export



                                           Page 27
IML: a side-rant

• IML lacks:
    • Logical vectors: everything has to be numeric or character
    • Support for zero-length vectors (you don’t realise how useful they are
      until they’re gone)
    • Unoriented vectors: everything is either a row or column vector
      (technically,
      (t h i ll everything i a matrix)
                         thi  is     t i )
• So something like x = x + y[z < 0]; fails in three ways

• IML also lacks anything like a dataset/data frame: everything is a matrix
    → It’s easier to transfer a SAS dataset to and from R, than IML

• Everything is a matrix: no lists, let alone lists of lists, or matrices of lists
   • Not even multi-way arrays

• Which puts the occasional online grouching about R into perspective



                                        Page 28
Other SAS/R interfaces

• SAS has a proprietary dataset format (or, many proprietary dataset
  formats) )
    • R’s foreign package includes read.ssd and read.xport for importing,
      and write.foreign(*, package="SAS") for exporting
    • Package Hmisc has sas.get
    • Package sas7bdat has an experimental reader for this format
                    7bd t
    • Revolution R can read SAS datasets
    • All have glitches, are not widely available, or not fully functional
        • First 2 also need SAS installed
• SAS 9.2 and IML make these issues moot
    • You just have to pay for it
    • Caveat: only works with R <= 2.11.1 (2.12 changed the locations of
      binaries)
        • SAS 9.3 will support R 2.12+




                                     Page 29
R and SAS rundown

• Advantages of R
   • Free! (base distribution, anyway)
   • Very powerful statistical programming environment: SAS takes 3
     languages to do what R does with 1
   • Flexible and extensible
   • Lots of features (if you can find them)
       • User-contributed packages are a blessing and a curse
   • Ability to handle large datasets is improving
• Advantages of SAS
   • Pervasive presence in large firms
       • “Nobody got fired for buying IBM SAS”
       • Compatibility with existing processes/metadata
              p      y              gp
       • Long-term support
   • Tremendous data processing/data warehousing capability
   • Lots of features (if you can afford them)
   • Sometimes cleaner than R, especially f d
                   l       h              ll for data manipulation
                                                            l

                                   Page 30
R and SAS rundown

Example: get weighted summary statistics by groups

   proc summary data = indat nway;
     class a b;
     var x y;
     weight w;
     output sum(w)=sumwt mean(x)=xmean mean(y)=ymean var(y)=yvar
       out = outdat;
   run;



   outdat <‐ local({
       res <‐ t(sapply(split(indat, indat[c("a", "b")]), function(subset) {
           c(xmean = weighted.mean(subset$x, subset$w),
             ymean = weighted.mean(subset$y, subset$w),
             yvar = cov.wt(subset["y"], subset$w)$cov)
       }))
       levs <‐ aggregate(w ~ a + b, data=indat, sum)
       cbind(levs, as.data.frame(res))                                    Thank god for plyr
   })


                                           Page 31
Challenges for deploying R

• Quality assurance, or perception thereof
    • Core is excellent, but much of R s attraction is in extensibility, ie
                                       R’s
      contributed packages
    • Can I be sure that the package I just downloaded does what it says? Is
      it doing more than it says?
    •B k
      Backward compatibility
                 d       tibilit
         • Telling people to use the latest version is not always helpful
• No single point of contact
    • Who do I yell at if things go wrong?
    • How can I be sure everyone is using the same version?
• Unix roots make package development clunky on Windows
    • Process is more fragile because it assumes Unix conventions
    • Why must I download a set of third-party tools to compile code?
    • Difficult to integrate with Visual C++/Visual Studio
• Interfacing with languages other than Fortran, C, C++ not (yet) integrated
  into core

                                    Page 32
Commercial R: an aside

• Many of these issues are fundamental in nature
• Third parties like Revolution Analytics can address them, without having to
  dilute R’s focus (I am not a Revo R user)
    • Other R commercialisations existed but seem to have disappeared
    • Anyone remember S-Plus?
    • Challenge is not to negate R’s drawcards
        • Low cost: important even for big companies
        • Community and ecosystem
            • Can I use Revo R with that package I got from CRAN?
            • If I use Revo R, can I/should I participate on R-Help,
              StackOverflow.com?
• Also why SAS, SPSS, etc can include interfaces to R without risking too
  much




                                     Page 33
Good problems to have

• Sign of R’s movement into the mainstream
• S-Plus now touts R compatibility, rather than R touting S compatibility
  S Plus
• Nobody cares that SHAZAM, GLIM, XLisp-Stat etc don’t support XML, C# or
  Java




                                  Page 34

More Related Content

What's hot

R in finance: Introduction to R and Its Applications in Finance
R in finance: Introduction to R and Its Applications in FinanceR in finance: Introduction to R and Its Applications in Finance
R in finance: Introduction to R and Its Applications in FinanceLiang C. Zhang (張良丞)
 
Chapter 16
Chapter 16Chapter 16
Chapter 16bmcfad01
 
Top 8 Different Types Of Charts In Statistics And Their Uses
Top 8 Different Types Of Charts In Statistics And Their UsesTop 8 Different Types Of Charts In Statistics And Their Uses
Top 8 Different Types Of Charts In Statistics And Their UsesStat Analytica
 
An Interactive Introduction To R (Programming Language For Statistics)
An Interactive Introduction To R (Programming Language For Statistics)An Interactive Introduction To R (Programming Language For Statistics)
An Interactive Introduction To R (Programming Language For Statistics)Dataspora
 
Governance, Risk & Compliance Management Solution
Governance, Risk & Compliance Management SolutionGovernance, Risk & Compliance Management Solution
Governance, Risk & Compliance Management SolutionRishabh Software
 
Logistic regression
Logistic regressionLogistic regression
Logistic regressionsaba khan
 
Introduction to R Programming
Introduction to R ProgrammingIntroduction to R Programming
Introduction to R Programmingizahn
 
R basics
R basicsR basics
R basicsFAO
 
Bayesian statistics
Bayesian statisticsBayesian statistics
Bayesian statisticsSagar Kamble
 
Data Analytics Training Course
Data Analytics Training CourseData Analytics Training Course
Data Analytics Training CourseDINESHKUMARANAPPONIX
 
Linear regression theory
Linear regression theoryLinear regression theory
Linear regression theorySaurav Mukherjee
 
Montgomery
Montgomery Montgomery
Montgomery bazz8
 
Introduction to data analysis using R
Introduction to data analysis using RIntroduction to data analysis using R
Introduction to data analysis using RVictoria LĂłpez
 
Managing Risk in a digital world: successfully enabling the quest for new rev...
Managing Risk in a digital world: successfully enabling the quest for new rev...Managing Risk in a digital world: successfully enabling the quest for new rev...
Managing Risk in a digital world: successfully enabling the quest for new rev...accenture
 
EY - Remaking risk management in banking
EY - Remaking risk management in bankingEY - Remaking risk management in banking
EY - Remaking risk management in bankingEY
 
Chapter 10
Chapter 10Chapter 10
Chapter 10guest3720ca
 

What's hot (20)

R in finance: Introduction to R and Its Applications in Finance
R in finance: Introduction to R and Its Applications in FinanceR in finance: Introduction to R and Its Applications in Finance
R in finance: Introduction to R and Its Applications in Finance
 
Chapter 16
Chapter 16Chapter 16
Chapter 16
 
Analytics 2
Analytics 2Analytics 2
Analytics 2
 
Top 8 Different Types Of Charts In Statistics And Their Uses
Top 8 Different Types Of Charts In Statistics And Their UsesTop 8 Different Types Of Charts In Statistics And Their Uses
Top 8 Different Types Of Charts In Statistics And Their Uses
 
An Interactive Introduction To R (Programming Language For Statistics)
An Interactive Introduction To R (Programming Language For Statistics)An Interactive Introduction To R (Programming Language For Statistics)
An Interactive Introduction To R (Programming Language For Statistics)
 
Governance, Risk & Compliance Management Solution
Governance, Risk & Compliance Management SolutionGovernance, Risk & Compliance Management Solution
Governance, Risk & Compliance Management Solution
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Introduction to R Programming
Introduction to R ProgrammingIntroduction to R Programming
Introduction to R Programming
 
R basics
R basicsR basics
R basics
 
Bayesian statistics
Bayesian statisticsBayesian statistics
Bayesian statistics
 
Data Analytics Training Course
Data Analytics Training CourseData Analytics Training Course
Data Analytics Training Course
 
Linear regression theory
Linear regression theoryLinear regression theory
Linear regression theory
 
Compliance Risk Assessment
Compliance Risk AssessmentCompliance Risk Assessment
Compliance Risk Assessment
 
Montgomery
Montgomery Montgomery
Montgomery
 
Introduction to data analysis using R
Introduction to data analysis using RIntroduction to data analysis using R
Introduction to data analysis using R
 
Managing Risk in a digital world: successfully enabling the quest for new rev...
Managing Risk in a digital world: successfully enabling the quest for new rev...Managing Risk in a digital world: successfully enabling the quest for new rev...
Managing Risk in a digital world: successfully enabling the quest for new rev...
 
EY - Remaking risk management in banking
EY - Remaking risk management in bankingEY - Remaking risk management in banking
EY - Remaking risk management in banking
 
Chapter 10
Chapter 10Chapter 10
Chapter 10
 
Central tedancy & correlation project - 1
Central tedancy & correlation project - 1Central tedancy & correlation project - 1
Central tedancy & correlation project - 1
 
IBM Planning Analytics
IBM Planning AnalyticsIBM Planning Analytics
IBM Planning Analytics
 

Viewers also liked

Using R for Analyzing Loans, Portfolios and Risk: From Academic Theory to Fi...
Using R for Analyzing Loans, Portfolios and Risk:  From Academic Theory to Fi...Using R for Analyzing Loans, Portfolios and Risk:  From Academic Theory to Fi...
Using R for Analyzing Loans, Portfolios and Risk: From Academic Theory to Fi...Revolution Analytics
 
Using R in Finance
Using R in FinanceUsing R in Finance
Using R in FinanceCeShine Lee
 
Backtesting Trading Strategies with R
Backtesting Trading Strategies with RBacktesting Trading Strategies with R
Backtesting Trading Strategies with Reraviv
 
Business Finance KPI's - tutorial
Business Finance KPI's -  tutorialBusiness Finance KPI's -  tutorial
Business Finance KPI's - tutorialSmeebi
 
How to use R in different professions: R In Finance (Speaker: Gabriel Foix, M...
How to use R in different professions: R In Finance (Speaker: Gabriel Foix, M...How to use R in different professions: R In Finance (Speaker: Gabriel Foix, M...
How to use R in different professions: R In Finance (Speaker: Gabriel Foix, M...Zurich_R_User_Group
 
Pairs Trading - Hedge Fund Strategies
Pairs Trading - Hedge Fund StrategiesPairs Trading - Hedge Fund Strategies
Pairs Trading - Hedge Fund StrategiesHedge Fund South Africa
 
ReStream: Accelerating Backtesting and Stream Replay with Serial-Equivalent P...
ReStream: Accelerating Backtesting and Stream Replay with Serial-Equivalent P...ReStream: Accelerating Backtesting and Stream Replay with Serial-Equivalent P...
ReStream: Accelerating Backtesting and Stream Replay with Serial-Equivalent P...Johann Schleier-Smith
 
Cross sell - concept and related analytics
Cross sell - concept and related analyticsCross sell - concept and related analytics
Cross sell - concept and related analyticsGaurav Sharma
 
Introduction to pairtrading
Introduction to pairtradingIntroduction to pairtrading
Introduction to pairtradingKohta Ishikawa
 
Analytics cross-selling-retail-banking
Analytics cross-selling-retail-bankingAnalytics cross-selling-retail-banking
Analytics cross-selling-retail-bankingSantosh Tiwari
 
Pairs Trading from NYC Algorithmic Trading Meetup November '13
Pairs Trading from NYC Algorithmic Trading Meetup November '13Pairs Trading from NYC Algorithmic Trading Meetup November '13
Pairs Trading from NYC Algorithmic Trading Meetup November '13Quantopian
 
Cross Selling Opportunities In Banking Industry
Cross Selling Opportunities In Banking IndustryCross Selling Opportunities In Banking Industry
Cross Selling Opportunities In Banking Industrytutkuozmen
 
The Value of Open Source Communities
The Value of Open Source CommunitiesThe Value of Open Source Communities
The Value of Open Source CommunitiesRevolution Analytics
 
DIY Quant Strategies on Quantopian
DIY Quant Strategies on QuantopianDIY Quant Strategies on Quantopian
DIY Quant Strategies on QuantopianJess Stauth
 

Viewers also liked (20)

Using R for Analyzing Loans, Portfolios and Risk: From Academic Theory to Fi...
Using R for Analyzing Loans, Portfolios and Risk:  From Academic Theory to Fi...Using R for Analyzing Loans, Portfolios and Risk:  From Academic Theory to Fi...
Using R for Analyzing Loans, Portfolios and Risk: From Academic Theory to Fi...
 
Using R in Finance
Using R in FinanceUsing R in Finance
Using R in Finance
 
Backtesting Trading Strategies with R
Backtesting Trading Strategies with RBacktesting Trading Strategies with R
Backtesting Trading Strategies with R
 
Business Finance KPI's - tutorial
Business Finance KPI's -  tutorialBusiness Finance KPI's -  tutorial
Business Finance KPI's - tutorial
 
How to use R in different professions: R In Finance (Speaker: Gabriel Foix, M...
How to use R in different professions: R In Finance (Speaker: Gabriel Foix, M...How to use R in different professions: R In Finance (Speaker: Gabriel Foix, M...
How to use R in different professions: R In Finance (Speaker: Gabriel Foix, M...
 
Demystifying PD Terminologies
Demystifying PD TerminologiesDemystifying PD Terminologies
Demystifying PD Terminologies
 
Pairs Trading - Hedge Fund Strategies
Pairs Trading - Hedge Fund StrategiesPairs Trading - Hedge Fund Strategies
Pairs Trading - Hedge Fund Strategies
 
ReStream: Accelerating Backtesting and Stream Replay with Serial-Equivalent P...
ReStream: Accelerating Backtesting and Stream Replay with Serial-Equivalent P...ReStream: Accelerating Backtesting and Stream Replay with Serial-Equivalent P...
ReStream: Accelerating Backtesting and Stream Replay with Serial-Equivalent P...
 
Pairs trading using R
Pairs trading using R Pairs trading using R
Pairs trading using R
 
The New Rules of Cross-Selling
The New Rules of Cross-SellingThe New Rules of Cross-Selling
The New Rules of Cross-Selling
 
Cross sell - concept and related analytics
Cross sell - concept and related analyticsCross sell - concept and related analytics
Cross sell - concept and related analytics
 
Introduction to pairtrading
Introduction to pairtradingIntroduction to pairtrading
Introduction to pairtrading
 
Machine Learning, Stock Market and Chaos
Machine Learning, Stock Market and Chaos Machine Learning, Stock Market and Chaos
Machine Learning, Stock Market and Chaos
 
Analytics cross-selling-retail-banking
Analytics cross-selling-retail-bankingAnalytics cross-selling-retail-banking
Analytics cross-selling-retail-banking
 
Prezi v sway
Prezi v swayPrezi v sway
Prezi v sway
 
Pairs Trading from NYC Algorithmic Trading Meetup November '13
Pairs Trading from NYC Algorithmic Trading Meetup November '13Pairs Trading from NYC Algorithmic Trading Meetup November '13
Pairs Trading from NYC Algorithmic Trading Meetup November '13
 
Cross Selling Opportunities In Banking Industry
Cross Selling Opportunities In Banking IndustryCross Selling Opportunities In Banking Industry
Cross Selling Opportunities In Banking Industry
 
The Value of Open Source Communities
The Value of Open Source CommunitiesThe Value of Open Source Communities
The Value of Open Source Communities
 
DIY Quant Strategies on Quantopian
DIY Quant Strategies on QuantopianDIY Quant Strategies on Quantopian
DIY Quant Strategies on Quantopian
 
Algorithmic Trading
Algorithmic TradingAlgorithmic Trading
Algorithmic Trading
 

More from Revolution Analytics

Speeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the CloudSpeeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the CloudRevolution Analytics
 
Migrating Existing Open Source Machine Learning to Azure
Migrating Existing Open Source Machine Learning to AzureMigrating Existing Open Source Machine Learning to Azure
Migrating Existing Open Source Machine Learning to AzureRevolution Analytics
 
The case for R for AI developers
The case for R for AI developersThe case for R for AI developers
The case for R for AI developersRevolution Analytics
 
Speed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudSpeed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudRevolution Analytics
 
Predicting Loan Delinquency at One Million Transactions per Second
Predicting Loan Delinquency at One Million Transactions per SecondPredicting Loan Delinquency at One Million Transactions per Second
Predicting Loan Delinquency at One Million Transactions per SecondRevolution Analytics
 
Reproducible Data Science with R
Reproducible Data Science with RReproducible Data Science with R
Reproducible Data Science with RRevolution Analytics
 
Building a scalable data science platform with R
Building a scalable data science platform with RBuilding a scalable data science platform with R
Building a scalable data science platform with RRevolution Analytics
 
The Business Economics and Opportunity of Open Source Data Science
The Business Economics and Opportunity of Open Source Data ScienceThe Business Economics and Opportunity of Open Source Data Science
The Business Economics and Opportunity of Open Source Data ScienceRevolution Analytics
 
Taking R Analytics to SQL and the Cloud
Taking R Analytics to SQL and the CloudTaking R Analytics to SQL and the Cloud
Taking R Analytics to SQL and the CloudRevolution Analytics
 
The Network structure of R packages on CRAN & BioConductor
The Network structure of R packages on CRAN & BioConductorThe Network structure of R packages on CRAN & BioConductor
The Network structure of R packages on CRAN & BioConductorRevolution Analytics
 
The network structure of cran 2015 07-02 final
The network structure of cran 2015 07-02 finalThe network structure of cran 2015 07-02 final
The network structure of cran 2015 07-02 finalRevolution Analytics
 
Simple Reproducibility with the checkpoint package
Simple Reproducibilitywith the checkpoint packageSimple Reproducibilitywith the checkpoint package
Simple Reproducibility with the checkpoint packageRevolution Analytics
 
Revolution R Enterprise 7.4 - Presentation by Bill Jacobs 11Jun15
Revolution R Enterprise 7.4 - Presentation by Bill Jacobs 11Jun15Revolution R Enterprise 7.4 - Presentation by Bill Jacobs 11Jun15
Revolution R Enterprise 7.4 - Presentation by Bill Jacobs 11Jun15Revolution Analytics
 

More from Revolution Analytics (20)

Speeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the CloudSpeeding up R with Parallel Programming in the Cloud
Speeding up R with Parallel Programming in the Cloud
 
Migrating Existing Open Source Machine Learning to Azure
Migrating Existing Open Source Machine Learning to AzureMigrating Existing Open Source Machine Learning to Azure
Migrating Existing Open Source Machine Learning to Azure
 
R in Minecraft
R in Minecraft R in Minecraft
R in Minecraft
 
The case for R for AI developers
The case for R for AI developersThe case for R for AI developers
The case for R for AI developers
 
Speed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudSpeed up R with parallel programming in the Cloud
Speed up R with parallel programming in the Cloud
 
The R Ecosystem
The R EcosystemThe R Ecosystem
The R Ecosystem
 
R Then and Now
R Then and NowR Then and Now
R Then and Now
 
Predicting Loan Delinquency at One Million Transactions per Second
Predicting Loan Delinquency at One Million Transactions per SecondPredicting Loan Delinquency at One Million Transactions per Second
Predicting Loan Delinquency at One Million Transactions per Second
 
Reproducible Data Science with R
Reproducible Data Science with RReproducible Data Science with R
Reproducible Data Science with R
 
The R Ecosystem
The R EcosystemThe R Ecosystem
The R Ecosystem
 
R at Microsoft (useR! 2016)
R at Microsoft (useR! 2016)R at Microsoft (useR! 2016)
R at Microsoft (useR! 2016)
 
Building a scalable data science platform with R
Building a scalable data science platform with RBuilding a scalable data science platform with R
Building a scalable data science platform with R
 
R at Microsoft
R at MicrosoftR at Microsoft
R at Microsoft
 
The Business Economics and Opportunity of Open Source Data Science
The Business Economics and Opportunity of Open Source Data ScienceThe Business Economics and Opportunity of Open Source Data Science
The Business Economics and Opportunity of Open Source Data Science
 
Taking R Analytics to SQL and the Cloud
Taking R Analytics to SQL and the CloudTaking R Analytics to SQL and the Cloud
Taking R Analytics to SQL and the Cloud
 
The Network structure of R packages on CRAN & BioConductor
The Network structure of R packages on CRAN & BioConductorThe Network structure of R packages on CRAN & BioConductor
The Network structure of R packages on CRAN & BioConductor
 
The network structure of cran 2015 07-02 final
The network structure of cran 2015 07-02 finalThe network structure of cran 2015 07-02 final
The network structure of cran 2015 07-02 final
 
Simple Reproducibility with the checkpoint package
Simple Reproducibilitywith the checkpoint packageSimple Reproducibilitywith the checkpoint package
Simple Reproducibility with the checkpoint package
 
R at Microsoft
R at MicrosoftR at Microsoft
R at Microsoft
 
Revolution R Enterprise 7.4 - Presentation by Bill Jacobs 11Jun15
Revolution R Enterprise 7.4 - Presentation by Bill Jacobs 11Jun15Revolution R Enterprise 7.4 - Presentation by Bill Jacobs 11Jun15
Revolution R Enterprise 7.4 - Presentation by Bill Jacobs 11Jun15
 

Recently uploaded

Stock Market Brief Deck (Under Pressure).pdf
Stock Market Brief Deck (Under Pressure).pdfStock Market Brief Deck (Under Pressure).pdf
Stock Market Brief Deck (Under Pressure).pdfMichael Silva
 
The Economic History of the U.S. Lecture 19.pdf
The Economic History of the U.S. Lecture 19.pdfThe Economic History of the U.S. Lecture 19.pdf
The Economic History of the U.S. Lecture 19.pdfGale Pooley
 
20240429 Calibre April 2024 Investor Presentation.pdf
20240429 Calibre April 2024 Investor Presentation.pdf20240429 Calibre April 2024 Investor Presentation.pdf
20240429 Calibre April 2024 Investor Presentation.pdfAdnet Communications
 
Top Rated Pune Call Girls Sinhagad Road ⟟ 6297143586 ⟟ Call Me For Genuine S...
Top Rated  Pune Call Girls Sinhagad Road ⟟ 6297143586 ⟟ Call Me For Genuine S...Top Rated  Pune Call Girls Sinhagad Road ⟟ 6297143586 ⟟ Call Me For Genuine S...
Top Rated Pune Call Girls Sinhagad Road ⟟ 6297143586 ⟟ Call Me For Genuine S...Call Girls in Nagpur High Profile
 
Mira Road Awesome 100% Independent Call Girls NUmber-9833754194-Dahisar Inter...
Mira Road Awesome 100% Independent Call Girls NUmber-9833754194-Dahisar Inter...Mira Road Awesome 100% Independent Call Girls NUmber-9833754194-Dahisar Inter...
Mira Road Awesome 100% Independent Call Girls NUmber-9833754194-Dahisar Inter...priyasharma62062
 
Call Girls Koregaon Park Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Koregaon Park Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Koregaon Park Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Koregaon Park Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
Booking open Available Pune Call Girls Shivane 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Shivane  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Shivane  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Shivane 6297143586 Call Hot Indian Gi...Call Girls in Nagpur High Profile
 
VIP Call Girl in Mumbai 💧 9920725232 ( Call Me ) Get A New Crush Everyday Wit...
VIP Call Girl in Mumbai 💧 9920725232 ( Call Me ) Get A New Crush Everyday Wit...VIP Call Girl in Mumbai 💧 9920725232 ( Call Me ) Get A New Crush Everyday Wit...
VIP Call Girl in Mumbai 💧 9920725232 ( Call Me ) Get A New Crush Everyday Wit...dipikadinghjn ( Why You Choose Us? ) Escorts
 
The Economic History of the U.S. Lecture 30.pdf
The Economic History of the U.S. Lecture 30.pdfThe Economic History of the U.S. Lecture 30.pdf
The Economic History of the U.S. Lecture 30.pdfGale Pooley
 
TEST BANK For Corporate Finance, 13th Edition By Stephen Ross, Randolph Weste...
TEST BANK For Corporate Finance, 13th Edition By Stephen Ross, Randolph Weste...TEST BANK For Corporate Finance, 13th Edition By Stephen Ross, Randolph Weste...
TEST BANK For Corporate Finance, 13th Edition By Stephen Ross, Randolph Weste...ssifa0344
 
Vasai-Virar Fantastic Call Girls-9833754194-Call Girls MUmbai
Vasai-Virar Fantastic Call Girls-9833754194-Call Girls MUmbaiVasai-Virar Fantastic Call Girls-9833754194-Call Girls MUmbai
Vasai-Virar Fantastic Call Girls-9833754194-Call Girls MUmbaipriyasharma62062
 
Booking open Available Pune Call Girls Wadgaon Sheri 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Wadgaon Sheri  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Wadgaon Sheri  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Wadgaon Sheri 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
The Economic History of the U.S. Lecture 21.pdf
The Economic History of the U.S. Lecture 21.pdfThe Economic History of the U.S. Lecture 21.pdf
The Economic History of the U.S. Lecture 21.pdfGale Pooley
 
Shrambal_Distributors_Newsletter_Apr-2024 (1).pdf
Shrambal_Distributors_Newsletter_Apr-2024 (1).pdfShrambal_Distributors_Newsletter_Apr-2024 (1).pdf
Shrambal_Distributors_Newsletter_Apr-2024 (1).pdfvikashdidwania1
 
Top Rated Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...
Top Rated  Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...Top Rated  Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...
Top Rated Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...Call Girls in Nagpur High Profile
 
VIP Call Girl Service Andheri West ⚡ 9920725232 What It Takes To Be The Best ...
VIP Call Girl Service Andheri West ⚡ 9920725232 What It Takes To Be The Best ...VIP Call Girl Service Andheri West ⚡ 9920725232 What It Takes To Be The Best ...
VIP Call Girl Service Andheri West ⚡ 9920725232 What It Takes To Be The Best ...dipikadinghjn ( Why You Choose Us? ) Escorts
 

Recently uploaded (20)

From Luxury Escort Service Kamathipura : 9352852248 Make on-demand Arrangemen...
From Luxury Escort Service Kamathipura : 9352852248 Make on-demand Arrangemen...From Luxury Escort Service Kamathipura : 9352852248 Make on-demand Arrangemen...
From Luxury Escort Service Kamathipura : 9352852248 Make on-demand Arrangemen...
 
Stock Market Brief Deck (Under Pressure).pdf
Stock Market Brief Deck (Under Pressure).pdfStock Market Brief Deck (Under Pressure).pdf
Stock Market Brief Deck (Under Pressure).pdf
 
The Economic History of the U.S. Lecture 19.pdf
The Economic History of the U.S. Lecture 19.pdfThe Economic History of the U.S. Lecture 19.pdf
The Economic History of the U.S. Lecture 19.pdf
 
(Vedika) Low Rate Call Girls in Pune Call Now 8250077686 Pune Escorts 24x7
(Vedika) Low Rate Call Girls in Pune Call Now 8250077686 Pune Escorts 24x7(Vedika) Low Rate Call Girls in Pune Call Now 8250077686 Pune Escorts 24x7
(Vedika) Low Rate Call Girls in Pune Call Now 8250077686 Pune Escorts 24x7
 
Call Girls in New Ashok Nagar, (delhi) call me [9953056974] escort service 24X7
Call Girls in New Ashok Nagar, (delhi) call me [9953056974] escort service 24X7Call Girls in New Ashok Nagar, (delhi) call me [9953056974] escort service 24X7
Call Girls in New Ashok Nagar, (delhi) call me [9953056974] escort service 24X7
 
20240429 Calibre April 2024 Investor Presentation.pdf
20240429 Calibre April 2024 Investor Presentation.pdf20240429 Calibre April 2024 Investor Presentation.pdf
20240429 Calibre April 2024 Investor Presentation.pdf
 
Top Rated Pune Call Girls Sinhagad Road ⟟ 6297143586 ⟟ Call Me For Genuine S...
Top Rated  Pune Call Girls Sinhagad Road ⟟ 6297143586 ⟟ Call Me For Genuine S...Top Rated  Pune Call Girls Sinhagad Road ⟟ 6297143586 ⟟ Call Me For Genuine S...
Top Rated Pune Call Girls Sinhagad Road ⟟ 6297143586 ⟟ Call Me For Genuine S...
 
Mira Road Awesome 100% Independent Call Girls NUmber-9833754194-Dahisar Inter...
Mira Road Awesome 100% Independent Call Girls NUmber-9833754194-Dahisar Inter...Mira Road Awesome 100% Independent Call Girls NUmber-9833754194-Dahisar Inter...
Mira Road Awesome 100% Independent Call Girls NUmber-9833754194-Dahisar Inter...
 
Call Girls Koregaon Park Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Koregaon Park Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Koregaon Park Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Koregaon Park Call Me 7737669865 Budget Friendly No Advance Booking
 
Booking open Available Pune Call Girls Shivane 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Shivane  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Shivane  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Shivane 6297143586 Call Hot Indian Gi...
 
VIP Call Girl in Mumbai 💧 9920725232 ( Call Me ) Get A New Crush Everyday Wit...
VIP Call Girl in Mumbai 💧 9920725232 ( Call Me ) Get A New Crush Everyday Wit...VIP Call Girl in Mumbai 💧 9920725232 ( Call Me ) Get A New Crush Everyday Wit...
VIP Call Girl in Mumbai 💧 9920725232 ( Call Me ) Get A New Crush Everyday Wit...
 
The Economic History of the U.S. Lecture 30.pdf
The Economic History of the U.S. Lecture 30.pdfThe Economic History of the U.S. Lecture 30.pdf
The Economic History of the U.S. Lecture 30.pdf
 
(INDIRA) Call Girl Mumbai Call Now 8250077686 Mumbai Escorts 24x7
(INDIRA) Call Girl Mumbai Call Now 8250077686 Mumbai Escorts 24x7(INDIRA) Call Girl Mumbai Call Now 8250077686 Mumbai Escorts 24x7
(INDIRA) Call Girl Mumbai Call Now 8250077686 Mumbai Escorts 24x7
 
TEST BANK For Corporate Finance, 13th Edition By Stephen Ross, Randolph Weste...
TEST BANK For Corporate Finance, 13th Edition By Stephen Ross, Randolph Weste...TEST BANK For Corporate Finance, 13th Edition By Stephen Ross, Randolph Weste...
TEST BANK For Corporate Finance, 13th Edition By Stephen Ross, Randolph Weste...
 
Vasai-Virar Fantastic Call Girls-9833754194-Call Girls MUmbai
Vasai-Virar Fantastic Call Girls-9833754194-Call Girls MUmbaiVasai-Virar Fantastic Call Girls-9833754194-Call Girls MUmbai
Vasai-Virar Fantastic Call Girls-9833754194-Call Girls MUmbai
 
Booking open Available Pune Call Girls Wadgaon Sheri 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Wadgaon Sheri  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Wadgaon Sheri  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Wadgaon Sheri 6297143586 Call Hot Ind...
 
The Economic History of the U.S. Lecture 21.pdf
The Economic History of the U.S. Lecture 21.pdfThe Economic History of the U.S. Lecture 21.pdf
The Economic History of the U.S. Lecture 21.pdf
 
Shrambal_Distributors_Newsletter_Apr-2024 (1).pdf
Shrambal_Distributors_Newsletter_Apr-2024 (1).pdfShrambal_Distributors_Newsletter_Apr-2024 (1).pdf
Shrambal_Distributors_Newsletter_Apr-2024 (1).pdf
 
Top Rated Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...
Top Rated  Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...Top Rated  Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...
Top Rated Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...
 
VIP Call Girl Service Andheri West ⚡ 9920725232 What It Takes To Be The Best ...
VIP Call Girl Service Andheri West ⚡ 9920725232 What It Takes To Be The Best ...VIP Call Girl Service Andheri West ⚡ 9920725232 What It Takes To Be The Best ...
VIP Call Girl Service Andheri West ⚡ 9920725232 What It Takes To Be The Best ...
 

Successful Uses of R in Banking

  • 1. Experiences with using R in credit risk Hong Ooi
  • 2. Introduction • A bit about myself: • Did an actuarial degree, found it terribly boring and switched to statistics after graduating • Worked for several years as a general insurance risk modeller • Switched to banking for a change of scenery (at the top of the boom) • Now with ANZ Bank in Melbourne, Australia • Contents: • LGD haircut modelling • Through-the-cycle calibration • Stress testing simulation app • SAS and R • Closing comments Page 2
  • 3. Mortgage haircut model • When a mortgage defaults, the bank can take possession of the property and sell it to recoup the loss1 p • We have some idea of the market value of the property • Actual sale price tends to be lower on average than the market value (the haircut)2 • If sale price > exposure at default, we don’t make a loss (excess is passed l i t d f lt d ’t k l ( i d on to customer); otherwise, we make a loss Expected loss = P(default) x EAD x P(possess) x exp.shortfall p ( ) (p ) p Notes: 1. For ANZ, <10% of defaults actually result in possession 2. Meaning of “haircut” depends on context; very different when talking about, say, US mortgages Page 3
  • 4. Sale price distribution p Valuation Expected shortfall Haircut Exposure at default $ Page 4
  • 5. Stat modelling • Modelling part is in finding parameters for the sale price distribution • Assumed distributional shape, eg Gaussian • Mean haircut relates average sale price to valuation • Spread (volatility) of sale price around haircut • Once model is found, calculating expected shortfall is just (complicated) arithmetic Page 5
  • 6. Valuation at origination Valuation at kerbside Valuation after possession 15 15 15 14 14 14 13 3 13 3 13 3 log sale price log sale price log sale price 12 12 12 11 11 11 10 10 10 9 9 9 11 12 13 14 15 10 11 12 13 14 15 10 11 12 13 14 15 log valuation log valuation log valuation Page 6
  • 7. Volatility Volatility of haircut (=sale price/valuation) appears to vary systematically: Property type SD(haircut)* A 11.6% B 9.3% 9 3% C 31.2% State/territory SD(haircut)* 1 NA 2 13.3% 3 7.7% 4 9.2% 5 15.6% 6 18.4% 7 14.8% * valued after possession Page 7
  • 8. Volatility modelling • Use regression to estimate haircut as a function of loan characteristics • Use log-linear structure for volatility to ensure +ve variance estimates log linear • Data is more heavy-tailed than Gaussian (even after deleting outliers), so extend to use t distribution • df acts as a shape parameter, controls how much influence outlying observations h b ti have • Constant volatility/Gaussian error model is ordinary linear regression • Varying-volatility models can be fit by generalised least squares, using gls Varying volatility in the nlme package; simpler and faster to directly maximise the Gaussian likelihood with optim/nlminb (latter will reproduce gls fit) • Directly maximising the likelihood easily extended to t case: just replace all *norm functions with *t norm t Page 8
  • 9. Normal model residuals 0.7 6 0.6 4 0.5 Sample quantiles 2 0.4 Density 0.3 0 0.2 -2 0.1 -4 0.0 -4 -2 0 2 4 6 -3 -2 -1 0 1 2 3 Std residual Theoretical normal quantiles t5-model residuals .4 0. 10 0.3 Sampl e quantiles 5 Density 0.2 0 0 0.1 -5 0.0 -5 0 5 10 -6 -4 -2 0 2 4 6 Std residual Theoretical t-quantiles Page 9
  • 10. Example impact Property type = C, state = 7, valuation $250k, EAD $240k Gaussian model Mean formula Volatility formula Expected shortfall ($) ~1 1 ~1 1 7,610 7 610 ~1 ~proptype 23,686 ~1 ~proptype + state 29,931 t5 model Mean formula Volatility formula Expected shortfall ($) ~1 ~1 4,493 ~1 ~proptype 10,190 ~1 ~proptype + state , 5,896 Page 10
  • 11. Notes on model behaviour • Why does using a heavy-tailed error distribution reduce the expected shortfall? • With normal distrib, volatility is overestimated → Likelihood of low sale price is also inflated • t distrib corrects this • Extreme tails of the t less important • At lower end, sale price cannot go below 0 • At higher end, sale price > EAD is gravy • This is not a monotonic relationship! At low enough thresholds thresholds, eventually heavier tail of the t will make itself felt • In most regression situations, assuming sufficient data, distributional assumptions (ie, normality, homoskedasticity) are not so critical: CLT comes into play • Here, they are important: changing the distributional assumptions can change expected shortfall by big amounts g g Page 11
  • 12. In SAS • SAS has PROC MIXED for modelling variances, but only allows one grouping variable and assumes a normal distribution • PROC NLIN does general nonlinear optimisation • Also possible in PROC IML • None of these are as flexible or powerful as R • The R modelling function returns an object, which can be used to generate predictions, compute summaries, etc • SAS 9.2 now has PROC PLM that does something similar, but requires the modelling proc to execute a STORE statement first • Only a few procs support this currently • If you’re fitting a custom model (like this one), you’re out of luck Page 12
  • 13. Through-the-cycle calibration • For capital purposes, we would like an estimate of default probability that doesn’t depend on the current state of the economy p y • This is called a through-the-cycle or long-run PD • Contrast with a point-in-time or spot PD, which is what most models will give you (data is inherently point-in-time) •EExactly what l tl h t long-run means can be the subject of philosophical debate; I’ll b th bj t f hil hi l d b t define it as a customer’s average risk, given their characteristics, across the different economic conditions that might arise • This is not a lifetime estimate: eg their age/time on books doesn’t change • Which variables are considered to be cyclical can be a tricky decision to make (many behavioural variables eg credit card balance are probably correlated with the economy) y) • During bad economic times, the long-run PD will be below the spot, and vice-versa during good times • You don’t want to have to raise capital during a crisis Page 13
  • 14. TTC approach • Start with the spot estimate: PD(x, e) = f(x, e) • x = individual customer’s characteristics • e = economic variables (constant for all customers at any point in time) • Average over the possible values of e to get a TTC estimate PD Economic cycle Page 14
  • 15. TTC approach • This is complicated numerically, can be done in various ways eg Monte Carlo • Use backcasting for simplicity: take historical values of e, substitute into prediction equation, average the results • As we are interested in means rather than quantiles, this shouldn’t affect accuracy much (other practical issues will have much more impact) • R used to handle backcasting, combining multiple spot PDs into one output value Page 15
  • 16. TTC calculation • Input from spot model is a prediction equation, along with sample of historical economic data spot_predict <‐ function(data) { # code copied from SAS; with preprocessing, can be arbitrarily complex xb <‐ with(data, b0 + x1 * b1 + x2 * b2 + ... ) ( ) plogis(xb) } ttc_predict <‐ function(data, ecodata, from = "2000‐01‐01", to = "2010‐12‐01") { dates <‐ seq(as.Date(from), as.Date(to), by = "months") evars <‐ names(ecodata) pd <‐ matrix(nrow(data), length(dates)) # not very space‐efficient! for(i in seq_along(dates)) { data[evars] <‐ subset(ecodata, date == dates[i], evars) pd[, i] <‐ spot_predict(data) } apply(pd, 1, mean) pp y(p ) } Page 16
  • 17. Binning/cohorting • Raw TTC estimate is a combination of many spot PDs, each of which is from a logistic regression g g → TTC estimate is a complicated function of customer attributes • Need to simplify for communication, implementation purposes • Turn into bins or cohorts based on customer attributes: estimate for each cohort is the average for customers within the cohort • Take pragmatic approach to defining cohorts • Create tiers based on small selection of variables that will split out riskiest customers • Within each tier, create contingency table using attributes deemed most interesting/important to the business • Number of cohorts limited by need for simplicity/manageability, <1000 simplicity/manageability desirable • Not a data-driven approach, although selection of variables informed by data exploration/analysis Page 17
  • 18. SAS Model specification code Portfolio data Economic data R Empirical logits TTC averaging function Portfolio data Economic data PIT predictor PIT PD Cohorting functions Cohorting specification TTC PD Empty cell imputation Initial cohort table Final cohort table Cohorted TTC PD SAS Initialisation code Cohorting code Page 18
  • 19. Binning/cohorting Example from nameless portfolio: Raw TTC PD Cohorted TTC PD Distribution of ILS long-run PD Distribution of cohorted ILS long-run PD 0.4 0.4 0.3 0.3 sity sity 0.2 2 Dens Dens 0.2 0.1 0.1 0.0 0.0 0.01% 0.1% 1% 10% 0.01% 0.1% 1% 10% PD PD Page 19
  • 20. Binning input varlist <‐ list( low_doc2=list(name="low_doc", breaks=c("N", "Y"), midp=c("N", "Y"), id ("N" "Y") na.val="N"), enq     =list(name="wrst_nbr_enq", breaks=c(‐Inf, 0, 5, 15, Inf), midp=c(0, 3, 10, 25), na.val=0), lvr     =list(name="new lvr basel", lvr list(name new_lvr_basel , breaks=c(‐Inf, 60, 70, 80, 90, Inf), midp=c(50, 60, 70, 80, 95), na.val=70), ... low_doc wrst_nbr_enq new_lvr_basel ... tier1 tier2 by application of 1        N            0            50 ...     1     1 expand.grid() and 2        N            0            60 ...     1     2 3        N            0            70 ...     1     3 friends... friends 4        N            0            80 ...     1     4 5        N            0            95 ...     1     5 6        N            3            50 ...     1     6 7        N            3            60 ...     1     7 8        N            3            70 ...     1     8 9        N            3            80 ...     1     9 10       N            3            95 ...     1    10 10 N 3 95 1 10 Page 20
  • 21. Binning output if low_doc = ' ' then low_doc2 = 1; else if low_doc = 'Y' then low_doc2 = 1; else low_doc2 = 2; if wrst_nbr_enq = . then enq = 0; else if wrst_nbr_enq <= 0 then enq = 0; else if wrst_nbr_enq <= 5 then enq = 3; else if wrst_nbr_enq <= 15 then enq = 10; else enq = 25; if new_lvr_basel = . then lvr = 70; else if new_lvr_basel <= 60 then lvr = 50; else if new_lvr_basel <= 70 then lvr = 60; else if new_lvr_basel <= 80 then lvr = 70; else if new_lvr_basel <= 90 then lvr = 80; else lvr = 95; ... if lvr = 50 and enq = 0 and low_doc2 = 'N' then do; tier2 = 1; ttc_pd = ________; end; else if lvr = 60 and enq = 0 and low_doc2 = 'N' then do; tier2 = 2; ttc_pd = ________; end; else if lvr = 70 and enq = 0 and low_doc2 = 'N' then do; tier2 = 3; ttc_pd = ________; end; l if l d d l d ' ' h d i d d else if lvr = 80 and enq = 0 and low_doc2 = 'N' then do; tier2 = 4; ttc_pd = ________; end; else if lvr = 95 and enq = 0 and low_doc2 = 'N' then do; tier2 = 5; ttc_pd = ________; end; else if lvr = 50 and enq = 3 and low_doc2 = 'N' then do; tier2 = 6; ttc_pd = ________; end; else if lvr = 60 and enq = 3 and low_doc2 = 'N' then do; tier2 = 7; ttc_pd = ________; end; ... Page 21
  • 22. Stress testing simulation • Banks run stress tests on their loan portfolios, to see what a downturn would do to their financial health • Mathematical framework is similar to the “Vasicek model”: • Represent the economy by a parameter X • Each loan has a transition matrix that is shifted based on X, determining its i k it risk grade in year t given its grade i year t - 1 d i i it d in • Defaults if bottom grade reached • Take a scenario/simulation-based approach: set X to a stressed value, run N times, take the average , g • Contrast to VaR: “average result for a stressed economy”, as opposed to “stressed result for an average economy” • Example data: portfolio of 100,000 commercial loans along with current risk grade, grade split by subportfolio • Simulation horizon: ~3 years Page 22
  • 23. Application outline • Front end in Excel (because the business world lives in Excel) • Calls SAS to setup datasets • Calls R to do the actual computations • Previous version was an ad-hoc script written entirely in SAS, took ~4 hours to run, often crashed due to lack of disk space • Series of DATA steps (disk-bound) • Transition matrices represented by unrolled if-then-else statements (25x25 matrix becomes 625 lines of code) • Reduced to 2 minutes with R, 1 megabyte of code cut to 10k • No rocket science involved: simply due to using a better tool • Similar times achievable with PROC IML, of which more later Page 23
  • 24. Application outline • For each subportfolio and year, get the median result and store it • Next year s simulation uses this year s median portfolio year’s year’s • To avoid having to store multiple transited copies of the portfolio, we manipulate random seeds for(p in 1:nPortfolios) # varies by project f ( i 1 P tf li ) # i b j t { for(y in 1:nYears)    # usually 2‐5 { seed <‐ .GlobalEnv$.Random.seed $ for(i in 1:nIters)  # around 1,000, but could be less result[i] <‐ summary(doTransit(portfolio[p, y], T[p, y])) med <‐ which(result == median(result)) portfolio[p, y + 1] <‐ doTransit(portfolio[p, y], T[p, y], seed, med) } } Page 24
  • 25. Data structures • For business reasons, we want to split the simulation by subportfolio • And also present results for each year separately → Naturally have 2-dimensional (matrix) structure for output: [i, j]th entry is the result for the ith subportfolio, jth year • But desired output for each [i, j] might be a bunch of summary statistics, diagnostics, etc → Output needs to be a list • Similarly, we have a separate input transition matrix for each subportfolio and year → Input should be a matrix of matrices Page 25
  • 26. Data structures R allows matrices whose elements are lists: T <‐ matrix(list(), nPortfolios, nYears) for(i in 1:nPortfolios) for(j in 1:nYears) T[[i, j]] <‐ getMatrix(i, j, ...) M <‐ matrix(list(), nPortfolios, nYears) M[[i, j]]$result <‐ doTransit(i, j, ...) M[[i, j]]$sumstat <‐ summary(M[[i, j]]$result)  • Better than the alternatives: • List of lists loses indexing ability • Separate matrices for each output stat is clumsy • Multi-way arrays conflate data and metadata Page 26
  • 27. PROC IML: a gateway to R • As of SAS 9.2, you can use IML to execute R code, and transfer datasets to and from R: PROC IML; call ExportDataSetToR('portfol', 'portfol');  /* creates a data frame */ call ExportMatrixToR("&Rfuncs", 'rfuncs'); call ExportMatrixToR("&Rscript", 'rscript'); call ExportMatrixToR("&nIters", 'nIters'); ... submit /R; source(rfuncs) source(rscript) endsubmit; call ImportDataSetFromR('result', 'result'); QUIT; • No messing around with import/export via CSV, transport files, etc • Half the code in an earlier version was for import/export Page 27
  • 28. IML: a side-rant • IML lacks: • Logical vectors: everything has to be numeric or character • Support for zero-length vectors (you don’t realise how useful they are until they’re gone) • Unoriented vectors: everything is either a row or column vector (technically, (t h i ll everything i a matrix) thi is t i ) • So something like x = x + y[z < 0]; fails in three ways • IML also lacks anything like a dataset/data frame: everything is a matrix → It’s easier to transfer a SAS dataset to and from R, than IML • Everything is a matrix: no lists, let alone lists of lists, or matrices of lists • Not even multi-way arrays • Which puts the occasional online grouching about R into perspective Page 28
  • 29. Other SAS/R interfaces • SAS has a proprietary dataset format (or, many proprietary dataset formats) ) • R’s foreign package includes read.ssd and read.xport for importing, and write.foreign(*, package="SAS") for exporting • Package Hmisc has sas.get • Package sas7bdat has an experimental reader for this format 7bd t • Revolution R can read SAS datasets • All have glitches, are not widely available, or not fully functional • First 2 also need SAS installed • SAS 9.2 and IML make these issues moot • You just have to pay for it • Caveat: only works with R <= 2.11.1 (2.12 changed the locations of binaries) • SAS 9.3 will support R 2.12+ Page 29
  • 30. R and SAS rundown • Advantages of R • Free! (base distribution, anyway) • Very powerful statistical programming environment: SAS takes 3 languages to do what R does with 1 • Flexible and extensible • Lots of features (if you can find them) • User-contributed packages are a blessing and a curse • Ability to handle large datasets is improving • Advantages of SAS • Pervasive presence in large firms • “Nobody got fired for buying IBM SAS” • Compatibility with existing processes/metadata p y gp • Long-term support • Tremendous data processing/data warehousing capability • Lots of features (if you can afford them) • Sometimes cleaner than R, especially f d l h ll for data manipulation l Page 30
  • 31. R and SAS rundown Example: get weighted summary statistics by groups proc summary data = indat nway; class a b; var x y; weight w; output sum(w)=sumwt mean(x)=xmean mean(y)=ymean var(y)=yvar out = outdat; run; outdat <‐ local({ res <‐ t(sapply(split(indat, indat[c("a", "b")]), function(subset) { c(xmean = weighted.mean(subset$x, subset$w), ymean = weighted.mean(subset$y, subset$w), yvar = cov.wt(subset["y"], subset$w)$cov) })) levs <‐ aggregate(w ~ a + b, data=indat, sum) cbind(levs, as.data.frame(res)) Thank god for plyr }) Page 31
  • 32. Challenges for deploying R • Quality assurance, or perception thereof • Core is excellent, but much of R s attraction is in extensibility, ie R’s contributed packages • Can I be sure that the package I just downloaded does what it says? Is it doing more than it says? •B k Backward compatibility d tibilit • Telling people to use the latest version is not always helpful • No single point of contact • Who do I yell at if things go wrong? • How can I be sure everyone is using the same version? • Unix roots make package development clunky on Windows • Process is more fragile because it assumes Unix conventions • Why must I download a set of third-party tools to compile code? • Difficult to integrate with Visual C++/Visual Studio • Interfacing with languages other than Fortran, C, C++ not (yet) integrated into core Page 32
  • 33. Commercial R: an aside • Many of these issues are fundamental in nature • Third parties like Revolution Analytics can address them, without having to dilute R’s focus (I am not a Revo R user) • Other R commercialisations existed but seem to have disappeared • Anyone remember S-Plus? • Challenge is not to negate R’s drawcards • Low cost: important even for big companies • Community and ecosystem • Can I use Revo R with that package I got from CRAN? • If I use Revo R, can I/should I participate on R-Help, StackOverflow.com? • Also why SAS, SPSS, etc can include interfaces to R without risking too much Page 33
  • 34. Good problems to have • Sign of R’s movement into the mainstream • S-Plus now touts R compatibility, rather than R touting S compatibility S Plus • Nobody cares that SHAZAM, GLIM, XLisp-Stat etc don’t support XML, C# or Java Page 34