## Archive for **April 2013**

## A SAS Macro Calculating PDO

In the development of credit scorecards, the model developer usually will scale the predicted probability of default / delinquent into a range of discrete score points for the purpose of operational convenience. While there are multiple ways to perform scaling, the most popular one in the credit risk arena is to scale the predicted probability logarithmically such that the odds, the ratio between goods and bads, will be doubled / halved after the increase / decrease of a certain number of score points, which is also called PDO (points to double the odds) in the industry.

In practice, PDO along the time carries strong implications in credit policies and strategies and is often used as an effectiveness measure for a scorecard. Chances are that a scorecard could still maintain a satisfactory predictiveness, e.g. K-S statistic or AUC, but might lose its effectiveness, e.g. increase in PDO. For instance, if a credit strategy was originally developed to take effect at credit score = 700 by design, it might require credit score = 720 later after the increase in PDO, which will significantly discount the effectiveness of such strategy. As a result, it is critically important to monitor PDO of a scorecard in the production environment.

Below is a sas macro to calculate PDO of a scorecard.

options nocenter nonumber nodate mprint mlogic symbolgen orientation = landscape ls = 125 formchar = "|----|+|---+=|-/\<>*"; %macro get_pdo(data = , score = , y = , wt = NONE, ref_score = , target_odds = , target_pdo = ); **********************************************************************; * THIS MACRO IS TO CALCULATE OBSERVED ODDS AND PDO FOR ANY SCORECARD *; * AND COMPUTE ALIGNMENT BETAS TO REACH TARGET ODDS AND PDO *; * ------------------------------------------------------------------ *; * PARAMETERS: *; * DATA : INPUT DATASET *; * SCORE : SCORECARD VARIABLE *; * Y : RESPONSE VARIABLE IN (0, 1) *; * WT : WEIGHT VARIABLE IN POSITIVE INTEGER *; * REF_SCORE : REFERENCE SCORE POINT FOR TARGET ODDS AND PDO *; * TARGET_ODDS: TARGET ODDS AT REFERENCE SCORE OF SCORECARD *; * TARGET_PDO : TARGET POINTS TO DOUBLE ODDS OF SCARECARD *; * ------------------------------------------------------------------ *; * OUTPUTS: *; * REPORT : PDO REPORT WITH THE CALIBRATION FORMULA IN HTML FORMAT *; * ------------------------------------------------------------------ *; * AUTHOR: WENSUI.LIU@53.COM *; **********************************************************************; options nonumber nodate orientation = landscape nocenter; *** CHECK IF THERE IS WEIGHT VARIABLE ***; %if %upcase(&wt) = NONE %then %do; data _tmp1 (keep = &y &score _wt); set &data; where &y in (1, 0) and &score ~= .; _wt = 1; &score = round(&score., 1); run; %end; %else %do; data _tmp1 (keep = &y &score _wt); set &data; where &y in (1, 0) and &score ~= . and round(&wt., 1) > 0; _wt = round(&wt., 1); &score = round(&score., 1); run; %end; proc logistic data = _tmp1 desc outest = _est1 noprint; model &y = &score; freq _wt; run; proc sql noprint; select round(min(&score), 0.01) into :min_score from _tmp1; select round(max(&score), 0.01) into :max_score from _tmp1; quit; data _est2; set _est1 (keep = intercept &score rename = (&score = slope)); adjust_beta0 = &ref_score - (&target_pdo * log(&target_odds) / log(2)) - intercept * &target_pdo / log(2); adjust_beta1 = -1 * (&target_pdo * slope / log(2)); do i = -5 to 5; old_pdo = round(-log(2) / slope, 0.01); old_ref = &ref_score + (i) * old_pdo; old_odd = exp(-(slope * old_ref + intercept)); if old_ref >= &min_score and old_ref <= &max_score then output; end; run; data _tmp2; set _tmp1; if _n_ = 1 then do; set _est2(obs = 1); end; adjusted = adjust_beta0 + adjust_beta1 * &score; run; proc logistic data = _tmp2 desc noprint outest = _est3; model &y = adjusted; freq _wt; run; data _est4; set _est3 (keep = intercept adjusted rename = (adjusted = slope)); adjust_beta0 = &ref_score - (&target_pdo * log(&target_odds) / log(2)) - intercept * &target_pdo / log(2); adjust_beta1 = -1 * (&target_pdo * slope / log(2)); do i = -5 to 5; new_pdo = round(-log(2) / slope, 0.01); new_ref = &ref_score + (i) * new_pdo; new_odd = exp(-(slope * new_ref + intercept)); if new_ref >= &min_score and new_ref <= &max_score then output; end; run; proc sql noprint; create table _final as select &target_pdo as target_pdo, &target_odds as target_odds, a.old_pdo as pdo1, a.old_ref as ref1, a.old_odd as odd1, log(a.old_odd) as ln_odd1, a.adjust_beta0 as adjust_beta0, a.adjust_beta1 as adjust_beta1, b.new_pdo as pdo2, b.new_ref as ref2, b.new_odd as odd2, log(b.new_odd) as ln_odd2 from _est2 as a inner join _est4 as b on a.i = b.i; select round(pdo1, 1) into :pdo1 from _final; select put(max(pdo1 / pdo2 - 1, 0), percent10.2) into :compare from _final; select case when pdo1 > pdo2 then 1 else 0 end into :flag from _final; select put(adjust_beta0, 12.8) into :beta0 from _final; select put(adjust_beta1, 12.8) into :beta1 from _final; quit; %put &compare; ods html file = "%upcase(%trim(&score))_PDO_SUMMARY.html" style = sasweb; title; proc report data = _final box spacing = 1 split = "/" style(header) = [font_face = "courier new"] style(column) = [font_face = "courier new"] style(lines) = [font_face = "courier new" font_size = 2] style(report) = [font_face = "courier new"]; column("/SUMMARY OF POINTS TO DOUBLE ODDS FOR %upcase(&score) WEIGHTED BY %upcase(&wt) IN DATA %upcase(&data) /( TARGET PDO = &target_pdo, TARGET ODDS = &target_odds AT REFERENCE SCORE &ref_score ) / " pdo1 ref1 odd1 ln_odd1 pdo2 ref2 odd2 ln_odd2); define pdo1 / "OBSERVED/SCORE PDO" width = 10 format = 4. center; define ref1 / "OBSERVED/REF. SCORE" width = 15 format = 5. center order order = data; define odd1 / "OBSERVED/ODDS" width = 15 format = 14.4 center; define ln_odd1 / "OBSERVED/LOG ODDS" width = 15 format = 8.2 center; define pdo2 / "ADJUSTED/SCORE PDO" width = 10 format = 4. center; define ref2 / "ADJUSTED/REF. SCORE" width = 15 format = 5. center; define odd2 / "ADJUSTED/ODDS" width = 15 format = 14.4 center; define ln_odd2 / "ADJUSTED/LOG ODDS" width = 15 format = 8.2 center; compute after; %if &flag = 1 %then %do; line @15 "THE SCORE ODDS IS DETERIORATED BY %trim(&compare)."; line @15 "CALIBRATION FORMULA: ADJUSTED SCORE = %trim(&beta0) + %trim(&beta1) * %trim(%upcase(&score))."; %end; %else %do; line @25 "THERE IS NO DETERIORATION IN THE SCORE ODDS."; %end; endcomp; run;; ods html close; *************************************************; * END OF THE MACRO *; *************************************************; %mend get_pdo;

In the following example, I called the macro to calculate PDO of bureau_score in a test data. Parameter inputs of “ref_score”, “target_odds”, and “target_pdo” could be acquired from the development or the benchmark sample. As shown in the output, the observed PDO is higher than the target PDO, implying 2.93% deterioration in the score effectiveness. In addition to PDO calculation, the sas macro also generated a formulation to calibrate the bureau_score back to the target PDO.

data tmp1; set data.accepts; where bureau_score ~= .; run; %get_pdo(data = tmp1, score = bureau_score, y = bad, wt = weight, ref_score = 680, target_odds = 20, target_pdo = 45); /* ----------------------------------------------------------------------------------------------------------------------- | SUMMARY OF POINTS TO DOUBLE ODDS FOR BUREAU_SCORE WEIGHTED BY WEIGHT IN DATA TMP1 | | ( TARGET PDO = 45, TARGET ODDS = 20 AT REFERENCE SCORE 680 ) | | | | OBSERVED OBSERVED OBSERVED OBSERVED ADJUSTED ADJUSTED ADJUSTED ADJUSTED | |SCORE PDO REF. SCORE ODDS LOG ODDS SCORE PDO REF. SCORE ODDS LOG ODDS | |---------------------------------------------------------------------------------------------------------------------| | 46 | 448 | 0.6404 | -0.45 | 45 | 455 | 0.6250 | -0.47 | |----------+---------------+---------------+---------------+----------+---------------+---------------+---------------| | 46 | 495 | 1.2809 | 0.25 | 45 | 500 | 1.2500 | 0.22 | |----------+---------------+---------------+---------------+----------+---------------+---------------+---------------| | 46 | 541 | 2.5618 | 0.94 | 45 | 545 | 2.5000 | 0.92 | |----------+---------------+---------------+---------------+----------+---------------+---------------+---------------| | 46 | 587 | 5.1239 | 1.63 | 45 | 590 | 5.0000 | 1.61 | |----------+---------------+---------------+---------------+----------+---------------+---------------+---------------| | 46 | 634 | 10.2483 | 2.33 | 45 | 635 | 10.0000 | 2.30 | |----------+---------------+---------------+---------------+----------+---------------+---------------+---------------| | 46 | 680 | 20.4976 | 3.02 | 45 | 680 | 20.0000 | 3.00 | |----------+---------------+---------------+---------------+----------+---------------+---------------+---------------| | 46 | 726 | 40.9971 | 3.71 | 45 | 725 | 40.0000 | 3.69 | |----------+---------------+---------------+---------------+----------+---------------+---------------+---------------| | 46 | 773 | 81.9981 | 4.41 | 45 | 770 | 80.0000 | 4.38 | |----------+---------------+---------------+---------------+----------+---------------+---------------+---------------| | 46 | 819 | 164.0038 | 5.10 | 45 | 815 | 160.0000 | 5.08 | |---------------------------------------------------------------------------------------------------------------------| | THE SCORE ODDS IS DETERIORATED BY 2.93%. | | CALIBRATION FORMULA: ADJUSTED SCORE = 20.92914838 + 0.97156812 * BUREAU_SCORE. | ----------------------------------------------------------------------------------------------------------------------- */

In the second example, I used the formulation from the above to generate a adjust_score from the bureau_score and then calculated PDO again. As shown in the output, after the calibration, the observed PDO of adjust_score now becomes equivalent to the target PDO of bureau_score. When the score increases every 45 points, e.g. from 680 to 725, the odds value would be doubled as anticipated, e.g. from 20 to 40.

data tmp2; set tmp1; adjust_score = 20.92914838 + 0.97156812 * bureau_score; run; %get_pdo(data = tmp2, score = adjust_score, y = bad, wt = weight, ref_score = 680, target_odds = 20, target_pdo = 45); /* ----------------------------------------------------------------------------------------------------------------------- | SUMMARY OF POINTS TO DOUBLE ODDS FOR ADJUST_SCORE WEIGHTED BY WEIGHT IN DATA TMP2 | | ( TARGET PDO = 45, TARGET ODDS = 20 AT REFERENCE SCORE 680 ) | | | | OBSERVED OBSERVED OBSERVED OBSERVED ADJUSTED ADJUSTED ADJUSTED ADJUSTED | |SCORE PDO REF. SCORE ODDS LOG ODDS SCORE PDO REF. SCORE ODDS LOG ODDS | |---------------------------------------------------------------------------------------------------------------------| | 45 | 455 | 0.6249 | -0.47 | 45 | 455 | 0.6250 | -0.47 | |----------+---------------+---------------+---------------+----------+---------------+---------------+---------------| | 45 | 500 | 1.2498 | 0.22 | 45 | 500 | 1.2500 | 0.22 | |----------+---------------+---------------+---------------+----------+---------------+---------------+---------------| | 45 | 545 | 2.4996 | 0.92 | 45 | 545 | 2.5000 | 0.92 | |----------+---------------+---------------+---------------+----------+---------------+---------------+---------------| | 45 | 590 | 4.9991 | 1.61 | 45 | 590 | 5.0000 | 1.61 | |----------+---------------+---------------+---------------+----------+---------------+---------------+---------------| | 45 | 635 | 9.9980 | 2.30 | 45 | 635 | 10.0000 | 2.30 | |----------+---------------+---------------+---------------+----------+---------------+---------------+---------------| | 45 | 680 | 19.9956 | 3.00 | 45 | 680 | 20.0000 | 3.00 | |----------+---------------+---------------+---------------+----------+---------------+---------------+---------------| | 45 | 725 | 39.9905 | 3.69 | 45 | 725 | 40.0000 | 3.69 | |----------+---------------+---------------+---------------+----------+---------------+---------------+---------------| | 45 | 770 | 79.9796 | 4.38 | 45 | 770 | 80.0000 | 4.38 | |----------+---------------+---------------+---------------+----------+---------------+---------------+---------------| | 45 | 815 | 159.9565 | 5.07 | 45 | 815 | 160.0000 | 5.08 | |---------------------------------------------------------------------------------------------------------------------| | THERE IS NO DETERIORATION IN THE SCORE ODDS. | ----------------------------------------------------------------------------------------------------------------------- */

## Disaggregating Annual Losses into Each Quarter

In loss forecasting, it is often necessary to disaggregate annual losses into each quarter. The most simple method to convert low frequency to high frequency time series is interpolation, such as the one implemented in EXPAND procedure of SAS/ETS. In the example below, there is a series of annual loss projections from 2013 through 2016. An interpolation by the natural spline is used to convert the annual losses into quarterly ones.

**SAS Code:**

data annual; input loss year mmddyy8.; format year mmddyy8.; datalines; 19270175 12/31/13 18043897 12/31/14 17111193 12/31/15 17011107 12/31/16 ; run; proc expand data = annual out = quarterly from = year to = quarter; id year; convert loss / observed = total method = spline(natural); run; proc sql; select year(year) as year, sum(case when qtr(year) = 1 then loss else 0 end) as qtr1, sum(case when qtr(year) = 2 then loss else 0 end) as qtr2, sum(case when qtr(year) = 3 then loss else 0 end) as qtr3, sum(case when qtr(year) = 4 then loss else 0 end) as qtr4, sum(loss) as total from quarterly group by calculated year; quit;

**Output:**

year qtr1 qtr2 qtr3 qtr4 total 2013 4868536 4844486 4818223 4738931 19270175 2014 4560049 4535549 4510106 4438194 18043897 2015 4279674 4276480 4287373 4267666 17111193 2016 4215505 4220260 4279095 4296247 17011107

While the mathematical interpolation is easy to implement, it might be difficult to justify and interpret from the business standpoint. In reality, there might be an assumption that the loss trend would follow the movement of macro-economy. Therefore, it might be advantageous to disaggregate annual losses into quarterly ones with the inclusion of one or more economic indicators. This approach can be implemented in tempdisagg package of R language. Below is a demo with the same loss data used above. However, disaggregation of annual losses is accomplished based upon a macro-economic indicator.

**R Code:**

library(tempdisagg) loss <- c(19270175, 18043897, 17111193, 17011107) loss.a <- ts(loss, frequency = 1, start = 2013) econ <- c(7.74, 7.67, 7.62, 7.48, 7.32, 7.11, 6.88, 6.63, 6.41, 6.26, 6.12, 6.01, 5.93, 5.83, 5.72, 5.59) econ.q <- ts(econ, frequency = 4, start = 2013) summary(mdl <- td(loss.a ~ econ.q)) print(predict(mdl))

**Output:**

Call: td(formula = loss.a ~ econ.q) Residuals: Time Series: Start = 2013 End = 2016 Frequency = 1 [1] 199753 -234384 -199257 233888 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 2416610 359064 6.730 0.0214 * econ.q 308226 53724 5.737 0.0291 * --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 'chow-lin-maxlog' disaggregation with 'sum' conversion 4 low-freq. obs. converted to 16 high-freq. obs. Adjusted R-squared: 0.9141 AR1-Parameter: 0 (truncated) Qtr1 Qtr2 Qtr3 Qtr4 2013 4852219 4830643 4815232 4772080 2014 4614230 4549503 4478611 4401554 2015 4342526 4296292 4253140 4219235 2016 4302864 4272041 4238136 4198067

In practice, if a simple and flexible solution is desired without the need of interpretation, then the mathematical interpolation might be a good choice. On the other hand, if there is a strong belief that the macro-economy might drive the loss trend, then the regression-based method implemented in tempdisagg package might be preferred. However, in our example, both methods generate extremely similar results.

## How to Construct Piecewise Linear Spline in SAS

options nocenter; data tmp1; do i = 1 to 5000; x = ranuni(1); y = x + rannor(1) * 0.5; if x >= 0.3 then y = y + 6 * (x - 0.3); if x >= 0.6 then y = y - 10 * (x - 0.6); output; end; run; *** Manually Construct Piecewise Spline ***; data tmp2; set tmp1; x1 = x; x2 = max(x - 0.3, 0); x3 = max(x - 0.6, 0); run; proc reg data = tmp2; model y = x1 - x3; run; quit; /* Parameter Standard Variable DF Estimate Error t Value Pr > |t| Intercept 1 0.02627 0.02432 1.08 0.2801 x1 1 0.81570 0.11576 7.05 <.0001 x2 1 6.29682 0.18477 34.08 <.0001 x3 1 -10.19025 0.14870 -68.53 <.0001 */ *** Automatically Construct Piece Spline ***; proc transreg data = tmp1 ss2; model identity(y) = pspline(x / knots = 0.3 0.6 degree = 1); run; /* Type II Sum of Mean Variable DF Coefficient Squares Square F Value Pr > F Label Intercept 1 0.026272 0.28 0.28 1.17 0.2801 Intercept Pspline.x_1 1 0.815702 12.12 12.12 49.65 <.0001 x 1 Pspline.x_2 1 6.296817 283.50 283.50 1161.34 <.0001 x 2 Pspline.x_3 1 -10.190247 1146.47 1146.47 4696.43 <.0001 x 3 */