I can calculate the motion of heavenly bodies but not the madness of people. -Isaac Newton

## Scorecard Development with Data from Multiple Sources

This week, one of my friends asked me a very interesting and practical question in the scorecard development. The model development data were collected from multiple independent sources with various data sizes, heterogeneous risk profiles and different bad rates. While the performance statistics seem satisfactory on the model training dataset, the model doesn’t generalize well with new accounts that might come from a unknown source. The situation is very common in a consulting company where a risk or marketing model is sometimes developed with the data from multiple organizations.

To better understand the issue, I simulated a dataset consisting of two groups. In the dataset, while x0 and x1 govern the group segmentation, x2 and x3 define the bad definition. It is important to point out that the group information “grp” is only known in the model development sample but is unknown in the production population. Therefore, the variable “grp”, albeit predictive, can not be explicitly used in the model estimation.

```data one;
do i = 1 to 100000;
x0 = ranuni(0);
x1 = ranuni(1);
x2 = ranuni(2);
x3 = ranuni(3);
if 1 + x0 * 2 + x1 * 4 + rannor(1) > 5 then do;
grp = 1;
if x2 * 2 + x3 * 4 + rannor(2) > 5 then bad = 1;
end;
else do;
grp = 0;
if x2 * 4 + x3 * 2 + rannor(3) > 4 then bad = 1;
end;
output;
end;
run;
```

Our first approach is to use all variables x0 – x3 to build a logistic regression and then evaluate the model altogether and by groups.

```proc logistic data = one desc noprint;
model bad = x0 x1 x2 x3;
score data = one out = mdl1 (rename = (p_1 = score1));
run;

GOOD BAD SEPARATION REPORT FOR SCORE1 IN DATA MDL1
MAXIMUM KS = 59.5763 AT SCORE POINT 0.2281
( AUC STATISTICS = 0.8800, GINI COEFFICIENT = 0.7599, DIVERGENCE = 2.6802 )

SCORE      SCORE             #          #          #    RATE      BAD RATE  PERCENT      PERCENT
--------------------------------------------------------------------------------------------------------
BAD     0.6800     0.9699       2,057      7,943     10,000   79.43%      79.43%    33.81%      33.81%
|      0.4679     0.6799       4,444      5,556     10,000   55.56%      67.50%    23.65%      57.46%
|      0.3094     0.4679       6,133      3,867     10,000   38.67%      57.89%    16.46%      73.92%
|      0.1947     0.3094       7,319      2,681     10,000   26.81%      50.12%    11.41%      85.33%
|      0.1181     0.1946       8,364      1,636     10,000   16.36%      43.37%     6.96%      92.29%
|      0.0690     0.1181       9,044        956     10,000    9.56%      37.73%     4.07%      96.36%
|      0.0389     0.0690       9,477        523     10,000    5.23%      33.09%     2.23%      98.59%
|      0.0201     0.0389       9,752        248     10,000    2.48%      29.26%     1.06%      99.64%
V      0.0085     0.0201       9,925         75     10,000    0.75%      26.09%     0.32%      99.96%
GOOD     0.0005     0.0085       9,991          9     10,000    0.09%      23.49%     0.04%     100.00%
========== ========== ========== ========== ==========
0.0005     0.9699      76,506     23,494    100,000

GOOD BAD SEPARATION REPORT FOR SCORE1 IN DATA MDL1(WHERE = (GRP = 0))
MAXIMUM KS = 61.0327 AT SCORE POINT 0.2457
( AUC STATISTICS = 0.8872, GINI COEFFICIENT = 0.7744, DIVERGENCE = 2.8605 )

SCORE      SCORE             #          #          #    RATE      BAD RATE  PERCENT      PERCENT
--------------------------------------------------------------------------------------------------------
BAD     0.7086     0.9699       1,051      6,162      7,213   85.43%      85.43%    30.51%      30.51%
|      0.5019     0.7086       2,452      4,762      7,214   66.01%      75.72%    23.58%      54.10%
|      0.3407     0.5019       3,710      3,504      7,214   48.57%      66.67%    17.35%      71.45%
|      0.2195     0.3406       4,696      2,517      7,213   34.90%      58.73%    12.46%      83.91%
|      0.1347     0.2195       5,650      1,564      7,214   21.68%      51.32%     7.74%      91.66%
|      0.0792     0.1347       6,295        919      7,214   12.74%      44.89%     4.55%      96.21%
|      0.0452     0.0792       6,737        476      7,213    6.60%      39.42%     2.36%      98.56%
|      0.0234     0.0452       7,000        214      7,214    2.97%      34.86%     1.06%      99.62%
V      0.0099     0.0234       7,150         64      7,214    0.89%      31.09%     0.32%      99.94%
GOOD     0.0007     0.0099       7,201         12      7,213    0.17%      27.99%     0.06%     100.00%
========== ========== ========== ========== ==========
0.0007     0.9699      51,942     20,194     72,136

GOOD BAD SEPARATION REPORT FOR SCORE1 IN DATA MDL1(WHERE = (GRP = 1))
MAXIMUM KS = 53.0942 AT SCORE POINT 0.2290
( AUC STATISTICS = 0.8486, GINI COEFFICIENT = 0.6973, DIVERGENCE = 2.0251 )

SCORE      SCORE             #          #          #    RATE      BAD RATE  PERCENT      PERCENT
--------------------------------------------------------------------------------------------------------
BAD     0.5863     0.9413       1,351      1,435      2,786   51.51%      51.51%    43.48%      43.48%
|      0.3713     0.5862       2,136        651      2,787   23.36%      37.43%    19.73%      63.21%
|      0.2299     0.3712       2,340        446      2,786   16.01%      30.29%    13.52%      76.73%
|      0.1419     0.2298       2,525        262      2,787    9.40%      25.07%     7.94%      84.67%
|      0.0832     0.1419       2,584        202      2,786    7.25%      21.50%     6.12%      90.79%
|      0.0480     0.0832       2,643        144      2,787    5.17%      18.78%     4.36%      95.15%
|      0.0270     0.0480       2,682        104      2,786    3.73%      16.63%     3.15%      98.30%
|      0.0140     0.0270       2,741         46      2,787    1.65%      14.76%     1.39%      99.70%
V      0.0058     0.0140       2,776         10      2,786    0.36%      13.16%     0.30%     100.00%
GOOD     0.0005     0.0058       2,786          0      2,786    0.00%      11.84%     0.00%     100.00%
========== ========== ========== ========== ==========
0.0005     0.9413      24,564      3,300     27,864
```

As shown in the above output, while the overall model performance looks ok, it doesn’t generalize well in the dataset from the 2nd group with a smaller size. While the overall KS could be as high as 60, the KS for the 2nd group is merely 53. The reason is that the overall model performance is heavily influenced by the dataset from the 1st group with the larger size. Therefore, the estimated model is biased toward the risk profile reflected in the 1st group.

To alleviate the bias in the first model, we could first introduce a look-alike model driven by x0 – x1 with the purpose to profile the group and then build two separate risk models with x2 – x3 only for 1st and 2nd groups respectively. As a result, the final predicted probability should be the composite of all three sub-models, as shown below. The model evaluation is also provided to compared with the first model.

```proc logistic data = one desc noprint;
where grp = 0;
score data = one out = mdl20(rename = (p_1 = p_10));
run;

proc logistic data = one desc noprint;
where grp = 1;
score data = one out = mdl21(rename = (p_1 = p_11));
run;

proc logistic data = one desc noprint;
model grp = x0 x1;
score data = one out = seg;
run;

data mdl2;
merge seg mdl20 mdl21;
by i;
score2 = p_10 * (1 - p_1) + p_11 * p_1;
run;

GOOD BAD SEPARATION REPORT FOR SCORE2 IN DATA MDL2
MAXIMUM KS = 60.6234 AT SCORE POINT 0.2469
( AUC STATISTICS = 0.8858, GINI COEFFICIENT = 0.7715, DIVERGENCE = 2.8434 )

SCORE      SCORE             #          #          #    RATE      BAD RATE  PERCENT      PERCENT
--------------------------------------------------------------------------------------------------------
BAD     0.6877     0.9677       2,011      7,989     10,000   79.89%      79.89%    34.00%      34.00%
|      0.4749     0.6876       4,300      5,700     10,000   57.00%      68.45%    24.26%      58.27%
|      0.3125     0.4748       6,036      3,964     10,000   39.64%      58.84%    16.87%      75.14%
|      0.1932     0.3124       7,451      2,549     10,000   25.49%      50.51%    10.85%      85.99%
|      0.1142     0.1932       8,379      1,621     10,000   16.21%      43.65%     6.90%      92.89%
|      0.0646     0.1142       9,055        945     10,000    9.45%      37.95%     4.02%      96.91%
|      0.0345     0.0646       9,533        467     10,000    4.67%      33.19%     1.99%      98.90%
|      0.0166     0.0345       9,800        200     10,000    2.00%      29.29%     0.85%      99.75%
V      0.0062     0.0166       9,946         54     10,000    0.54%      26.10%     0.23%      99.98%
GOOD     0.0001     0.0062       9,995          5     10,000    0.05%      23.49%     0.02%     100.00%
========== ========== ========== ========== ==========
0.0001     0.9677      76,506     23,494    100,000

GOOD BAD SEPARATION REPORT FOR SCORE2 IN DATA MDL2(WHERE = (GRP = 0))
MAXIMUM KS = 61.1591 AT SCORE POINT 0.2458
( AUC STATISTICS = 0.8880, GINI COEFFICIENT = 0.7759, DIVERGENCE = 2.9130 )

SCORE      SCORE             #          #          #    RATE      BAD RATE  PERCENT      PERCENT
--------------------------------------------------------------------------------------------------------
BAD     0.7221     0.9677       1,075      6,138      7,213   85.10%      85.10%    30.40%      30.40%
|      0.5208     0.7221       2,436      4,778      7,214   66.23%      75.66%    23.66%      54.06%
|      0.3533     0.5208       3,670      3,544      7,214   49.13%      66.82%    17.55%      71.61%
|      0.2219     0.3532       4,726      2,487      7,213   34.48%      58.73%    12.32%      83.92%
|      0.1309     0.2219       5,617      1,597      7,214   22.14%      51.41%     7.91%      91.83%
|      0.0731     0.1309       6,294        920      7,214   12.75%      44.97%     4.56%      96.39%
|      0.0387     0.0731       6,762        451      7,213    6.25%      39.44%     2.23%      98.62%
|      0.0189     0.0387       7,009        205      7,214    2.84%      34.86%     1.02%      99.63%
V      0.0074     0.0189       7,152         62      7,214    0.86%      31.09%     0.31%      99.94%
GOOD     0.0002     0.0073       7,201         12      7,213    0.17%      27.99%     0.06%     100.00%
========== ========== ========== ========== ==========
0.0002     0.9677      51,942     20,194     72,136

GOOD BAD SEPARATION REPORT FOR SCORE2 IN DATA MDL2(WHERE = (GRP = 1))
MAXIMUM KS = 57.6788 AT SCORE POINT 0.1979
( AUC STATISTICS = 0.8717, GINI COEFFICIENT = 0.7434, DIVERGENCE = 2.4317 )

SCORE      SCORE             #          #          #    RATE      BAD RATE  PERCENT      PERCENT
--------------------------------------------------------------------------------------------------------
BAD     0.5559     0.9553       1,343      1,443      2,786   51.79%      51.79%    43.73%      43.73%
|      0.3528     0.5559       2,001        786      2,787   28.20%      40.00%    23.82%      67.55%
|      0.2213     0.3528       2,364        422      2,786   15.15%      31.71%    12.79%      80.33%
|      0.1372     0.2213       2,513        274      2,787    9.83%      26.24%     8.30%      88.64%
|      0.0840     0.1372       2,588        198      2,786    7.11%      22.42%     6.00%      94.64%
|      0.0484     0.0840       2,683        104      2,787    3.73%      19.30%     3.15%      97.79%
|      0.0256     0.0483       2,729         57      2,786    2.05%      16.84%     1.73%      99.52%
|      0.0118     0.0256       2,776         11      2,787    0.39%      14.78%     0.33%      99.85%
V      0.0040     0.0118       2,781          5      2,786    0.18%      13.16%     0.15%     100.00%
GOOD     0.0001     0.0040       2,786          0      2,786    0.00%      11.84%     0.00%     100.00%
========== ========== ========== ========== ==========
0.0001     0.9553      24,564      3,300     27,864
```

After comparing KS statistics from two modeling approaches, we can see that, while the performance from the 2nd approach on the overall sample is only slightly better than the one from the 1st approach, the KS on the 2nd group with a smaller size, e.g. grp = 1, increases from 53 upto 58 by 8.6%. While the example is just for two groups, it is trivial to generalize in cases with more than two groups.