I can calculate the motion of heavenly bodies but not the madness of people. -Isaac Newton

## PART – A Rule-Learning Algorithm

```> require('RWeka')
> require('pROC')
>
> # SEPARATE DATA INTO TRAINING AND TESTING SETS
> df2 <- df1[df1\$CARDHLDR == 1, 2:12]
> set.seed(2013)
> rows <- sample(1:nrow(df2), nrow(df2) - 1000)
> set1 <- df2[rows, ]
> set2 <- df2[-rows, ]
>
> # BUILD A PART RULE MODEL
> mdl1 <- PART(factor(BAD) ~., data = set1)
> print(mdl1)
PART decision list
------------------

EXP_INC > 0.000774 AND
AGE > 21.833334 AND
INCOME > 2100 AND
MAJORDRG <= 0 AND
OWNRENT > 0 AND
MINORDRG <= 1: 0 (2564.0/103.0)

AGE > 21.25 AND
EXP_INC > 0.000774 AND
INCPER > 17010 AND
INCOME > 1774.583333 AND
MINORDRG <= 0: 0 (2278.0/129.0)

AGE > 20.75 AND
EXP_INC > 0.016071 AND
OWNRENT > 0 AND
SELFEMPL > 0 AND
EXP_INC <= 0.233759 AND
MINORDRG <= 1: 0 (56.0)

AGE > 20.75 AND
EXP_INC > 0.016071 AND
SELFEMPL <= 0 AND
OWNRENT > 0: 0 (1123.0/130.0)

OWNRENT <= 0 AND
AGE > 20.75 AND
MINORDRG > 0 AND

OWNRENT <= 0 AND
AGE > 20.75 AND

INCOME > 1423 AND
OWNRENT <= 0 AND
MINORDRG <= 1 AND
SELFEMPL <= 0 AND
MINORDRG <= 0: 0 (943.0/124.0)

SELFEMPL > 0 AND
MAJORDRG <= 0 AND

SELFEMPL > 0 AND
MAJORDRG <= 1 AND
MAJORDRG <= 0 AND
MINORDRG <= 0 AND
INCOME > 2708.333333: 0 (17.0)

SELFEMPL > 0 AND
MAJORDRG <= 1 AND
OWNRENT <= 0 AND
MINORDRG <= 0 AND
INCPER <= 8400: 0 (13.0)

SELFEMPL <= 0 AND
OWNRENT > 0 AND
MINORDRG <= 0 AND
MAJORDRG <= 0: 0 (107.0/15.0)

OWNRENT <= 0 AND
MINORDRG > 0 AND
MINORDRG <= 1 AND
MAJORDRG <= 1 AND
MAJORDRG <= 0 AND
SELFEMPL <= 0: 0 (87.0/13.0)

OWNRENT <= 0 AND
SELFEMPL <= 0 AND
MAJORDRG <= 0 AND
MINORDRG <= 1: 0 (373.0/100.0)

MAJORDRG > 0 AND
MINORDRG > 0 AND
MAJORDRG <= 1 AND
MINORDRG <= 1: 0 (29.0)

SELFEMPL <= 0 AND
OWNRENT > 0 AND
MAJORDRG <= 0: 0 (199.0/57.0)

OWNRENT <= 0 AND
SELFEMPL <= 0: 0 (84.0/24.0)

MAJORDRG > 1: 0 (17.0/3.0)

MAJORDRG > 0: 0 (10.0)

MAJORDRG <= 0 AND
OWNRENT <= 0: 0 (29.0/7.0)

OWNRENT > 0 AND
SELFEMPL > 0 AND
EXP_INC <= 0.218654 AND
MINORDRG <= 2 AND
MINORDRG <= 1: 0 (8.0/1.0)

OWNRENT > 0 AND
INCOME <= 2041.666667 AND
MAJORDRG > 0 AND

OWNRENT > 0 AND
AGE > 33.416668 AND
SELFEMPL > 0: 0 (10.0/1.0)

OWNRENT > 0 AND
SELFEMPL <= 0 AND
MINORDRG <= 1 AND
AGE > 33.5 AND
EXP_INC > 0.006737: 0 (6.0)

EXP_INC > 0.001179: 1 (16.0/1.0)

: 0 (3.0)

Number of Rules  : 	25

> pred1 <- data.frame(prob = predict(mdl1, newdata = set2, type = 'probability')[, 2])
> # ROC FOR TESTING SET

Call:
roc.default(response = set2\$BAD, predictor = pred1\$prob)

Data: pred1\$prob in 905 controls (set2\$BAD 0) < 95 cases (set2\$BAD 1).
Area under the curve: 0.6794
>
> # BUILD A LOGISTIC REGRESSION
> mdl2 <- Logistic(factor(BAD) ~., data = set1)
> print(mdl2)
Logistic Regression with ridge parameter of 1.0E-8
Coefficients...
Class
Variable           0
====================
AGE           0.0112
MAJORDRG     -0.2312
MINORDRG     -0.1991
OWNRENT       0.2244
INCOME        0.0004
SELFEMPL     -0.1206
INCPER             0
EXP_INC       0.4472
Intercept     0.7965

Odds Ratios...
Class
Variable           0
====================
AGE           1.0113
MAJORDRG      0.7936
MINORDRG      0.8195
OWNRENT       1.2516
INCOME        1.0004
SELFEMPL      0.8864
INCPER             1
EXP_INC       1.5639

> pred2 <- data.frame(prob = predict(mdl2, newdata = set2, type = 'probability')[, 2])
> # ROC FOR TESTING SET

Call:
roc.default(response = set2\$BAD, predictor = pred2\$prob)

Data: pred2\$prob in 905 controls (set2\$BAD 0) < 95 cases (set2\$BAD 1).
Area under the curve: 0.6529
>
> # COMPARE TWO ROCS
> roc.test(roc1, roc2)

DeLong's test for two correlated ROC curves

data:  roc1 and roc2
Z = 1.0344, p-value = 0.301
alternative hypothesis: true difference in AUC is not equal to 0
sample estimates:
AUC of roc1 AUC of roc2
0.6793894   0.6528875
```