Yet Another Blog in Statistical Computing

I can calculate the motion of heavenly bodies but not the madness of people. -Isaac Newton

Quasi-Binomial Model in SAS

Similar to quasi-Poisson regressions, quasi-binomial regressions try to address the excessive variance by the inclusion of a dispersion parameter. In addition to addressing the over-dispersion, quasi-binomial regressions also demonstrate extra values in other areas, such as LGD model development in credit risk modeling, due to its flexible distributional assumption.

Measuring the ratio between NCO and GCO, LGD could take any value in the range [0, 1] with no unanimous consensus on the distributional assumption currently in the industry. An advantage of quasi-binomial regression is that it makes no assumption of a specific distribution but merely specifies the conditional mean for a given model response. As a result, the trade-off is the lack of likelihood-based measures such as AIC and BIC.

Below is a demonstration on how to estimate a quasi-binomial model with GLIMMIX procedure in SAS.

proc glimmix data = _last_;
  model y = age number start / link = logit solution;
  _variance_ = _mu_ * (1-_mu_);
  random _residual_;
              Model Information
Data Set                     WORK.KYPHOSIS   
Response Variable            y               
Response Distribution        Unknown         
Link Function                Logit           
Variance Function            _mu_ * (1-_mu_) 
Variance Matrix              Diagonal        
Estimation Technique         Quasi-Likelihood
Degrees of Freedom Method    Residual        

                       Parameter Estimates 
Effect       Estimate       Error       DF    t Value    Pr > |t|
Intercept     -2.0369      1.3853       77      -1.47      0.1455
age           0.01093    0.006160       77       1.77      0.0800
number         0.4106      0.2149       77       1.91      0.0598
start         -0.2065     0.06470       77      -3.19      0.0020
Residual       0.9132           .        .        .         .    

For the comparison purpose, the same model is also estimated with R glm() function, showing identical outputs.

summary(glm(data = kyphosis, Kyphosis ~ ., family = quasibinomial))
#            Estimate Std. Error t value Pr(>|t|)   
#(Intercept) -2.03693    1.38527  -1.470  0.14552   
#Age          0.01093    0.00616   1.774  0.07996 . 
#Number       0.41060    0.21489   1.911  0.05975 . 
#Start       -0.20651    0.06470  -3.192  0.00205 **
#(Dispersion parameter for quasibinomial family taken to be 0.913249)

Written by statcompute

November 1, 2015 at 1:20 pm