I can calculate the motion of heavenly bodies but not the madness of people. -Isaac Newton

## Lagrange Multiplier (LM) Test for Over-Dispersion

While Poisson regression is often used as a baseline model for count data, its assumption of equi-dispersion is too restrictive for many empirical applications. In practice, the variance of observed count data usually exceeds the mean, namely over-dispersion, due to the unobserved heterogeneity and/or excess zeroes. With the similar consequences of heteroskedasticity in the linear regression, over-dispersion in a Poisson regression will lead to deflated standard errors of parameter estimates and therefore inflated t-statistics. After the development of Poisson regression, it is always a sound practice to do an additional analysis for over-dispersion.

Below is a SAS macro to test the over-dispersion based upon the Lagrange Multiplier (LM) Test introduced by William Greene (2002) in his famous “Econometric Analysis”. The statistic follows the chi-square distribution with 1 degree freedom. The null hypothesis implies equi-dispersion in outcomes from the tested Poisson model.

```%macro lm(data = , y = , pred_y = );
***************************************************;
* This macro is to test the over-dispersion based *;
* on outcomes from a poisson model                *;
*                            -- wensui.liu@53.com *;
***************************************************;
* parameters:                                     *;
*  data  : the input dataset                      *;
*  y     : observed count outcome                 *;
*  pred_y: predicted outcome from poisson model   *;
***************************************************;
* reference:                                      *;
*  w. greene (2002), econometric analysis         *;
***************************************************;

proc iml;
use &data;
read all var {&y} into y;
read all var {&pred_y} into lambda;
close &data;

e = (y - lambda);
n = nrow(y);
ybar = y`[, :];
LM = (e` * e - n * ybar) ** 2 / (2 * lambda` * lambda);
Pvalue = 1 - probchi(LM, 1);
title 'LM TEST FOR OVER-DISPERSION';
print LM Pvalue;
title;
quit;

***************************************************;
*                 end of macro                    *;
***************************************************;
%mend lm;
```

Next, a use case of the aforementioned LM test is demonstrated. First of all, a vector of Poisson outcomes are simulated with 10% excessive zeros and therefore over-dispersion.

```*** SIMULATE A POISSON VECTOR WITH EXCESSIVE ZEROS ***;
data one;
do i = 1 to 1000;
x = ranuni(i);
if i <= 900 then y = ranpoi(i, exp(x * 2));
else y = 0;
output;
end;
run;
```

A Poisson regression is estimated with the simulated count outcomes including excessive zeros. After the calculation of predicted values, LM test is used to test the over-dispersion. As shown below, the null hypothesis of equi-dispersion is rejected with LM-stat = 31.18.

```*** TEST DISPERSION WITH EXCESSIVE ZEROS ***;
ods listing close;
proc genmod data = one;
model y =  x / dist = poisson;
output out = out1 p = predicted;
run;
ods listing;

%lm(data = out1, y = y, pred_y = predicted);
/*
LM TEST FOR OVER-DISPERSION

LM    PVALUE

31.182978 2.3482E-8
*/
```

Another Poisson regression is also estimated with simulated count outcomes excluding 10% excessive zeros. As expected, with outcomes from this newly estimated Poisson model, the null hypothesis of equi-dispersion is not rejected.

```*** TEST DISPERSION WITHOUT EXCESSIVE ZEROS ***;
ods listing close;
proc genmod data = one;
where i <= 900;
model y =  x / dist = poisson;
output out = out2 p = predicted;
run;
ods listing;

%lm(data = out2, y = y, pred_y = predicted);
/*
LM TEST FOR OVER-DISPERSION

LM    PVALUE

0.052131 0.8193959
*/
```