I can calculate the motion of heavenly bodies but not the madness of people. -Isaac Newton

## Estimating GLM with Julia

```using DataFrames, GLM

df2 = df1[df1[:CARDHLDR] .== 1, [:DEFAULT, :MAJORDRG, :MINORDRG, :INCOME, :OWNRENT]];

mdl = glm(DEFAULT ~ MAJORDRG + MINORDRG + INCOME + OWNRENT, df2, Binomial());

print(mdl);
# Coefficients:
#                  Estimate  Std.Error  z value Pr(>|z|)
# (Intercept)      -1.20444  0.0908218 -13.2616  < eps()
# MAJORDRG         0.203135  0.0692537  2.93319   0.0034
# MINORDRG         0.202727  0.0479741  4.22575   2.4e-5
# INCOME       -0.000442229 4.04222e-5 -10.9402  < eps()
# OWNRENT         -0.201223  0.0716217 -2.80953    0.005

print(deviance(mdl));
# 6376.220859525586
```

Written by statcompute

March 8, 2014 at 7:13 pm

## Learning Pig Latin on 2014-03-01

```grunt> sh head -4 2008.csv;
Year,Month,DayofMonth,DayOfWeek,DepTime,CRSDepTime,ArrTime,CRSArrTime,UniqueCarrier,FlightNum,TailNum,ActualElapsedTime,CRSElapsedTime,AirTime,ArrDelay,DepDelay,Origin,Dest,Distance,TaxiIn,TaxiOut,Cancelled,CancellationCode,Diverted,CarrierDelay,WeatherDelay,NASDelay,SecurityDelay,LateAircraftDelay
2008,1,3,4,628,620,804,750,WN,448,N428WN,96,90,76,14,8,IND,BWI,515,3,17,0,,0,NA,NA,NA,NA,NA

grunt> sh sed '1d' 2008.csv > 2008.csv2;

2008,1,3,4,628,620,804,750,WN,448,N428WN,96,90,76,14,8,IND,BWI,515,3,17,0,,0,NA,NA,NA,NA,NA
2008,1,3,4,926,930,1054,1100,WN,1746,N612SW,88,90,78,-6,-4,IND,BWI,515,3,7,0,,0,NA,NA,NA,NA,NA

grunt> A = LOAD '2008.csv2' USING PigStorage(',');

grunt> B = GROUP A BY (\$0, \$1);

grunt> C = FOREACH B GENERATE group, COUNT(A);

grunt> D = FILTER C BY \$0.\$1 IN (1, 2, 3);

grunt> SPLIT D INTO D1 IF \$0.\$1 == 1, D2 IF \$0.\$1 == 2, D3 IF \$0.\$1 == 3;

grunt> dump D3;
((2008,3),616090)
```

Written by statcompute

March 1, 2014 at 11:05 pm

Posted in Big Data, Pig Latin