Yet Another Blog in Statistical Computing

I can calculate the motion of heavenly bodies but not the madness of people. -Isaac Newton

Generate and Retrieve Many Objects with Sequential Names

While coding ensemble methods in data mining with R, e.g. bagging, we often need to generate many data and models objects with sequential names. Below is a quick example how to use assign() function to generate many prediction objects on the fly and then retrieve these predictions with mget() to do the model averaging.

data(Boston, package = "MASS")

for (i in 1:10) {
  set.seed(i)
  smp <- Boston[sample(1:nrow(Boston), nrow(Boston), replace = TRUE), ]
  glm <- glm(medv ~ ., data = smp)
  prd <- predict(glm, Boston)
  ### ASSIGN A LIST OF SEQUENTIAL NAMES TO PREDICTIONS ###
  assign(paste("p", i, sep = ""), prd)
}

### RETURN NAMED OBJECTS TO A LIST ###
plist <- mget(paste('p', 1:i, sep = ''))
### AGGREGATE ALL PREDICTIONS ###
pcols <- do.call('cbind', plist)
pred_medv <- rowSums(pcols) / i

### A SIMPLE FUNCTION CALCULATION R-SQUARE ###
r2 <- function(y, yhat) {
  ybar <- mean(y)
  r2 <- sum((yhat - ybar) ^ 2) / sum((y - ybar) ^ 2)
  return(r2)
}
print(r2(Boston$medv, pred_medv))
# OUTPUT:
# [1] 0.7454225
About these ads

Written by statcompute

September 8, 2013 at 2:09 pm

Follow

Get every new post delivered to your Inbox.

Join 80 other followers

%d bloggers like this: