Yet Another Blog in Statistical Computing

"Did you always know?" "No, I didn't. But I believed."

Generate and Retrieve Many Objects with Sequential Names

While coding ensemble methods in data mining with R, e.g. bagging, we often need to generate many data and models objects with sequential names. Below is a quick example how to use assign() function to generate many prediction objects on the fly and then retrieve these predictions with mget() to do the model averaging.

data(Boston, package = "MASS")

for (i in 1:10) {
  set.seed(i)
  smp <- Boston[sample(1:nrow(Boston), nrow(Boston), replace = TRUE), ]
  glm <- glm(medv ~ ., data = smp)
  prd <- predict(glm, Boston)
  ### ASSIGN A LIST OF SEQUENTIAL NAMES TO PREDICTIONS ###
  assign(paste("p", i, sep = ""), prd)
}

### RETURN NAMED OBJECTS TO A LIST ###
plist <- mget(paste('p', 1:i, sep = ''))
### AGGREGATE ALL PREDICTIONS ###
pcols <- do.call('cbind', plist)
pred_medv <- rowSums(pcols) / i

### A SIMPLE FUNCTION CALCULATION R-SQUARE ###
r2 <- function(y, yhat) {
  ybar <- mean(y)
  r2 <- sum((yhat - ybar) ^ 2) / sum((y - ybar) ^ 2)
  return(r2)
}
print(r2(Boston$medv, pred_medv))
# OUTPUT:
# [1] 0.7454225
About these ads

Written by statcompute

September 8, 2013 at 2:09 pm

Follow

Get every new post delivered to your Inbox.

Join 67 other followers

%d bloggers like this: