Tuning hyper-parameters might be the most tedious yet crucial in various machine learning algorithms, such as neural networks, svm, or boosting. The configuration of hyper-parameters not only impacts the computational efficiency of a learning algorithm but also determines its prediction accuracy.
Thus far, manual tuning and grid searching are still the most prevailing strategies. In the paper http://www.jmlr.org/papers/volume13/bergstra12a/bergstra12a.pdf, Bergstra and Bengio showed that the random search is more efficient in the hyper-parameter optimization than both the grid search and the manual tuning. Following the similar logic of the random search, a Sobol sequence is a series of quasi-random numbers designed to cover the space more evenly than uniform random numbers.
The demonstration below compared the Sobol sequence and the uniform random number generator in the hyper-parameter tuning of a General Regression Neural Network (GRNN). In this particular example, the Sobol sequence outperforms the uniform random number generator in two folds. First of all, it picks the hyper-parameter that yields a better performance, e.g. R^2, in the cross-validation. Secondly, the performance is more consistent in multiple trials with a lower variance.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
data(Boston, package = "MASS") | |
grnn.fit <- function(x, y, sigma) { | |
return(grnn::smooth(grnn::learn(data.frame(y, x)), sigma)) | |
} | |
grnn.predict <- function(nn, x) { | |
c <- parallel::detectCores() – 1 | |
return(do.call(rbind, | |
parallel::mcMap(function(i) grnn::guess(nn, as.matrix(x[i, ])), | |
1:nrow(x), mc.cores = c))[,1]) | |
} | |
r2 <- function(act, pre) { | |
rss <- sum((pre – act) ^ 2) | |
tss <- sum((act – mean(act)) ^ 2) | |
return(1 – rss / tss) | |
} | |
grnn.cv <- function(nn, sigmas, nfolds, seed) { | |
dt <- nn$set | |
set.seed(seed) | |
folds <- caret::createFolds(1:nrow(dt), k = nfolds, list = FALSE) | |
cv <- function(s) { | |
r <- do.call(rbind, | |
lapply(1:nfolds, | |
function(i) data.frame(Ya = nn$Ya[folds == i], | |
Yp = grnn.predict(grnn.fit(nn$Xa[folds != i, ], nn$Ya[folds != i], s), | |
data.frame(nn$Xa[folds == i,]))))) | |
return(data.frame(sigma = s, R2 = r2(r$Ya, r$Yp))) | |
} | |
r2_lst <- Reduce(rbind, Map(cv, sigmas)) | |
return(r2_lst[r2_lst$R2 == max(r2_lst$R2), ]) | |
} | |
gen_sobol <- function(min, max, n, seed) { | |
return(round(min + (max – min) * randtoolbox::sobol(n, dim = 1, scrambling = 1, seed = seed), 4)) | |
} | |
gen_unifm <- function(min, max, n, seed) { | |
set.seed(seed) | |
return(round(min + (max – min) * runif(n), 4)) | |
} | |
net <- grnn.fit(Boston[, –14], Boston[, 14], sigma = 2) | |
sobol_out <- Reduce(rbind, Map(function(x) grnn.cv(net, gen_sobol(5, 10, 10, x), 4, 2019), seq(1, 10))) | |
unifm_out <- Reduce(rbind, Map(function(x) grnn.cv(net, gen_unifm(5, 10, 10, x), 4, 2019), seq(1, 10))) | |
out <- rbind(cbind(type = rep("sobol", 10), sobol_out), | |
cbind(type = rep("unifm", 10), unifm_out)) | |
boxplot(R2 ~ type, data = out, main = "Sobol Sequence vs. Uniform Random", | |
ylab = "CV RSquare", xlab = "Sequence Type") |
You must be logged in to post a comment.