In this example, I am trying to optimize the custom function "my_subset_mean" by using the "nb" (custom) function and the "TAopt" algorithm:
library(dplyr)
#create data
df <- data.frame(b = rnorm(100,5,5), d = rnorm(100,2,2),
c = rnorm(100,10,10))
a <- c("a", "b", "c", "d", "e")
a <- sample(a, 100, replace=TRUE, prob=c(0.3, 0.2, 0.3, 0.1, 0.1))
df$a <- a
e <- c("a", "b", "c", "d", "e")
e <- sample(e, 100, replace=TRUE, prob=c(0.3, 0.2, 0.3, 0.1, 0.1))
df$a <- e
#create function to be optimized
my_subset_mean <- function(x){
subset <- df %>% filter(a %in% names(x$r1)[x$r1], e %in% names(x$r4)[x$r4],
b > x$r2,
d < x$r3)
ans <- -mean(subset$c)
if (!is.finite(ans))
ans <- 100
ans
}
#store values of categorical variables into temporary objects
tmp <- !logical(length(sort(unique(a))))
names(tmp) <- sort(unique(a))
tmp1 <- !logical(length(sort(unique(e))))
names(tmp1) <- sort(unique(e))
x <- list(r1 = tmp, r4 = tmp1,
r2 = 0.5,
r3 = 0.5)
### optimization
nb <- function(x) {
i <- sample(c("r1", "r2", "r3", "r4"), 1)
if (i == "r1" & i == "r4") {
j <- sample(length(x[[i]]), 1)
x[[i]][j] <- !x[[i]][j]
} else {
x[[i]] <- x[[i]] + runif(1, min = -0.1, max = 0.1)
x[[i]] <- max(min(1, x[[i]]), 0)
}
x
}
library("NMOF")
ans <- TAopt(my_subset_mean, list(x0 = x, neighbour = nb, nI = 1000))
-my_subset_mean(ans$xbest)
Can someone please tell me - in this process, where would you do specify the upper and lower bounds for r2 and r3? For example, if I want to specify that r2 between (0,2) and r3 between (0,1.5) - where exactly can I specify this?
Thanks!