I am working the R language. I am trying to follow this tutorial here (A quick tour of GA ) on function optimization.
I created some data for this example:
#load libraries
library(dplyr)
library(GA)
# create some data for this example
a1 = rnorm(1000,100,10)
b1 = rnorm(1000,100,5)
c1 = sample.int(1000, 1000, replace = TRUE)
train_data = data.frame(a1,b1,c1)
I defined the following function ("funct_set") over here:
funct_set <- function (x) {
x1 <- x[1]; x2 <- x[2]; x3 <- x[3] ; x4 <- x[4]
f <- numeric(4)
#bin data according to random criteria
train_data <- train_data %>%
mutate(cat = ifelse(a1 <= x1 & b1 <= x3, "a",
ifelse(a1 <= x2 & b1 <= x4, "b", "c")))
train_data$cat = as.factor(train_data$cat)
#new splits
a_table = train_data %>%
filter(cat == "a") %>%
select(a1, b1, c1, cat)
b_table = train_data %>%
filter(cat == "b") %>%
select(a1, b1, c1, cat)
c_table = train_data %>%
filter(cat == "c") %>%
select(a1, b1, c1, cat)
#calculate quantile ("quant") for each bin
table_a = data.frame(a_table%>% group_by(cat) %>%
mutate(quant = ifelse(c1 > x[5],1,0 )))
table_b = data.frame(b_table%>% group_by(cat) %>%
mutate(quant = ifelse(c1 > x[6],1,0 )))
table_c = data.frame(c_table%>% group_by(cat) %>%
mutate(quant = ifelse(c1 > x[7],1,0 )))
f[1] = -mean(table_a$quant)
f[2] = -mean(table_b$quant)
f[3] = -mean(table_c$quant)
#group all tables
final_table = rbind(table_a, table_b, table_c)
# calculate the total mean : this is what needs to be optimized
f[4] = -mean(final_table$quant)
return (f);
}
Then, I ran the optimization algorithm:
GA <- ga(type = "real-valued",
fitness = funct_set,
lower = c(80, 1, 80, 1, 90,180, 365), upper = c(100, 20, 100, 20, 140,400,720),
popSize = 50, maxiter = 20, run = 20)
This successfully runs, e.g. :
GA | iter = 1 | Mean = -0.9686663 | Best = -0.8636364
GA | iter = 2 | Mean = -0.9408230 | Best = -0.8571429
GA | iter = 3 | Mean = -0.9107766 | Best = -0.8571429
GA | iter = 4 | Mean = -0.8995899 | Best = -0.8571429
etc
But this produces the following warning:
There were 50 or more warnings (use warnings() to see the first 50)
When I view the warnings, I see this warning appear multiple times:
#view warnings
warnings()
Warning messages:
1: In Fitness[i] <- fit :
number of items to replace is not a multiple of replacement length
2: In Fitness[i] <- fit :
number of items to replace is not a multiple of replacement length
3: In Fitness[i] <- fit :
number of items to replace is not a multiple of replacement length
etc
Does anyone know why these warnings are being produced? Is there a way to avoid these warnings?
Thanks