R Error: (all(xi <= xj) && any(xi < xj)) { : missing value where TRUE/FALSE needed

I am working with the R programming language. I am trying to use the following library to optimize an arbitrary function I wrote: https://cran.r-project.org/web/packages/nsga2R/nsga2R.pdf

First I created some data for this example:

#load library
library(dplyr)
library(nsga2r)

#create data for this example
# create some data for this example
a1 = rnorm(1000,100,10)
b1 = rnorm(1000,100,5)
c1 = sample.int(1000, 1000, replace = TRUE)
train_data = data.frame(a1,b1,c1)

Then, I defined the function for optimization (7 inputs, 4 outputs):

#define function

funct_set <- function (x) {
    x1 <- x[1]; x2 <- x[2]; x3 <- x[3] ; x4 <- x[4]; x5 <- x[5]; x6 <- x[6]; x[7] <- x[7]
    f <- numeric(4)
    
    
    #bin data according to random criteria
    train_data <- train_data %>%
        mutate(cat = ifelse(a1 <= x1 & b1 <= x3, "a",
                            ifelse(a1 <= x2 & b1 <= x4, "b", "c")))
    
    train_data$cat = as.factor(train_data$cat)
    
    #new splits
    a_table = train_data %>%
        filter(cat == "a") %>%
        select(a1, b1, c1, cat)
    
    b_table = train_data %>%
        filter(cat == "b") %>%
        select(a1, b1, c1, cat)
    
    c_table = train_data %>%
        filter(cat == "c") %>%
        select(a1, b1, c1, cat)
    
    
    
    #calculate  quantile ("quant") for each bin
    
    table_a = data.frame(a_table%>% group_by(cat) %>%
                             mutate(quant = ifelse(c1 > x[5],1,0 )))
    
    table_b = data.frame(b_table%>% group_by(cat) %>%
                             mutate(quant = ifelse(c1 > x[6],1,0 )))
    
    table_c = data.frame(c_table%>% group_by(cat) %>%
                             mutate(quant = ifelse(c1 > x[7],1,0 )))
    
    f[1] = mean(table_a$quant)
    f[2] = mean(table_b$quant)
    f[3] = mean(table_c$quant)
    
    
    #group all tables
    
    final_table = rbind(table_a, table_b, table_c)
    # calculate the total mean : this is what needs to be optimized
    
    f[4] = mean(final_table$quant)
    
    
    return (f);
}

Then, I ran the optimization code:

#optimization
results <- nsga2R(fn=funct_set, varNo=7, objDim=4, lowerBounds=c(80,80,80,80, 100, 200, 300), upperBounds=c(120,120,120,120,200,300,400),
                  popSize=50, tourSize=2, generations=50, cprob=0.9, XoverDistIdx=20, mprob=0.1,MuDistIdx=3)

But this returns the following error:

Error in if (all(xi <= xj) && any(xi < xj)) { : 
  missing value where TRUE/FALSE needed

Does anyone know if the error being produced is because of the way I have defined the function/data for this problem? Or is there another reason why this error is being produced?

Thanks

Hello @swaheera ,

two remarks to start with:

  • always use set.seed to that we can exactly reproduce your results
  • please always include a reprex . Of course I know that you included all parts of your code but while copying you made a mistake: you mentioned library(nsga2r) instead of library(nsga2R).
    Not something that can't be solved, but this could have been avoid when you had used the reprex package to show us the code.

That said, I don't understand the purpose of your optimization so I can't help you there.
What strikes me is that you want to optimize (in some way?) four dependent numbers: the mean of a variable and then the mean of subsets thereof. Apparently this is not accepted here. By the way, I think you could have shown more of the context of the error message:

initializing the population
ranking the initial population
Error in if (all(xj <= xi) && any(xj < xi)) { : 
  missing value where TRUE/FALSE needed

I think your code is not wrong because I can run it when I slightly change the function and the optimization call:

set.seed(2021)

#define function

funct_set <- function (x) {
 ....
return(f[3:4])
}

#optimization
results <- nsga2R(fn=funct_set, varNo=7, objDim=2, 
                  lowerBounds=c(80,80,80,80, 100, 200, 300), 
                  upperBounds=c(120,120,120,120,200,300,400),
                  popSize=50, tourSize=2, generations=50, 
                  cprob=0.9, XoverDistIdx=20, mprob=0.1,MuDistIdx=3)


I think this suggests that the code is not wrong but that this optimization is not fit for your purpose (?)

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.