“Strategic” Way to Insert Constraints into a Function (Is this Correct?)

I am working with R and trying to perform optimization on the following function using the "mco" library (https://cran.r-project.org/web/packages/mco/mco.pdf):

#load libraries
library(mco)
library(dplyr)

# create some data for this example
    a1 = rnorm(1000,100,10)
    b1 = rnorm(1000,100,5)
    c1 = sample.int(1000, 1000, replace = TRUE)
    train_data = data.frame(a1,b1,c1)

Now, I define the function "funct_set" (7 inputs, 4 outputs) for optimization - in this function, I want to set "constraints" such that " n5, n6, n7 " can not be smaller than 200. Do this is, I tried to "trick" the computer into assigning the "outputs" (i.e. f[1], f[2], f[3], f[4] ) some illogical value (e.g. 99999):

#define function 
funct_set <- function (x) {
    x1 <- x[1]; x2 <- x[2]; x3 <- x[3] ; x4 <- x[4]; x5 <- x[5]; x6 <- x[6]; x[7] <- x[7]
    f <- numeric(4)
    
    
    #bin data according to random criteria
    train_data <- train_data %>%
        mutate(cat = ifelse(a1 <= x1 & b1 <= x3, "a",
                            ifelse(a1 <= x2 & b1 <= x4, "b", "c")))
    
    train_data$cat = as.factor(train_data$cat)
    
    #new splits
    a_table = train_data %>%
        filter(cat == "a") %>%
        select(a1, b1, c1, cat)
    
    b_table = train_data %>%
        filter(cat == "b") %>%
        select(a1, b1, c1, cat)
    
    c_table = train_data %>%
        filter(cat == "c") %>%
        select(a1, b1, c1, cat)
    
    
    
    #calculate  quantile ("quant") for each bin
    
    table_a = data.frame(a_table%>% group_by(cat) %>%
                             mutate(quant = ifelse(c1 > x[5],1,0 )))
    
    table_b = data.frame(b_table%>% group_by(cat) %>%
                             mutate(quant = ifelse(c1 > x[6],1,0 )))
    
    table_c = data.frame(c_table%>% group_by(cat) %>%
                             mutate(quant = ifelse(c1 > x[7],1,0 )))
    
    
    
    
    #group all tables
    
    final_table = rbind(table_a, table_b, table_c)
    # calculate the total mean 
    
    
    
    #count number of rows in each table
    n5 = data.frame(table_a %>% 
                        summarise(count = n()))
    
    n6 = data.frame(table_b %>% 
                        summarise(count = n()))
    
    n7 = data.frame(table_c %>% 
                        summarise(count = n()))
    if (n5 <200){
                f[1]=-999999}
    else {
    f[1] = mean(table_a$quant)
    }

    if (n6 <200){
                f[2]=-999999}
    else {
    f[2] = mean(table_b$quant)
    }

    if (n7 <200){
                f[3]=-999999}
    else {
    f[3] = mean(table_c$quant)
    }
    #f[2] = mean(table_b$quant)
    #f[3] = mean(table_c$quant)
   
    f[4] = mean(final_table$quant)
    
    return (f);
}

The "mco" library offers a standard way to add constraints to the inputs of the function you are optimizing (i.e. x[1], x[2], x[3], x[4], x[5], x[6], x[7] ):

#add constraints
gn <- function(x) {
     g1 <- x[3] - x[1] 
     g2<- x[4] - x[2] 
     g3 <- x[7] - x[6]
     g4 <- x[6] - x[5] 
     #g5 <- n5 > 200
     #g6 <- n6 > 200
     #g7 <- n7 > 200
     return(c(g1,g2,g3,g4))
}

But since " n5, n6, n7 " are not the final outputs of the function ("funct_set"), constraints can not be placed on them within the " gn " object. This why I tried to define these constraints within the original function ("funct_set") itself.

Finally, I ran the optimization:

#run optimization:
optimization <- nsga2(funct_set, idim = 7, odim = 4 , constraints = gn, cdim = 4,
                       
                       generations=150,
                       popsize=100,
                      cprob=0.7,
                       cdist=20,
                       mprob=0.2,
                       mdist=20,
                       lower.bounds=rep(80,80,80,80, 100,200,300),
                       upper.bounds=rep(120,120,120,120,200,300,400)
 )

The above code seems to run successfully, as well as having respected the logical constraints:

#view results

#optimized input parameters
head(optimization$par)

          [,1]      [,2]     [,3]     [,4]     [,5]     [,6]     [,7]
[1,]  96.33968 102.80793 103.3724 103.8658 116.9360 119.4670 119.9997
[2,] 102.42030 100.27308 104.6474 105.9168 119.5517 119.9530 119.9992
[3,]  92.77710 100.52490 105.4731 100.6363 108.3434 119.6574 119.9990

#view optimized outputs
 head(optimization$value)

          [,1]         [,2]      [,3]  [,4]
[1,] 0.8982456 8.038278e-01 0.8833992 0.871
[2,] 0.8546169 9.999990e+05 0.8820961 0.870
[3,] 0.9061033 9.999990e+05 0.8522167 0.871

My Question: Can someone please tell me if the way I have placed constraints on " *n5, n6, n7* " is correct? Are there any other ways that allow for these constraints to be placed?

Thanks

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.