“Strategic” Way to Insert Constraints into a Function (Is this Correct?)

swaheera · July 16, 2021, 1:51am

I am working with R and trying to perform optimization on the following function using the "mco" library (https://cran.r-project.org/web/packages/mco/mco.pdf):

#load libraries
library(mco)
library(dplyr)

# create some data for this example
    a1 = rnorm(1000,100,10)
    b1 = rnorm(1000,100,5)
    c1 = sample.int(1000, 1000, replace = TRUE)
    train_data = data.frame(a1,b1,c1)

Now, I define the function "funct_set" (7 inputs, 4 outputs) for optimization - in this function, I want to set "constraints" such that " n5, n6, n7 " can not be smaller than 200. Do this is, I tried to "trick" the computer into assigning the "outputs" (i.e. f[1], f[2], f[3], f[4] ) some illogical value (e.g. 99999):

#define function 
funct_set <- function (x) {
    x1 <- x[1]; x2 <- x[2]; x3 <- x[3] ; x4 <- x[4]; x5 <- x[5]; x6 <- x[6]; x[7] <- x[7]
    f <- numeric(4)
    
    
    #bin data according to random criteria
    train_data <- train_data %>%
        mutate(cat = ifelse(a1 <= x1 & b1 <= x3, "a",
                            ifelse(a1 <= x2 & b1 <= x4, "b", "c")))
    
    train_data$cat = as.factor(train_data$cat)
    
    #new splits
    a_table = train_data %>%
        filter(cat == "a") %>%
        select(a1, b1, c1, cat)
    
    b_table = train_data %>%
        filter(cat == "b") %>%
        select(a1, b1, c1, cat)
    
    c_table = train_data %>%
        filter(cat == "c") %>%
        select(a1, b1, c1, cat)
    
    
    
    #calculate  quantile ("quant") for each bin
    
    table_a = data.frame(a_table%>% group_by(cat) %>%
                             mutate(quant = ifelse(c1 > x[5],1,0 )))
    
    table_b = data.frame(b_table%>% group_by(cat) %>%
                             mutate(quant = ifelse(c1 > x[6],1,0 )))
    
    table_c = data.frame(c_table%>% group_by(cat) %>%
                             mutate(quant = ifelse(c1 > x[7],1,0 )))
    
    
    
    
    #group all tables
    
    final_table = rbind(table_a, table_b, table_c)
    # calculate the total mean 
    
    
    
    #count number of rows in each table
    n5 = data.frame(table_a %>% 
                        summarise(count = n()))
    
    n6 = data.frame(table_b %>% 
                        summarise(count = n()))
    
    n7 = data.frame(table_c %>% 
                        summarise(count = n()))
    if (n5 <200){
                f[1]=-999999}
    else {
    f[1] = mean(table_a$quant)
    }

    if (n6 <200){
                f[2]=-999999}
    else {
    f[2] = mean(table_b$quant)
    }

    if (n7 <200){
                f[3]=-999999}
    else {
    f[3] = mean(table_c$quant)
    }
    #f[2] = mean(table_b$quant)
    #f[3] = mean(table_c$quant)
   
    f[4] = mean(final_table$quant)
    
    return (f);
}

The "mco" library offers a standard way to add constraints to the inputs of the function you are optimizing (i.e. x[1], x[2], x[3], x[4], x[5], x[6], x[7] ):

#add constraints
gn <- function(x) {
     g1 <- x[3] - x[1] 
     g2<- x[4] - x[2] 
     g3 <- x[7] - x[6]
     g4 <- x[6] - x[5] 
     #g5 <- n5 > 200
     #g6 <- n6 > 200
     #g7 <- n7 > 200
     return(c(g1,g2,g3,g4))
}

But since " n5, n6, n7 " are not the final outputs of the function ("funct_set"), constraints can not be placed on them within the " gn " object. This why I tried to define these constraints within the original function ("funct_set") itself.

Finally, I ran the optimization:

#run optimization:
optimization <- nsga2(funct_set, idim = 7, odim = 4 , constraints = gn, cdim = 4,
                       
                       generations=150,
                       popsize=100,
                      cprob=0.7,
                       cdist=20,
                       mprob=0.2,
                       mdist=20,
                       lower.bounds=rep(80,80,80,80, 100,200,300),
                       upper.bounds=rep(120,120,120,120,200,300,400)
 )

The above code seems to run successfully, as well as having respected the logical constraints:

#view results

#optimized input parameters
head(optimization$par)

          [,1]      [,2]     [,3]     [,4]     [,5]     [,6]     [,7]
[1,]  96.33968 102.80793 103.3724 103.8658 116.9360 119.4670 119.9997
[2,] 102.42030 100.27308 104.6474 105.9168 119.5517 119.9530 119.9992
[3,]  92.77710 100.52490 105.4731 100.6363 108.3434 119.6574 119.9990

#view optimized outputs
 head(optimization$value)

          [,1]         [,2]      [,3]  [,4]
[1,] 0.8982456 8.038278e-01 0.8833992 0.871
[2,] 0.8546169 9.999990e+05 0.8820961 0.870
[3,] 0.9061033 9.999990e+05 0.8522167 0.871

My Question: Can someone please tell me if the way I have placed constraints on " *n5, n6, n7* " is correct? Are there any other ways that allow for these constraints to be placed?

Thanks

system · August 6, 2021, 1:51am

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.