Hello, I want to know how I find the optimal values of my regression model for four factors?

Hello, I want to know how I find the optimal values ​​of my regression model for four factors?
The optimal values ​​obtained by other methods is
longitud= 319.9 paso= 158.976 helice= 69.7967 radior= 0.144

Code:

rm(list=objects()) # Borra todas los objetos creados

### Requires ### Paquetes necesarios para compilar el c?digo
require(lmtest) 
require(lawstat)
require(agricolae)
require(car)
require(nortest)
require(normtest)
require(multcomp)
require(pwr)
require(readr)
require(ggplot2)
require(rsm)
require(moments)
require(MASS)
require(descriptr)
require(olsrr)
require(AER)

library(MVN)
library(dplyr)
library(lmtest)
library(car)
library(randtests)
library(normtest)
library(corrplot)
library(het.test)
library(agricolae)
require(lawstat)
require(nortest)
require(multcomp)
require(pwr)
require(readr)
install.packages("DescTools")
library(DescTools)



datos <- matrix(0, 27, 5) 
datos <- as.data.frame(datos)
dim(datos) 
datos 


colnames(datos) <- c("cp","longitud","paso","helice","radior")
datos

cp <- c(0.4153,0.4572,0.26,0.1446,0.0749,0.2778,0.294,0.4074,0.0912,0.2541,0.0855,0.444,0.0983,0.4263,0.0712,0.1286,0.2951,0.1389,0.128,0.3865,0.5113,0.5478,0.2631,0.4023,0.2962,0.2962,0.2962)


longitud<- c(340,340,320,300,340,320,340,340,340,320,320,300,300,340,300,340,320,340,300,300,300,320,300,300,320,320,320)
paso<-c(200,120,200,120,200,160,160,120,120,160,160,200,120,200,200,120,120,200,200,200,120,160,160,120,160,160,160)
helice<- c(80,60,70,60,80,60,70,80,80,80,70,80,80,60,80,60,70,60,60,60,60,70,70,80,70,70,70)
radior<- c(0.2,0.2,0.4,0.6,0.6,0.4,0.4,0.2,0.6,0.4,0.6,0.2,0.6,0.2,0.6,0.6,0.4,0.6,0.6,0.2,0.2,0.2,0.4,0.2,0.4,0.4,0.4)


datos[,1] <- cp
datos[,2] <- longitud
datos[,3] <- paso
datos[,4] <- helice
datos[,5] <- radior

datos


### An?lisis de varianza

mod <- aov(cp~I(longitud)+I(paso)+I(helice)+I(radior)+I(longitud^2)+I(paso^2)+I(helice^2)+I(radior^2)+
             (I(longitud)*I(paso))+(I(longitud)*I(helice))+(I(longitud)*I(radior))+(I(paso)*I(helice))
           +(I(paso)*I(radior))+(I(helice)*I(radior)))

mod
summary(mod)

lm <- lm(cp~I(longitud)+I(paso)+I(helice)+I(radior)+I(longitud^2)+I(paso^2)+I(helice^2)+I(radior^2)+
           (I(longitud)*I(paso))+(I(longitud)*I(helice))+(I(longitud)*I(radior))+(I(paso)*I(helice))
         +(I(paso)*I(radior))+(I(helice)*I(radior)))
lm
summary(lm)



### Validaci?n del Modelo

# 1er Supuesto
# Independencia

# Durbin-Watson test
dwtest(lm,alternative="two.sided") 
#Hip?tesis nula: Los residuales son independientes
#Hip?tesis alternativa: Los residuales est?n correlacionados
#Si el valor-p es menor que el nivel de significancia (alpha) debe ser 
#rechazada la hip?tesis nula

bgtest(lm) #Breusch-Godfrey test Independencia

#An?lisis gr?fico de la independencia de los residuales
par(mfrow=c(1,1))
plot(lm$residuals,xlab="Prueba",ylab="Residuales por prueba",main="",col=18,pch=17)
abline(0,0,col=12,lwd=2,lty=2)

par(mfrow=c(1,1))
# Funci?n de Autocorrelaci?n
# Bandas con un intervalo de confianza del 95%
acf(lm$res,ylim=c(-1,1),col=18,lwd=2,xlab="",ylab="Autocorrelaci?n",main="Autocorrelaci?n")

par(mfrow=c(1,1))
# Funci?n de Autocorrelaci?n parcial
# Bandas on un intervalo de confianza del 95%
pacf(lm$res,ylim=c(-1,1),col=18,lwd=2,xlab="",ylab="Autocorrelaci?n parcial",main="Autocorrelaci?n parcial")


# 2do Supuesto
# Normalidad

# Shapiro Wilk test 
#recomendado para 50 observaciones o menos
shapiro.test(lm$residuals)
#Hip?tesis nula: Los Residuales son normales
#Hip?tesis alternativa:  Los Residuales No son normales
#Si el valor-p es menor que el nivel de significancia (alpha) debe ser 
#rechazada la hip?tesis nula

lillie.test(lm$residuals)
#Hip?tesis nula: Los Residuales son normales
#Hip?tesis alternativa:  Los Residuales No son normales
#Si el valor-p es menor que el nivel de significancia (alpha) debe ser 
#rechazada la hip?tesis nula

ad.test(lm$residuals)
#Hip?tesis nula: Los Residuales son normales
#Hip?tesis alternativa:  Los Residuales No son normales
#Si el valor-p es menor que el nivel de significancia (alpha) debe ser 
#rechazada la hip?tesis nula

cvm.test(lm$residuals)
sf.test(lm$residuals)
frosini.norm.test(lm$residuals)
geary.norm.test(lm$residuals)
jb.norm.test(lm$residuals)
kurtosis.norm.test(lm$residuals)
skewness.norm.test(lm$residuals)

#agostino.test(lm$residuals)

#An?lisis gr?fico de la normalidad de los residuales
summary(lm$residuals)
par(mfcol=c(1,1))
boxplot(lm$residuals,col = c("pink"))
abline(0,0,col=1,lwd=2,lty=2)

par(mfcol=c(1,1))
hist(lm$residuals,ylab="",xlab="",main="",
     col="pink",freq=FALSE)#,ylim=c(0,0.005))
lines(density(lm$residuals),col="black",lwd=2,lty=2)
legend("topright",col="black",legend=c("Curva de densidad"),lwd=2,lty=2,bty="n")


par(mfcol=c(1,1))
qqnorm(lm$residuals,
       col=18,pch=17) 
qqline(lm$residuals,lwd=2,lty=2)

skewness(lm$residuals)
kurtosis(lm$residuals)

# 3re Supuesto
# Varianza constante


bptest(lm) #studentized Breusch-Pagan test %%
#Hip?tesis nula: Las varianzas de los tratamiento son iguales
#Hip?tesis alternativa: Las varianzas de los tratamiento No son iguales
#Si el valor-p es menor que el nivel de significancia (alpha) debe ser 
#rechazada la hip?tesis nula


#An?lisis gr?fico del supuesto de varianza constante

par(mfrow=c(1,1))
plot(lm$fit,lm$res,xlab="Predichos",ylab="Residuales",main="",col=18,pch=17)
abline(h=0,col=1,lty=2,lwd=2)

what is an 'optimal value of a regression' ?

I want to obtain the optimal value and the 4 factors that make the regression model optimal. I would very much appreciate your response.

are you wanting to extract the coefficients of the fitted model ?

Hello, no, I want to obtain the optimal value of the model called lm and the values ​​of length, pitch, helix and radius that maximize the model at that optimum.

I think @nirgrahamuk is asking you to define "optimal." A regression minimizes the sum of squared residuals and (equivalently) maximizes the log likelihood. Do you want to know one of those? The regression chooses coefficients multiplying the variables to do this, not values of the variables themselves.

1 Like

what constraints should be on the inputs to make them realistic ?
I did an unconstrained optimisation and got a much higher result than what would be achieved by the numbers you proposed, I suppose because the inputs were unrealistic.
is radior = 0 realistic ? if I evaluate your inputs and replace radior with 0 I get a higher value 0.736 compared to 0.552

Correct, I want to find the four factors that maximize the regression model, in an article a value of 0.5521 appeared as optimal for the model for values ​​of length= 319.9 pitch= 158.976 helix= 69.7967 radius= 0.144, these factors maximized the model. How could I find them in R study with lm?

The ranges in which each factor must be is:
length= (300 to 340)
step= (120 to 200)
helix=(60 to 80) radius= (0.2 to 0.6)

Correct, I want to find the four factors that maximize the regression model, in an article a value of 0.5521 appeared as optimal for the model for values ​​of length= 319.9 pitch= 158.976 helix= 69.7967 radius= 0.144, these factors maximized the model. How could I find them in R study with lm?
.............................
The ranges in which each factor must be is:
length= (300 to 340)
step= (120 to 200)
helix=(60 to 80) radius= (0.2 to 0.6)


cp <- c(0.4153,0.4572,0.26,0.1446,0.0749,0.2778,0.294,0.4074,0.0912,0.2541,0.0855,0.444,0.0983,0.4263,0.0712,0.1286,0.2951,0.1389,0.128,0.3865,0.5113,0.5478,0.2631,0.4023,0.2962,0.2962,0.2962)
longitud<- c(340,340,320,300,340,320,340,340,340,320,320,300,300,340,300,340,320,340,300,300,300,320,300,300,320,320,320)
paso<-c(200,120,200,120,200,160,160,120,120,160,160,200,120,200,200,120,120,200,200,200,120,160,160,120,160,160,160)
helice<- c(80,60,70,60,80,60,70,80,80,80,70,80,80,60,80,60,70,60,60,60,60,70,70,80,70,70,70)
radior<- c(0.2,0.2,0.4,0.6,0.6,0.4,0.4,0.2,0.6,0.4,0.6,0.2,0.6,0.2,0.6,0.6,0.4,0.6,0.6,0.2,0.2,0.2,0.4,0.2,0.4,0.4,0.4)


lm1 <- lm(cp~I(longitud)+I(paso)+I(helice)+I(radior)+I(longitud^2)+I(paso^2)+I(helice^2)+I(radior^2)+
           (I(longitud)*I(paso))+(I(longitud)*I(helice))+(I(longitud)*I(radior))+(I(paso)*I(helice))
         +(I(paso)*I(radior))+(I(helice)*I(radior)))


fwrap <- function(x){
  indata <- data.frame(
    longitud=x[[1]],
    paso=x[[2]],
    helice=x[[3]],
    radior=x[[4]]
  )
  
  predict(lm1,newdata=indata)
}

#test
fwrap(x=c(
  longitud= 319.9, paso= 158.976 ,helice= 69.7967 ,radior= 0.144
))


(result <- optim(par=c(
  longitud= 319.9, paso= 158.976 ,helice= 69.7967 ,radior= 0.144),
fn = fwrap,
lower = c(300,120,60,.2),
upper = c(340,200,80,.6),
method = "L-BFGS-B",
control = list(
  fnscale = -.01 # make it a maximisation
)))

#verify
fwrap(x=result$par)

Agradezco de antemano tu ayuda hace unos días, asumiendo que no se conocen los valores que óptimizan la efficiencia, como se podrían hallar los valores de los factores A,B,C y D que optimizan la efficiencia?
Teniendo en cuenta que la efficiencia maxima debe dar alrededor de 0.5524.
Hello, I want to know how I find the optimal values ​​of my regression model for four factors?
the optimal value obtained must be greater than 0.5.What is the value of A, B, C and D that maximize the efficiency vector?I greatly appreciate your help

efficiency <- c(0.4153,0.4572,0.26,0.1446,0.0749,0.2778,0.294,0.4074,0.0912,0.2541,0.0855,0.444,0.0983,0.4263,0.0712,0.1286,0.2951,0.1389,0.128,0.3865,0.5113,0.5478,0.2631,0.4023,0.2962,0.2962,0.2962)
A<- c(340,340,320,300,340,320,340,340,340,320,320,300,300,340,300,340,320,340,300,300,300,320,300,300,320,320,320)
B<-c(200,120,200,120,200,160,160,120,120,160,160,200,120,200,200,120,120,200,200,200,120,160,160,120,160,160,160)
C<- c(80,60,70,60,80,60,70,80,80,80,70,80,80,60,80,60,70,60,60,60,60,70,70,80,70,70,70)
D<- c(0.2,0.2,0.4,0.6,0.6,0.4,0.4,0.2,0.6,0.4,0.6,0.2,0.6,0.2,0.6,0.6,0.4,0.6,0.6,0.2,0.2,0.2,0.4,0.2,0.4,0.4,0.4)

lm1 <- lm(cp~I(A)+I(B)+I(C)+I(D)+I(A^2)+I(B^2)+I(C^2)+I(D^2)+
(I(A)*I(B))+(I(A)*I(C))+(I(A)*I(D))+(I(B)*I(C))
+(I(B)*I(D))+(I(C)*I(D)))

Is this setup different from the previous , it looks similar, why not translate from one to the other i.e. lognditude is A paso is B etc. Etc.

Hello, yes it is the same as the previous one but let's assume that we do not know the value of A, B, C and D that maximize efficiency. How could you find in R or another program the value of A, B, C and D that maximize the efficiency?

I dont understand. Why cant you reuse the approach already given?

Hello, I want to know how I find the optimal values ​​of my regression model for four factors?
the optimal value obtained must be greater than 0.5.What is the value of A, B, C and D that maximize the efficiency vector?I greatly appreciate your help

efficiency <- c(0.4153,0.4572,0.26,0.1446,0.0749,0.2778,0.294,0.4074,0.0912,0.2541,0.0855,0.444,0.0983,0.4263,0.0712,0.1286,0.2951,0.1389,0.128,0.3865,0.5113,0.5478,0.2631,0.4023,0.2962,0.2962,0.2962)
A<- c(340,340,320,300,340,320,340,340,340,320,320,300,300,340,300,340,320,340,300,300,300,320,300,300,320,320,320)
B<-c(200,120,200,120,200,160,160,120,120,160,160,200,120,200,200,120,120,200,200,200,120,160,160,120,160,160,160)
C<- c(80,60,70,60,80,60,70,80,80,80,70,80,80,60,80,60,70,60,60,60,60,70,70,80,70,70,70)
D<- c(0.2,0.2,0.4,0.6,0.6,0.4,0.4,0.2,0.6,0.4,0.6,0.2,0.6,0.2,0.6,0.6,0.4,0.6,0.6,0.2,0.2,0.2,0.4,0.2,0.4,0.4,0.4)

lm1 <- lm(cp~I(A)+I(B)+I(C)+I(D)+I(A^2)+I(B^2)+I(C^2)+I(D^2)+
(I(A)*I(B))+(I(A)*I(C))+(I(A)*I(D))+(I(B)*I(C))
+(I(B)*I(D))+(I(C)*I(D)))

At this point you are simply asking for others to do your work. It would serve you to at least attempt to solve this yourself with the aid of the example given. If you get stuck on a specific issue or unusual detail, that would warrant a further question , but not until then.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.