Hello everyone,
For a research project, I am supposed to create a regression model from samples that were calculated by a CFD-simulation and then carry out an optimisation with it. I created the regression function with the function lm(...) and according to R, this has an R^2 value of 0.9, an adjusted R^2 of 0.87 and with a cross-validation (10-fold) I achieve an R^2 value of 0.8 or an RMSE of 0.02, which corresponds to about 2%.
For the optimisation I used the package NSGA2R and for this I had to copy the coefficients of the regression function and adjust the names of the independent parameters to x[1],x[2], and so on.
My problem is that the results of the optimisation differ quite a lot from the validation values and when I checked the regression function in Excel, I found that the values differ far too much for the values for R^2 to be correct. In the attached image the simulation results are shown in blue and the values calculated with the regression are shown in yellow, these are the data points with which the regression was created.
My approach to recreate the regression function was to multiply the coefficients by the independent variables, so with the example from the attached code I would get the following function:
DPM = 1.1934 - 0.3384 * d_Slength + 0.3137 * S_CPos + (...) + 0.39896 * d_SLength * S_Cpos + 1.026....
Is this correct at all or do I have to divide the parameters for the mixed terms or adjust them in some other way?
Many thanks for your help and best regards!
Call:
lm(formula = DPM ~ d_Slength + S_Cpos + S_Thickn + d_Salpha +
d_Scpos + S_Length + d_Rthickn + d_Rlenght + d_Slength *
S_Cpos + d_Slength * S_Thickn + d_Slength * d_Salpha + d_Slength *
d_Scpos + d_Slength * S_Length + d_Slength * d_Rthickn +
d_Slength * d_Rlenght + S_Cpos * S_Thickn + S_Cpos * d_Salpha +
S_Cpos * d_Scpos + S_Cpos * S_Length + S_Cpos * d_Rthickn +
S_Cpos * d_Rlenght + S_Thickn * d_Salpha + S_Thickn * d_Scpos +
S_Thickn * S_Length + S_Thickn * d_Rthickn + S_Thickn * d_Rlenght +
d_Salpha * d_Scpos + d_Salpha * S_Length + d_Salpha * d_Rthickn +
d_Salpha * d_Rlenght + d_Scpos * S_Length + d_Scpos * d_Rthickn +
d_Scpos * d_Rlenght + S_Length * d_Rthickn + S_Length * d_Rlenght +
d_Rthickn * d_Rlenght + I(S_Cpos^2) + I(d_Slength^2) + I(S_Thickn^2),
data = DPM_Datenkomplett)
Residuals:
Min 1Q Median 3Q Max
-0.055933 -0.011381 0.000643 0.009256 0.058509
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.19343608 0.04592284 25.988 < 0.0000000000000002 ***
d_Slength -0.33835784 0.03673592 -9.211 0.00000000000000179 ***
S_Cpos 0.31373074 0.09257402 3.389 0.000962 ***
S_Thickn 1.00152539 0.34618128 2.893 0.004564 **
d_Salpha 0.01583747 0.00697137 2.272 0.024959 *
d_Scpos 0.09200863 0.03938820 2.336 0.021227 *
S_Length -0.00118119 0.00023732 -4.977 0.00000228444905966 ***
d_Rthickn 0.05064585 0.04087589 1.239 0.217863
d_Rlenght -0.09785938 0.04161753 -2.351 0.020405 *
I(S_Cpos^2) -0.22322571 0.07364307 -3.031 0.003010 **
I(d_Slength^2) -0.00374295 0.02470567 -0.152 0.879845
I(S_Thickn^2) -0.77449438 1.06629556 -0.726 0.469105
d_Slength:S_Cpos 0.39895926 0.03916982 10.185 < 0.0000000000000002 ***
d_Slength:S_Thickn 1.02664204 0.15632897 6.567 0.00000000155059495 ***
d_Slength:d_Salpha 0.02368125 0.00377089 6.280 0.00000000621571566 ***
d_Slength:d_Scpos 0.07567882 0.02313591 3.271 0.001414 **
d_Slength:S_Length -0.00034069 0.00019462 -1.751 0.082697 .
d_Slength:d_Rthickn -0.05231748 0.02185388 -2.394 0.018284 *
d_Slength:d_Rlenght -0.07596702 0.02423901 -3.134 0.002188 **
S_Cpos:S_Thickn -1.53697942 0.23259719 -6.608 0.00000000127085466 ***
S_Cpos:d_Salpha -0.03905314 0.00619623 -6.303 0.00000000557503180 ***
S_Cpos:d_Scpos -0.08924233 0.03563258 -2.505 0.013663 *
S_Cpos:S_Length 0.00153957 0.00029551 5.210 0.00000083947182373 ***
S_Cpos:d_Rthickn 0.00417650 0.03733823 0.112 0.911133
S_Cpos:d_Rlenght 0.14838299 0.03797772 3.907 0.000158 ***
S_Thickn:d_Salpha -0.04721221 0.02402536 -1.965 0.051814 .
S_Thickn:d_Scpos -0.42548721 0.15904291 -2.675 0.008555 **
S_Thickn:S_Length 0.00204608 0.00126093 1.623 0.107398
S_Thickn:d_Rthickn 0.05897616 0.14923122 0.395 0.693427
S_Thickn:d_Rlenght 0.36554665 0.14274810 2.561 0.011739 *
d_Salpha:d_Scpos -0.01532567 0.00369658 -4.146 0.00006502238714769 ***
d_Salpha:S_Length 0.00011998 0.00003445 3.482 0.000703 ***
d_Salpha:d_Rthickn 0.00118219 0.00362268 0.326 0.744767
d_Salpha:d_Rlenght -0.00185424 0.00370562 -0.500 0.617759
d_Scpos:S_Length 0.00020800 0.00019323 1.076 0.283985
d_Scpos:d_Rthickn 0.00738574 0.02376699 0.311 0.756548
d_Scpos:d_Rlenght 0.06390154 0.02226765 2.870 0.004891 **
S_Length:d_Rthickn -0.00046057 0.00020336 -2.265 0.025398 *
S_Length:d_Rlenght -0.00031854 0.00021703 -1.468 0.144915
d_Rthickn:d_Rlenght -0.01644057 0.02115912 -0.777 0.438756
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.01982 on 115 degrees of freedom
Multiple R-squared: 0.9016, Adjusted R-squared: 0.8682
F-statistic: 27.01 on 39 and 115 DF, p-value: < 0.00000000000000022