 # Multiple regression: p-values for unstandarized data differ from p-values for standardized data

I calculated a multiple regression with unstandardized data using lm and got those values:

``````Call:
lm(formula = reg.modeldef2d, data = dta2)

Residuals:
Min       1Q   Median       3Q      Max
-1.48129 -0.47499 -0.03837  0.43007  1.32839

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)                -1.96183    0.68485  -2.865  0.00535 **
WPR.Teiler.theta            0.11308    0.13375   0.845  0.40041
WPR.Prim.theta             -0.16453    0.09212  -1.786  0.07794 .
SD.math                     0.38957    0.13936   2.795  0.00650 **
SR3.theta                   0.20984    0.10707   1.960  0.05355 .
SW2.Teiler.theta            0.05043    0.09504   0.531  0.59716
SR3.theta:SW2.Teiler.theta  0.22049    0.08912   2.474  0.01550 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.6724 on 79 degrees of freedom
Multiple R-squared:  0.4074,	Adjusted R-squared:  0.3624
F-statistic: 9.052 on 6 and 79 DF,  p-value: 1.583e-07
``````

Then I standardized the data using scale() and calculated the very same regression:

``````Call:
lm(formula = reg.modeldef2d, data = dta2_std)

Residuals:
Min       1Q   Median       3Q      Max
-1.75916 -0.56409 -0.04557  0.51075  1.57757

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)                -0.02616    0.08675  -0.302   0.7638
WPR.Teiler.theta            0.08763    0.10365   0.845   0.4004
WPR.Prim.theta             -0.17215    0.09639  -1.786   0.0779 .
SD.math                     0.36673    0.13119   2.795   0.0065 **
SR3.theta                   0.23590    0.11639   2.027   0.0461 *
SW2.Teiler.theta            0.05409    0.10077   0.537   0.5930
SR3.theta:SW2.Teiler.theta  0.21524    0.08700   2.474   0.0155 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.7985 on 79 degrees of freedom
Multiple R-squared:  0.4074,	Adjusted R-squared:  0.3624
F-statistic: 9.052 on 6 and 79 DF,  p-value: 1.583e-07
``````

The p-values are not the same! Also, the intercept isn't at 0 (which it should with standardized data, shouldn't it?)
I checked for NA-values, but none exist.

Is it possible that the accuracy of the R variables is the problem? When rescale the standardized data manually and calculate the differences to the original unstandardized values, not all differences are zero (but small numbers, like E-17).

It's kind of a problem because after the standardization the factor SR3.theta suddenly gets significant...

Did you standardize the left-hand side variable as well as the explanatory variables?

Thanks for the quick reply. Yes, I did, I standardized the whole data frame.

I would double-check that. I realize that sounds strange to say, but if you standardized everything then you are quite right to be puzzled about the intercept. A linear regression goes through the mean of the data. If both the left and right hand sides have mean zero, then the intercept should be zero.

Maybe post more of your code? Or maybe a `summary()` of both dataframes? (And show what `reg.modeldef2d` looks like.)

