Manual 2SLS Regression Produces Different Results to IVReg

I am currently following along with Wooldridge's Introductory Econometrics, in which he performs a 2SLS regression of the wage equation

\text{log(Wage)} = \beta_0+\beta_1\text{Educ}+\beta_2\text{Exper}+\beta_3\text{Exper}^2+u_1

where \text{Educ} is endogenous, whilst \text{Exper} and \text{Exper}^2 are exogenous. The mother's and father's education are also assumed to be uncorrelated with u_1, so we use both of these as instrumental variables for \text{Educ}. Hence, the reduced form equation for \text{Educ} is


If I use ivreg to perform the 2SLS estimation, then I get the same results as Wooldridge:

Reg2SLS <- ivreg(formula=log(wage)~educ+exper+expersq | 

              Estimate Std. Error t value Pr(>|t|)   
(Intercept)  0.0481003  0.4003281   0.120  0.90442   
educ         0.0613966  0.0314367   1.953  0.05147 . 
exper        0.0441704  0.0134325   3.288  0.00109 **
expersq     -0.0008990  0.0004017  -2.238  0.02574 *

However, if I perform the same process manually, like so:

S1OLS <- lm(formula = educ~exper+expersq+fatheduc+motheduc, data = mroz)

frame <- data.frame(Wage = mroz$wage, Educ = mroz$educ, 
                    FittedEduc = S1OLS$fitted.values,
                    Exper = mroz$exper, ExperSq = mroz$expersq)

S2OLS <- lm(formula = log(Wage)~FittedEduc+Exper+ExperSq, data=frame)

The results are different to that above:

              Estimate Std. Error t value Pr(>|t|)   
(Intercept)  0.1332094  0.3817364   0.349  0.72730   
FittedEduc   0.0568605  0.0310692   1.830  0.06793 . 
Exper        0.0421082  0.0142860   2.948  0.00338 **
ExperSq     -0.0008565  0.0004255  -2.013  0.04477 *

What is the reason for this? I am aware that performing this regression manually invalidates the standard errors and t statistics, but to the best of my understanding, the coefficients should not be effected. So why exactly does my manual regression fail?

PS: The following packages were used:


I believe, but am not sure, that the difference is that ivreg is only using 428 observations while S1OLS uses 753. No idea why this would be true.

You are quite right. The cause of the problem is the NAs. After removing the NA rows, the manual regression gives the same results as the IVReg regression.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.