 # Difficulty with IV regression

Hey guys, I'm trying to run a very simple IV regression and I keep getting an error I don't know how to fix. This is my first post, so if I put code in incorrectly, I apologize. Simply put, we're running a regression of the effect of education on the log of wage, using the number of siblings as the instrument variable for education. Here's the problem.

iv1 <- ivreg(lwage~educ|sibs, WAGE2)
Error in ivreg(lwage ~ educ | sibs, WAGE2) :
length(formula) == 1L is not TRUE

I'm assuming that the number of observations for the variables isn't identical, but I have no idea how to fix this. Any ideas would be greatly appreciated.

Hi, and welcome!

Please see the FAQ: What's a reproducible example (`reprex`) and how do I do one? Using a reprex, complete with representative data will attract quicker and more answers. It really helps to have representative data to work with. Doesn't have to be your complete data or even your data at all, so long as it shows the same error.

Take a look at the signature of `ivreg` in `help(ivreg)` and the details:

Usage

ivreg(formula, instruments, data, subset, na.action, weights, offset, contrasts = NULL, model = TRUE, y = TRUE, x = FALSE, ...)

Arguments

formula, instruments formula specification(s) of the regression relationship and the instruments. Either instruments is missing and formula has three parts as in y ~ x1 + x2 z1 + z2 + z3 (recommended) or formula is y ~ x1 + x2 and instruments is a one-sided formula ~ z1 + z2 + z3 (only for backward compatibility).

That is, the formula here is incomplete.

See the example given there

``````suppressPackageStartupMessages(library(AER))

# Examples

## data
data("CigarettesSW", package = "AER")
CigarettesSW\$rprice <- with(CigarettesSW, price/cpi)
CigarettesSW\$rincome <- with(CigarettesSW, income/population/cpi)
CigarettesSW\$tdiff <- with(CigarettesSW, (taxs - tax)/cpi)

## model
fm <- ivreg(log(packs) ~ log(rprice) + log(rincome) | log(rincome) + tdiff + I(tax/cpi), data = CigarettesSW, subset = year == "1995")
summary(fm)
#>
#> Call:
#> ivreg(formula = log(packs) ~ log(rprice) + log(rincome) | log(rincome) +
#>     tdiff + I(tax/cpi), data = CigarettesSW, subset = year ==
#>     "1995")
#>
#> Residuals:
#>        Min         1Q     Median         3Q        Max
#> -0.6006931 -0.0862222 -0.0009999  0.1164699  0.3734227
#>
#> Coefficients:
#>              Estimate Std. Error t value Pr(>|t|)
#> (Intercept)    9.8950     1.0586   9.348 4.12e-12 ***
#> log(rprice)   -1.2774     0.2632  -4.853 1.50e-05 ***
#> log(rincome)   0.2804     0.2386   1.175    0.246
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 0.1879 on 45 degrees of freedom
#> Multiple R-Squared: 0.4294,  Adjusted R-squared: 0.4041
#> Wald test: 13.28 on 2 and 45 DF,  p-value: 2.931e-05
summary(fm, vcov = sandwich, df = Inf, diagnostics = TRUE)
#>
#> Call:
#> ivreg(formula = log(packs) ~ log(rprice) + log(rincome) | log(rincome) +
#>     tdiff + I(tax/cpi), data = CigarettesSW, subset = year ==
#>     "1995")
#>
#> Residuals:
#>        Min         1Q     Median         3Q        Max
#> -0.6006931 -0.0862222 -0.0009999  0.1164699  0.3734227
#>
#> Coefficients:
#>              Estimate Std. Error z value Pr(>|z|)
#> (Intercept)    9.8950     0.9288  10.654  < 2e-16 ***
#> log(rprice)   -1.2774     0.2417  -5.286 1.25e-07 ***
#> log(rincome)   0.2804     0.2458   1.141    0.254
#>
#> Diagnostic tests:
#>                  df1 df2 statistic p-value
#> Weak instruments   2  44   228.738  <2e-16 ***
#> Wu-Hausman         1  44     3.823  0.0569 .
#> Sargan             1  NA     0.333  0.5641
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 0.1879 on Inf degrees of freedom
#> Multiple R-Squared: 0.4294,  Adjusted R-squared: 0.4041
#> Wald test: 34.51 on 2 DF,  p-value: 3.214e-08
``````

Created on 2020-03-29 by the reprex package (v0.3.0)

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.