Difficulty with IV regression

Hi, and welcome!

Please see the FAQ: What's a reproducible example (`reprex`) and how do I create one? Using a reprex, complete with representative data will attract quicker and more answers. It really helps to have representative data to work with. Doesn't have to be your complete data or even your data at all, so long as it shows the same error.

Take a look at the signature of ivreg in help(ivreg) and the details:

Usage

ivreg(formula, instruments, data, subset, na.action, weights, offset, contrasts = NULL, model = TRUE, y = TRUE, x = FALSE, ...)

Arguments

formula, instruments formula specification(s) of the regression relationship and the instruments. Either instruments is missing and formula has three parts as in y ~ x1 + x2 z1 + z2 + z3 (recommended) or formula is y ~ x1 + x2 and instruments is a one-sided formula ~ z1 + z2 + z3 (only for backward compatibility).

That is, the formula here is incomplete.

See the example given there

suppressPackageStartupMessages(library(AER)) 

# Examples

## data
data("CigarettesSW", package = "AER")
CigarettesSW$rprice <- with(CigarettesSW, price/cpi)
CigarettesSW$rincome <- with(CigarettesSW, income/population/cpi)
CigarettesSW$tdiff <- with(CigarettesSW, (taxs - tax)/cpi)

## model 
fm <- ivreg(log(packs) ~ log(rprice) + log(rincome) | log(rincome) + tdiff + I(tax/cpi), data = CigarettesSW, subset = year == "1995")
summary(fm)
#> 
#> Call:
#> ivreg(formula = log(packs) ~ log(rprice) + log(rincome) | log(rincome) + 
#>     tdiff + I(tax/cpi), data = CigarettesSW, subset = year == 
#>     "1995")
#> 
#> Residuals:
#>        Min         1Q     Median         3Q        Max 
#> -0.6006931 -0.0862222 -0.0009999  0.1164699  0.3734227 
#> 
#> Coefficients:
#>              Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)    9.8950     1.0586   9.348 4.12e-12 ***
#> log(rprice)   -1.2774     0.2632  -4.853 1.50e-05 ***
#> log(rincome)   0.2804     0.2386   1.175    0.246    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 0.1879 on 45 degrees of freedom
#> Multiple R-Squared: 0.4294,  Adjusted R-squared: 0.4041 
#> Wald test: 13.28 on 2 and 45 DF,  p-value: 2.931e-05
summary(fm, vcov = sandwich, df = Inf, diagnostics = TRUE)
#> 
#> Call:
#> ivreg(formula = log(packs) ~ log(rprice) + log(rincome) | log(rincome) + 
#>     tdiff + I(tax/cpi), data = CigarettesSW, subset = year == 
#>     "1995")
#> 
#> Residuals:
#>        Min         1Q     Median         3Q        Max 
#> -0.6006931 -0.0862222 -0.0009999  0.1164699  0.3734227 
#> 
#> Coefficients:
#>              Estimate Std. Error z value Pr(>|z|)    
#> (Intercept)    9.8950     0.9288  10.654  < 2e-16 ***
#> log(rprice)   -1.2774     0.2417  -5.286 1.25e-07 ***
#> log(rincome)   0.2804     0.2458   1.141    0.254    
#> 
#> Diagnostic tests:
#>                  df1 df2 statistic p-value    
#> Weak instruments   2  44   228.738  <2e-16 ***
#> Wu-Hausman         1  44     3.823  0.0569 .  
#> Sargan             1  NA     0.333  0.5641    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 0.1879 on Inf degrees of freedom
#> Multiple R-Squared: 0.4294,  Adjusted R-squared: 0.4041 
#> Wald test: 34.51 on 2 DF,  p-value: 3.214e-08

Created on 2020-03-29 by the reprex package (v0.3.0)