How to test whether a coefficient statistical significant but economically insignificant?

file_show.pdf (8.1 KB)

In my model, I test many different variables (in separate regression ). Many of them are statistically insignificant. Some variables are statistically significant, but the coefficients are quite small. Thus, I do not know whether or not these variables are economically significant? How can I test this in R?

significance is a deeply misleading term. Typically, a statistical test is performed and its significance measures the probability that a value that extreme is due to chance only. At an \alpha of 0.05, the conventional test, we can think of this as a 1 in 20 chance that the observed outcome is due to chance. To put that in perspective, imagine a table with 4 revolvers, each of which holds 5 shots. You are told that all of them are empty except for one chamber in one revolver. Pick it up and hold it to your head. How confident do you feel pulling the trigger that you will hear a click?

By the numbers, there is only a 0.05 chance that you won't hear the click (i.e., because you're dead before it registers). But if you keep trying, revolver by revolver, at some point the one bullet's turn will come up and game over.

That puts a new gloss on the distinction between significant and meaningful.

Use the p-value significance test for a given \alpha when you want to put a measure on the likelihood of results being random. That's all it's there for.

Thank you so much. Could you please tell me the exact R package that I should use?

1 Like

Yes, that is what is was explained earlier. If you have multiple regression models, one is most likely to give you a significant result due to pure chance. I would take that into account and if the p-value is around 0.05, I would dismiss it as pure change. If it's closer to 0, you can consider the effect significant.

If you want some specific formulae, you can use the stats package the lm() function and that will give you the p-value.

library(tidyverse)

model <- lm(mpg ~ cyl, data = mtcars)

summary(model)

I would be very cautious about labeling a coefficient "small" unless all of the measurements have been normalized. In the following example, I look at the correlation between the Mass of some objects and their length. As the units of the length change, the absolute size of the fit coefficient changes but the p value is constant and any practical significance would also be constant.

set.seed(1)
DF <- data.frame(Meters = c(1.5, 1.6, 1.7, 1.8), 
                 Mass = c(1.5, 1.6, 1.7, 1.8) * 1.5 + rnorm(4, mean = 0 , sd = 0.1),
                 KM = c(1.5, 1.6, 1.7, 1.8)/1000,
                 MM = c(1.5, 1.6, 1.7, 1.8) * 1000)

summary(lm(Mass ~ Meters, data = DF))
#> 
#> Call:
#> lm(formula = Mass ~ Meters, data = DF)
#> 
#> Residuals:
#>        1        2        3        4 
#>  0.01412  0.03867 -0.11971  0.06692 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)  
#> (Intercept)  -0.9237     0.7489  -1.233   0.3427  
#> Meters        2.0646     0.4528   4.559   0.0449 *
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 0.1013 on 2 degrees of freedom
#> Multiple R-squared:  0.9122, Adjusted R-squared:  0.8684 
#> F-statistic: 20.79 on 1 and 2 DF,  p-value: 0.04489
summary(lm(Mass ~ KM, data = DF))
#> 
#> Call:
#> lm(formula = Mass ~ KM, data = DF)
#> 
#> Residuals:
#>        1        2        3        4 
#>  0.01412  0.03867 -0.11971  0.06692 
#> 
#> Coefficients:
#>              Estimate Std. Error t value Pr(>|t|)  
#> (Intercept)   -0.9237     0.7489  -1.233   0.3427  
#> KM          2064.5932   452.8179   4.559   0.0449 *
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 0.1013 on 2 degrees of freedom
#> Multiple R-squared:  0.9122, Adjusted R-squared:  0.8684 
#> F-statistic: 20.79 on 1 and 2 DF,  p-value: 0.04489
summary(lm(Mass ~ MM, data = DF))
#> 
#> Call:
#> lm(formula = Mass ~ MM, data = DF)
#> 
#> Residuals:
#>        1        2        3        4 
#>  0.01412  0.03867 -0.11971  0.06692 
#> 
#> Coefficients:
#>               Estimate Std. Error t value Pr(>|t|)  
#> (Intercept) -0.9236577  0.7488628  -1.233   0.3427  
#> MM           0.0020646  0.0004528   4.559   0.0449 *
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 0.1013 on 2 degrees of freedom
#> Multiple R-squared:  0.9122, Adjusted R-squared:  0.8684 
#> F-statistic: 20.79 on 1 and 2 DF,  p-value: 0.04489

Created on 2021-03-25 by the reprex package (v0.3.0)

1 Like

Statistically significance tests the probability of the coefficient to be 0. The above answers provide good info.

I don't think there is a test for "economically significant". To get a feel of the size of your coefficient you could do the following: Multiple a reasonable change of the variable in question (e.g., one or two standard deviations) by the coefficient and make a judgement if the change of the dependent variable can be considered "significant in an economic sense". This is of course just a rule of thumb.

The lm.beta package, last updated in 2014, is a simple way to add the standardized (beta) coefficients to the lm() results. They measure how much Y changes in units of its standard deviation when X changes by one of its standard deviations. In the example below, if wt increases by one standard deviation then mpg will decrease by 0.518 standard deviations. This is a fairly large response and is the largest of the two explanatory variables. There is no rule for when something is economically significant.

library(tidyverse)
library(broom)
library(lm.beta)

fit <- lm(mpg ~ cyl + wt, data = mtcars)
fit_b <- lm.beta(fit)
tidy(fit_b)
#> # A tibble: 3 x 6
#>   term        estimate std_estimate std.error statistic  p.value
#>   <chr>          <dbl>        <dbl>     <dbl>     <dbl>    <dbl>
#> 1 (Intercept)    39.7         0         1.71      23.1  3.04e-20
#> 2 cyl            -1.51       -0.447     0.415     -3.64 1.06e- 3
#> 3 wt             -3.19       -0.518     0.757     -4.22 2.22e- 4

Created on 2021-04-05 by the reprex package (v2.0.0)

Thank you so much. However, I am wondering, is there something equivalent with multiple hypothesis testing commands in R with this version in STATA: https://blogs.worldbank.org/impactevaluations/overview-multiple-hypothesis-testing-commands-stata.?