# standard errors using prediction() in R vs margins in Stata

Hi,

When using the `prediction` function to calculate predicted means (or "margins") I get the same predicted values in both R and Stata, but different standard errors (and different confidence intervals as a result). What is the reason for the difference? Am I calculating the standard errors incorrectly below with my call to `summarize` (`se = mean(se.fitted)`). Or, possibly, is Stata calculating the errors incorrectly?

Ideally, I would be able to reproduce the same results in any software package. I would greatly appreciate any advice for how to replicate the results Stata gives in R.

``````suppressWarnings(suppressPackageStartupMessages({
library(tidyverse)
library(prediction)
}))

# model
mod <- lm(mpg ~ cyl + vs + hp, data = mtcars)

# predicted values
pred <- prediction(mod, at = list(cyl = c(4, 6, 8)))

pred %>%
group_by(cyl) %>%
summarise(pred_mean = mean(fitted) ,
se = mean(se.fitted),
ci_low = pred_mean - (1.96 * se),
ci_high = pred_mean + (1.96 * se),
total_n = n()) %>%
as.data.frame() # to carry out decimals further for printing
#>   cyl pred_mean       se   ci_low  ci_high total_n
#> 1   4  25.60488 1.907449 21.86628 29.34348      32
#> 2   6  20.56328 1.415614 17.78867 23.33788      32
#> 3   8  15.52167 1.771571 12.04939 18.99395      32

# stata output, and code to replicate, below:

# library(haven)

# mtcars %>%
#   write_dta("mtcars.dta"))
#
# use mtcars, clear
# reg mpg vs cyl hp
# margins, at(cyl = (4, 6, 8))
#
# Predictive margins                              Number of obs     =         32
# Model VCE    : OLS
#
# Expression   : Linear prediction, predict()
#
# 1._at        : cyl             =           4
#
# 2._at        : cyl             =           6
#
# 3._at        : cyl             =           8
#
``````
``````# |            Delta-method
# |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
``````
``````#_at |
# 1  |   25.60488   1.619802    15.81   0.000     22.28687     28.9229
# 2  |   20.56328   .5809866    35.39   0.000     19.37318    21.75337
# 3  |   15.52167   1.379056    11.26   0.000      12.6968    18.34654
``````
``````
# for comparison, ggpredict gives yet a different (though closer) confidence interval

suppressWarnings(suppressPackageStartupMessages({
library(ggeffects)
}))

ggpredict(mod, terms =  c("cyl [4, 6, 8]"))
#> # Predicted values of mpg
#>
#> cyl | Predicted |         95% CI
#> --------------------------------
#>   4 |     25.60 | [22.43, 28.78]
#>   6 |     20.56 | [19.42, 21.70]
#>   8 |     15.52 | [12.82, 18.22]
#>
#> Adjusted for:
#> * vs =   0.44
#> * hp = 146.69
``````

Created on 2022-03-14 by the reprex package (v2.0.0)

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.