Skip to content

standard errors using prediction() in R vs margins in Stata  #54

@sgenter1

Description

@sgenter1
  • a question about package functionality

Hi,

When using the prediction function to calculate predicted means (or "margins") I get the same predicted values in both R and Stata, but different standard errors (and different confidence intervals as a result). What is the reason for the difference? Am I calculating the standard errors incorrectly below with my call to summarize (se = mean(se.fitted)). Or, possibly, is Stata calculating the errors incorrectly?

Ideally, I would be able to reproduce the same results in any software package. I would greatly appreciate any advice for how to replicate the results Stata gives in R.

suppressWarnings(suppressPackageStartupMessages({
  library(tidyverse)
  library(prediction)
}))
  
# model
mod <- lm(mpg ~ cyl + vs + hp, data = mtcars)

# predicted values
pred <- prediction(mod, at = list(cyl = c(4, 6, 8)))

pred %>% 
  group_by(cyl) %>% 
  summarise(pred_mean = mean(fitted) ,
            se = mean(se.fitted),
            ci_low = pred_mean - (1.96 * se),
            ci_high = pred_mean + (1.96 * se),
            total_n = n()) %>% 
  as.data.frame() # to carry out decimals further for printing
#>   cyl pred_mean       se   ci_low  ci_high total_n
#> 1   4  25.60488 1.907449 21.86628 29.34348      32
#> 2   6  20.56328 1.415614 17.78867 23.33788      32
#> 3   8  15.52167 1.771571 12.04939 18.99395      32


# stata output, and code to replicate, below:

# library(haven)

# mtcars %>%
#   write_dta("mtcars.dta"))
# 
# use mtcars, clear
# reg mpg vs cyl hp
# margins, at(cyl = (4, 6, 8))
# 
# Predictive margins                              Number of obs     =         32
# Model VCE    : OLS
# 
# Expression   : Linear prediction, predict()
# 
# 1._at        : cyl             =           4
# 
# 2._at        : cyl             =           6
# 
# 3._at        : cyl             =           8
# 
# |            Delta-method
# |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
#_at |
# 1  |   25.60488   1.619802    15.81   0.000     22.28687     28.9229
# 2  |   20.56328   .5809866    35.39   0.000     19.37318    21.75337
# 3  |   15.52167   1.379056    11.26   0.000      12.6968    18.34654

Created on 2022-03-16 by the reprex package (v2.0.0)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions