Help writing a function for the infer package

I need some help writing a simple function. I'm guessing this involves some use of tidyeval (which I have completely forgotten since the last time i had to use it), and I'm sort of in a hurry so any help would be greatly appreciated!

mtcars as an example:

library(infer)
library(dplyr)


d_hat <- mtcars %>% 
  mutate(vs = if_else(vs == 1, "one", "zero")) %>% 
  specify(disp~vs) %>% 
  calculate(stat = "diff in means", order = c("zero", "one"))

mtcars %>% 
  mutate(vs = if_else(vs == 1, "one", "zero")) %>% 
  specify(disp~vs) %>%  
  hypothesize(null = "independence") %>% 
  generate(reps = 5000, type = "permute") %>% 
  calculate(stat = "diff in means", order = c("zero", "one")) %>% 
  get_pvalue(obs_stat = d_hat, direction = "two_sided")
#> # A tibble: 1 x 1
#>   p_value
#>     <dbl>
#> 1       0

Created on 2019-04-10 by the reprex package (v0.2.1)

If I want to get p-values for other variables than disp I could of course just repeat everything and change disp, but I'd prefer a basic function that just allows me to change the input to specify.

rlang has new_formula function that can help you with this. It seems to work:

library(infer)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union


calc_p <- function(var){
  form <- rlang::new_formula(rlang::ensym(var), quote(vs))
  
  d_hat <- mtcars %>% 
    mutate(vs = if_else(vs == 1, "one", "zero")) %>% 
    specify(form) %>% 
    calculate(stat = "diff in means", order = c("zero", "one"))
  
  mtcars %>% 
    mutate(vs = if_else(vs == 1, "one", "zero")) %>% 
    specify(form) %>%  
    hypothesize(null = "independence") %>% 
    generate(reps = 5000, type = "permute") %>% 
    calculate(stat = "diff in means", order = c("zero", "one")) %>% 
    get_pvalue(obs_stat = d_hat, direction = "two_sided")
}

calc_p(cyl)
#> # A tibble: 1 x 1
#>   p_value
#>     <dbl>
#> 1       0
calc_p(disp)
#> # A tibble: 1 x 1
#>   p_value
#>     <dbl>
#> 1       0

Created on 2019-04-10 by the reprex package (v0.2.1)

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.