Error when running a formula within a function

Hello everyone,
I am running some basic analysis in R with no issue at all. Please, see code below:

#Specifying survey design (Survey data comes with survey weights as well as bootstrap weights)
#That survey information is captured below)
library("survey")

CCHSDesign = svrepdesign(
  data            = MyDat,
  weights         = ~WGT_FULL,
  repweights      = "BSW[1-9]+",                 
  combined.weights = T,   
  type             = "BRR")


###Performing overall analysis
perc.O = as.data.frame(svymean(~IRO, CCHSDesign, na.rm = TRUE))

###Performing analysis by status
perc.status = as.data.frame(svyby(~IRO,~factor(status),
                            CCHSDesign, svymean,na.rm = TRUE)) 


However, when I try to run the same analysis within a function I get error messages. Please, see my function below:

###Creating a function to perform the same analysis
MyFunction = function(nutr,group){
  
  ###Performing overall analysis
  perc.O = as.data.frame(svymean(~nutr, CCHSDesign, na.rm = TRUE))
  
  ###Performing analysis by status
  perc.status = as.data.frame(svyby(~nutr,~factor(group),
                                    CCHSDesign, svymean,na.rm = TRUE))

}

MyEstimates = MyFunction(IRO,status)

I have been struggling with this issue for a few months already. A suggestion to solve my problem was posted in the link below:

However, I tried many combinations of the proposed solution and still not working properly. I would appreciate if anyone can tell me how to make it run.

MyDat = tibble::tribble(
                 ~status, ~IRO,   ~WGT_FULL,       ~BSW1,       ~BSW2,       ~BSW3,
              "pregnant",  45L,     29316.4,           0,    58050.94,           0,
              "pregnant",  17L, 401.9984918, 545.8072449, 561.8532389, 504.8898659,
              "pregnant",  32L,    2203.624,           0,    4915.119,    5293.266,
              "pregnant",  33L, 779.2059197, 2304.176342, 2633.064775,           0,
              "pregnant",  17L,    8958.373,           0,           0,     10007.5,
          "non-pregnant",  52L,    3307.726,           0,    5982.301,    4324.108,
          "non-pregnant",  16L, 304.3146592, 567.9649596,           0, 329.7168172,
          "non-pregnant",  67L,    425.8173,           0,           0,    1285.752,
          "non-pregnant",  29L, 974.3875169,           0, 1191.837293,           0,
          "non-pregnant",  19L,    2502.044,    2963.788,     2939.72,    2785.933,
          "non-pregnant",   9L,    2210.339,    2674.871,           0,    2891.036,
          "non-pregnant",  19L,    642.0245,           0,    765.6621,    649.7655,
          "non-pregnant",  12L,     1322.31,           0,    1497.407,    1383.725,
          "non-pregnant",   4L,     574.426,    1455.648,    1414.688,           0,
          "non-pregnant",  73L,    126.7128,           0,    189.9741,    181.5099,
          "non-pregnant",  14L,    676.4648,           0,    718.8922,    842.7405,
          "non-pregnant",  18L,    4027.879,    4874.392,           0,    5268.307,
          "non-pregnant",  97L,    2196.652,           0,    3226.484,     3420.65,
          "non-pregnant",   7L,    1993.748,           0,           0,    3269.434,
          "non-pregnant",  20L,    485.4677,    606.3933,    2182.711,           0
          )

Thanks a lot,
A.G.

Additional relevant information.
OS: Windows 10 (64-bit)
R version: 3.6.2
R studio version: 3.5

You had a good instinct to share something about the data of concern.

However, the way you did it does not facilitate easy copy and pasting of your data, so a user can easily add it to their R session.

You can read this guide to see how such things are possible via datapasta package, or base::dput()
FAQ: How to do a minimal reproducible example ( reprex ) for beginners

Hi @nirgrahamuk ,
Thanks a lot for your great feedback. This is extremely helpful. I already editted the post to account for this modification. I believe this is a right way to do it.
I cannot wait to heard from someone who could help me to solve the problem I am stuck with for a few months now.
A.G.

All I can say is that x is not in exdf. For a proper reprex, representative data is essential.

is invisible to anyone on any other system than yours. It doesn't have to be all the data or even your data. It just has to produce the same error message.

Hi @technocrat ,
Not sure I understand what you mean. You don't need lm(y~x,data = exdf) at all. It's just a example @nirgrahamuk used. Unfortunately, I could not reproduce his idea in my problem and still waiting for someone to help me solve the issue.

You can use my example data set, which at the bottom of the post. I'll edit my post to remove the line file="Location\Example.csv" to prevent future confusion.
Thanks a lot for taking the time to look into this.

library("survey")
library(tibble)
MyDat = tibble::tribble(
  ~status, ~IRO,   ~WGT_FULL,       ~BSW1,       ~BSW2,       ~BSW3,
  "pregnant",  45L,     29316.4,           0,    58050.94,           0,
  "pregnant",  17L, 401.9984918, 545.8072449, 561.8532389, 504.8898659,
  "pregnant",  32L,    2203.624,           0,    4915.119,    5293.266,
  "pregnant",  33L, 779.2059197, 2304.176342, 2633.064775,           0,
  "pregnant",  17L,    8958.373,           0,           0,     10007.5,
  "non-pregnant",  52L,    3307.726,           0,    5982.301,    4324.108,
  "non-pregnant",  16L, 304.3146592, 567.9649596,           0, 329.7168172,
  "non-pregnant",  67L,    425.8173,           0,           0,    1285.752,
  "non-pregnant",  29L, 974.3875169,           0, 1191.837293,           0,
  "non-pregnant",  19L,    2502.044,    2963.788,     2939.72,    2785.933,
  "non-pregnant",   9L,    2210.339,    2674.871,           0,    2891.036,
  "non-pregnant",  19L,    642.0245,           0,    765.6621,    649.7655,
  "non-pregnant",  12L,     1322.31,           0,    1497.407,    1383.725,
  "non-pregnant",   4L,     574.426,    1455.648,    1414.688,           0,
  "non-pregnant",  73L,    126.7128,           0,    189.9741,    181.5099,
  "non-pregnant",  14L,    676.4648,           0,    718.8922,    842.7405,
  "non-pregnant",  18L,    4027.879,    4874.392,           0,    5268.307,
  "non-pregnant",  97L,    2196.652,           0,    3226.484,     3420.65,
  "non-pregnant",   7L,    1993.748,           0,           0,    3269.434,
  "non-pregnant",  20L,    485.4677,    606.3933,    2182.711,           0
)

CCHSDesign = svrepdesign(
  data            = MyDat,
  weights         = ~WGT_FULL,
  repweights      = "BSW[1-9]+",                 
  combined.weights = T,   
  type             = "BRR")


  MyFunction <- function(nutr,group){
    nutr_s <- deparse(substitute(nutr))
    group_s <-  deparse(substitute(group))

    ###Performing overall analysis
    perc.O = as.data.frame(svymean(as.formula(paste0("~",nutr_s)), CCHSDesign, na.rm = TRUE))
    
    ###Performing analysis by status
    perc.status = as.data.frame(svyby(as.formula(paste0("~",nutr_s)),
                                      as.formula(paste0("~factor(",group_s,")")),
                                      CCHSDesign, svymean,na.rm = TRUE))
    
  }

MyEstimates = MyFunction(IRO,status)
1 Like

That's awesome @nirgrahamuk . Thank you so much!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.