lonely psu error in survey package for Rmarkdown, but not through console or R script

Hello all,

I am using the survey package to analyze the European Social Survey, 2018. After specifying the survey design object and running an analysis command, such as getting mean trust in parliament by age group and country, it works fine, apparently, via the console and script.

But running these lines through RMarkdown results in a lonely psu error. Since it works via the console, I don't think I've erred in specifying the complex survey design.

Is there an option for the RMarkdown code chunk that would resolve this problem? Any advice or an explanation of what is going on would be really appr. ciated. Thank you!

library(tidyverse)
library(survey)

# ESS data here: https://www.dropbox.com/s/ceq02i2rkyaqrwk/ESS9e03_1.sav
# brief codebook https://www.dropbox.com/s/d9jefeuht8bu6yo/appendix%20general%20codebook.pdf
ess9<-foreign::read.spss("ESS9e03_1.sav", to.data.frame=TRUE)

ess9<-ess9 %>%
  mutate(age=agea)

ess9<-ess9 %>%
  mutate(age = na_if(agea, 999))

ess9<-ess9 %>%
   mutate(agecat= case_when(
     age %in% 15:25 ~ "1. 15 to 25",
     age %in% 26:36 ~ "2. 26 to 36",
     age %in% 37:47 ~ "3. 37 to 47",
     age %in% 48:58 ~ "4. 48 to 58",
     age %in% 59:69 ~ "5. 59 to 69",
     age %in% 70:90 ~ "6. 70 to 90")) 

ess9<-ess9 %>%
mutate(agecat = fct_relevel(agecat, "1. 15 to 25","2. 26 to 36","3. 37 to 47","4. 48 to 58","5. 59 to 69","6. 70 to 90"))

ess9<-ess9 %>%
  mutate(trust_parliament = trstprl) %>%
  mutate(trust_parliament = fct_recode(trust_parliament,
               NULL = "Refusal",
               NULL = "Don't know",
               NULL = "No answer"))
  

ess9$anweight <- ess9$pweight*ess9$pspwght
ess_design<-svydesign(weights=~anweight, ids=~psu, strata=~stratum, survey.lonely.psu = "remove", data = ess9, nest=TRUE)


svyby(~as.numeric(trust_parliament), ~ agecat, ess_design, na.rm=TRUE, svymean)
svyby(~as.numeric(trust_parliament), ~ agecat + cntry, ess_design, na.rm=TRUE, svymean)

You can try using the following option which does a conservative adjustment of adjusting the lonely-PSU (a stratum with only one PSU) around the grand mean. I'm not sure why this wouldn't occur in the interactive session but does occur in markdown unless you applied this option and forgot about it.

options(survey.lonely.psu="adjust")

Now, that I look at your code, I see you try to put a survey.lonely.psu option when you create the design object but that's not where it goes. There's some other odd things in your code. I've redone it below correctly using both survey package and the srvyr package. I'm just doing analysis for Italy to make it run a bit slower as an example.

library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --

## v ggplot2 3.3.5     v purrr   0.3.4
## v tibble  3.1.3     v dplyr   1.0.7
## v tidyr   1.1.3     v stringr 1.4.0
## v readr   2.0.1     v forcats 0.5.1

## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(survey)
## Loading required package: grid

## Loading required package: Matrix

## 
## Attaching package: 'Matrix'

## The following objects are masked from 'package:tidyr':
## 
##     expand, pack, unpack

## Loading required package: survival

## 
## Attaching package: 'survey'

## The following object is masked from 'package:graphics':
## 
##     dotchart
library(here)
## here() starts at C:/Users/..
# ESS data here: https://www.dropbox.com/s/ceq02i2rkyaqrwk/ESS9e03_1.sav
# brief codebook https://www.dropbox.com/s/d9jefeuht8bu6yo/appendix%20general%20codebook.pdf

ess9_in <-haven::read_spss(here::here("ESS9e03_1.sav"))

ess9<-ess9_in %>%
  mutate(
    age=agea,
    age = na_if(agea, 999),
    agecat= case_when(
      age %in% 15:25 ~ "1. 15 to 25",
      age %in% 26:36 ~ "2. 26 to 36",
      age %in% 37:47 ~ "3. 37 to 47",
      age %in% 48:58 ~ "4. 48 to 58",
      age %in% 59:69 ~ "5. 59 to 69",
      age %in% 70:90 ~ "6. 70 to 90"),
    agecat = fct_relevel(agecat, "1. 15 to 25","2. 26 to 36","3. 37 to 47","4. 48 to 58","5. 59 to 69","6. 70 to 90"),
    trust_parliament = as.factor(trstprl),
    tp_num=as.character(trust_parliament) %>% as.numeric(),
    anweight=pweight*pspwght
    ) %>%
  select(psu, stratum, anweight, agecat, trust_parliament, tp_num, cntry)%>%
  filter(cntry=="IT") 
  

options(survey.lonely.psu="adjust")
ess_design<-svydesign(weights=~anweight, ids=~psu, strata=~stratum, data = ess9, nest=TRUE)
svyby(~tp_num, ~ agecat, ess_design, na.rm=TRUE, svymean)
##                  agecat   tp_num        se
## 1. 15 to 25 1. 15 to 25 4.640104 0.1577266
## 2. 26 to 36 2. 26 to 36 4.279145 0.1589022
## 3. 37 to 47 3. 37 to 47 4.364255 0.1460727
## 4. 48 to 58 4. 48 to 58 4.413230 0.1417483
## 5. 59 to 69 5. 59 to 69 4.103305 0.1499386
## 6. 70 to 90 6. 70 to 90 3.778527 0.1466672
library(srvyr)
## 
## Attaching package: 'srvyr'

## The following object is masked from 'package:stats':
## 
##     filter
ess_design_2 <- ess9 %>% 
  as_survey_design(ids=psu, strata=stratum, weights=anweight, nest=TRUE)
ess_design_2 %>%
  group_by(agecat) %>%
  summarise(
    tp=survey_mean(tp_num, na.rm=TRUE)
  )
## # A tibble: 7 x 3
##   agecat         tp tp_se
##   <fct>       <dbl> <dbl>
## 1 1. 15 to 25  4.64 0.158
## 2 2. 26 to 36  4.28 0.159
## 3 3. 37 to 47  4.36 0.146
## 4 4. 48 to 58  4.41 0.142
## 5 5. 59 to 69  4.10 0.150
## 6 6. 70 to 90  3.78 0.147
## 7 <NA>         3.85 0.374
1 Like

Ah, thank you StatSteph! It works. --I think I was thrown off by how survey.lonely.psu worked interactively in the design object function, but not RMarkdown, among all the other things...

Also, thank you for fixing the other parts of the survey design. I just found your conference workshop materials for srvyr (https://github.com/szimmer/tidy-survey-aapor-2021) and will work through them --- those notes are exactly what I have been looking for.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.