Just a quick note, when asking for help you should always put in at least as much effort in the asking of the question as you hope the person you are asking will put into formulating their answer.
A couple of things which might help you get an answer.
- Don't make people go looking for the data you are using, either provide it or a link to it to speed things up.
- When you include a code snippet giving you trouble, be sure to include code which will load any libraries the code requires to run correctly.
- INCLUDE ALL THE CODE TO MAKE ALL THE OBJECTS! What is cg17_pe_design? I had to read through the terrible code/comments in the YouTube description. But, there's still no resource for the
cg17_pe
data.
- You don't even include the command you executed which results in the output you're getting, so there's no way to possibly begin to help you in any meaningful, concrete way.
I'm not intending this to be harsh criticism, so please don't take it that way. This is a welcoming place, I want to help you, and you should feel comfortable asking any questions you have. Please though, put more effort into asking better questions.
That said, because I am a glutton for punishment I dug around to try to find the data, I think it comes from here:
https://www.cdc.gov/brfss/annual_data/2017/files/LLCP2017XPT.zip
But, there's no way to be sure.
I also don't know how the data is trimmed down. At a certain point in the video a data.frame is displayed with 9355 rows and 23 columns. This seems close to what you'd get if you keep only those rows for which X_STATE == 6
and only those 23 columns, but, again, I don't know for sure and she seems to reference other columns later which aren't in those 23.
That data also appears to be re-encoded from its original values to all be two-level factors... I attempted to emulate this, but I didn't see any explanation as to what the rules for doing this should be. I also tried to clean up the style which is atrocious and borderline unreadable.
I also advise seeking out tutorials provided by people who provide complete source code so you aren't reliant on magically divining things they neglect to mention or show.
What follows is my modest attempt to rectify the issues I found. At this point I have no idea why you are getting the output you are, but perhaps my efforts can illuminate your way or help someone else in helping you.
# install.packages("survey")
# install.packages("srvyr")
# install.packages("SASxport")
# Where I located the data:
# https://www.cdc.gov/brfss/annual_data/2017/files/LLCP2017XPT.zip
# You'll need to unzip the files into your working directory.
library(survey)
library(SASxport)
library(srvyr)
library(tidyverse)
### Missing from code description (found in the video) but needed
options(survey.lonely.psu = "adjust")
brfss <- read.xport('LLCP2017.XPT_',name.chars = "_")
# these are the 23 variables in the only data.frame structure I saw
vars <- c("X_PSU", "X_LLCPWT", "X_STSTR", "SEX", "X_RFHLTH",
"X_PHYS14D", "X_MENT14D", "X_HCVU651", "CHECKUP1",
"CHOLCHK1", "X_RFSMOK3", "X_RFBING5", "X_TOTINDA",
"X_FRTLT1A", "X_VEGLT1A", "X_RFBMI5", "DIABETE3",
"HAVARTH3", "ADDEPEV2", "BPHIGH4", "TOLDHI2",
"CVDCRHD4", "CVDSTRK3")
# Manual inspection found State 6 has 9358 observations which is
# close to the 9355 shown in the video data.frame.
cg17_pe <- brfss[brfss[["X_STATE"]] == 6, vars]
cg17_pe_dsgn <- svydesign(id = ~1,
strata = ~X_STSTR,
weights = ~X_LLCPWT,
data = cg17_pe)
str(cg17_pe_dsgn)
svymean(~ factor(X_RFSMOK3), cg17_pe_dsgn, na.rm = TRUE)
# some sloppy refactoring of the remaining variables with
# minimal effort invested in matching the video results.
cg17_pe[cg17_pe$SEX > 2, "SEX"] <- NA
cg17_pe$SEX <- factor(cg17_pe$SEX, labels = c("M", "F"))
cg17_pe[cg17_pe[, 5] > 2, 5] <- NA
cg17_pe[, 5] <- factor(cg17_pe[, 5], labels = 2:1)
q <- lapply(cg17_pe[, 6:23],
function(x) {x[x > 2] <- NA
factor(x, labels = 2:1)})
cg17_pe[, 6:23] <- q
#############################################################
# The rest of the code taken from the video description with
# an attempt to make it readable. No effort was made to
# improve the actual coding.
cg17_pe_design <- cg17_pe %>% as_survey_design(ids = "X_PSU",
strata = "X_STSTR",
weights = "X_LLCPWT")
cg17_pe_design <- cg17_pe %>% as_survey_design(ids = "X_PSU",
strata = "X_STSTR",
weights = "X_LLCPWT")
cg17_pe_design %>%
group_by(SEX, X_RFSMOK3) %>%
summarize(proportion = survey_mean(vartype = c("se", "ci")))
cg17_pe_design %>%
group_by(SEX, X_RFSMOK3) %>%
summarize(proportion = survey_mean(vartype = c("se", "ci"))) %>%
filter(X_RFSMOK3 == 1)
cg17_pe %>%
count(vars = X_RFSMOK3, by = SEX) %>%
filter(vars == 1)
cg17_pe_design %>%
group_by(SEX, X_RFSMOK3) %>%
summarize(proportion = survey_mean(vartype = c("se", "ci"))) %>%
filter(X_RFSMOK3 == 1) %>%
cbind(cg17_pe %>% count(vars = X_RFSMOK3, by = SEX) %>% filter(vars == 1))
pe <- function(y) {
y <- enquo(y)
cg17_pe_design %>%
group_by(SEX, !!y) %>%
summarize(proportion = survey_mean(vartype = c("se", "ci"))) %>%
filter(!!y == 1) %>%
cbind(cg17_pe %>%
count(vars = !!y, by = SEX) %>%
filter(vars == 1))
}
a <- pe(X_RFHLTH)
b <- pe(X_PHYS14D)
c <- pe(X_MENT14D)
d <- pe(X_HCVU651)
e <- pe(CHECKUP1)
f <- pe(CHOLCHK1)
g <- pe(X_RFSMOK3)
h <- pe(X_RFBING5)
i <- pe(X_TOTINDA)
j <- pe(X_FRTLT1A)
k <- pe(X_VEGLT1A)
l <- pe(X_RFBMI5)
m <- pe(DIABETE3)
n <- pe(HAVARTH3)
o <- pe(ADDEPEV2)
p <- pe(BPHIGH4)
q <- pe(TOLDHI2)
r <- pe(CVDCRHD4)
s <- pe(CVDSTRK3)
pe_table <- bind_rows(a, b, c, d, e,
f, g, h, i, j,
k, l, m, n, o,
p, q, r, s)
health_variable <- c("X_RFHLTH", "X_RFHLTH", "X_PHYS14D",
"X_PHYS14D", "X_MENT14D", "X_MENT14D",
"X_HCVU651", "X_HCVU651", "CHECKUP1",
"CHECKUP1", "CHOLCHK1", "CHOLCHK1",
"X_RFSMOK3", "X_RFSMOK3", "X_RFBING5",
"X_RFBING5", "X_TOTINDA", "X_TOTINDA",
"X_FRTLT1A", "X_FRTLT1A", "X_VEGLT1A",
"X_VEGLT1A", "X_RFMBI5", "X_RFMBI5",
"DIABETE3", "DIABETE3", "HAVARTH3",
"HAVARTH3", "ADDEPEV2", "ADDEPEV2",
"BPHIGH4", "BPHIGH4", "TOLDHI2",
"TOLDHI2", "CVDCRHD4", "CVDCRHD4",
"CVDSTRK3", "CVDSTRK3")
pe_table$health_variable <- health_variable
pe_table
pe_table_2 <- subset(pe_table,
select = c(health_variable,
SEX,
n,
proportion:proportion_upp))
pe_table_2