Help formulating a way to check answers in a dataset from a codebook

Hello all,

I have a large amount of people in my study who either have not completed it or who failed out due to my attention checks. Since there are 600 people, I would like to create some kind of loop or function (or both?) to check everyone's answers to find out who (and how many) failed out vs. dropped out.

There are six conditions in total, each with their own unique 3 attention check questions. Its a between subjects' design, so each person (i.e. row in the data) was assigned to one condition. I imagine the code should first look to see the condition tag of row "i" in the data, then match that tag and the appropriate QID to the codebook, and check the answer. A simple 1/0 code for fail/not fail any of the three questions should suffice.

Both example data and the codebook are here:

example_data<-structure(list(ResponseId = c(36706370L, 35165951L, 94751328L, 
91438384L, 21240033L, 18711877L, 3822060L, 37350350L, 60661688L, 
79826766L, 30722991L, 77534285L, 92995145L, 80964044L, 45542745L, 
32141225L, 29057470L, 56307572L, 93353922L), Q134 = c(NA, NA, 
1L, NA, NA, NA, NA, 1L, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1L, 
NA), Q135 = c(NA, NA, 3L, NA, NA, NA, NA, 3L, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, 3L, NA), Q136 = c(NA, NA, 1.5, NA, NA, NA, 
NA, 1.5, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1.5, NA), Q529 = c(NA, 
NA, NA, NA, NA, 1L, NA, NA, 1L, NA, NA, NA, NA, 2L, NA, NA, NA, 
NA, NA), Q530 = c(NA, NA, NA, NA, NA, 4L, NA, NA, 4L, NA, NA, 
NA, NA, 4L, NA, NA, NA, NA, NA), Q531 = c(NA, NA, NA, NA, NA, 
1.5, NA, NA, 1.5, NA, NA, NA, NA, 1.5, NA, NA, NA, NA, NA), Q534 = c(1L, 
NA, NA, NA, NA, NA, NA, NA, NA, 1L, NA, NA, NA, NA, NA, NA, 2L, 
NA, 2L), Q535 = c(6L, NA, NA, NA, NA, NA, NA, NA, NA, 6L, NA, 
NA, NA, NA, NA, NA, 6L, NA, 6L), Q536 = c(3L, NA, NA, NA, NA, 
NA, NA, NA, NA, 3L, NA, NA, NA, NA, NA, NA, 3L, NA, 3L), Q539 = c(NA, 
1L, NA, NA, NA, NA, 1L, NA, NA, NA, NA, NA, 2L, NA, NA, NA, NA, 
NA, NA), Q540 = c(NA, 3, NA, NA, NA, NA, 7.5, NA, NA, NA, NA, 
NA, 7.5, NA, NA, NA, NA, NA, NA), Q541 = c(NA, 7.5, NA, NA, NA, 
NA, 3, NA, NA, NA, NA, NA, 3, NA, NA, NA, NA, NA, NA), Q593 = c(NA, 
NA, NA, NA, 1L, NA, NA, NA, NA, NA, NA, 2L, NA, NA, NA, 1L, NA, 
NA, NA), Q594 = c(NA, NA, NA, NA, 14L, NA, NA, NA, NA, NA, NA, 
4L, NA, NA, NA, 14L, NA, NA, NA), Q595 = c(NA, NA, NA, NA, 7L, 
NA, NA, NA, NA, NA, NA, 3L, NA, NA, NA, 7L, NA, NA, NA), Q598 = c(NA, 
NA, NA, 1L, NA, NA, NA, NA, NA, NA, 1L, NA, NA, NA, 1L, NA, NA, 
NA, NA), Q599 = c(NA, NA, NA, 17.5, NA, NA, NA, NA, NA, NA, 17, 
NA, NA, NA, 17.5, NA, NA, NA, NA), Q600 = c(NA, NA, NA, 7L, NA, 
NA, NA, NA, NA, NA, 7L, NA, NA, NA, 7L, NA, NA, NA, NA), Condition = c("G_B1", 
"G_B2", "G_A1", "G_C2", "G_C1", "G_A2", "G_B2", "G_A1", "G_A2", 
"G_B1", "G_C2", "G_C1", "G_B2", "G_A2", "G_C2", "G_C1", "G_B1", 
"G_A1", "G_B1")), row.names = c(NA, -19L), class = c("tbl_df", 
"tbl", "data.frame"))
codebook<-tibble::tribble(
            ~Condition, ~QuestionID, ~Correct_Answer,
                "G_A1",      "Q134",               1,
                "G_A1",      "Q135",               3,
                "G_A1",      "Q136",             1.5,
                "G_A2",      "Q529",               1,
                "G_A2",      "Q530",               4,
                "G_A2",      "Q531",             1.5,
                "G_B1",      "Q534",               1,
                "G_B1",      "Q535",               6,
                "G_B1",      "Q536",               3,
                "G_B2",      "Q539",               1,
                "G_B2",      "Q540",             7.5,
                "G_B2",      "Q541",               3,
                "G_C1",      "Q593",               1,
                "G_C1",      "Q594",              14,
                "G_C1",      "Q595",               7,
                "G_C2",      "Q598",               1,
                "G_C2",      "Q599",            17.5,
                "G_C2",      "Q600",               7
            )

I've never made a loop or a function before, so I could use some help thinking this through.

Sorry for the late reply, I forgot I was going to look into this later. If you are still looking for a solution, here is one using "joins"

library(tidyverse)

example_data <- tibble::tribble(
    ~ResponseId, ~Q134, ~Q135, ~Q136, ~Q529, ~Q530, ~Q531, ~Q534, ~Q535, ~Q536, ~Q539, ~Q540, ~Q541, ~Q593, ~Q594, ~Q595, ~Q598, ~Q599, ~Q600, ~Condition,
      36706370L,    NA,    NA,    NA,    NA,    NA,    NA,    1L,    6L,    3L,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,     "G_B1",
      35165951L,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    1L,     3,   7.5,    NA,    NA,    NA,    NA,    NA,    NA,     "G_B2",
      94751328L,    1L,    3L,   1.5,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,     "G_A1",
      91438384L,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    1L,  17.5,    7L,     "G_C2",
      21240033L,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    1L,   14L,    7L,    NA,    NA,    NA,     "G_C1",
      18711877L,    NA,    NA,    NA,    1L,    4L,   1.5,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,     "G_A2",
       3822060L,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    1L,   7.5,     3,    NA,    NA,    NA,    NA,    NA,    NA,     "G_B2",
      37350350L,    1L,    3L,   1.5,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,     "G_A1",
      60661688L,    NA,    NA,    NA,    1L,    4L,   1.5,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,     "G_A2",
      79826766L,    NA,    NA,    NA,    NA,    NA,    NA,    1L,    6L,    3L,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,     "G_B1",
      30722991L,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    1L,    17,    7L,     "G_C2",
      77534285L,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    2L,    4L,    3L,    NA,    NA,    NA,     "G_C1",
      92995145L,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    2L,   7.5,     3,    NA,    NA,    NA,    NA,    NA,    NA,     "G_B2",
      80964044L,    NA,    NA,    NA,    2L,    4L,   1.5,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,     "G_A2",
      45542745L,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    1L,  17.5,    7L,     "G_C2",
      32141225L,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    1L,   14L,    7L,    NA,    NA,    NA,     "G_C1",
      29057470L,    NA,    NA,    NA,    NA,    NA,    NA,    2L,    6L,    3L,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,     "G_B1",
      56307572L,    1L,    3L,   1.5,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,     "G_A1",
      93353922L,    NA,    NA,    NA,    NA,    NA,    NA,    2L,    6L,    3L,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,    NA,     "G_B1"
    )

codebook <- tibble::tribble(
    ~Condition, ~QuestionID, ~Correct_Answer,
    "G_A1",      "Q134",               1,
    "G_A1",      "Q135",               3,
    "G_A1",      "Q136",             1.5,
    "G_A2",      "Q529",               1,
    "G_A2",      "Q530",               4,
    "G_A2",      "Q531",             1.5,
    "G_B1",      "Q534",               1,
    "G_B1",      "Q535",               6,
    "G_B1",      "Q536",               3,
    "G_B2",      "Q539",               1,
    "G_B2",      "Q540",             7.5,
    "G_B2",      "Q541",               3,
    "G_C1",      "Q593",               1,
    "G_C1",      "Q594",              14,
    "G_C1",      "Q595",               7,
    "G_C2",      "Q598",               1,
    "G_C2",      "Q599",            17.5,
    "G_C2",      "Q600",               7
)

example_data %>% 
    gather(QuestionID, Answer, -ResponseId) %>% 
    right_join(codebook, by = "QuestionID") %>% 
    select(ResponseId, Condition, QuestionID, Answer, Correct_Answer) %>% 
    arrange(ResponseId, Condition, QuestionID) %>% 
    mutate(check = if_else(Answer != Correct_Answer, TRUE, FALSE))
#> # A tibble: 342 x 6
#>    ResponseId Condition QuestionID Answer Correct_Answer check
#>         <int> <chr>     <chr>      <chr>           <dbl> <lgl>
#>  1    3822060 G_A1      Q134       <NA>              1   NA   
#>  2    3822060 G_A1      Q135       <NA>              3   NA   
#>  3    3822060 G_A1      Q136       <NA>              1.5 NA   
#>  4    3822060 G_A2      Q529       <NA>              1   NA   
#>  5    3822060 G_A2      Q530       <NA>              4   NA   
#>  6    3822060 G_A2      Q531       <NA>              1.5 NA   
#>  7    3822060 G_B1      Q534       <NA>              1   NA   
#>  8    3822060 G_B1      Q535       <NA>              6   NA   
#>  9    3822060 G_B1      Q536       <NA>              3   NA   
#> 10    3822060 G_B2      Q539       1                 1   FALSE
#> # … with 332 more rows

Created on 2020-01-23 by the reprex package (v0.3.0.9000)

1 Like

Thanks so much!!!! This has been beyond helpful. I was able to adapt this to my full data set and tweak it, along with the following two lines of code, to get exactly what I needed

fails=Attention_checks %>% filter(check=="TRUE") #complete list of all INCORRECTLY answered questions

yadonemessedup=fails %>% filter(check=="TRUE") %>% distinct(Response.ID, .keep_all = TRUE) #filter the above to retain only unique people

Now I just need to Google gather() and right_join() to find out how those commands work and why you used them. I thought I'd need a much more complicated line of code to do this.

No need for googling there is a free ebook that teaches how to use most tidyverse functions.

1 Like

image

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.