# Creating a new variable based on a range of scores in the dataset

I feel this is an easy solution for someone with more experience in R then myself, but I am having trouble completing the correct syntax for the following calculation.

As part of a 9 item questionnaire we use, people can respond between the ranges of 1 - 11.
For questions Q1, Q4 and Q7 if you answered between 8-11 for at least 2 of them AND for question Q3, Q6, and Q9 you answered between 8-11 for at least 2 of them you are classified into a certain category.

My question is how do you properly perform this, specifically the AND/OR option since 2 out of 3 and 3 out of 3 are both valid.

I import the csv file with 10 columns (ID, Q1, Q2....Q9)
Making a new variable at the end (i.e., categoryX) is simply. But the calculation/operator is the part I'm having an issue with.

For example I've been playing around with:
database\$categoryX[(dataset\$Q1 & dataset\$Q4 | dataset\$Q7 > 7) & (dataset\$Q3 & dataset\$Q6 | dataset\$Q9 > 7)] <- 1

But as some of you probably already know, this doesn't work fully. I've tried || and playing around but I can't get it. Does anyone know how to handle this?

Greatly appreciate it any advice !

``````dat = tibble::tribble(
ID,  Q1, Q2, Q3, Q4, Q5, Q6, Q7, Q8, Q9
001,  9,  3,  9,  8,  3,  3,  1,  3, 10
002,  1, 11,  7,  1,  7,  7,  1,  7,  8
003,  1, 11, 11,  1, 11, 11,  7, 11, 11
004,  4, 10,  9,  9,  9,  6,  6,  8,  6
)````````````

Could you post some of the data and an example of what you are trying to achieve? It sounds like you should be able to do this using `mutate` and `case_when`. Just make sure you read the help file for `case_when`, especially this part:

``````# Like an if statement, the arguments are evaluated in order, so you must
# proceed from the most specific to the most general.
``````

I second clausp request for a REPRoducible EXample (reprex) illustrating your issue.

``````dat = tibble::tribble(
ID,  Q1, Q2, Q3, Q4, Q5, Q6, Q7, Q8, Q9
001,  9,  3,  9,  8,  3,  3,  1,  3, 10
002,  1, 11,  7,  1,  7,  7,  1,  7,  8
003,  1, 11, 11,  1, 11, 11,  7, 11, 11
004,  4, 10,  9,  9,  9,  6,  6,  8,  6
)
``````

My apologiez. Here is an example of the data

Is this what you mean?

``````library(dplyr)

dat <- data.frame(
stringsAsFactors = FALSE,
ID = c("001", "002", "003", "004"),
Q1 = c(9, 1, 1, 4),
Q2 = c(3, 11, 11, 10),
Q3 = c(9, 7, 11, 9),
Q4 = c(8, 1, 1, 9),
Q5 = c(3, 7, 11, 9),
Q6 = c(3, 7, 11, 6),
Q7 = c(1, 1, 7, 6),
Q8 = c(3, 7, 11, 8),
Q9 = c(10, 8, 11, 6)
)

dat %>%
rowwise() %>%
mutate(flag = if_else(sum(Q1 > 7, Q4 > 7, Q7 > 7) >= 2 & sum(Q3 > 7, Q6 > 7, Q9 > 7) >= 2,
"Category x",
"Other Category"))
#> Source: local data frame [4 x 11]
#> Groups: <by row>
#>
#> # A tibble: 4 x 11
#>   ID       Q1    Q2    Q3    Q4    Q5    Q6    Q7    Q8    Q9 flag
#>   <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 001       9     3     9     8     3     3     1     3    10 Category x
#> 2 002       1    11     7     1     7     7     1     7     8 Other Category
#> 3 003       1    11    11     1    11    11     7    11    11 Other Category
#> 4 004       4    10     9     9     9     6     6     8     6 Other Category
``````

Created on 2020-03-16 by the reprex package (v0.3.0.9001)

1 Like

@andresrcs that is EXACTLY what I mean yes! I can't tell you my appreciation, Thank you. I was not aware you can combined the "if_else(sum(Q1 > 7, Q4 > 7, Q7 > 7) >= 2 ... option. That was a huge savior.

Thank you very much sir!! The lab and I are extremely grateful

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.