ifelse() in summarise()

I'm trying to calculate the SD for a variable but only if a particular condition from another variable is fulfilled. This is quite similar to my previous question but I'm still struggling to properly embed ifelse() into another function.

The data frame is the following one:

df <- data.frame(ID = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), var1 = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), var2 = c(1855, 556, 2661, 764, 1021, 589, 684, 661, 668, 565, 860, 508, 804, 652, 680, 628, 693, 492, 572, 638, 1187, 1549, 653, 841, 841, 1121, 1029, 1301, 844, 1117, 669, 1124, 1412, 1165, 1668, 876, 1501, 1069, 1213, 1980), var_condition = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0))

Now I'm trying to calculate the SD but only for the cases where var_condition = 0. Please note that this is only the last value!

I thought that this script would make sense:

df %>% group_by("ID", "var1") %>% summarise(SD = ifelse(var_condition == 1, sd(var2), NA))

But whatever I try as a result I'm getting the SD where var_condition = 0 (meaning the last value that I'm trying to remove) is still included (SD = 465,454643 although it should be SD = 442,768494).

I'm using the group_by function because the original dataframe is way larger with more than one ID and var1 = 1 and I'm trying to calculate the SD for each ID + the different values of var1. This example is only one of many so I had the option to manually calculate the SD here but for the whole dataframe this would be impossible.

I hope somebody can help!

The ifelse() function does not work like that.

You could filter() on the entries where var_conditions == 1

df %>% 
  group_by(ID, var1) %>%
  filter(var_condition == 1) %>%
  summarise(SD =  sd(var2))

(PS you don't quote the variables in group_by())

Cool!

I tried to use the filter function like this before:

filter(!var_condition %in% c(0))

which didn't work out so I thought that the filter function wouldn't work altogether. But your version works perfectly!

Thank you :slight_smile:

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.