Confused about case_when

So I'm having some trouble with this question. I conceptually understand what I'm trying to do, but I'm not sure how to make it happen. Essentially, I want to create a tibble that flips a coin 6 times and returns how many times it flipped heads. The twist is that the coin may or may not be fake and this is represented by 5 probabilities for heads (.00, .25, .50, .75, 1). Also I have priors for each of these probabilities.

To tackle this question I created a tibble that samples from my five probabilities with the priors written in the prob section. Then I try and use mutate to create a number_heads column that should show how many heads were flipped that also correspond to my p column. I try and use case_when to then create five different samples that match the probabilities from earlier. I'm not getting anywhere with this approach though.

I know this strategy worked for me when I only had to probabilities and I used an if_else command. But now that I have 5, I'm hitting a wall. Hopefully this makes sense.

Thanks in advance.

Q2 <- tibble(replicate = 1:1000) %>%
  mutate(p = sample(c("0.00", "0.25", ".50", ".75", "1.00"), size = 1000, replace = TRUE, c(.25, .05, .4, .05, .25))) %>%
  mutate(
    number_heads = case_when(
      
      sample(c(0, 1, 2, 3, 4, 5, 6), size = 1000, replace = TRUE, prob = c(1, 0, 0, 0, 0, 0, 0)) ~ "0.00",
    
      sample(c(0, 1, 2, 3, 4, 5, 6), size = 1000, replace = TRUE, prob = c(0.1779785, 0.355957, 0.2966309, 0.1318359, 0.03295898, 0.004394531, 0.0002441406)) ~ "0.25",
    
      sample(c(0, 1, 2, 3, 4, 5, 6), size = 1000, replace = TRUE, prob = c(0.015625, 0.09375, 0.234375, 0.3125, 0.234375, 0.09375, 0.015625)) ~ "0.50",

      sample(c(0, 1, 2, 3, 4, 5, 6), size = 1000, replace = TRUE, prob = c(0.0002441406, 0.004394531, 0.03295898, 0.1318359, 0.2966309, 0.355957, 0.1779785)) ~ "0.75", 

      sample(c(0, 1, 2, 3, 4, 5, 6), size = 1000, replace = TRUE, prob = c(0, 0, 0, 0, 0, 0, 1)) ~ "1.00",
      
      TRUE ~ NA    
      ))

Please do not post screenshots, they are not useful, post formatted code instead, here is how to do it.

Ideally, you should ask your question providing a proper reproducible example, like explained in this guide

My bad. Sorry about that. I added the formatted code instead. I don't really have a dataset to add in though. I do have my calculations for the probabilities, but that code is kinda a mess so I didn't want to junk up my post.

Edit: I removed the variables and replaced them with the actual values. I have nothing else on my end that isn't the code I posted now.

Check the documentation for case_when(), the left-hand side should evaluate to a logical value

A sequence of two-sided formulas. The left hand side (LHS) determines which values match this case. The right hand side (RHS) provides the replacement value.

The LHS must evaluate to a logical vector. The RHS does not need to be logical, but all > RHSs must evaluate to the same type of vector.
....

So you have to do something like this

library(dplyr)

tibble(replicate = 1:1000) %>%
    mutate(p = sample(c("0.00", "0.25", "0.50", "0.75", "1.00"), size = 1000, replace = TRUE, c(.25, .05, .4, .05, .25)),
           number_heads = case_when(
               p == "0.00" ~ sample(c(0, 1, 2, 3, 4, 5, 6), size = 1000, replace = TRUE, prob = c(1, 0, 0, 0, 0, 0, 0)),
               p == "0.25" ~ sample(c(0, 1, 2, 3, 4, 5, 6), size = 1000, replace = TRUE, prob = c(0.1779785, 0.355957, 0.2966309, 0.1318359, 0.03295898, 0.004394531, 0.0002441406)),
               p == "0.50" ~ sample(c(0, 1, 2, 3, 4, 5, 6), size = 1000, replace = TRUE, prob = c(0.015625, 0.09375, 0.234375, 0.3125, 0.234375, 0.09375, 0.015625)),
               p == "0.75" ~ sample(c(0, 1, 2, 3, 4, 5, 6), size = 1000, replace = TRUE, prob = c(0.0002441406, 0.004394531, 0.03295898, 0.1318359, 0.2966309, 0.355957, 0.1779785)), 
               p == "1.00" ~ sample(c(0, 1, 2, 3, 4, 5, 6), size = 1000, replace = TRUE, prob = c(0, 0, 0, 0, 0, 0, 1)),
               TRUE ~ NA_real_
           ))
#> # A tibble: 1,000 x 3
#>    replicate p     number_heads
#>        <int> <chr>        <dbl>
#>  1         1 0.50             2
#>  2         2 0.50             3
#>  3         3 0.00             0
#>  4         4 1.00             6
#>  5         5 0.50             2
#>  6         6 1.00             6
#>  7         7 0.50             3
#>  8         8 0.00             0
#>  9         9 0.50             4
#> 10        10 0.75             5
#> # … with 990 more rows

Created on 2019-10-20 by the reprex package (v0.3.0.9000)

3 Likes

Thank you so much! This has been killing me and I suspected that I was making a simple mistake with regards to case_when.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.