Data <- data.frame(
name = rep('Adam',5),
location = sample(c('A','B','C'),5,TRUE,prob = c(.5,.2,.3)),
money_earned = sample(0:10,5,TRUE,prob = c(.3,rep(.07,10)))
)
Data
#> name location money_earned
#> 1 Adam C 0
#> 2 Adam A 3
#> 3 Adam A 10
#> 4 Adam C 2
#> 5 Adam A 7
library('dplyr')
Data |>
group_by(name,location) |>
mutate(
flag = sum(if_else(money_earned > 0L, 1L, 0L)),
atleast2nonzero = if_else(flag >= 2L, 1L, 0L)
)
#> # A tibble: 5 × 5
#> # Groups: name, location [2]
#> name location money_earned flag atleast2nonzero
#> <chr> <chr> <int> <int> <int>
#> 1 Adam C 0 1 0
#> 2 Adam A 3 3 1
#> 3 Adam A 10 3 1
#> 4 Adam C 2 1 0
#> 5 Adam A 7 3 1
Explanation: The Data gets grouped by name and location, then I create a flag that is 1 if there was money earned and zero otherwise. Taking the sum results in the number of places per name, where there was a positive earning. If this cout is greater or equal to two, the condition is met and hence the value of atleast2nonzero equals 1.
Data <- data.frame(
name = rep('Adam',5),
location = sample(c('A','B','C'),5,TRUE,prob = c(.5,.2,.3)),
money_earned = sample(0:10,5,TRUE,prob = c(.3,rep(.07,10)))
)
Data
#> name location money_earned
#> 1 Adam A 2
#> 2 Adam B 9
#> 3 Adam C 0
#> 4 Adam A 6
#> 5 Adam B 1
library('dplyr')
Data |>
group_by(name,location) |>
mutate(atleast2nonzero = if_else(sum(money_earned > 0L) >= 2L, 1L, 0L))
#> # A tibble: 5 × 4
#> # Groups: name, location [3]
#> name location money_earned atleast2nonzero
#> <chr> <chr> <int> <int>
#> 1 Adam A 2 1
#> 2 Adam B 9 1
#> 3 Adam C 0 0
#> 4 Adam A 6 1
#> 5 Adam B 1 1
When working with indices, you should always use 1L instead of 1 (and other numbers), because 1L is integer, but 1 is double and hence twice as memory demanding. If you are working with large data sets and need indexing, using integers instead of doubles can speed up your code dramatically.
sorry I dont know if this is what I am looking for.
here is the sample data:
name location money_earned
Adam Bora 10
Adam Bora 14
Adam Nala 34
Adam Rina 10
Adam Rina 0
Adam Rina 0
name location money_earned atleast_2nonzero
Adam Bora 10 TRUE
Adam Bora 14 TRUE
Adam Nala 34 FALSE
Adam Rina 10 FALSE
Adam Rina 0 FALSE
Adam Rina 0 FALSE
I am not looking for the sum of money earned to be greater than 2. I need there to be at least two non-zero values in Money_Earned for a given name and location combination. Only then is atleast2_nonzero TRUE. Meaning, two rows must be non-zero.