I am new into R studio and I want to count the number of rows that meet conditions in two columns. let's say I have a dataset something like :
df <- data.frame(stringsAsFactors = FALSE,
A = c("X", "Y", "X", "X", "Z"),
B = c(0, 1, 0, 0, 1),
C = c("AD2758", "AD2758", "AD2764", "AD2768", "AD2772")
)
I want to count those Xs in the first column which the number in the next row in the second column is 1. So it should count Xs in the first and fourth rows, but it should not count the X in the third row.
Sure, thanks williaml for reminding. Here is how the data set looks like, but it has a lot more rows of course.
# A tibble: 41 x 3
A B C
<chr> <dbl> <chr>
1 X 0 AD2758
2 Y 1 AD2758
3 X 1 AD2764
4 Y 1 AD2768
5 X 0 AD2772
6 Z 0 AD2780
7 Y 1 AD2789
8 X 0 AD2797
9 Y 1 AD2805
10 X 0 AD2814
# ... with 31 more rows
Something like this using lead? Here lead puts the value of the next row in to column D.
library(tidyverse)
df2 <- df %>%
mutate(D = lead(B, 1)) # value of next row
> df2
A B C D
1 X 0 AD2758 1
2 Y 1 AD2758 0
3 X 0 AD2764 0
4 X 0 AD2768 1
5 Z 1 AD2772 NA
df3 <- df2 %>%
filter(A == "X" & D == 1)
> df3
A B C D
1 X 0 AD2758 1
2 X 0 AD2768 1
df3 %>%
count(A)
# A tibble: 1 x 2
A n
<chr> <int>
1 X 2