CountIF function i am not able to replicate as similiar to Excel in R

Hi folks,

I am having a trouble to execute countif function in R. Basically i am looking for something similar to what excel replicates. Below is what i am looking at, it's in a data frame and I want to count numbers of Hi's & Hello's in Cell_1 to Cell_6 columns and arrive at Count_Hi & Count_Hello columns. I tried using count_if("Hi", dataset), this was not working, also tried using calc function. Can you guys help me out.

Cell_1 Cell_2 Cell_3 Cell_4 Cell_5 Cell_6 Count_Hi Count_Hello
Hi .......Hello.....Hi.......Hello....Hi

Welcome in the Community.

I give here an example not in a data.frame but just with a character vector:

my_cells <-  c('Hi','there','Hello','Hi')
isHi <- my_cells == 'Hi'
print(isHi)
#> [1]  TRUE FALSE FALSE  TRUE
sum(isHi)
#> [1] 2
isHello <- my_cells == 'Hello'
print(isHello)
#> [1] FALSE FALSE  TRUE FALSE
sum(isHello)
#> [1] 1

Created on 2020-07-26 by the reprex package (v0.3.0)

Depending on what you want to do with R the website R for Data Science could be of use to you.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

Thanks guys!

Found a new alternate solution as well. The below one also works well.

Project[,"Overforecast count_2"] <- apply(Project[,c(32:36)],1,function(x) sum(x =="Overforecast"))

Here is an example with a data frame. There are ways to operate on rows but I never use them.

library(dplyr, warn.conflicts = FALSE)
library(tidyr)

set.seed(1)
DF <- data.frame(Cell_1 = sample(c("Hi", "Hello"), 10, replace = TRUE),
                 Cell_2 = sample(c("Hi", "Hello"), 10, replace = TRUE),
                 Cell_3 = sample(c("Hi", "Hello"), 10, replace = TRUE),
                 Cell_4 = sample(c("Hi", "Hello"), 10, replace = TRUE),
                 stringsAsFactors = FALSE)
DF
#>    Cell_1 Cell_2 Cell_3 Cell_4
#> 1      Hi     Hi     Hi  Hello
#> 2   Hello     Hi     Hi  Hello
#> 3      Hi     Hi     Hi     Hi
#> 4      Hi     Hi     Hi  Hello
#> 5   Hello     Hi     Hi     Hi
#> 6      Hi  Hello     Hi     Hi
#> 7      Hi  Hello  Hello  Hello
#> 8      Hi  Hello     Hi     Hi
#> 9   Hello  Hello     Hi  Hello
#> 10  Hello     Hi  Hello  Hello
DF <- DF %>% mutate(RowIndex = row_number())
COUNTS <-  DF %>% pivot_longer(cols = Cell_1:Cell_4, names_to = "Cell", values_to = "Value") %>% 
  group_by(RowIndex) %>% 
  summarize(Cnt_Hi = sum(Value == "Hi"),
            Cnt_Hello = sum(Value == "Hello"))
DF <- inner_join(DF, COUNTS, by = "RowIndex") %>% select(-RowIndex)
DF
#>    Cell_1 Cell_2 Cell_3 Cell_4 Cnt_Hi Cnt_Hello
#> 1      Hi     Hi     Hi  Hello      3         1
#> 2   Hello     Hi     Hi  Hello      2         2
#> 3      Hi     Hi     Hi     Hi      4         0
#> 4      Hi     Hi     Hi  Hello      3         1
#> 5   Hello     Hi     Hi     Hi      3         1
#> 6      Hi  Hello     Hi     Hi      3         1
#> 7      Hi  Hello  Hello  Hello      1         3
#> 8      Hi  Hello     Hi     Hi      3         1
#> 9   Hello  Hello     Hi  Hello      1         3
#> 10  Hello     Hi  Hello  Hello      1         3

Created on 2020-07-26 by the reprex package (v0.3.0)

1 Like

Thank you.

This one worked perfectly.

Project[,"Overforecast count_2"] <- apply(Project[,c(32:36)],1,function(x) sum(x =="Overforecast"))