How to loop over my data frame and find rows containing my character of interest?

cwright1 · January 13, 2021, 7:27pm

I have a dataframe df where two columns are characters, and a third is numeric. Example:

df <- data.frame("col1"= c("A", "B", "C", "D", "E", "F", "G", "A"), "col2"=c("Q", "A", "S", "Z", "A", "C", "F", "X"), "col3"=c(1,2,3,4,5,6,7,8))

df
  col1 col2 col3
1    A    Q    1
2    B    A    2
3    C    S    3
4    D    Z    4
5    E    A    5
6    F    C    6
7    G    F    7
8    A    X    8

My character of interest is 'A'. I want to know how many rows exist where 'A' is either in col1 or col2. The answer here would be 4.

It easy enough when I'm interested in one value at a time, but how can I write a function to loop over all unique values and return the number of rows? My character of interest is always split with some entries in col1, others in col2.

I'm guessing that first I could stack col1 and col2 , then subset that based only on the unique entries. Then I'd want to say, "how many rows exist where col1 or col2 contain 'B', 'C', etc.

UPDATE :
I learned more about for loops and came to this (imperfect but working) solution:

library(reshape2)

#Make a dataframe with just two columns of characters
df <- data.frame("col1"= c("A", "B", "C", "D", "E", "F", "G", "A"), "col2"=c("Q", "A", "S", "Z", "A", "C", "F", "X"))

#Allocate empty dataframe for results
newdf = NULL

#For any unique characters between col1 and col2, count how many times they appear in df
for(i in unique(stack(df)$value)){
 newdf <- rbind(newdf,data.frame("col1"=i, "col2"=nrow(df[df$col1==i | df$col2==i,])))

}

newdf
  col1 col2
1     A    4
2     B    1
3     C    2
4     D    1
5     E    1
6     F    2
7     G    1
8     Q    1
9     S    1
10    Z    1
11    X    1

nirgrahamuk · January 13, 2021, 9:26pm

library(tidyverse)
filter(df,
       col1 == "A"  | 
       col2 == "A") %>% 
  count()

cwright1 · January 13, 2021, 9:54pm

Is there any way I can modify my loop (edited and pasted above) into a function? What if df2 was just like df above, but different letters.
Could I use a function like myfun(df2) to give me a resulting dataframe?

nirgrahamuk · January 13, 2021, 10:06pm

You can put arbitrary code into a function. So yes, you could.
But I see no advantages to your approach, over the non loop approach I provided.

system · February 3, 2021, 10:12pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.