For loop and function any

Hello everyone,

I am a beginner on R trying to improve my skills!

In my dataset (see a small sample attached), I have different threshold values (0, 1 or 2 for example). For these thresholds, a value of 1 means that the individual (id) manifested the behaviour studied, 0 means the opposite. I would like to count for each threshold, the number of individuals that manifested the behaviour regardless of time.

table

For example, for threshold_0, the 4 individuals manifested the behaviour. But for threshold_2, only 3 manifested the behaviour (not individual c).

I thought about doing a for loop with which I would fill up the table with three columns (threshold, F and M (sex)). Only the "sex" columns would need to be filled with the number of individuals.

This is what I did... but it is obviously not working and I am blocked.

for (individual in unique(table$id)){
    for (i in 2:20){ #This corresponds to the number of columns in my real dataset
        if (any(table[[i]][table$id %in% individual] == 1)){ #Returns missing value where TRUE/FALSE needed
            #Then I would like to add +1 in the right column and line of the latest table
        }
    }
}

Could anybody help me find potential solutions or a new method to complete this ?

Thank you!

This is easiest with {dplyr}

your_data %>% group_by(threshold0) %>% count()

Thank you very much!

But this will only help me add up all the 1 for each column, right?
I am actually trying to add up the number of individuals who manifested the behaviour. For example for threshold_0, the individual a manifested the behaviour 3 times but the number of individuals who manifested the behaviour is still 1 (individual a). So I just need one value of 1 for each individual per threshold to count it. I hope my problem is clearer.

Or maybe it is possible to do that with {dplyr} ?

See the FAQ: How to do a minimal reproducible example reprex for beginners for how to solicit more specific answers.

For id a, the three columns will sum to 3, 1, and 3, respectively when using group_by and count after library(dplyr)

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.