Please HELP in loop For

I have a very simple question but I cannot find the answer.
I have a data containing 135 observations with 5 variables to describe them. one of them is the "Sample" variable containing three levels: month 0, month 2 and month 4.
The other variable is a value (proportion of gonads)
I would like to create a for loop selecting, for each sample level (month 0, month 2 and month 4), the above-average observations during the sample (month 0, month 2 and month 4).
In other words, for each sample (month 0, month 2 and month 4), I have to average the proportion of gonads, then select the observations greater than this average in each sample.

I manage to create a for loop that answers me, for each sample, if yes or no, its proportion of gonads is greater than the average:

for (sample in unique (data$Sample)) {
print (data$GT[data$Sample == sample]> mean(data$GT [data$Sample == sample]))
}

And r answer :
[1] FALSE FALSE TRUE TRUE TRUE TRUE TRUE TRUE FALSE TRUE FALSE TRUE TRUE TRUE TRUE
[16] TRUE TRUE TRUE TRUE FALSE TRUE TRUE FALSE TRUE TRUE TRUE TRUE FALSE TRUE TRUE
[31] TRUE TRUE TRUE FALSE TRUE FALSE TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[46] FALSE FALSE FALSE
[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[16] TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[31] TRUE TRUE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE TRUE TRUE TRUE TRUE
[16] TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE FALSE
[31] FALSE FALSE TRUE TRUE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE

but I am unable to collect the other information on each of these observations (the 4 other variables). In fact, I would like R to show me all the information about "the above-average observations in each sample.". I would like R to show me all the information about "True Value"

Please help me, it drives me crazy !

meanG <- mean(data$GT[data$Sample ==sample])
data[, data$Sample== sample & data$Sample> meanG]

I think is what you are doing and I don't think your T & Fs are currently right.

The tidyverse solution is likely much cleaner

Here is a tidyverse flavour:

myData <- data.frame( sample = rep( seq(1:3), 10),
                      GT = seq(1: 30),
                      another = rep(letters[1:15], 2))

myData


myData %>%
    group_by(sample) %>%
    filter(GT > mean(GT))

Thats all the data with GT > the mean in its group. You could then sort , filter, or summarise within the sample

First thank you for answer.
Yes I know it will be better with tidyverse, but I have to deal with for loop - because it's in my homework answer ("Please use for loop"...).
So, I repeat my question but in the famous reprex Data iris :

for (species in unique(data$Species)) {
print(data$Petal.Length[data$Species==species]>mean(data$Petal.Length[data$Species==species]))
}

And R answer : FALSE FALSE FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE TRUE TRUE FALSE FALSE FALSE
[16] TRUE FALSE FALSE TRUE TRUE TRUE TRUE FALSE TRUE TRUE TRUE TRUE TRUE FALSE TRUE
[31] TRUE TRUE TRUE FALSE TRUE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE TRUE TRUE
[46] FALSE TRUE FALSE TRUE FALSE
[1] TRUE TRUE TRUE FALSE TRUE TRUE TRUE FALSE TRUE FALSE FALSE FALSE FALSE TRUE FALSE
[16] TRUE TRUE FALSE TRUE FALSE TRUE FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE
[31] FALSE FALSE FALSE TRUE TRUE TRUE TRUE TRUE FALSE FALSE TRUE TRUE FALSE FALSE FALSE
[46] FALSE FALSE TRUE FALSE FALSE
[1] TRUE FALSE TRUE TRUE TRUE TRUE FALSE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE
[16] FALSE FALSE TRUE TRUE FALSE TRUE FALSE TRUE FALSE TRUE TRUE FALSE FALSE TRUE TRUE
[31] TRUE TRUE TRUE FALSE TRUE TRUE TRUE FALSE FALSE FALSE TRUE FALSE FALSE TRUE TRUE
[46] FALSE FALSE FALSE FALSE FALSE

I Would like to have all the information about "True" .. (
Like in

data[data$Petal.Length > mean(data$Petal.Length),]

R answer :
data[data$Petal.Length > mean(data$Petal.Length),]

Sepal.Length Sepal.Width Petal.Length Petal.Width    Species

51 7.0 3.2 4.7 1.4 versicolor
52 6.4 3.2 4.5 1.5 versicolor
53 6.9 3.1 4.9 1.5 versicolor
54 5.5 2.3 4.0 1.3 versicolor
55 6.5 2.8 4.6 1.5 versicolor
.......
I would like this kind of answer, but with observations greater than the mean of Petal.Length in each species (And not for all observations..)

Thank you so much to help me

Homework is usually out of bounds. But your error is simple so go back and look at the non-tidyverse solution I suggested. Simply but BOTH your criteria in the [square brackets] so:

data[data$Species == data$species & data$Petal.Length > mean(data$Petal.Length[Species == data$species]), ]

That is a hideous line which will be prone to error. So you might want to pull the mean out before hand like I did in my example, and you may want to use attach() (tidyverse just screamed at me in shock that I would even say that!) - remember to detach()

data <- iris
for (species in unique(data$Species)) {
      speciesMean = mean(data$Petal.Length[data$Species== species])
      print (data[data$Petal.Length > speciesMean & data$Species == species,])
      print ("========")
}

# tidyverse way

require(tidyverse)
for (species in unique(data$Species)) {
    iris %>%
        filter(Species == species) %>%
        filter(Petal.Length > mean(Petal.Length)) %>% print()
    print ("========")
}

Or - if tidyverse was allowed - you can still loop it in a for statement and filter.

So to be clear instead of

data $GT [criteria ]

use

data[ criteria , ]

It's so hard when you are alone and turn around the right way, and so easy and clear with some help.

Thank you to help me, despite my English. Your advices was really really helpful for me ...!
Nice and happy evening
Guillaume

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.