 # Mean function select rows AND define conditions

Hi community,

my question is concerning the mean function.

So, if I have a data set, e.g.

``````data <- data.table(c(1,2,3,4,5,6,7,8,9,10),c(1,6,4,78,2,14,2,95,2,11))
``````

and I want to have the mean of column two, rows five to nine, I write

``````Average <- mean( data\$V2[5:9] )
``````

In a different case, when I want the mean value of all entries in the second column which are greater than two, I go

``````Average <- mean( data\$V2[ data\$V2 > 2 ] )
``````

Now, I want both. But how would I get the mean value of rows five to nine of column two where the entries are greater than two, basically combining both of the before mentioned code strips?

``````
library(data.table)
data <- data.table(c(1,2,3,4,5,6,7,8,9,10),c(1,6,4,78,2,14,2,95,2,11))
mean( data\$V2[5:9] )
mean( data\$V2[ data\$V2 > 2 ] )
mean( data\$V2[5:9][data\$V2[5:9]>2] )

library(tidyverse)
mean(slice(data,5:9) %>% pull(V2),na.rm = TRUE)
mean(filter(data,V2 > 2) %>% pull(V2),na.rm = TRUE)
mean(slice(data,5:9) %>%
filter(V2 > 2) %>%
pull(V2),na.rm = TRUE)``````
1 Like

The proper `data.table` syntax for these three cases is:

``````data[5:9, mean(V2)]
data[V2 > 2, mean(V2)]
data[5:9][V2 > 2, mean(V2)]
``````
3 Likes

Thank you nigrahamuk and martin for your fast replies. The solutions worked nicely for me.

I got a followup question, though. I want to get the mean, as described:

``````data[5:9][V2 != 2, mean(V2)]
``````

Yet I don't want to exclude ALL the entries, that are equal to two, but only one. So e.g. if I have a vector like

``````c(1,2,3,2,2,4)
``````

I want the mean of

``````c(1,3,2,2,4)
``````

Is there an easy way to do this?

If there is a 2, i want to drop it. But if there is more than one 2, i only want to drop one of them. Since it doesnt affect the mean value, i dont care if the first, last, or middle 2 is dropped.

The idea behind this is, I intend to do something like an olympic smooth, where for every datapoint I take the average of say the previous 6 datapoints, but not including one min value. But if the min value occurs more than one time, I don't want every entry to be dropped.

So for the vector

``````c(5,2,3,2,2,4,2)
``````

I want either

``````mean(c(5,3,2,2,4,2))
``````

or

``````mean(c(5,2,3,2,4,2))
``````

or

``````mean(c(5,2,3,2,2,4))
``````

but never

``````mean(c(5,3,4)
``````

Or in pseudo code for my previous example

``````data[5:9][ "Drop exactly one entry that is 2", mean(V2)]
``````

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

what you've asked for is ambigious.
are you always dropping at least one 2 , in which case if the input had only one 2, you would drop it.
or are you dropping any 2 more than a first 2 which is allowed , so that if you had only one 2 you would observe it ?

``````data[5:9, ifelse(sum(V2 == 2) > 0, (sum(V2) - 2) / (.N - 1), mean(V2))]
``````
1 Like