Hi. How to find the mean, median excluding the maximum and minimum variables from the given list (column) of numbers? Thanks.

Suppose the column of interest is named t in the data frame named df.

Then, one way of doing this will be mean(x = sort(x = df$t)[-c(1, length(x = df$t))]).

df <- data.frame(r = rnorm(n = 10),
                 s = rnorm(n = 10),
                 t = rnorm(n = 10))

df
#>              r          s           t
#> 1  -1.83591079 -0.1826240  1.45745501
#> 2   1.60511929 -0.7182501 -1.14194643
#> 3   0.39539556  0.3468794 -1.28876245
#> 4   2.25208190 -0.9199469 -2.22691378
#> 5   0.70405190  0.8550220  0.00532131
#> 6  -0.13380410  1.3264477  0.28501685
#> 7  -0.33047812  0.2084321  0.49404108
#> 8   1.34002246  1.4510729 -1.06143957
#> 9   0.47653311 -0.3784250  0.48697362
#> 10  0.03415973 -0.8906984  0.26499137

mean(x = sort(x = df$t)[-c(1, length(x = df$t))]) # what you want
#> [1] -0.2444755

mean(x = df$t) # mean based on all observations
#> [1] -0.2725263

Created on 2019-03-18 by the reprex package (v0.2.1)

Do you have any particular reason to remove just 1 observation from both sides instead of trimming a p proportion?

1 Like

For complementing Anirban´s answer, this would be a way to do it for all your variables at once

set.seed(123)
df <- data.frame(r = rnorm(n = 10),
                 s = rnorm(n = 10),
                 t = rnorm(n = 10))
library(dplyr)

df %>% 
    summarise_each(~ mean(.[!. %in% c(min(.), max(.))]))
#>            r         s          t
#> 1 0.03703159 0.2832405 -0.4765888

Created on 2019-03-18 by the reprex package (v0.2.1)
Also, I agree with his observation about the use of a trimmed mean instead of this approach e.g. mean(df$t, trim = 0.10)

2 Likes

thanks very much:) It works perfectly

Thank you. This way is far easier.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.