Hi. How to find the mean, median excluding the maximum and minimum variables from the given list (column) of numbers? Thanks.
Suppose the column of interest is named t
in the data frame named df
.
Then, one way of doing this will be mean(x = sort(x = df$t)[-c(1, length(x = df$t))])
.
df <- data.frame(r = rnorm(n = 10),
s = rnorm(n = 10),
t = rnorm(n = 10))
df
#> r s t
#> 1 -1.83591079 -0.1826240 1.45745501
#> 2 1.60511929 -0.7182501 -1.14194643
#> 3 0.39539556 0.3468794 -1.28876245
#> 4 2.25208190 -0.9199469 -2.22691378
#> 5 0.70405190 0.8550220 0.00532131
#> 6 -0.13380410 1.3264477 0.28501685
#> 7 -0.33047812 0.2084321 0.49404108
#> 8 1.34002246 1.4510729 -1.06143957
#> 9 0.47653311 -0.3784250 0.48697362
#> 10 0.03415973 -0.8906984 0.26499137
mean(x = sort(x = df$t)[-c(1, length(x = df$t))]) # what you want
#> [1] -0.2444755
mean(x = df$t) # mean based on all observations
#> [1] -0.2725263
Created on 2019-03-18 by the reprex package (v0.2.1)
Do you have any particular reason to remove just 1 observation from both sides instead of trimming a p
proportion?
For complementing Anirban´s answer, this would be a way to do it for all your variables at once
set.seed(123)
df <- data.frame(r = rnorm(n = 10),
s = rnorm(n = 10),
t = rnorm(n = 10))
library(dplyr)
df %>%
summarise_each(~ mean(.[!. %in% c(min(.), max(.))]))
#> r s t
#> 1 0.03703159 0.2832405 -0.4765888
Created on 2019-03-18 by the reprex package (v0.2.1)
Also, I agree with his observation about the use of a trimmed mean instead of this approach e.g. mean(df$t, trim = 0.10)
thanks very much:) It works perfectly
Thank you. This way is far easier.
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.
If you have a query related to it or one of the replies, start a new topic and refer back with a link.