Finding top and bottom 1% of multiple columns in a dataframe

Hi, I'd like to write a function that returns the top and bottom 1% of each column in a data frame (tibble)- Can anyone help me?

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

library(tidyverse)
(result_df <- summarise_if(iris,
                           is.numeric,
                           ~ list(quantile(x = .,
                                           probs = c(0.01, 0.99)))))

# for display purposes ; the top and bottom 1% cut offs per variable
result_df %>% unnest(cols = everything())


#now get the actual values found
#first low then high
low_1 <- map2(
  .x = names(result_df),
  .y = result_df,
  .f = ~ filter(iris, !!sym(.x) <= (.y %>% unlist() %>% .[[1]])) %>% pull(
    .x
  )
)

names(low_1) <- names(result_df)
low_1

high_99 <- map2(
  .x = names(result_df),
  .y = result_df,
  .f = ~ filter(iris, !!sym(.x) >= (.y %>% unlist() %>% .[[2]])) %>% pull(
    .x
  )
)

names(high_99) <- names(result_df)
high_99

Hi, my data is representative of a density plot and I'd like to find the top and bottom 1% of the distribution, not rows in the column. I hope thats clear, thanks!

map2(
  .x = names(result_df),
  .y = result_df,
  .f = ~ filter(iris, !!sym(.x) >= (.y %>% unlist() %>% .[[2]])) %>% pull(
    .x
  )

map2 provides iteration over two inputs in this case .x and .y params specify them.
within the .f param where you have your function that acts on what you iterate over you can refer to them as .x and .y

thanks for your help!

thanks, ill give this a go!