Removing the dataset:

How do I create a filtered version of the dataset that removes the outliers that in a distribution.

Hi!

To help us help you, could you please prepare a reproducible example (reprex) illustrating your issue? Please have a look at this guide, to see how to create one:

The last two functions I typed down were:

geom_boxplot(aes(x = trial, y = time) + coord_flip()) 
geom_boxplot(aes(x = trial, y = time) + coord_flip()) 
aes(geom_histogram(y = ..density..))

The data table contains two columns: one contains the trial time, and the other is numbering the trial times (1,2,,3,4...). I'm supposed to create a second, filtered version of the dataset that removes the outliers that you see in the distribution. But how do I do that?

Please read the guide I gave you and at least try to provide a reproducible example, that would make much easier to help you.

Try this

library("tidyverse")
d <- tibble(x = c(-5, -4, rnorm(96), 4, 5))
d_filtered <- d %>%
  mutate(is_outlier = x %in% pluck(boxplot(x, plot = FALSE), "out")) %>% 
  filter(is_outlier %>% `!`)

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.