Remove factor level

rstudio

#1

Hello users!

In my dataset I have this factor with 3 levels

> summary(dataset$Frutta.verdura)
   No\\raramente Non regolarmente     Regolarmente 
               1               18               50 

The numbers below represent the frequency of those particular observations.

I want to draw the boxplot which compares those frequencies with a numeric variable(Energia.giornaliera), but without the level "No\raramente" because it occurs only one time which means it doesn't bring a lot of meaning to the statistics.

How do I remove the level from that dataframe's factor?

I've only found functions that remove Unused factor levels such as drop.levels(), but I'm having a hard
solving this one.

Thank you in advance!

Note:
my dataset is something like this

index, Frutta. verdura, Energia.giornaliera
[1], Regolarmente, 1
[2], Non Regolarmente, 7
[3], Regolarmente, 4
[4], Regolarmente, 9
...
[69], Non Regolarmente, 10

Index column was generated automatically btw
....!

I want this graph without the level "No\raramente"
boxplot


#2

Hello,

to work with level, you should look at {forcats] :package:
https://forcats.tidyverse.org/

Especially, you'll found some functions to change value of level like fct_collapse, fct_lump or fct_relabel

Take a look at the reference page to find the function that suits you best here.


#3

You do not need to remove the level, but rather you can simply filter it away prior to plotting like so:

# Load libraries
library('tidyverse')

# Reproducible example
set.seed(636831)

# Create dummy data
d = tibble(my_lbl = factor(c(rep(c('A','B','C'),c(2,9,9)))),
           my_val = rnorm(20))

# View the data we created
d
# A tibble: 20 x 2
   my_lbl  my_val
   <fct>    <dbl>
 1 A       0.491 
 2 A      -0.683 
 3 B      -0.980 
 4 B      -0.369 
 5 B      -2.77  
 6 B       0.204 
 7 B      -0.427 
 8 B       0.0414
 9 B       1.61  
10 B      -1.36  
11 B      -2.81  
12 C      -0.0593
13 C      -0.483 
14 C      -0.856 
15 C       0.217 
16 C       2.06  
17 C      -1.93  
18 C      -0.1000
19 C      -0.994 
20 C       1.61 

# Plot all levels
d %>%
  ggplot(aes(x = my_lbl, y = my_val, fill = my_lbl)) +
  geom_boxplot() +
  theme_bw()

# Plot, but leave out level 'A'
d %>% filter(my_lbl != 'A') %>% 
  ggplot(aes(x = my_lbl, y = my_val, fill = my_lbl)) +
  geom_boxplot() +
  theme_bw()

Hope it helps :slightly_smiling_face: