ggplot boxplot one variable?

Hi!

I am not quite sure I understand ggplot's mapping. I like the look of the box plot from the Boxplot function in the car package but it doesn't take a horizontal command and I'd like to understand how to use ggplot for this purpose. I've provided example code and the Boxplot code as well for comparison.

I basically want to just map a boxplot of the sub_scale item but retain the student_ID's in the data set to be able to trace down outliers. I just don't understand how to do this without having the student_ID values become part of the chart. If I am not clear enough about what I am asking please let me know and I will try to clarify further but hopefully the example will be helpful.


sub_scale <- c(1.00 , 3.45, 3.00, 3.00, 3.00, 3.24, 4.00, 17, 22, 75)
student_ID <- c(1000, 1001, 1002, 1003, 1004, 1005, 1006, 1007, 1008, 1009)

xData <-data.frame(sub_scale, student_ID) #data


# Notice that student_ID values are being used

xData%>%
  ggplot( aes(x= student_ID, y=sub_scale, fill=sub_scale)) +
  geom_point() +
    geom_boxplot() +
   stat_boxplot(geom = "errorbar") + 
     theme_ipsum() +
    theme(legend.position="none")+
    coord_flip() +
    xlab("") +
    ylab("")

## Even if I try pivot longer it doesn't quite do it...

DFplot <- xData %>% 
  pivot_longer(cols = c("student_ID", "sub_scale"), names_to = "text", values_to = "value")

DFplot%>%
  ggplot( aes(x= text, y= value)) +
  geom_point() +
    geom_boxplot() +
   stat_boxplot(geom = "errorbar") + 
     theme_ipsum() +
    theme(legend.position="none")+
    coord_flip() +
    xlab("") +
    ylab("")


# Here is the graph I'd like to mimic with ggplot only have it displayed horizontally 

library(cars)
Boxplot(Data$slf_eff_T1, 
        main = "Self Efficacy",
xlab = "Response Values",
ylab = "Self Efficacy",
col = "orange",
border = "brown")

I'm not 100% sure what you want the final plot to look like. This bit of code should make it, but it doesn't sem to run (what is "Data"?). Could you double check this, or alternatively just upload a picture of the output here. Cheers!

I am not entirely sure what you are up to (partly because the example code at the bottom did not work on my computer), but I suppose you want to have outliers labeled with the student id. I am basically copying the solution suggested on a related StackExchange Thread.

is_outlier <- function(x) {
  return(x < quantile(x, 0.25) - 1.5 * IQR(x) | x > quantile(x, 0.75) + 1.5 * IQR(x))
}

xData |>
  mutate(outlier = ifelse(is_outlier(sub_scale), student_ID, as.numeric(NA))) |>
  ggplot(aes(x = "", y=sub_scale)) +
  stat_boxplot(geom = "errorbar", width = 0.5) +  
  geom_boxplot() +
  labs(y = NULL, x = NULL) +
  geom_text(aes(label = outlier, x = ""), na.rm = TRUE, vjust = -0.8) +
  coord_flip(clip = 'off') +
  theme_ipsum(grid = "X") 

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.