I'm pretty new to R and I am a bit stuck. I have to create a variable to categorise study hours into low medium and high. I already have a column with the study hours and I created a new variable called Study_Effort but now I don't know how to categorize the hours into low medium and high
You can use the
cut function for this. For example:
library(tidyverse) # Fake data set.seed(3) d = tibble(study.hours = rnorm(100, 20, 8)) hist(d$study.hours)
d = d %>% mutate(study.hours.category = cut(study.hours, breaks=c(0,15,30,Inf), labels=c("low","medium","high"), include.lowest=TRUE)) d #> # A tibble: 100 × 2 #> study.hours study.hours.category #> <dbl> <fct> #> 1 12.3 low #> 2 17.7 medium #> 3 22.1 medium #> 4 10.8 low #> 5 21.6 medium #> 6 20.2 medium #> 7 20.7 medium #> 8 28.9 medium #> 9 10.2 low #> 10 30.1 high #> # … with 90 more rows ggplot(d, aes(study.hours.category, study.hours)) + geom_point() + expand_limits(y=0)
Created on 2022-01-05 by the reprex package (v2.0.1)
This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.
If you have a query related to it or one of the replies, start a new topic and refer back with a link.