How to grouping

I have a data frame of age and height column. I want to grouping height by the age (i.e. 21-25, 26-30, etc.). So the output is: for the age group 21-25, heights are 166, 175, 172, etc.
How to do that? If possible, using the function split()

Why do you want to use the functionsplit?
Homework?
A suitable package for data.frames would be dplyr or are you not allowed to use that?

For an exercise. Otherwise is okay.
The main purpose is that i wanna make boxplot for heights stratified by age. But rather than using every single age data, i want to grouping the age instead.
Can you give an example?

I've never used split before. I would use case_when() to group:

library(tidyverse)
df <- tribble(
  ~age, ~ht,
  25,   165,
  30,   170,
  35,   175,
  40,   170,
  45,   165
)

df <- df %>%
  mutate(
    age_g = case_when(
      age > 35 ~ ">35",
      age <= 35 ~ "<=35"
  ))

df

Yields:

# A tibble: 5 x 3
    age    ht age_g
  <dbl> <dbl> <chr>
1    25   165 <=35 
2    30   170 <=35 
3    35   175 <=35 
4    40   170 >35  
5    45   165 >35

and the box plot:

df %>%
  ggplot(aes(x = age_g, y = ht)) +
  geom_boxplot()

It's give me a clue.
Yet, suppose that the age data more varied
Example:

age   ht
23    173
20    168
28    170
22    166
27    175
26    175

i want to stratify so the outcome are more or less like this

$`20-24`
173 168 166

$`25-29`
170 175 175

etc.

This is considered 'binning', of course what you bin by you can later group by but binning comes first.
You haven't said if the binning would be conducted manually by your choice of break points or whether a principled way should be found to break by. The cut() function is good built in way to get linear breaks on a variable. I believe the binr library has other perhaps more sophisticated binning functions

Yes, i solved with cut() function. Thank you.
And thanks again for introduce the term "binning"

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.