Extracting data

Hi there

so I have a set of data with about five variables or so. I want to extract specific information from this data that meets a certain criteria, for example my headers could be something like "Name" + "Age" + "Date of birth" + "Sex" and maybe "age group" as well and lets say this dataset is called Class1, so usually when we extract rows or columns I'd use the code Class1[1,] if i want to extract the first row of information or Class1[1,2] if i want to extract the 1st row and second column of the information right?

Now what i want to do with my hypothetical class1 data set is I want to extract the information of the person that was born last in each age group perhaps, like i could have had an age group of 18-24, 25-30 etc and now I want to know the name, age , date of birth, sex and age group of the person who's date of birth is the oldest or youngest perhaps, how would I go about doing that? I know i would need to use the min or max function somehow when extracting but i just don't know how to go about it

I would first find the maximum DateOfBirth within each group.

library(dplyr)
MaxDates <- Class1 %>% group_by(AgeGroup) %>% 
      summarize(MaxDate = max(DateOfBirth))

I would then filter the initial data using the MaxDates result and the function semi_join()

FilteredData <- semi_join(Class1, MaxDates, by = c("DateOfBirth" = "MaxDate"))

Obviously, I could not test that, not having your data.
Notice I changed your column names to not have spaces since spaces in names causes needless trouble.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.