filtering vector of dates by year

Hi everybody,

I need to filter a vector of dates by year given that the instructions before calculating how many protests in the data set occurred in each month in 2020 (the data set contains information on protests over several years), but I have not been able to figure out how to do this.

Can anyone explain how I might go about filtering the original data set (protest_dates) by both month and year? (Bearing in mind that I am only allowed to use the StringR package for this assignment?)

Thank you for your help!

Hi, can you provide a reproducible example of your dataset? We don't have your data and won't be able to help otherwise.

Okay, I don't know exactly how to share the data set in isolation, but here's the link to the assignment itself: RStudio Cloud

That is a link to a private project, other people do not have access to it.

Please have a look to our homework policy, homework inspired questions are welcome but they should not include verbatim instructions from your course.

1 Like

I get that, that's why I tried to phrase my original question so that it would not be necessary to access the data itself, I'm just trying to understand the concepts associated with these types of data.

if you are only allowed to use stringr (which is a suboptimal way of working with dates), you will have to chop up a date string into its representative components (typically, day/month/year) and go from there.

I personally prefer to deal with dates with the R package lubridate, and filter/subset with dplyr. If you are dealing with a lot of date-times, the clock package is also quite handy.

Here's a nice vignette on extracting dates from a date vector with lubridate,

Here's hears info on dplyr's filter package Subset rows using column values — filter • dplyr

With stringr - Dates usually have a consistent separator and format. For example, separating year month and day with a slash or dash. Using only stringr, there's a nice set of split and substring functions,

date = c("2022-01-01", "2022-01-02")
date %>% str_split_fixed("-", n=3)
#>      [,1]   [,2] [,3]
#> [1,] "2022" "01" "01"
#> [2,] "2022" "01" "02"

date %>% str_sub(start = 1, end = 4)
#> [1] "2022" "2022"
date %>% str_sub(start = 6, end = 7) %>% as.numeric()
#> [1] 1 1
date %>% str_sub(start = 9, end = 10) %>% as.numeric()
#> [1] 1 2

Created on 2022-04-19 by the reprex package (v2.0.1)

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.