Separate YYYY-MM-DD into Individual Columns

Hi all -

I am fairly new to R and I am pulling my hair out trying to do what is probably something super simple.

I downloaded the crime data for Los Angeles from 2010 - 2019. There are 2,114,010 rows of data. Right now, it is called 'df' in my Global Environment area.

I want to manipulate one specific column titled "Occurred" - which is a date reference to when the crime occurred.

Right now, it is set up as YYYY-MM-DD (ie., 2010-02-20).

I am trying to separate all three into individual columns. I have Googled, and Googled, and Googled and tried and tried and tried things from this forum and StackExchange and just cannot get it to work.

I have tried Lubridate and followed instructions to other answers, but it simply won't create new columns (one each for Year, Month, Day).

Any help in getting this to work would be greatly appreciated!

Hi!

To help us help you, could you please prepare a reproducible example (reprex) illustrating your issue? Please have a look at this guide, to see how to create one:

Hello,

I think this is what you are looking for below.

I did not include all of the different variables, because they aren't the issue.

As mentioned in the OP, I am trying to separate 'occurred' into individual Year, Month, and Day columns.

> head(df, 10)[c('dr_no','occurred','time','area_name')]
       dr_no   occurred time area_name
1    1307355 2010-02-20 1350    Newton
2   11401303 2010-09-12   45   Pacific
3   70309629 2010-08-09 1515    Newton
4   90631215 2010-01-05  150 Hollywood
5  100100501 2010-01-02 2100   Central
6  100100506 2010-01-04 1650   Central
7  100100508 2010-01-07 2005   Central
8  100100509 2010-01-08 2100   Central
9  100100510 2010-01-09  230   Central
10 100100511 2010-01-06 2100   Central

this is close, as it retreives a portion of your frame, however, you havent used dput(). so what you have shared is not 'runnable', dput output would be runnable

Not really, that is not copy/paste friendly and it doesn't show the structure of your data, so I can't know the class of the occurred variable.

If I assume that it is of class character, you can do something like this:
Note: Please notice the way I'm posting the solution, that would be a proper reproducible example as explained in the link I gave you.

library(tidyverse)

sample_df <- data.frame(
  stringsAsFactors = FALSE,
             dr_no = c(1307355,11401303,70309629,
                       90631215,100100501,100100506,100100508,100100509,
                       100100510,100100511),
          occurred = c("2010-02-20","2010-09-12",
                       "2010-08-09","2010-01-05","2010-01-02","2010-01-04",
                       "2010-01-07","2010-01-08","2010-01-09","2010-01-06"),
              time = c(1350, 45, 1515, 150, 2100, 1650, 2005, 2100, 230, 2100),
         area_name = c("Newton","Pacific","Newton",
                       "Hollywood","Central","Central","Central","Central",
                       "Central","Central")
)

sample_df %>% 
    separate(occurred, into = c("year", "month", "day"), "-")
#>        dr_no year month day time area_name
#> 1    1307355 2010    02  20 1350    Newton
#> 2   11401303 2010    09  12   45   Pacific
#> 3   70309629 2010    08  09 1515    Newton
#> 4   90631215 2010    01  05  150 Hollywood
#> 5  100100501 2010    01  02 2100   Central
#> 6  100100506 2010    01  04 1650   Central
#> 7  100100508 2010    01  07 2005   Central
#> 8  100100509 2010    01  08 2100   Central
#> 9  100100510 2010    01  09  230   Central
#> 10 100100511 2010    01  06 2100   Central

Created on 2020-04-25 by the reprex package (v0.3.0.9001)

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.