Lubridate, change factor to date

Hi there, I am looking to change a column of factors (displayed as e.g. 01Apr2015) into dates. There are two years of dates with multiple rows for each date. I need to create a cross section date from 31st March 2017 and create a new dataset to include the 56 days before this. However, I am not able to filter this column (called "date") because it is a factor variable.

Does anyone have any suggestion about how I could do this using lubridate?

Many thanks,

Clare

Hi Clare,

The trick is to convert the factor to character. Try this:

library(lubridate)
xx  <-  as.factor("2020-10-12")

aa  <- ymd(as.character(xx)) 

Thanks for this. I am very new to R and not quite sure what the xx and aa stand for. My data sheet name is Activity2015_2017 and the column with the date is ActivityDate. Do I use these names or do I create new names for xx and aa, and if so, what do they represent?

Many thanks again

If you need more specific help, please provide a proper REPRoducible EXample (reprex) illustrating your issue.

Otherwise, we can only give you general examples like this one

library(dplyr)
library(lubridate)

# Made up sample data (replace this with your actual data frame)
Activity2015_2017 <- data.frame(
    ActivityDate = factor(c("01Apr2015", "01Apr2015", "02Apr2015", "03Apr2015"))
)

Activity2015_2017 %>% 
    mutate(ActivityDate = dmy(ActivityDate))
#>   ActivityDate
#> 1   2015-04-01
#> 2   2015-04-01
#> 3   2015-04-02
#> 4   2015-04-03

Created on 2020-11-22 by the reprex package (v0.3.0.9001)

Oh sorry. It was a bit late when I wrote that and I see now I was too brief.

I needed a factor to work with so I created a factor called xx with :

xx  <-  as.factor("2020-10-12")

Do

class(xx)

or

str(xx)

to see that is, in fact a factor.

We cannot directly convert a factor to a date. So first we convert the factor value to a character value using as.character(). Then we apply the ymd() function to convert to date format

to make it a little simpler

## Create factor data
xx  <-  as.factor("2020-10-12")

## load needed library
library(lubridate)

## Convert factor to character
 yy  <-  as.character(xx)

## convert character data to date
 yy  <-  as.character(xx)

Eh voilà.

Assuming you have a column of data called "xx" in a data.frame or a tibble called "dat1", to make it easy to see what is happening we can do this:

## Convert to character data
yy   <-  as.character(dat1$xx)

## Covert to date data and replace the old format data with the new date format data  in the tibble or data.frame
dat1$xx  <-  ymd(yy)

Once you are more familiar with R it is just more convenient to it in one line as in:

dat1$xx  <-  ymd(as.character(dat1$xx)

So, assuming your data column is "Activity2015_2017$ActivityDate" and changing to the lubridate function dmy() to match for data format this should work:

Activity2015_2017$ActivityDate  <-  dmy(as.character(Activity2015_2017$ActivityDate))

You might find these links useful.

Good luck and welcome to R

Thank you so much for this, you have completely solved the issue! I really appreciate your description of the process too.

You have spent lots of time helping me already and you probably won't have time, but I just wondered if you might know how I can create a subset of my tibble Activity2015_2017 only looking at dates from 31st March 2018 and the 56 days before? (i.e. 2017-02-03 to 2017-03-31)

Thank you so much again, this has been really helpful

A basic subset() should do this.

Sample data

structure(list(dats = structure(c(17167, 17168, 17169, 17170, 
17171, 17172, 17173, 17174, 17175, 17176, 17177, 17178, 17179, 
17180, 17181, 17182, 17183, 17184, 17185, 17186, 17187, 17188, 
17189, 17190, 17191, 17192, 17193, 17194, 17195, 17196, 17197, 
17198, 17199, 17200, 17201, 17202, 17203, 17204, 17205, 17206, 
17207, 17208, 17209, 17210, 17211, 17212, 17213, 17214, 17215, 
17216, 17217, 17218, 17219, 17220, 17221, 17222, 17223, 17224, 
17225, 17226, 17227, 17228, 17229, 17230, 17231, 17232, 17233, 
17234, 17235, 17236, 17237, 17238, 17239, 17240, 17241, 17242, 
17243, 17244, 17245, 17246, 17247, 17248, 17249, 17250, 17251, 
17252, 17253, 17254, 17255, 17256, 17257, 17258, 17259, 17260, 
17261, 17262, 17263, 17264, 17265, 17266, 17267, 17268, 17269, 
17270, 17271, 17272, 17273, 17274, 17275, 17276, 17277, 17278, 
17279, 17280, 17281, 17282, 17283, 17284, 17285, 17286, 17287, 
17288, 17289, 17290, 17291, 17292, 17293, 17294, 17295, 17296, 
17297, 17298, 17299, 17300, 17301, 17302, 17303, 17304, 17305, 
17306, 17307, 17308, 17309, 17310, 17311, 17312, 17313, 17314, 
17315, 17316, 17317, 17318, 17319, 17320, 17321, 17322, 17323, 
17324, 17325, 17326, 17327, 17328, 17329, 17330, 17331, 17332, 
17333, 17334, 17335, 17336, 17337, 17338, 17339, 17340, 17341, 
17342, 17343, 17344, 17345, 17346, 17347, 17348, 17349, 17350, 
17351, 17352, 17353, 17354, 17355, 17356, 17357, 17358, 17359, 
17360, 17361, 17362, 17363, 17364, 17365, 17366, 17367, 17368, 
17369, 17370, 17371, 17372, 17373, 17374, 17375, 17376, 17377, 
17378, 17379, 17380, 17381, 17382, 17383, 17384, 17385, 17386, 
17387, 17388, 17389, 17390, 17391, 17392, 17393, 17394, 17395, 
17396, 17397, 17398, 17399, 17400, 17401, 17402, 17403, 17404, 
17405, 17406, 17407, 17408, 17409, 17410, 17411, 17412, 17413, 
17414, 17415, 17416, 17417, 17418, 17419, 17420, 17421, 17422, 
17423, 17424, 17425, 17426, 17427, 17428, 17429, 17430, 17431, 
17432, 17433, 17434, 17435, 17436, 17437, 17438, 17439, 17440, 
17441, 17442, 17443, 17444, 17445, 17446, 17447, 17448, 17449, 
17450, 17451, 17452, 17453, 17454, 17455, 17456, 17457, 17458, 
17459, 17460, 17461, 17462, 17463, 17464, 17465, 17466, 17467, 
17468, 17469, 17470, 17471, 17472, 17473, 17474, 17475, 17476, 
17477, 17478, 17479, 17480, 17481, 17482, 17483, 17484, 17485, 
17486, 17487, 17488, 17489, 17490, 17491, 17492, 17493, 17494, 
17495, 17496, 17497, 17498, 17499, 17500, 17501, 17502, 17503, 
17504, 17505, 17506, 17507, 17508, 17509, 17510, 17511, 17512, 
17513, 17514, 17515, 17516, 17517, 17518, 17519, 17520, 17521, 
17522, 17523, 17524, 17525, 17526, 17527, 17528, 17529, 17530, 
17531), class = "Date"), nums = c("i", "p", "c", "v", "y", "j", 
"w", "v", "x", "m", "y", "c", "s", "r", "u", "x", "z", "w", "f", 
"t", "i", "g", "s", "f", "e", "l", "h", "p", "a", "d", "w", "s", 
"i", "t", "q", "f", "v", "z", "f", "u", "u", "m", "q", "b", "f", 
"z", "b", "u", "p", "k", "d", "c", "z", "d", "u", "j", "f", "f", 
"i", "k", "s", "y", "v", "n", "h", "q", "m", "y", "p", "b", "f", 
"x", "k", "f", "s", "w", "o", "u", "v", "n", "w", "s", "s", "g", 
"i", "p", "h", "c", "z", "p", "v", "w", "b", "e", "p", "u", "i", 
"c", "h", "c", "h", "s", "y", "d", "x", "t", "s", "z", "g", "q", 
"y", "c", "b", "e", "o", "o", "j", "v", "y", "q", "c", "y", "p", 
"i", "t", "l", "e", "q", "c", "o", "v", "e", "f", "r", "s", "r", 
"w", "k", "p", "f", "g", "h", "d", "k", "o", "q", "k", "g", "j", 
"x", "k", "c", "z", "r", "k", "x", "a", "h", "h", "v", "j", "e", 
"t", "l", "j", "t", "a", "v", "f", "c", "d", "d", "k", "c", "a", 
"s", "a", "y", "l", "s", "j", "u", "m", "l", "a", "u", "m", "j", 
"v", "m", "h", "r", "w", "b", "x", "f", "y", "u", "n", "f", "e", 
"k", "n", "i", "t", "t", "a", "a", "o", "f", "u", "j", "b", "k", 
"y", "r", "l", "k", "s", "e", "y", "b", "f", "b", "e", "w", "m", 
"z", "t", "e", "i", "w", "f", "o", "v", "t", "i", "b", "k", "x", 
"l", "x", "e", "s", "g", "t", "f", "e", "o", "d", "w", "m", "b", 
"j", "d", "r", "t", "j", "b", "q", "d", "v", "j", "x", "s", "m", 
"n", "i", "m", "w", "o", "s", "z", "j", "m", "a", "f", "u", "v", 
"c", "o", "s", "a", "q", "r", "o", "o", "u", "d", "n", "n", "q", 
"j", "p", "w", "v", "j", "q", "n", "f", "v", "v", "o", "g", "u", 
"e", "u", "f", "m", "k", "v", "t", "h", "p", "s", "r", "p", "l", 
"w", "o", "h", "d", "m", "n", "i", "s", "a", "d", "x", "n", "j", 
"x", "m", "m", "j", "s", "k", "o", "u", "o", "k", "d", "v", "q", 
"n", "n", "r", "v", "h", "m", "t", "k", "c", "v", "k", "j", "j", 
"r", "z", "c", "d", "o", "u", "v", "b")), class = "data.frame", row.names = c(NA, 
-365L))

Subset command

dat2  <-  subset(dat1, dats >= as.Date("2017-02-03") & dats <= as.Date("2017-03-31"))

Good luck.

Edit

I forgot you were using lubridate. The above will work but to keep things consistent here is tho same thing in lubridate format.

subset(dat1, dats >= ymd("2017-02-03") & dats <= ymd("2017-03-31"))
``

Thanks so much. Just working out what dat1 and dats refer to. My data frame is now ActivityCohort and the new dataframe with subset of selected dates would be called ActivityCohortDate - are these objects what dat1 and dats refer to?

Thanks very much again

Thanks so much, got it to work now, really appreciate your help

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.