Changing xlsx variable to show the date instead of random numbers

I'm doing an assignment for class, and we are using a dataset in xlsx format. The dataset consists of an observation_date variable, and one other variable. When I uploaded the dataset into R, the observation_date appears as numbers, rather than 10/1/2010 for example. How do I change the data in the observation_date variable to make it appear as an actual date instead of 44033 for example?

str(myData)
tibble [253 × 2] (S3: tbl_df/tbl/data.frame)
observation_date: num [1:253] 44033 44034 44035 44036 44039 ... SP500 : num [1:253] 3257 3276 3236 3216 3239 ...

myData$Date <- as.Date(myData$Date, format = "%m/%d/%Y")
Error: Assigned data as.Date(myData$Date, format = "%m/%d/%Y") must be compatible with existing data.
:heavy_multiplication_x: Existing data has 253 rows.
:heavy_multiplication_x: Assigned data has 0 rows.
:information_source: Only vectors of size 1 are recycled.
Run rlang::last_error() to see where the error occurred.
In addition: Warning message:
Unknown or uninitialised column: Date.
myData$observation_date <- as.observation_date(myData$observation_date, format = "%m/%d/%Y")
Error in as.observation_date(myData$observation_date, format = "%m/%d/%Y") :
could not find function "as.observation_date"
myData$observation_date <- as.Date(myData$observation_date, format = "%m/%d/%Y")
Error in as.Date.numeric(myData$observation_date, format = "%m/%d/%Y") :
'origin' must be supplied

This was the code I tried to attempt to change it

I'm not sure specifically what you mean by "appear as an actual date", but

as.Date(44348, origin="1900-01-01")

[1] "2021-06-03"

Does that help?

I'm trying to change all of the dates in the dataset from appearing like "44348" to actually appearing like dates "1990-01-01".

You might try:

as.Date(myData$observation_date, origin="1900-01-01")

of course.

head(myData)

A tibble: 6 x 2

observation_date SP500

1 44033 3257.
2 44034 3276.
3 44035 3236.
4 44036 3216.
5 44039 3239.
6 44040 3218.

try this;

myData %>% 
  mutate(Datex = map_chr(observation_date , as.character))

I don't think that changed the format of the observation date. It still appears the same when I view(myData) or use the head(myData) commands.

if you could share your sample data in a link such as googledrive will help in getting the result or a reproducible example. As you read the file, the column seems to be numeric.
So,
try

 library(lubridate)
 myData[ , 1 ] <- ymd(myData[, 1])

its in the correct format before I imported the file into R. But once I displayed the file in R after I imported it, the date changes to the numeric value that I don't want

library(lubridate)
myData[ , 1 ] <- ymd(myData[, 1])
Warning message:
All formats failed to parse. No formats found.

This is what happened when I tried your suggestion

Excel represents dates as the number of days since an origin date, try this

library(dplyr)

myData <- myData %>% 
    mutate(Date = as.Date(Date, origin="1899-12-30"))

library(dplyr)

myData <- myData %>%

  • mutate(Date = as.Date(Date, origin="1899-12-30"))
    Error: Problem with mutate() column Date.
    :information_source: Date = as.Date(Date, origin = "1899-12-30").
    :heavy_multiplication_x: do not know how to convert 'Date' to class “Date”
    Run rlang::last_error() to see where the error occurred.

not sure how to share the data from googledrive link on here

To help us help you, could you please prepare a reproducible example (reprex) illustrating your issue? Please have a look at this guide, to see how to create one:

here is the link:

here is the code:

library(readxl)
myData <- read_excel("Downloads/SP500Correct.xlsx")
Error: path does not exist: ‘Downloads/SP500Correct.xlsx’
myData <- read_excel("Downloads/SP500Correct.xlsx")
Error: path does not exist: ‘Downloads/SP500Correct.xlsx’
library(readxl)
myData <- read_excel("Downloads/SP500Correct.xlsx")
Error: path does not exist: ‘Downloads/SP500Correct.xlsx’
myData <- read_excel("Downloads/SP500Correct.xlsx")

view(myData)
nrow(myData)
[1] 253
head(myData)

A tibble: 6 x 2

observation_date SP500

1 44033 3257.
2 44034 3276.
3 44035 3236.
4 44036 3216.
5 44039 3239.
6 44040 3218.

str(myData)
tibble [253 × 2] (S3: tbl_df/tbl/data.frame)
observation_date: num [1:253] 44033 44034 44035 44036 44039 ... SP500 : num [1:253] 3257 3276 3236 3216 3239 ...
myData$Date <- as.Date(myData$Date, format = "%m/%d/%Y")
Error: Assigned data as.Date(myData$Date, format = "%m/%d/%Y") must be compatible with existing data.
:heavy_multiplication_x: Existing data has 253 rows.
:heavy_multiplication_x: Assigned data has 0 rows.
:information_source: Only vectors of size 1 are recycled.
Run rlang::last_error() to see where the error occurred.
In addition: Warning message:
Unknown or uninitialised column: Date.
myData$observation_date <- as.observation_date(myData$observation_date, format = "%m/%d/%Y")
Error in as.observation_date(myData$observation_date, format = "%m/%d/%Y") :
could not find function "as.observation_date"
myData$observation_date <- as.Date(myData$observation_date, format = "%m/%d/%Y")
Error in as.Date.numeric(myData$observation_date, format = "%m/%d/%Y") :
'origin' must be supplied

view(myData)

myData %>%

  • mutate(Datex = map_chr(observation_date , as.character))

A tibble: 253 x 3

observation_date SP500 Datex

1 44033 3257. 44033
2 44034 3276. 44034
3 44035 3236. 44035
4 44036 3216. 44036
5 44039 3239. 44039
6 44040 3218. 44040
7 44041 3258. 44041
8 44042 3246. 44042
9 44043 3271. 44043
10 44046 3295. 44046

… with 243 more rows

view(myData)
head(myData)

A tibble: 6 x 2

observation_date SP500

1 44033 3257.
2 44034 3276.
3 44035 3236.
4 44036 3216.
5 44039 3239.
6 44040 3218.

library(lubridate)

Attaching package: ‘lubridate’

The following objects are masked from ‘package:base’:

date, intersect, setdiff, union

myData[ , 1 ] <- ymd(myData[, 1])
Warning message:
All formats failed to parse. No formats found.
library(lubridate)
myData[ , 1 ] <- ymd(myData[, 1])
Warning message:
All formats failed to parse. No formats found.
library(lubridate)
myData[ , 1 ] <- ymd(myData[, 1])
Warning message:
All formats failed to parse. No formats found.
library(dplyr)

myData <- myData %>%

  • mutate(Date = as.Date(Date, origin="1899-12-30"))
    

Error: Problem with mutate() column Date.
:information_source: Date = as.Date(Date, origin = "1899-12-30").
:heavy_multiplication_x: do not know how to convert 'Date' to class “Date”
Run rlang::last_error() to see where the error occurred.

library(dplyr)
myData <- myData %>%

  • mutate(Date = as.Date(Date, origin="1899-12-30"))
    Error: Problem with mutate() column Date.
    :information_source: Date = as.Date(Date, origin = "1899-12-30").
    :heavy_multiplication_x: do not know how to convert 'Date' to class “Date”
    Run rlang::last_error() to see where the error occurred.

Did you try:

myData$observation_date <- as.Date(myData$observation_date, origin="1900-01-01")

That should change the original column in your myData data frame.

This solution worked for me. Thank you so much

1 Like

There are lots of great solutions here already but just to also mention the {janitor} package which works really well with tidyverse packages and has a function convert_to_date() that specifically handles the Excel serial numbers to dates problem. It's a great package that has some other cleaning functions and is really easy to use.

1 Like