I have a directory where several csv files are stored. The files are created from a job that downloads a new report each morning and affixes the date to the end of the file. Example of files in this directory:
RAM_report2018-05-14
RAM_report2018-05-15
RAM_report2018-05-16
I have a for loop that takes everything in the directory and creates a dataframe for each report. The date the report was run is not contained within the csv file itself so I want to use mutate to add a date column to every dataframe:
file_list <- list.files(pattern = "\\.csv$")
for (i in 1:length(file_list)){
assign(paste0(gsub("\\..*", "", file_list[i]), read.csv(file_list[i])) %>%
mutate(Report.Date = as.Date(gsub("\\D", "", file_list[i]), format = "%Y%m%d")))
}
which returns the following:
Error in UseMethod("mutate_") :
no applicable method for 'mutate_' applied to an object of class "character"
if I run:
str(as.Date(gsub("\\D", "", file_list[i]), format = "%Y%m%d")))
it returns:
Date[1:1], format: "2018-5-14"
I am not sure where else to go from here and Google/SO have not returned any results that have resolved the issue.
file_list <- list.files(pattern = "\.csv$") for (i in 1:length(file_list)){ assign(paste0(gsub("\..*", "", file_list[i]), read.csv(file_list[i])) %>% mutate("Report.Date" = as.Date(gsub("\D", "", file_list[i]), format = "%Y%m%d"))) }
Report.Date should not be quoted in the left-hand side of your assignment statement in mutate.
file_list <- list.files(pattern = "\\.csv$")
for (i in 1:length(file_list)){
assign(paste0(gsub("\\..*", "", file_list[i]), read.csv(file_list[i])) %>%
mutate(Report.Date = as.Date(gsub("\\D", "", file_list[i]), format = "%Y%m%d")))
}
Edited my post. I had quoted it during an attempt at troubleshooting the problem. The error still persists.
Maybe the probablem is with how you are using assign. I rarely use that function. My typical approach would be to stash each file in a list and then possibly combine the elements of the list into a single data frame with bind_rows()
(if appropriate). You could try this:
file_list <- list.files(pattern = "\\.csv$")
list.out = list() # initialize empty list
for (i in 1:length(file_list)){
list.out[[paste0(gsub("\\..*", "", file_list[i])]] = read.csv(file_list[i]) %>%
mutate(Report.Date = as.Date(gsub("\\D", "", file_list[i]), format = "%Y%m%d")))
}
1 Like
Like this (?):
for (i in 1:length(file_list)){
list.out[[paste0(gsub("\\..*", "", file_list[i]))]] = read.csv(file_list[i]) %>%
mutate(Report.Date = as.Date(gsub("\\D", "", file_list[i]), format = "%Y%m%d"))
}
There was a small issue with the code you sent but I think ^ is what you were describing.
What's odd is, I have used the same loop with assign and mutate previously with no issues. I may head your direction but I would still like to know why mutate is throwing this error.
Also, this directory actually has multiple different csvs that will eventually be joined so I can't bind everything in the list together simply. Thanks for the help
Yes, I missed the closing parenthesis in the paste/gsub combo, but that is what I had in mind. Hopefully, someone else will chime in to answer your specific question.
It's a problem of misplaced parentheses. Your original code from inside the for
loop, with helpful whitespace added:
assign(
paste0(
gsub("\\..*", "", file_list[i]),
read.csv(file_list[i])
) %>%
mutate(
Report.Date = as.Date(gsub("\\D", "", file_list[i]), format = "%Y%m%d")
)
)
The value returned by paste0()
, which is a character vector, is being passed to mutate()
. I suggest splitting up this single statement. One possibility:
# Using seq_along() to avoid 1:0 problems
for (i in seq_along(file_list)) {
# More explicit regex to avoid problems if reports have digits in the name
date_stamp <- gsub(".*(\\d{4}-\\d{2}-\\d{2})\\.csv$", "\\1", file_list[i])
report_df <- read.csv(file_list[i])
report_df[["Report.Date"]] <- as.Date(date_stamp, format = "%Y-%m-%d")
df_name <- tools::file_path_sans_ext(file_list[i])
assign(df_name, report_df)
}
I'd also recommend @hinkelman's advice:
But go further to suggest wrapping the reading logic in a function and using lapply
:
read_report_data <- function(path) {
date_stamp <- gsub(".*(\\d{4}-\\d{2}-\\d{2})\\.csv$", "\\1", path)
report_df <- read.csv(path)
report_df[["Report.Date"]] <- as.Date(date_stamp, format = "%Y-%m-%d")
report_df
}
reports <- lapply(file_list, read_report_data)
names(reports) <- tools::file_path_sans_ext(file_list)
5 Likes
Thanks for pointing out the misplaced parentheses. Fixing that does exactly what I am looking for.