what is the difference between read_excel and sapply(files, read_excel)?

hi community, am pretty new to R and have a query related to read_excel function.

i am trying to import multiple files from a folder and store them as a tibble. for this I tried defining the following.

path <- "c:/My Drive/test/files/"
files <- list.files(path, pattern = "*.xlsx", full.names = T)

the following command doesn't work

library(readxl)
raw_dump <- read_excel(files, sheet = "Sheet1", col_names = TRUE) #not working
#Error: `path` must be a string

however, the following command when used with sapply works.

library(readxl)
raw_dump <- sapply(files, read_excel, simplify = FALSE) %>% bind_rows() 

how does read_excel handle multiple files? isn't files a character vector of size 6? why would the function be searching for a string?

and how does sapply manage to convey the exact thing with the bind_rows()?

I don't think read_excel can read multiple files simultaneously. That's why your first approach fails.

In the second approach, what's happening is that read_excel is being called on all files separately, and it can do that successfully. Then the list of data (of same length as of files) is passed to bind_rows to concatenate them by rows into a single data.

Hope this helps.

1 Like

Thanks for the insight Anirban. Makes sense now.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.