Read excel data dynamically in R

Hi all,

I have a excel file in my home directory. I read excel file using below code

df <- read_excel("new_sd.xlsx", sheet = "Sheet1")

I wanted to check if we can read the excel file dynamically similar to how we query sql tables dynamically
For example

My excel has below data

Col1  Col2
1           A
2           A
3           B

While reading above file (df <- read_excel("new_sd.xlsx", sheet = "Sheet1")), can we only import data where Col2 is "A") . So the output would be

df 
Col1  Col2
1           A
2           A

Hello @vinayprakash808,

as far as I know the read_excel variants (in several packages) do not offer a filter option.
I think it is indeed more efficient to separate the functionalities for reading and filtering in specialized packages. So use openxlsx or readxl to read the file and dplyr to do the filtering :

# create an xlsx file with the data
df <- data.frame(
  stringsAsFactors = F,
  Col1 = c(1,2,3),
  Col2 = c("A","A","B")
)

library(openxlsx)
wb <- createWorkbook()
addWorksheet(wb, "Sheet1")
writeDataTable(wb, "Sheet1", x = df)
saveWorkbook(wb, file = "new_sd.xlsx", overwrite = TRUE)

rm(list=c('df')) # remove the data.frame df

# read from the xlsx file:
df <- readxl::read_xlsx("new_sd.xlsx",sheet="Sheet1")
# filter the data
df2 <- dplyr::filter(df,Col2=="A")
print(as.data.frame(df2)) # show it as a data.frame
#>   Col1 Col2
#> 1    1    A
#> 2    2    A
Created on 2021-07-14 by the reprex package (v2.0.0)

If you are on Windows you could define the Excell file as an ODBC data source and then use sql to make queries.

thanks. But my goal is to not load excel all at once. :frowning: By filtering the data, I can limit the memory so that my data wrangling becomes easy

Read all of col2 only. And get the rows of the sheet you really want to read from analysing that. Openxlsx read functions support sending rows and column as vectors of what to read.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.