In order to get the chunk writing ability from sqlalchemy
in R I have created a function that wraps around the python code and allows me to write to redshift using sqlalchemy
and pandas
. It works great. But I have some questions about "best practices". Right now I have a start_python
function that I run before I want to export. That fires up reticulate and imports some libraries. Then I run my write function. Then in some cases I may be done with Python/Reticulate for a while. Should I shut down the Reticulate Python stack somehow? How do I even do that? Should I just not worry about it? Other thoughts?
start_python <- function(){
library(reticulate)
use_condaenv( "r-reticulate")
pd <- import('pandas')
sa <- import('sqlalchemy')
}
# write a table to redshift
write_to_redshift <- function(df, tablename, if_exists = 'append'){
pd_df <- r_to_py(df)
eng = sa$create_engine('postgres://user:pwd@redshift_name:5439/db_name')
conn = eng$connect()
write_result <- pd_df$to_sql( name=tablename, con=conn, index = FALSE, if_exists = if_exists, schema='my_schema', chunksize=10000L)
conn$close()
return(write_result)
}