I have a pickle file on s3 (which comes from a python/pandas DataFrame), and I want to read it into R. I know from a previous question how to read in a csv, and if I was in Python, I'd know how to read in a pickle from s3, but I am having difficulty combining them in R with reticulate.
In Python, I run the following:
import pandas as pd import pickle import boto3 from io import BytesIO bucket = 'my_bucket' filename = 'my_filename.pkl' s3 = boto3.resource('s3') with BytesIO() as data: s3.Bucket(my_bucket).download_fileobj(my_filename, data) data.seek(0) df1 = pickle.load(data)
which works succesfully.
so I tried to convert this into R, but failed:
library(reticulate) reticulate::use_condaenv("base2", required = TRUE) boto3 <- reticulate::import("boto3") pickle <- reticulate::import("pickle") io <- reticulate::import("io") data <- io$BytesIO() s3 <- boto3$resource("s3") bucket <- 'my_bucket' filename <- "my_filename.pkl" s3$Bucket(bucket)$download_fileobj(filename, data) data$seek(0) #> Error in py_call_impl(callable, dots$args, dots$keywords): TypeError: integer argument expected, got float df1 <- pickle$load(data) #> Error in py_call_impl(callable, dots$args, dots$keywords): UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 4: invalid start byte
Created on 2020-07-23 by the reprex package (v0.3.0)
Can anyone help with the python <--> R conversion?