Hi mara and jdlong,
Thank both of you for the feedback. I had forked reticulate into my github repository so I am using the latest version. I also see that there are well defined S3 methods to handle pandas DataFrame conversion in the reticulate py_to_r() S3 class (e.g. py_to_r.pandas.core.frame.DataFrame). I have identified the problem. The following test executes correctly in a new R session.
library(reticulate)
library(testthat)
pd <- import("pandas")
py_config()
## python: /usr/local/bin/python3
## libpython: /Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/config-3.6m-darwin/libpython3.6.dylib
## pythonhome: /Library/Frameworks/Python.framework/Versions/3.6:/Library/Frameworks/Python.framework/Versions/3.6
## version: 3.6.5 (v3.6.5:f59c0932b4, Mar 28 2018, 05:52:31) [GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)]
## numpy: /Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/numpy
## numpy_version: 1.14.5
## pandas: /Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas
##
## NOTE: Python version was forced by RETICULATE_PYTHON
before <- iris
expect_is(before,class = "data.frame")
convert <- r_to_py(before)
expect_is(convert,class = "pandas.core.frame.DataFrame")
after <- py_to_r(convert)
expect_is(after,class = "data.frame")
The failure occurs when I utilize the function 'reticulate::import("pandas", as="pd")' with the as parameter. You can see below that the pandas.DataFrame is not converted into an R data.frame. So the problem is related to the S3 method for the pandas DataFrame not matching based on the name of the python module.
pd <- import("pandas",as = "pd")
py_config()
## python: /usr/local/bin/python3
## libpython: /Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/config-3.6m-darwin/libpython3.6.dylib
## pythonhome: /Library/Frameworks/Python.framework/Versions/3.6:/Library/Frameworks/Python.framework/Versions/3.6
## version: 3.6.5 (v3.6.5:f59c0932b4, Mar 28 2018, 05:52:31) [GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)]
## numpy: /Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/numpy
## numpy_version: 1.14.5
## pandas: /Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas
##
## NOTE: Python version was forced by RETICULATE_PYTHON
before <- iris
expect_is(before,class = "data.frame")
convert <- r_to_py(before)
expect_is(convert,class = "pd.core.frame.DataFrame")
after <- py_to_r(convert)
print("after class: ")
## [1] "after class: "
print(class(after))
## [1] "pd.core.frame.DataFrame" "pd.core.generic.NDFrame"
## [3] "pd.core.base.PandasObject" "pd.core.base.StringMixin"
## [5] "pd.core.accessor.DirNamesMixin" "pd.core.base.SelectionMixin"
## [7] "python.builtin.object"
#expect_is(after,class = "data.frame")
I have tested this on two different Docker containers, and also on my MacBook Pro and the same error occurs. I think this should be addressed in the reticulate package.
Thanks,
Brett